I am working on a backup software, and the idea was to store all informations about each file in an sqlite db. So there basically is a “file” table that contains things like file size and sha1 hash (and other information). This works nicely as long as there are 1 million files. Insert speed is acceptable, querys are reasonably fast.
However, if I am assuming that someone will backup the contents of a 2 TB disk, and each file is about 20 kb, there would be over 100 million entries in the file table. I did some tests, and the problem is that sqlite gets slower and slower as the number of files (table entries) grow, to the point where its absolutely unusable. Of course if we’re conservative and assume that each table entry is 100 bytes including all index and internal information, we’re talking about a 10 GB database, which on most systems will neither be cached completely by the kernel, nor fit into the available memory.
Is there any open source alternative (I’d prefer a serverless solution, like sqlite, because its easier to setup), that can handle huge tables like the one I need? Preferably with C or C++ and Python language bindings. I’d also use a non-sql solution, I don’t use many sql features anyway.
30 comments ↓
Try Firebird http://www.firebirdsql.org/, it can be used both in standalone mode and in client-server mode, easy to install, lighweight, and I have a 16 GB database (holding hi-res images in BLOB fields) working quite well.
Have you thought about using mongodb – comes with excellent Python and C support and is really pretty fast. As your documents are rather limited and of identical structure it should have positive impact on the speed.
I have no idea how it will behave with such a big amount of data, but you could try PyTables:
http://www.pytables.org/moin
What’s wrong with all the existing backup software out there?
If you feel adventurous, you could use libglom to start a local PostgreSQL instance and then use libgda (via C, C++, or Python) with it. Here the test_selhosting_* test cases here would give a rough idea:
http://git.gnome.org/browse/glom/tree/tests
though you would want to adapt the API to avoid the need for a Glom::Document instance, which should be easy if you don’t use much of the API.
I used multiple sqlite databases (rotated them so they size was max. 1GB). So one possible solution is to split databases according file prefix…
I like Tokyo Cabinet: http://fallabs.com/tokyocabinet/
Pythoin binding: http://packages.python.org/tokyo-python/cabinet.html
You can try Lucene.
My first thought was mongo. Also how about:
http://www.daemonology.net/blog/2011-03-28-kivaloo-data-store.html
Tokyo cabinet has been replaced with Kyoto Cabinet, its the same basic idea just scales a lot better.
http://fallabs.com/kyotocabinet/
Make sure you are adding entries to the database in big batches with a transaction. I found proper use of BEGIN; and END; to have an order of magnitude increase on insertion speed.
Berkeley DB?
It isn’t as hip as the other suggestions, but if you want something battle tested go with BDB. If you want something younger, HamsterDB claims to be a contender. (Sorry, but I have no actual experience with HamsterDB.)
+1 on Berkeley DB. In process, highly mature, very fast, bindings for all languages, shipped by default in most distros.
If you want to avoid SQL, don’t go for postgreSQL or Firebird like others advice.
If I were you, I would try to avoid Mongo as well (seems too unstable wrt data consistency, read http://pastebin.com/raw.php?i=FD3xe6Jt for an interesting and very well documented opinion).
So why not trying DB4O? It’s a very mature NoSQL solutions that avoids the object/relational impedance mismatch and seems very performant. It is open source and has JVM/.NET bindings, which means you could use languages like C#, Java, Scala, or even Jython/IronPython/Boo/etc.
You could try LevelDB (from Google), it supports compression.
Also, are you using indexes properly? Did you try to use compression? Did you enable SQLite’s WAL?
I realise this isn’t the question you asked, but are you really expecting someone to be backing up 2TB of 20kb files? Most 2TB drives are full of much larger files…
(Obviously this isn’t “all 2TB drives” – I’ve seen in excess of 10TB of files ~20kb before, but it’s always good to make sure you’re solving the right problem 🙂
[WORDPRESS HASHCASH] The poster sent us ‘0 which is not a hashcash value.
If you use directly GDA (http://www.gnome-db.org/) you’ll get access to any database supported: SQLite or PostgreSQL.
When small database SQLite is better, then when grow, you can setup a routine to use a virtual connection to a PostgreSQL database to copy to for better performance.
GDA is easy to setup and transparent to your code, no mutter what database provider you are using.
GDA is written in C, have C++ bindings and supports GObject Introspection, then is available for any language supporting it like Python and JavaScript. It has support for Vala bindings too.
Honestly, I’d try a real database first. I am not convinced by sqlite’s underlying principle that only because stuff is in memory, optimization is secondary.
But perhaps you should benchmark your queries a bit more and find the actual slow down. Have you used “explain $query” to check whether primary keys are accessed via indices?
You could try looking into Kyoto Cabinet, its predecessor Tokyo Cabinet (in some ways at least), Google’s LevelDB, or Berkeley DB (BDB). All of these offer embedded key/value stores, most (all?) of them written in C/C++ and with Python bindings. There’s a few nice comparisons on the web, e.g. the LevelDB benchmarks by Google, or various highly interesting articles on basho.com.
In my old job, which had software generating very large numbers of small records, qdbm (http://fallabs.com/qdbm/) was very useful and fast, henceforth it’s definitely worth looking at some of their newer stuff, like Kyoto Cabinet.
Maybe have a peek at how other projects do it? I know for example Unison (the file synchronizer) keeps track of that type of metadata in “archive files”. Not a real database though.
Or you can use ZFS for the file system, and back up your files using zfs send and receive every X minutes or so. Thus you have no need whatsoever for databases to keep track of your file manifests.
The problem is that you are not telilng us what you are doing with the metadata that you have stored in the SQLite database. Tell us and we may be able to understand the usage pattern, or even if it makes sense to do what you are doing.
michael: “explain query plan” you mean, the output of “explain” alone is barely readable.
other question since you deal with files and have choosen sqlite as backend, why not store information in a private tracker instance: it also uses sqlite, has schemas for files and more, and quite some smart people worked on it full time for a few years to get optimal performance out of it.
I’m doing a similar thing (a host ids) and am looking at Redis – after a brief look at the alternatives (http://en.wikipedia.org/wiki/No_sql)
@Rudd-O: I’m implementing a FUSE filesystem, where the metadata (file size, mode, …) is stored in the DB. The contents of the files are stored in the filesystem and referenced from the database via their SHA1 hash. Additionally I provide a “commit” command, which allows later to access previous versions. There are also tools for replicating the filesystem content to other computers. Its basically a from-scratch rewrite of bfsync.
@kamstrup: I tried Berkeley DB today, and a first performance test indicates that it performs much better than SQLite; this could be the solution I was looking for – need to do more tests before I can move my code from SQLite to BDB.
Also … mention of backup and hashes reminds me of Bup – backup using git.
Just in case you’re not aware of it — might be interesting to you.
https://github.com/apenwarr/bup
If you are going to use Berkeley DB you can use its SQL API, which is a drop-in replacement for SQLite (it even uses the same parser).
I created a solution just like you want. First I’m trying to use sqlite, but I had the same problem you had. Looking further other database solutions and not found anything useful for this scenery (I’m using FUSE too), I developed a format to accomplish it. Now I have tables using HMAC-SHA512 for signing, AES256 for data, random access, append only files and is very, VERY fast. At this bare moment I’m optimizing memory consumption.
If you like to talk: gjalves at gjalves com br
Leave a Comment