LevelDB is particular implementation of a key-value store.
We are using FUSE to reuse the filesystem as our interface to the LevelDB store,
although it could be an API accessed by a socket, or other connector software technology.
The actual data is stored in files in a traditional filesystem, however the kinds of
file, their access and use, is conducive to a more robust and distributable data store.
There are three types of files, called level 0, 1 and 2. With the exception of one level 0
file, the open level 0 file, all files are atomic and immutable. They either exist or not,
and if the exist, they never change. The open level 0 file can change, but it accepts only
appending new key-value pairs to the table.
Any two versions of the open level 0 file agree on any key value pair the have in common,
and one of the two contains on those pairs that are in common.
At no point is there an uncertainty about what key-value pairs are in the store or their value.
New tables are created from old tables, without the old tables needing to be taken off line,
since they are immutable. The new tables are redundant with the information in the old
tables that they replace.
Switch over to the new table is seemless as both the new and old agree entirely on the
key-value pairs stored, it is just that the new table summarizes and organizes the old tables.
The files might also be eternal, creating a persistent version of the database at past
moments in time. Or they might be reaped, deleting that history.
A separate program and policy can repeat or archive old tables, according to the requirements
of the particular application.
Project: Mini-levelDB
Your project maintains most of the important principles of LevelDB, but does not implement
it fully. Your project will only keep one level 2 table, and up to two level 0 tables,
one open the other closed. You will most likely experience level 1 tables in your coding,
as your merge and reorganize the tables.
Subproject: Leveldb-merge
Given a level 0 and and level 2, merge the keys into a level table.
In some manner you must,
Sort the level 0 table, careful not to exchange the order of pairs with equal
valued keys.
When multiple pairs exist with the same key, discard all but the most recent pair.
Do retain tombstones.
Do a merge (see merge-sort) of the resulting table with the level 0 table. In cases of
equal keys in both tables, discard the pair in the level 2 table (it is older).
Discard tombstones.
Remember not to modify the input tables. They might still be in use by the leveldbd program.
Your algorithm might combine the sort and glean step, or the glean and merge step.
Subproject: Level 2 table integration
Modify your leveldbd to search first in the open level 0
table (done) and if a key is not found, then binary search in level 2 table (new code to write).
Combine your programs in a simple implementation that,
The leveldbd daemon is launched with empty level 2 and level 0 tables.
The daemon program is stopped, and leveldb-merge runs on the tables.
The leveldbd program is restarted on the result of the merge as the level 2 table
and an empty level 0 table.
Implementation Notes:
As in the previous project, you can assume a standard name for the level 2 file,
// in leveldb.h
#define LEVEL0_NAME "l0.dat"
#define LEVEL2_NAME "l2.dat"
The makefile is a good place to deal with filenames. If the output of merge is not l2.dat,
the rename can be part of the make-target.