As already mentioned, SQLite, JavaDB, and SimpleDB are good examples. I would add Berkeley DB to the list. Berkleley DB is well documented, has existed for several years, has several available APIs, as well as many access methods, such as HASH, QUEUE and RECNO, in addition to the traditional B-tree. Berkeley DB is a key / value database library written in C. Berkeley DB XML is an XML data library written in C ++ on top of Berkeley DB. Berkeley DB Java Edition is a 100% Java database / value library. All of them are available under the GPL, and source code is included in the distribution .
The Berkeley DB SQL API includes the SQLite API, mainly implementing a BDB key / value pair data store under the SQLite query layer. Berkeley DB was also the first MySQL data warehouse implementation, again taking the SQL query layer and storing the data in a simple key / data data format. This is certainly an interesting way to look at the problem - if you have a flexible, fast, scalable and reliable data warehouse, you can add any type of API or data presentation / abstraction on top of it. This is exactly what Berkeley DB does by providing a choice between key / value core data storage or XML, SQL, Java Collections, or a POJO-like Persistence over the base key / value infrastructure.
Berkeley DB is close to the “clean” storage engines you find. He makes no assumptions about the structure, content, or format of the stored data. This allows the upper tiers to provide these abstractions, while the lower tier focuses on fast, scalable, and reliable storage. One of the reasons Berkeley DB is so widely used is its simplicity and focus make it very fast, reliable and scalable.
Disclaimer: I am one of the product managers for Berkeley DB, so I am clearly a bit biased. But I have also been working on database products for 25 years, and I know a little about internal DBMSs :-)
Good luck with your research.
Dave
dsegleau
source share