sqlite3 (or any other good relational database, but sqlite comes with Python and is more convenient for such a small enough data set) seems to be the right approach for your task. If you prefer not to learn SQL, SQLAlchemy is a popular βwrapperβ over relational databases, so to speak, which allows you to deal with them at any of several different levels of abstraction of your choice.
And "doing all this in memory" is not a problem at all (this is stupid , mind you, since you will uselessly pay the overhead for reading all the data from somewhere more persistent for each and every launch of your program, saving the database to disk will save this overhead to you - but that's another problem ;-). Just open your sqlite database as ':memory:' , and there you are - a new new relational database that lives entirely in memory (only for the duration of your process), not a single disk is involved in the procedure at all , so why not? -)
Personally, I would use SQL directly for this task - it gives me excellent control over what is happening, and easily allows adding or removing indexes for tuning performance, etc. You would use three tables: a Books (primary key identifier, other fields such as Title & c), Authors table (primary key identifier, other fields such as Name & c), and a many-to-many relationship table, say BookAuthors , with just two BookID and AuthorID , and one entry for each author connection.
The two fields of the BookAuthors table BookAuthors called βforeign keysβ, referring to the fields of the identifiers of books and authors, respectively, and you can define them using ON DELETE CASCADE so that the entries refer to the book or the author who is deleted is automatically discarded one by one - an example of a high semantic level, on which even bare SQL allows you to work, and no other existing data structure can come close to matching.
Alex martelli
source share