Pros and cons of using sqlite3 vs custom table implementation

I noticed that a significant part of my (pure Python) code deals with tables. Of course, I have a class Table that supports basic functions, but in the end I add more and more functions to it, such as queries, validation, sorting, indexing, etc.

I am wondering if I can remove the class Table and reorganize the code to use the regular relational database that I will create in memory .

Here is my thinking:

  • Query performance and indexing will improve, but the relationship between Python code and a single database process may be less efficient than between Python functions. I guess this is too much overhead, so I have to go with sqlite, which comes with Python and lives in the same process. • Reliability means that this is a pure performance increase (due to the non-standard definition of SQL and the limited capabilities of sqlite).

  • With SQL, I get much more powerful functions than I ever want to build myself. Seems like a clear advantage (even with sqlite).

  • I will not need to debug my own implementation of the tables, but debugging errors in SQL is difficult because I cannot set breakpoints or print a temporary state easily. I do not know how to judge the general effect of code reliability and debugging time.

  • The code will be easier to read, because instead of calling my own methods, I would write SQL (everyone who needs to support this code knows SQL). However, Python code for working with a database can be uglier and more complex than code that uses a pure Python class Table . Again, I do not know what is better on balance.

Any corrections to the above or anything else I should think about?

+4
source share
3 answers

SQLite does not start in a separate process. Thus, you actually have no additional overhead from IPC. But the IPC overhead is not that big, especially for example UNIX sockets. If you need several authors (more than one process / thread writing to the database at the same time), the overhead of locking is probably worse, and MySQL or PostgreSQL will work better, especially if they work on the same computer. The underlying SQL supported by all three of these databases is the same, so benchmarking is not so painful.

As a rule, you do not need to perform the same type of debugging in SQL statements, as is done in your own implementation. SQLite works and is pretty well debugged. It is very unlikely that you will ever have to debug "OK, this line exists, why doesn't the database find it?" and track the error when updating the index. SQL debugging is completely different from procedural code, and in fact this only happens for fairly complex queries.

As for debugging your code, you can pretty easily centralize your SQL queries and add tracing to log running queries, the results you get, etc. Python's SQLite interface may already have this (not sure, I usually use Perl). It will probably be easiest to make your existing table class a wrapper around SQLite.

I would highly recommend not reinventing the wheel. SQLite will have far fewer errors and save you a ton of time. (Perhaps you should also take a look at Firefox for a fairly recent switch to using SQLite to store history, etc., I think they got some pretty significant speedups from this).

Also, a well-optimized implementation of SQLite C is probably a little faster than any pure Python implementation.

+5
source

You can try to make the sqlite shell with the same interface as the class table so that you keep your code clean and you get sqlite results.

+4
source

If you are working with a database, use the database; if not, then no. Using tables, it looks like you. I would recommend using ORM to make it more pythonic. SQLAlchemy is the most flexible (although this is not strictly ORM).

0
source

All Articles