Database Bubbles Compared to Disk Files

So, I have this requirement, which states that the application should allow users to upload and download about 6,000 files per month (mainly pdf, doc, xls).

I was thinking of an optimal solution for this. The question is whether I will use BLOb in my database or in a simple file hierarchy to write / read these pieces of files.

The application architecture is based on Java 1.6, Spring 3.1 and DOJO, Informix 10.X.

So, I am here to receive recommendations based on your experience.

+7
source share
2 answers

If you have other data in the database in relation to these files, storing files in the file system makes it more complicated:

  • Backup must be done separately.
  • Transactions must be performed separately (as far as possible for file system operations).
  • Integrity checks between the database and the file system structure are within the scope.
  • No cascades: deleting user images as a result of deleting a user.
  • First you need to request the file path from the database, and then select one from the file system.

What is good with a file system-based solution is that it is sometimes convenient to have access to files, for example, copy part of images somewhere else. Also, of course, storing binary data can significantly change the size of the database. But in any case, more disk storage is required somewhere with both solutions.

Of course, all this may require more database resources than is currently available. In general, a significant increase in performance can occur, especially if the solution is between the local file system and the remote database. In your case (6000 files per month) raw performance will not be a problem, but latency may be.

+6
source

When you ask what the β€œbest” solution is, it is recommended that you include evaluation criteria β€” speed, cost, simplicity, service, etc.

Mikko Manu gave the answer largely on money. I have not used Informix after 20 years, but most databases are slightly slower when working with BLOB blocks - especially the step of getting BLOBs to and from the database can be slow.

This problem tends to get worse as more and more users access the system at the same time, especially if they are using a web application - the application server must work quite hard to get files from the database and it probably consumes a lot more memory for it these requests than usual, and it will probably take longer to complete the file-related requests than for "regular" pages.

This can slow down the web server at moderate load. If you decide to save documents to your database, I highly recommend performing some performance tests to see if you have a problem. Such a solution tends to reveal shortcomings in your setup that otherwise would not have been identified (slow network connection to the database server, insufficient RAM on your web servers, etc.)

To avoid this, I saved the "main" copies of documents in the database, so they all get backups together, and I can ask questions about the database, for example: "Do I have all the documents for user x?" However, I used the cache on the web server to avoid reading documents from the database more than I need. This works well if you have a β€œone-time, one-time read,” such as a content management system in which the cache can earn.

+9
source

All Articles