Tracking external files with an SQL database and deleting an external file when deleting an entry for it

I have no idea if I will do it right, or something is completely stupid.

I have a file system that will contain a bunch of image files. These are large images of variable size cards. I use my database to perform spatial queries on them.

Basically, all I want to do is the ability to add image information (name, directory and spatial information) to the database and delete the image from the database (records in all tables along with an external file associated with this record). I know how to delete all records, but not external data. I do not want to embed images in the database as a binary blob, because I often use external tools for files.

Basically, my database only tracks the file name and directory along with the spatial data associated with the file.

How to delete a file from the file system when I delete a record from the database?

Am I even right about that? Is it more common to embed an image in a database as a binary block? (The overhead of copying the data around makes this implausible to me, and there should be a better way.)

I hope this does not matter, but I use postgre as my SQL database on Linux.

EDIT: my current strategy is a shell script that handles image deletion. During the shell script, it forces the transaction file to delete all database records associated with the image, preserving the full path of the file to a flat text file. If the transaction is successful, I delete the image in a flat file. It is reasonable? Is there a better way?

+4
source share
4 answers

Your "current strategy" sounds like a standard approach to me: delete it from the database, and if it succeeds (and this is a big "if"), delete the corresponding image file. You will probably need a validation check to make sure that you are not accumulating a jerk, but simply comparing the database and the file system to make sure that they agree with each other.

You do not need to save images in a database, the file system is quite suitable for processing files, and it will probably be much more convenient to have them in the file system. And, as David Ryder noted below, a file system will almost certainly be much faster at working with large image files than a database: file systems are pretty good at knowing what they are doing.


UPDATE If you want this to be very fast, you can try deleting files using the cron job. Once every couple of hours (or a day or something works), the cron job can compare the database with the file system and delete any wandering images. This would make bulk removal from the database easier: you could do DELETE FROM whatever WHERE ... to kill multiple records, and then your cleaner will come later to clear the remaining images.

+2
source

Much depends on where you want to place your images.

Since a database usually requires fast random I / O, you would put it on a box with a good redundant RAID10 controller for a backup battery.

But a web server serving zillions in mostly static (not often updated) files needs very different hardware, perhaps RAID6 or a cloud of cheap servers.

Therefore, you should consider this in your design.

1) ON DELETE trigger

You could delete the database using the ON DELETE trigger. The big problem: if the transaction is canceled, the files remain deleted!

2) log table

The ON DELETE trigger inserts deleted image records into the log table. The cron task reads this and deletes the files later.

==> No problem with ROLLBACK

3) garbage collection

The cron task compares the list of files on disk and the contents of a database and deletes files on disk without a corresponding database record.

It's safe, but probably a lot slower than the log table!

4) do it in the application:

  • DELETE RETURNING returns a list of deleted records, COMMIT
  • delete from file system

Failure Points:

  • you can receive files without database records if your application dies, or vice versa if you put COMMIT after unlink () gets worse.
  • the same applies to INSERTs ...
  • if something other than the application is deleted from the database, it is not processed.
+5
source

There is a PHP function to delete files from the file system:

 unlink(filename) 
+1
source

How to delete a file from the file system when I delete a record from the database?

You can use the pl / perlu script for this. However, if the file is not stored as lob, do not do this. Think about what might happen if an error occurs and the transaction is rolled back.

The right way to manipulate the file system is to do it from your application - if you are 100% sure that the file data was correctly installed / not installed in the database, without the additional possibility of errors and rollbacks.

0
source

All Articles