Many files in one directory?

I am developing some PHP project on the Linux platform. Are there any drawbacks in putting several thousand images (files) in one directory? This is a closed set that will not grow. An alternative would be to split these files using a directory structure based on some identifier (thus, only 100 in one directory could be said).

I ask this question because I often see this separation when I look at the URLs of images on different sites. You can see that the separation of directories is carried out in such a way that no more than several hundred images are in the same directory.

What would I win without putting several thousand files (from a non-growing set) in one directory, but dividing them in groups, for example. one hundred? Is it worth complicating things?

UPDATE:

  • There will be no programmatic iteration over the files in the directory (just direct access to the image by file name)
  • I want to emphasize that the set of images is closed. These are less than 5000 images, and that’s it.
  • Logical categorization of these images does not exist.
  • No human access / view required
  • Images have unique file names
  • OS: Debian / Linux 2.6.26-2-686, File system: ext3

VALUABLE INFORMATION FROM ANSWERS:

Why allocate many files to different directories:

  • "Limitation of 32 thousand files for each directory when using ext3 over nfs"
  • performance reason (access speed) [but for several thousand files it’s hard to say whether it’s worth it without measurement]
+7
linux filesystems php
source share
7 answers

usually the cause of this splitting is file system performance. for a private set of 5000 files, I'm not sure if this is worth the hassle. I suggest you try a simple approach for placing all files in one folder, but do not forget to open the time required to access the files.

if you see that it is not fast enough for your needs, you can break it down as you suggested.

I had to split files for performance reasons. in addition, I came across a limit of 32 thousand files for each directory when using ext3 over nfs (not sure if this is a limitation of nfs or ext3). so there’s another reason for splitting into multiple directories. anyway, try with one dir and only split if you see it not fast enough.

+2
source share

In addition to faster access to files, dividing images into subdirectories, you also significantly increase the number of files that you can track before using the natural limits of the file system.

A simple approach is md5() the file name, then use the first n characters as the directory name (for example, substr(md5($filename), 2) ). This provides a reasonably uniform distribution (vs accepts the first n characters of the direct file name).

+7
source share

I think there are two aspects to this question:

  • Does the Linux file system that you use use to efficiently support directories with thousands of files. I am not an expert, but I think that the new file systems will not have problems.

  • Are there performance issues with certain PHP functions? I think direct file access should be fine, but if you do directory listings, you might run into time or memory issues.

+1
source share

There is no reason to split these files into multiple directories unless you expect file name conflicts, and if you do not need to iterate over these images anywhere.

But still, if you can think of suggestive categorization, it would be nice to sort the images, even if it's just for maintenance reasons.

+1
source share

The only reason I could imagine where it would be harmful is to iterate over the directory. More files mean more iterations. But that’s basically all I can think of in terms of programming.

0
source share

A few thousand images are still in order. When you access a directory, operating systems read a list of their files in blocks of 4K. If you have a simple directory structure, it may take time to read the entire list of files if there are many (for example, one hundred thousand) files.

0
source share

If changing the file system is an option, I would recommend moving all the files to the ReiserFS file system. It is great for quickly storing / accessing a large number of small files.

If not, MightyE’s answer, dividing them into folders, is the most logical and will increase access time by a significant margin.

0
source share

All Articles