A quick way to count the number of files in a directory containing hundreds of thousands of files

Question

A quick way to count the number of files in a directory containing hundreds of thousands of files

On a Solaris system that processes a large number of files and stores their information in a database (yes, I know that using a database is the fastest way to get information about the number of files that we have). I need a quick way to track files as they go through the system along the path of their storage in the database.

I am currently using a perl script that reads an array in a directory and then grabs the size of the array and sends it to the monitoring script. Unfortunately, as our system grows, this monitor becomes slower.

I am looking for a method that will work much faster, instead of pausing and updating every 15-20 seconds after performing the count operation in all the directories involved.

I am relatively confident that my bottleneck is the read directory in the array operation.

I don’t need any information about files, I don’t need sizes or file names, just the number of files in a directory.

In my code, I do not count hidden files or text files that I use to store configuration information. It would be great if this functionality was preserved, but, of course, is not required.

I found some links to counting inodes with C code or something in that direction, but I'm not very experienced in this area.

I would like to make this monitor as real as possible.

The perl code I use looks like this:

opendir (DIR, $currentDir) or die "Cannot open directory: $!"; @files = grep ! m/^\./ && ! /config_file/, readdir DIR; # skip hidden files and config files closedir(DIR); $count = @files;

+8

unix directory perl count solaris

Andrew Jul 18 '13 at 19:58

source share

2 answers

Keep it short.

 @files = readdir(DIR) - 2; The -2 is because readdir counts "." and ".." as directory entries. print @files . " files found\n"; exit;

1 file found

-one

Andrew Oct 27 '14 at 15:00

source share

pilcrow · Accepted Answer · 2013-07-18T20:35:05+0000

What you are doing right now reads the entire directory (more or less) in memory just to drop this content to count it. Avoid this by streaming the directory:

 my $count; opendir(my $dh, $curDir) or die "opendir($curdir): $!"; while (my $de = readdir($dh)) { next if $de =~ /^\./ or $de =~ /config_file/; $count++; } closedir($dh);

Importantly, do not use glob() in any form. glob() will cost stat() every entry you don't need.

Now you can have much more complex and easier ways to do this depending on the capabilities of the OS or the capabilities of the file system (Linux, by comparison, offers inotify), but directory streaming, as mentioned above, is about the same good, ll get portable .

A quick way to count the number of files in a directory containing hundreds of thousands of files

More articles: