Incomplete directory listing

Can I get a partial directory listing?

In Python, I have a process that tries to get the os.listdir directory containing> 100,000 files, and it takes forever. I would like to be able, for example, to quickly get a list of the first 1000 files.

How can I achieve this?

+7
source share
1 answer

I found a solution that gives me a random file order :) (At least I don't see the template)

First I found this post in python maillist . Attached are 3 files that you must copy to your drive ( opendir.pyx, setup.py, test.py ). Next, you'll need the pyrex pyrex package to compile the opendir.pyx file from the message. I was having trouble installing Pyrex and I found that I had to install python-dev via apt-get . Then I installed the opendir package from the three files downloaded above using python setup.py install . The test.py file contains examples of how to use it.

Next, I was wondering how much faster this solution would be than using os.listdir, and I created 200,000 files with the following little shellscript.

 for((i=0; i<200000; i++)) do touch $i done 

The following script is my test, which works in the directory where I just created the files:

 from opendir import opendir from timeit import Timer import os def list_first_fast(i): d=opendir(".") filenames=[] for _ in range(i): name = d.read() if not name: break filenames.append(name) return filenames def list_first_slow(i): return os.listdir(".")[:i] if __name__ == '__main__': t1 = Timer("list_first_fast(100)", "from __main__ import list_first_fast") t2 = Timer("list_first_slow(100)", "from __main__ import list_first_slow") print "With opendir: ", t1.repeat(5, 100) print "With os.list: ", t2.repeat(5, 100) 

The output on my system is:

 With opendir: [0.045053958892822266, 0.04376697540283203, 0.0437769889831543, 0.04387712478637695, 0.04404592514038086] With os.list: [9.50291895866394, 9.567682027816772, 9.865844964981079, 13.486984968185425, 9.51977801322937] 

As you can see, I got 200 times acceleration when returning a list of 100 file names from 200,000, this is pretty nice :).

I hope this is the goal you are trying to achieve.

+3
source

All Articles