Os.path.isfile is not working properly

I am trying to scan my hard drive for jpg and mp3 files.

I wrote the following script, which works if I give it the directory with the file in the root, but returns nothing if I give it the root directory.

I'm new to Python, so I would have liked the help.

def findfiles(dirname,fileFilter): filesBySize = {} def filterfiles(f): ext = os.path.splitext(f)[1][1:] if ext in fileFilter: return True else: False for (path, dirs, fnames) in os.walk(dirname): if len(fileFilter)>0: fnames = filter(filterfiles,fnames) d = os.getcwd() os.chdir(dirname) for f in fnames: if not os.path.isfile(f) : continue size = os.stat(f)[stat.ST_SIZE] if size < 100: continue if filesBySize.has_key(size): a = filesBySize[size] else: a = [] filesBySize[size] = a a.append(os.path.join(dirname, f)) # print 'File Added: %s' %os.path.join(dirname,f) _filecount = _filecount + 1 os.chdir(d) return filesBySize 
+7
source share
4 answers

Oh yes.

You call os.path.isfile(f) , where f is the name of the file in path . You will need to specify an absolute path. If, indeed, this call is necessary (it should always return True ).

Try changing your for loop to:

  qualified_filenames = (os.path.join(path, filename) for filename in fnames) for f in qualified_filenames: 

And you must be installed!

In addition, os.chdir() calls os.chdir() not needed.

And, as I suggested in the comments, filterfiles should look like this:

 def filterfiles(f): ext = os.path.splitext(f)[1][1:] return ext in fileFilter 

(You missed return ).

+9
source

filesBySize is a rather unusual grouping. You can move it outside of findfiles() function:

 #!/usr/bin/env python import os import stat import sys from collections import defaultdict def findfiles(rootdir, extensions=None, minsize=100): """Find files with given `extensions` and larger than `minsize`. If `extensions` is None then don't filter on extensions. Yield size, filepath pairs. """ extensions = tuple(extensions) if extensions is not None else extensions for path, dirs, files in os.walk(rootdir): if extensions is not None: # get files with given extensions files = (f for f in files if f.endswith(extensions)) for f in files: f = os.path.join(path, f) try: st = os.stat(f) except os.error: continue # skip if stat.S_ISREG(st.st_mode): # isfile if st.st_size > minsize: yield st.st_size, f rootdir = sys.argv[1] # get it from command-line files_by_size = defaultdict(list) for size, f in findfiles(rootdir, ['.mp3', '.jpg']): files_by_size[size // (1<<20)].append((size, f)) # group in 1M buckets import pprint pprint.pprint(dict(files_by_size)) # pretty print 

There is no need to use os.chdir() , just call os.path.join(path, f) .

+3
source

Not directly relevant to your question, but here are some general modern Python tips as you are new to Python:

 os.stat(f)[stat.ST_SIZE] 

can be written as

 os.stat(f).st_size 

and

 if filesBySize.has_key(size): a = filesBySize[size] else: a = [] filesBySize[size] = a 

better to write like:

 a = filesBySize.setdefault(size, []) 
+2
source

I believe that constant calls to os.chdir() here complicate your program (and may even ruin the work of os.walk() ).

I copied a more beautiful example of how to work with path names without changing the directory from the Python documentation :

 # Delete everything reachable from the directory named in "top", # assuming there are no symbolic links. # CAUTION: This is dangerous! For example, if top == '/', it # could delete all your disk files. import os for root, dirs, files in os.walk(top, topdown=False): for name in files: os.remove(os.path.join(root, name)) for name in dirs: os.rmdir(os.path.join(root, name)) 

You use os.path.join(root, name) after you select name from files .

0
source

All Articles