Python - reading files from a directory file not found in a subdirectory (which is)

I am convinced that this is something just syntactic - however, I cannot understand why my code is:

import os from collections import Counter d = {} for filename in os.listdir('testfilefolder'): f = open(filename,'r') d = (f.read()).lower() freqs = Counter(d) print(freqs) 

will not work - it probably can get into the "test directory file" folder and tell me that there is a file, that is, the error message "file2.txt" was not found. Therefore, he can find him to tell me that he was not found ...

I am getting this piece of code nonetheless:

 from collections import Counter d = {} f = open("testfilefolder/file2.txt",'r') d = (f.read()).lower() freqs = Counter(d) print(freqs) 

Is a bonus a good way to do what I'm trying to do (read from a file and count word frequencies)? This is my first day with Python (although I have some programming experience).

I have to say that I like Python!

Thanks,

Brian

+2
source share
2 answers

As Isdeev pointed out, listdir () returns only file names, not the full path (or relative paths). Another way to solve this problem is os.chdir() to the directory in question, then os.listdir('.') .

Secondly, it seems your goal is to count the frequency of words, not letters (characters). To do this, you need to break the contents of the files into words. I prefer to use regular expression for this.

Thirdly, your decision takes into account the frequency of words for each file separately. If you need to do this for all files, first create a Counter() object, then call the update() method to count the counts.

Without further ado, my solution:

 import collections import re import os all_files_frequency = collections.Counter() previous_dir = os.getcwd() os.chdir('testfilefolder') for filename in os.listdir('.'): with open(filename) as f: file_contents = f.read().lower() words = re.findall(r"[a-zA-Z0-9']+", file_contents) # Breaks up into words frequency = collections.Counter(words) # For this file only all_files_frequency.update(words) # For all files print(frequency) os.chdir(previous_dir) print '' print all_files_frequency 
+1
source

Edit:

 f = open(filename,'r') 

To:

 f = open(os.path.join('testfilefolder',filename),'r') 

This is effectively what you do:

 f = open("testfilefolder/file2.txt",'r') 

Reason: you specify the files in the "testfilefolder" (a subdirectory of your current directory), but then try to open the file in the current directory.

+6
source

All Articles