Why does my glob.glob loop not repeat all text files in a folder?

I am trying to read from a folder containing text documents using python 3. In particular, this is a modification of the LingSpam email spam dataset. I expect the code that I wrote to return all the names of text documents in 1893, however, instead, the code returns the first 420 file names. I do not understand why this dwells on the total number of file names. Any ideas?

if not os.path.exists('train'):  # download data
  from urllib.request import urlretrieve
  import tarfile
  urlretrieve('http://cs.iit.edu/~culotta/cs429/lingspam.tgz', 'lingspam.tgz')
  tar = tarfile.open('lingspam.tgz')
  tar.extractall()
  tar.close()
abc = []
for f in glob.glob("train/*.txt"):
  print(f)
  abc.append(f)
print(len(abc))

I tried to change the glob parameters, but still did not succeed.

Edit: Apparently, my code works for everyone but me. Here is my conclusion

+4
source share
1 answer

Success! The problem was

if not os.path.exists('train'):  # download data

, , , , . , , , ,

  from urllib.request import urlretrieve
  import tarfile
  urlretrieve('http://cs.iit.edu/~culotta/cs429/lingspam.tgz', 'lingspam.tgz')
  tar = tarfile.open('lingspam.tgz')
  tar.extractall()
  tar.close()

if .

0

All Articles