urllib download the contents of the online catalog

I am trying to create a program that will open a directory and then use regular expressions to get PowerPoints names and then create files locally and copy their contents. When I run this, it works, however, when I actually try to open the files, they continue to say that the version is incorrect.

from urllib.request import urlopen import re urlpath = urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/') string = urlpath.read().decode('utf-8') pattern = re.compile('ch[0-9]*.ppt') #the pattern actually creates duplicates in the list filelist = pattern.findall(string) print(filelist) for filename in filelist: remotefile = urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/' + filename) localfile = open(filename,'wb') localfile.write(remotefile.read()) localfile.close() remotefile.close() 
+7
source share
1 answer

This code worked for me. I changed it a little because your duplicated every ppt file.

 from urllib2 import urlopen import re urlpath =urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/') string = urlpath.read().decode('utf-8') pattern = re.compile('ch[0-9]*.ppt"') #the pattern actually creates duplicates in the list filelist = pattern.findall(string) print(filelist) for filename in filelist: filename=filename[:-1] remotefile = urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/' + filename) localfile = open(filename,'wb') localfile.write(remotefile.read()) localfile.close() remotefile.close() 
+8
source

All Articles