urllib download the contents of the online catalog

Question

urllib download the contents of the online catalog

I am trying to create a program that will open a directory and then use regular expressions to get PowerPoints names and then create files locally and copy their contents. When I run this, it works, however, when I actually try to open the files, they continue to say that the version is incorrect.

from urllib.request import urlopen import re urlpath = urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/') string = urlpath.read().decode('utf-8') pattern = re.compile('ch[0-9]*.ppt') #the pattern actually creates duplicates in the list filelist = pattern.findall(string) print(filelist) for filename in filelist: remotefile = urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/' + filename) localfile = open(filename,'wb') localfile.write(remotefile.read()) localfile.close() remotefile.close()

+7

python directory python-3.x urllib2 urllib

davelupt Jun 04 '12 at 0:51

source share

1 answer

apple16 · Accepted Answer · 2012-06-04T01:10:43+0000

This code worked for me. I changed it a little because your duplicated every ppt file.

 from urllib2 import urlopen import re urlpath =urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/') string = urlpath.read().decode('utf-8') pattern = re.compile('ch[0-9]*.ppt"') #the pattern actually creates duplicates in the list filelist = pattern.findall(string) print(filelist) for filename in filelist: filename=filename[:-1] remotefile = urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/' + filename) localfile = open(filename,'wb') localfile.write(remotefile.read()) localfile.close() remotefile.close()

urllib download the contents of the online catalog

More articles: