I was instructed to create a script that registers on the corporate portal, goes to a specific page, loads the page, compares it with an earlier version and then sends an email to a specific person depending on the changes made. The last parts are fairly light, but this was the first step that gives me a big problem.
After unsuccessfully using urllib2 (I'm trying to do this in python) to connect and about 4 or 5 hours of browsing the Internet, I decided that the reason I could not connect was due to NTLM authentication on the web page. I tried a bunch of different connection processes found on this site, and others to no avail. Based on the NTLM example , I did:
import urllib2 from ntlm import HTTPNtlmAuthHandler user = 'username' password = "password" url = "https://portal.whatever.com/" passman = urllib2.HTTPPasswordMgrWithDefaultRealm() passman.add_password(None, url, user, password)
When I run this (with real username, password and url), I get the following:
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "ntlm2.py", line 21, in <module> response = urllib2.urlopen(urllib2.Request(url, None, header)) File "C:\Python27\lib\urllib2.py", line 126, in urlopen return _opener.open(url, data, timeout) File "C:\Python27\lib\urllib2.py", line 400, in open response = meth(req, response) File "C:\Python27\lib\urllib2.py", line 513, in http_response 'http', request, response, code, msg, hdrs) File "C:\Python27\lib\urllib2.py", line 432, in error result = self._call_chain(*args) File "C:\Python27\lib\urllib2.py", line 372, in _call_chain result = func(*args) File "C:\Python27\lib\urllib2.py", line 619, in http_error_302 return self.parent.open(new, timeout=req.timeout) File "C:\Python27\lib\urllib2.py", line 400, in open response = meth(req, response) File "C:\Python27\lib\urllib2.py", line 513, in http_response 'http', request, response, code, msg, hdrs) File "C:\Python27\lib\urllib2.py", line 432, in error result = self._call_chain(*args) File "C:\Python27\lib\urllib2.py", line 372, in _call_chain result = func(*args) File "C:\Python27\lib\urllib2.py", line 619, in http_error_302 return self.parent.open(new, timeout=req.timeout) File "C:\Python27\lib\urllib2.py", line 400, in open response = meth(req, response) File "C:\Python27\lib\urllib2.py", line 513, in http_response 'http', request, response, code, msg, hdrs) File "C:\Python27\lib\urllib2.py", line 438, in error return self._call_chain(*args) File "C:\Python27\lib\urllib2.py", line 372, in _call_chain result = func(*args) File "C:\Python27\lib\urllib2.py", line 521, in http_error_default raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) urllib2.HTTPError: HTTP Error 401: Unauthorized
The most interesting thing about this question for me is that the last line says that error 401 was sent back. From what I read , error 401 is the first message sent to the client when NTLM started. I got the impression that the goal of python-ntml was to process the NTLM process for me. Is it wrong or am I using it incorrectly? In addition, I am not limited to using python for this, so if there is an easier way to do this in another language, let me know (from what I saw on Google does not exist). Thanks!