Why is request.get () not returning? What is the default timeout that request.get () uses?

In my script, requests.get never returns:

 import requests print ("requesting..") # This call never returns! r = requests.get( "http://www.justdial.com", proxies = {'http': '222.255.169.74:8080'}, ) print(r.ok) 

What could be the possible reason? Any remedy? What is the default timeout that get uses?

+56
python get python-requests
Jul 22 '13 at 7:31
source share
3 answers

What is the default timeout that uses use?

The default timeout is None , which means that it will wait (hang) until the connection is closed.

What happens when you pass a timeout value?

 r = requests.get( 'http://www.justdial.com', proxies={'http': '222.255.169.74:8080'}, timeout=5 ) 
+76
Jul 22 '13 at 7:59
source share

From requests documentation :

You can say "Requests" to stop waiting for a response after a specified number of seconds with a timeout parameter:

 >>> requests.get('http://github.com', timeout=0.001) Traceback (most recent call last): File "<stdin>", line 1, in <module> requests.exceptions.Timeout: HTTPConnectionPool(host='github.com', port=80): Request timed out. (timeout=0.001) 

Note:

timeout is not a time limit for the entire response download; rather, an exception occurs if the server does not give a response for a timeout (more precisely, if the bytes were not received on the underlying socket for seconds to wait).

It very often happens to me that request.get () takes a lot of time, even if timeout is 1 second. There are several ways to overcome this problem:

1. Use the inner class TimeoutSauce

From: https://github.com/kennethreitz/requests/issues/1928#issuecomment-35811896

 import requests from requests.adapters import TimeoutSauce class MyTimeout(TimeoutSauce): def __init__(self, *args, **kwargs): connect = kwargs.get('connect', 5) read = kwargs.get('read', connect) super(MyTimeout, self).__init__(connect=connect, read=read) requests.adapters.TimeoutSauce = MyTimeout 

This code should force us to set the read timeout to the connection timeout, which is the timeout value that you pass on your Session.get (). (Note that I have not actually tested this code, so it may take some quick debugging, I just wrote it right in the GitHub Window.)

2. Use the query plug from kevinburke: https://github.com/kevinburke/requests/tree/connect-timeout

From the documentation: https://github.com/kevinburke/requests/blob/connect-timeout/docs/user/advanced.rst

If you specify a single value for the timeout, for example:

 r = requests.get('https://github.com', timeout=5) 

The timeout value will apply both to the connection and to reading timeouts. Specify a tuple if you want to set the values ​​separately:

 r = requests.get('https://github.com', timeout=(3.05, 27)) 

NOTE: Since then, this change has been merged with the main Requests project .

3. Using evenlet or signal , as mentioned in a similar question: Timeout for the entire request python request.get

+21
Mar 13 '14 at 11:40
source share

I looked through all the answers and came to the conclusion that the problem still exists. On some sites, requests can freeze indefinitely, and the use of multiprocessing seems to be too large. Here is my approach (Python 3.5+):

 import asyncio import aiohttp async def get_http(url): async with aiohttp.ClientSession(conn_timeout=1, read_timeout=3) as client: try: async with client.get(url) as response: content = await response.text() return content, response.status except Exception: pass loop = asyncio.get_event_loop() task = loop.create_task(get_http('http://example.com')) loop.run_until_complete(task) result = task.result() if result is not None: content, status = task.result() if status == 200: print(content) 
+1
Nov 04 '17 at 6:53 on
source share



All Articles