Retrieving HEAD Content Using Python Queries

Question

Retrieving HEAD Content Using Python Queries

I am trying to analyze the result of a HEAD request made using the Python Requests library, but the contents of the response may not seem available.

According to the docs, I should have access to the content from request.Response.text. This works fine for me in GET requests, but returns None HEAD requests.

GET request (works)

import requests response = requests.get(url) content = response.text

content = <html>...</html>

HEAD request (no content)

 import requests response = requests.head(url) content = response.text

content = None

EDIT

OK. I quickly realized that HEAD responses should not return only content headers. But does this mean that to access the things found in the <head> page, for example, the <link> and <meta> , you need to GET the whole document?

+8

python head http-request python-requests

Yarin Mar 04 '12 at 12:44

source share

3 answers

HEAD has no content! Try response.headers - which is probably where the action is. The HTTP HEAD request does not receive the <head> element of the HTML response that you will receive from the GET request. I think your mistake.

+5

Spacedman Mar 04 '12 at 12:48

source share

HEAD responses do not have a body. They only return HTTP headers, the same thing you get with a GET request.

+2

Dor shemer Mar 04 '12 at 12:49

source share

phihag · Accepted Answer · 2012-03-04T12:48:05+0000

In the definition, the responses to HEAD requests do not contain the message body.

Send a GET request if you want to get the body of the response. Send a HEAD iff request, you are only interested in the code and response headers.

HTTP transfers arbitrary content; The HTTP header is completely unrelated to the HTML <head> . However, HTTP may be recommended to download only part of the document. If you know the length of the HTML <head> code (or the upper bound for this), you can include the HTTP Range header in your request, which advises the remote server to return only a certain number of bytes. If the remote server supports HTTP ranges, it will serve an abbreviated response.

Retrieving HEAD Content Using Python Queries

More articles: