cwninja , unfortunately, has given you an answer that will only work for random attacks. A smart attacker will have no problem defeating this test. There are two main reasons why his method should not be used. Firstly, nothing guarantees that the information in the HEAD response will correspond to the corresponding GET response. A properly functioning server will certainly do this, but a malicious actor should not follow the specification. An attacker can simply send a HEAD response that says it has a Content-Length that is less than your threshold, but then pass you a huge file in the GET response. But this does not even cover the serverβs ability to send a response using the Transfer-Encoding: chunked header set. The reaction with a series may well end. A few people pointing to your server in endless answers can perform a trivial resource-exhaustion attack, even if your HTTP client uses a timeout.
To do this correctly, you need to use the HTTP library, which allows you to count bytes as they are received and abort if it crosses the threshold. I would rather recommend Curb for this, and not for Net :: HTTP. (Can you even do this with Net :: HTTP?) If you use the on_body and / or on_progress callbacks, you can count incoming bytes and abort the average response if you get a file that is too large. Obviously, as cwninja already noted, if you get a Content-Length header that exceeds your threshold, you also want to abort it. Curb is also noticeably faster than Net :: HTTP .
source share