PHP file_get_contents is very slow when using the full url

Question

PHP file_get_contents is very slow when using the full url

I am working with a script (which I did not create initially) that creates a PDF file from an HTML page. The problem is that now it takes a very long time, for example, 1-2 minutes. This supposedly worked fine, but has slowed down over the past few weeks.

The script calls file_get_contents in a PHP script, which then outputs the result to an HTML file on the server and launches the pdf generator application in this file.

I seem to have narrowed the problem down to calling file_get_contents to the full URL, and not to the local path.

When i use

 $content = file_get_contents('test.txt');

It is processed almost instantly. However, if I use the full URL

 $content = file_get_contents('http://example.com/test.txt');

processing takes 30 to 90 seconds.

It is not limited to our server, it is slow when accessing any external URL, for example http://www.google.com . I suppose the script calls the full URL, since it requires string query variables that don't work if you call the file locally.

I also tried fopen , readfile and curl , and they were all slow. Any ideas on where to look to fix this?

+50

php

ecurbh 02 Sep '10 at 17:16

source share

9 answers

KrisWebDev · Answer 1 · 2010-11-21 20:42

Note: This has been fixed in PHP 5.6.14. The Connection: close header will now automatically be sent even for HTTP / 1.0 requests. See commit 4b1dff6 .

It was difficult for me to find out the reason for the file_get_contents scripts being slow.

Analyzing this with Wireshark, the problem (in my case, and probably yours too) was that the remote web server DOES NOT CLOSE the TCP CONNECTION UP TO 15 SECONDS (ie, "keep-alive").

Indeed, file_get_contents does not send an HTTP connection header, so the remote web server by default considers it to support the connection and does not close the TCP stream for up to 15 seconds (this may not be a standard value - it depends on the conf server).

A regular browser considers the page to be fully loaded if the length of the HTTP payload reaches the length specified in the response of the Content-Length HTTP header. File_get_contents does not do this and this is a shame.

Decision

SO, if you want to know a solution, here it is:

 $context = stream_context_create(array('http' => array('header'=>'Connection: close\r\n'))); file_get_contents("http://www.something.com/somepage.html",false,$context);

The point is to tell the remote web server to close the connection when the download is completed , since file_get_contents is not smart enough to do it yourself using the response of the Content-Length HTTP header.

Brad F Jacobs · Answer 2 · 2010-09-02 17:20

I would use curl () to retrieve external content, as this is much faster than the file_get_contents method. Not sure if this solves the problem, but worth a shot.

Also note that the speed of your server will affect the time it takes to retrieve the file.

Here is a usage example:

 $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, 'http://example.com/test.txt'); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $output = curl_exec($ch);

diyism · Answer 3 · 2012-06-15 07:25

Sometimes, since DNS is too slow on your server, try the following:

replace

 echo file_get_contents('http://www.google.com');

but

 $context=stream_context_create(array('http' => array('header'=>"Host: www.google.com\r\n"))); echo file_get_contents('http://74.125.71.103', false, $context);

Mohammad Walid · Answer 4 · 2015-06-24 07:22

I had the same problem

The only thing that worked for me was to set a timeout in the $options array.

 $options = array( 'http' => array( 'header' => implode($headers, "\r\n"), 'method' => 'POST', 'content' => '', 'timeout' => .5 ), );

Marc B · Answer 5 · 2010-09-02 17:18

Can you try to extract this url on the server from the command line? curl or wget come to mind. If they get the URL at normal speed, then this is not a network problem and most likely there is something in apache / php configuration.

Amito · Answer 6 · 2015-12-05 21:56

 $context = stream_context_create(array('http' => array('header'=>'Connection: close\r\n'))); $string = file_get_contents("http://localhost/testcall/request.php",false,$context);

Time: 50976 ms (avaerage only 5 attempts)

 $ch = curl_init(); $timeout = 5; curl_setopt($ch, CURLOPT_URL, "http://localhost/testcall/request.php"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); echo $data = curl_exec($ch); curl_close($ch);

Time: 46679 ms (avaerage only 5 attempts)

Note. request.php is used to retrieve some data from the mysql database.

Elyor · Answer 7 · 2016-11-22 04:47

I have huge data passed by the API, I use file_get_contents to read the data, but it took about 60 seconds . However, using the KrisWebDev solution, it took about 25 seconds .

 $context = stream_context_create(array('https' => array('header'=>'Connection: close\r\n'))); file_get_contents($url,false,$context);

Mike Q · Answer 8 · 2017-05-10 16:33

What I would also consider with Curl is that you can sink the queries. It helped me a lot, since I do not have access to the version of PHP that allows threading at the moment.

For example, I received 7 images from a remote server using file_get_contents, and it took 2-5 seconds per request. Only this process included 30 seconds or something in this process, while the user was waiting for the creation of a PDF file.

This literally reduced the time to about 1 image. Another example: I am checking 36 URLs in the time it took before to do this. I think you understand. :-)

  $timeout = 30; $retTxfr = 1; $user = ''; $pass = ''; $master = curl_multi_init(); $node_count = count($curlList); $keys = array("url"); for ($i = 0; $i < $node_count; $i++) { foreach ($keys as $key) { if (empty($curlList[$i][$key])) continue; $ch[$i][$key] = curl_init($curlList[$i][$key]); curl_setopt($ch[$i][$key], CURLOPT_TIMEOUT, $timeout); // -- timeout after X seconds curl_setopt($ch[$i][$key], CURLOPT_RETURNTRANSFER, $retTxfr); curl_setopt($ch[$i][$key], CURLOPT_HTTPAUTH, CURLAUTH_ANY); curl_setopt($ch[$i][$key], CURLOPT_USERPWD, "{$user}:{$pass}"); curl_setopt($ch[$i][$key], CURLOPT_RETURNTRANSFER, true); curl_multi_add_handle($master, $ch[$i][$key]); } } // -- get all requests at once, finish when done or timeout met -- do { curl_multi_exec($master, $running); } while ($running > 0);

Then check the results:

  if ((int)curl_getinfo($ch[$i][$key], CURLINFO_HTTP_CODE) > 399 || empty($results[$i][$key])) { unset($results[$i][$key]); } else { $results[$i]["options"] = $curlList[$i]["options"]; } curl_multi_remove_handle($master, $ch[$i][$key]); curl_close($ch[$i][$key]);

then close the file:

  curl_multi_close($master);

ElChupacabra · Answer 9 · 2017-09-16 19:03

I know this is an old question, but I found it today and the answers did not work for me. I have not seen anyone say that the maximum number of connections on an IP can be set to 1. Thus, you are executing an API request and the API is making another request because you are using the full URL. This is why booting directly from disk works. For me, this fixed the problem:

 if (strpos($file->url, env('APP_URL')) === 0) { $url = substr($file->url, strlen(env('APP_URL'))); } else { $url = $file->url; } return file_get_contents($url);

PHP file_get_contents is very slow when using the full url

More articles: