Php url check available

I want to check if url is accessible from my database. I choose fopen , but I am testing 30 rows from my database, it will cost almost 20 seconds. Is there a way that can make it more efficient? Thanks.

 <?php $start_t = microtime(true); //connect database and select query while ($row = mysql_fetch_array($result)){ //$url = 'http://www.google.com'; //not test from database, but a google.com, one url will cost 0.49 seconds. $url = $row['url']; $res = @fopen($url, "r "); if($res){ echo $row['url'].' yes<br />'; }else{ echo $row['url']. ' no<br />'; } } $end_t = microtime(true); $totaltime = $end_t-$start_t; echo "<br />".$totaltime." s"; ?> 
+4
source share
4 answers

Try using fsockopen , which is faster than fopen

 <?php $t = microtime(true); $valid = @fsockopen("www.google.com", 80, $errno, $errstr, 30); echo (microtime(true)-$t); if (!$valid) { echo "Failure"; } else { echo "Success"; } ?> 

Output:

 0.0013298988342285 
+3
source

You can try using CURL with the parameter CURLOPT_NOBODY, which uses the HTTP HEAD method and avoids loading the entire page:

 $ch = curl_init($row['url']); curl_setopt($ch, CURLOPT_NOBODY, true); curl_exec($ch); $retcode = curl_getinfo($ch, CURLINFO_HTTP_CODE); // 400 means not found, 200 means found. curl_close($ch); 

From document CURLOPT_NOBODY:

TRUE to exclude the body from the conclusion. Then the request method is set to CHAPTER. Changing this parameter to FALSE do not change it to GET.

+2
source

You cannot speed it up.

With 30 lines, I assume you are connecting to 30 different URLs. 20 seconds is a good time to do this.

I also suggest that you use file_get_contents to extract the HTML code Or, if you need to , to learn how to use the response of the get_headers(); header get_headers();

If you want to speed up the process, just call more processes. Each of these will receive tot urls.

Adding

Also do not forget about the wonderful Zend_HTTP_Client(); which is very good for such a task

+1
source

Try checking the bulk URL, i.e. in blocks of 10 or 20

Curl Multi Exec.

http://semlabs.co.uk/journal/object-oriented-curl-class-with-multi-threading

Use the CURL options only for NOBODY and HEADER, so your answer will be much faster.

Also, do not forget to set TIMEOUT for curling, otherwise one BAD-url may take too much time.

I did 50 URL checks in 20 seconds.

Hope this helps.

+1
source

All Articles