Why is the CodeIgniter Curl library slower than using Curl in simple PHP?

I recently moved the scraper code using Curl to CodeIgniter. I am using the Curl CI library from http://philsturgeon.co.uk/code/codeigniter-curl . I put the cleanup process in the controller, and then found that the runtime of my scraper is slower than the one I built in simple PHP.

It took 12 seconds for CodeIgniter to output the result, while simple PHP only takes 6 seconds. Both include parsing with the HTML DOM parser.

Here is my Curl code in CodeIgniter:

function curl($url, $postdata=false) { $agent = "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)"; $this->curl->create($url); $this->curl->ssl(false); $options = array( 'URL' => $url, 'HEADER' => 0, 'AUTOREFERER' => true, 'FOLLOWLOCATION' => true, 'TIMEOUT' => 60, 'RETURNTRANSFER' => 1, 'USERAGENT' => $agent, 'COOKIEJAR' => dirname(__FILE__) . "/cookie.txt", 'COOKIEFILE' => dirname(__FILE__) . "/cookie.txt", ); if($postdata) { $this->curl->post($postdata, $options); } else { $this->curl->options($options); } return $this->curl->execute(); } 

non codeigniter (plain php):

function curl ($ url, $ binary = false, $ post = false, $ cookie = false) {

  $ch = curl_init(); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // Accepts all CAs curl_setopt ($ch, CURLOPT_SSL_VERIFYHOST, 2); curl_setopt ($ch, CURLOPT_URL, $url ); curl_setopt ($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_REFERER, $url); curl_setopt($ch, CURLOPT_ENCODING, 'gzip,deflate'); curl_setopt($ch, CURLOPT_AUTOREFERER, true); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); curl_setopt($ch, CURLOPT_TIMEOUT, 60); curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); if($cookie){ $agent = "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)"; curl_setopt($ch, CURLOPT_USERAGENT, $agent); curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(__FILE__) . "/cookie.txt"); curl_setopt($ch, CURLOPT_COOKIEFILE, dirname(__FILE__) . "/cookie.txt"); } if($binary) curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1); if($post){ foreach($post as $key=>$value) { $post_array_string1 .= $key.'='.$value.'&'; } $post_array_string1 = rtrim($post_array_string1,'&'); //set the url, number of POST vars, POST data curl_setopt($ch, CURLOPT_POST, true); curl_setopt($ch, CURLOPT_POSTFIELDS, $post_array_string1); } return curl_exec ($ch); 

}

Does anyone know why this CodeIgniter Curl is slower? or maybe because the parser is simple_html_dom ??

+6
source share
2 answers

I'm not sure I know the exact answer for this, but I have a few comments regarding Curl and CI, as I use it extensively.

  • Check the status of DNS caches / queries.

I noticed significant acceleration when the code was uploaded to the hosted intermediate server from my dev desktop. This was due to a DNS problem that was resolved by rebooting the bastion host ... Sometimes you can verify this by using IP addresses instead of host names.

  • Phil 'library' is actually just a shell.

All he really did was map CI-style functions in the PHP Curl library. Almost nothing happens there. I spent some time poking (I forgot why), and it was really imperceptible. However, it is possible that some common CI overhead - you can see what happens in other similar frameworks (Fuel, Kohana, Laravel, etc.).

  • Check reverse search.

Some APIs do reverse DNS checks as part of their security scan. Sometimes host names or other headers are poorly set in in-depth configurations and can cause real headaches.

  • Use the Chrome Postman extension to debug the REST API.

No comment, this is brilliant - https://github.com/a85/POSTMan-Chrome-Extension/wiki , and you have subtle control over the โ€œconversationโ€.

+2
source

I would need to learn more about the CI library and if it performs any additional tasks on the collected data, but I would try to call your method something other than the library name. I had problems with the Facebook library by calling them using a method called facebook, which caused problems. $ this-> curl can be ambiguous if you are talking about a library or method.

Also try adding a debug profiler and see what it is connected with. Add this to either the construct or the method:

 $this->output->enable_profiler(TRUE); 
0
source

Source: https://habr.com/ru/post/927803/


All Articles