CURL Multi Threading with PHP

I use cURL to get some rank data for over 20,000 domain names that I have stored in the database.

The code I'm using is http://semlabs.co.uk/journal/object-oriented-curl-class-with-multi-threading .

The $ competeRequests array is 20,000 requests to compete.com api for site rankings.

This is an example request: http://apps.compete.com/sites/stackoverflow.com/trended/rank/?apikey=xxxx&start_date=201207&end_date=201208&jsonp= ";

Since there are 20,000 of these queries, I want to break them into pieces, so for this I use the following code:

foreach(array_chunk($competeRequests, 1000) as $requests) { foreach($requests as $request) { $curl->addSession( $request, $opts ); } } 

This is great for sending requests in batches of 1000, however the script takes too long to execute. I increased max_execution_time to 10 minutes.

Is there a way to send 1000 requests from my array, then parse the results, then output a status update, and then continue with the next 1000 until the array becomes empty? At the moment, the screen just remains white all the time the script runs, which can be more than 10 minutes.

+7
source share
4 answers

This always works for me ... https://github.com/petewarden/ParallelCurl

+8
source

The above answer is out of date, so the correct answer must be approved.

http://php.net/manual/en/function.curl-multi-init.php

PHP now supports fetching multiple URLs at once.

There is a very nice feature written by someone, http://archevery.blogspot.in/2013/07/php-curl-multi-threading.html

You can just use it.

+7
source

https://github.com/krakjoe/pthreads

enter image description here

You can write in PHP, the code shown is just awful thread programming, and I donโ€™t advise how you do it, but wanted to show you the overhead of 20,000 threads ... it's 18 seconds on my current hardware, which is Intel G620 (dual-core) with 8 gigabytes of memory, on server hardware you can expect much faster results ... how you perform such a task depending on your resources and the resources of the service that you request ...

+3
source

Put this at the top of your php script:

 set_time_limit(0); @apache_setenv('no-gzip', 1);//comment this out if you use nginx instead of apache @ini_set('zlib.output_compression', 0); @ini_set('implicit_flush', 1); for ($i = 0; $i < ob_get_level(); $i++) { ob_end_flush(); } ob_implicit_flush(1); 

which could disable all caching of the web server or php, which makes your output displayed in the browser while the script is running.

Note the comment on the apache_setenv line if you use the nginx web server instead of apache.

Update for nginx:

So, OP uses nginx, which makes things a little more complicated, since nginx does not allow you to disable gzip compilation with PHP. I also use nginx, and I just found out that it is active by default, see:

 cat /etc/nginx/nginx.conf | grep gzip gzip on; gzip_disable "msie6"; # gzip_vary on; # gzip_proxied any; # gzip_comp_level 6; # gzip_buffers 16 8k; # gzip_http_version 1.1; # gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript; 

so you need to disable gzip on nginx.conf and restart nginx:

/etc/init.d/nginx restart

or you can play with gzip_disable or gzip_types to conditionally disable gzip for some browsers or for some types of page content, respectively.

+2
source

All Articles