How to get HTML code of a webpage in PHP?

I want to get the HTML code of a link (web page) in PHP. For example, if the link

https://stackoverflow.com/questions/ask

then i need the html code of the page to be served. I want to get this HTML code and save it in a PHP variable.

How can i do this?

+72
html php
May 04 '09 at 7:57 a.m.
source share
9 answers

If your PHP server allows the wrapper url fopen, then the easiest way:

$html = file_get_contents('http://stackoverflow.com/questions/ask'); 

If you need more control, you should look at the cURL functions:

 $c = curl_init('http://stackoverflow.com/questions/ask'); curl_setopt($c, CURLOPT_RETURNTRANSFER, true); //curl_setopt(... other options you want...) $html = curl_exec($c); if (curl_error($c)) die(curl_error($c)); // Get the status code $status = curl_getinfo($c, CURLINFO_HTTP_CODE); curl_close($c); 
+108
May 04 '09 at 8:02 a.m.
source share

Also, if you want to somehow process the extracted page, you may need to try the php DOM parser. I find PHP Simple HTML DOM Parser very easy to use.

+18
May 04 '09 at 9:01 a.m.
source share

You can check out the YQL libraries from Yahoo: http://developer.yahoo.com/yql

The task at hand is as simple as

 select * from html where url = 'http://stackoverflow.com/questions/ask' 

You can try this in the console: http://developer.yahoo.com/yql/console (login required)

Also see Chris Heilmanns screencast for some good ideas, what else can you do: http://developer.yahoo.net/blogs/theater/archives/2009/04/screencast_collating_distributed_information.html

+12
May 04 '09 at 8:45
source share

Easy way: Use file_get_contents() :

 $page = file_get_contents('http://stackoverflow.com/questions/ask'); 

Note that allow_url_fopen must be true in you php.ini in order to be able to use fopen wrappers with URL support.

More advanced way:. If you cannot change your PHP configuration, allow_url_fopen is false by default, and if ext / curl is installed, use the cURL library to connect to the desired page.

+9
May 04 '09 at 8:04 a.m.
source share
+2
May 04 '09 at 8:02 a.m.
source share

you could use file_get_contents if you want to save the source as a variable, however curl is better.

 $url = file_get_contents('http://example.com'); echo $url; 

this solution displays the web page of your site. However, curl is the best option.

+2
Jan 27 '13 at 2:17
source share
 include_once('simple_html_dom.php'); $url="http://stackoverflow.com/questions/ask"; $html = file_get_html($url); 

You can get all the HTML code as an array (parsed form) using this code Download the file 'simple_html_dom.php' here. http://sourceforge.net/projects/simplehtmldom/files/simple_html_dom.php/download

+1
Dec 18 '13 at 12:20
source share

Here are two different, easy ways to get content from a URL :

1) first method

Include Allow_url_include from your hosting (php.ini or somewhere else)

 <?php $variableee = readfile("http://example.com/"); echo $variableee; ?> 

or

2) second method

Enable php_curl, php_imap and php_openssl

 <?php // you can add anoother curl options too // see here - http://php.net/manual/en/function.curl-setopt.php function get_dataa($url) { $ch = curl_init(); $timeout = 5; curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_SSL_VERIFYHOST,false); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER,false); curl_setopt($ch, CURLOPT_MAXREDIRS, 10); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); $data = curl_exec($ch); curl_close($ch); return $data; } $variableee = get_dataa('http://example.com'); echo $variableee; ?> 
0
Apr 03 '13 at
source share

You can use the DomDocument method to get a separate HTML tag level variable too

 $homepage = file_get_contents('https://www.example.com/'); $doc = new DOMDocument; $doc->loadHTML($homepage); $titles = $doc->getElementsByTagName('h3'); echo $titles->item(0)->nodeValue; 
0
Dec 11 '18 at 10:29
source share



All Articles