How to parse Apache logs using regex in PHP

I am trying to break this line in PHP:

11.11.11.11 - - [25/Jan/2000:14:00:01 +0100] "GET /1986.js HTTP/1.1" 200 932 "http://domain.com/index.html" "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7 GTB6" 

How can I split this into IP, date, HTTP method, domain name and browser?

+6
php regex
source share
4 answers

This log format is an Apache combined log format . Try this regex:

 /^(\S+) \S+ \S+ \[([^\]]+)\] "([AZ]+)[^"]*" \d+ \d+ "[^"]*" "([^"]*)"$/m 

The relevant groups are as follows:

But the domain is not indicated there. The second quoted string is the value of the Referer .

+12
source share

You should check the regular expression tutorial. But here is the answer:

 if (preg_match('/^(\S+) \S+ \S+ \[(.*?)\] "(\S+).*?" \d+ \d+ "(.*?)" "(.*?)"/', $line, $m)) { $ip = $m[1]; $date = $m[2]; $method = $m[3]; $referer = $m[4]; $browser = $m[5]; } 

Take care, this is not a domain name in the log, but an HTTP referrer.

+4
source share

Here are a few Perl, not PHP, but the regex is the same. This regex works to parse everything I saw; customers may send some strange requests:

 my ($ip, $date, $method, $url, $protocol, $alt_url, $code, $bytes, $referrer, $ua) = (m/ ^(\S+)\s # IP \S+\s+ # remote logname (?:\S+\s+)+ # remote user \[([^]]+)\]\s # date "(\S*)\s? # method (?:((?:[^"]*(?:\\")?)*)\s # URL ([^"]*)"\s| # protocol ((?:[^"]*(?:\\")?)*)"\s) # or, possibly URL with no protocol (\S+)\s # status code (\S+)\s # bytes "((?:[^"]*(?:\\")?)*)"\s # referrer "(.*)"$ # user agent /x); die "Couldn't match $_" unless $ip; $alt_url ||= ''; $url ||= $alt_url; 
+4
source share
 // # Parses the NCSA Combined Log Format lines: $pattern = '/^([^ ]+) ([^ ]+) ([^ ]+) (\[[^\]]+\]) "(.*) (.*) (.*)" ([0-9\-]+) ([0-9\-]+) "(.*)" "(.*)"$/'; 

Using:

 if (preg_match($pattern,$yourstuff,$matches)) { //# puts each part of the match in a named variable list($whole_match, $remote_host, $logname, $user, $date_time, $method, $request, $protocol, $status, $bytes, $referer, $user_agent) = $matches; } 
+1
source share

All Articles