Here are a few Perl, not PHP, but the regex is the same. This regex works to parse everything I saw; customers may send some strange requests:
my ($ip, $date, $method, $url, $protocol, $alt_url, $code, $bytes, $referrer, $ua) = (m/ ^(\S+)\s # IP \S+\s+ # remote logname (?:\S+\s+)+ # remote user \[([^]]+)\]\s # date "(\S*)\s? # method (?:((?:[^"]*(?:\\")?)*)\s # URL ([^"]*)"\s| # protocol ((?:[^"]*(?:\\")?)*)"\s) # or, possibly URL with no protocol (\S+)\s # status code (\S+)\s # bytes "((?:[^"]*(?:\\")?)*)"\s # referrer "(.*)"$ # user agent /x); die "Couldn't match $_" unless $ip; $alt_url ||= ''; $url ||= $alt_url;
Daniel S. Sterling
source share