Parsing a URL without a path, but with a slash in the request

I am having problems parsing a URL that has no path but has a slash in the request. For example: http://example.com?q=a/b

I know that such a URL is most likely invalid (*) - this requires at least a slash as the path: http://example.com/?q=a/b .

All browsers in which I tried such a URL, correct the URL correctly. And thatโ€™s basically what I want to reproduce: identify and fix such a URL.

Using parse_url , however, produces:

 var_dump( parse_url('http://example.com?q=a/b') ); array(3) { ["scheme"]=> string(4) "http" ["host"]=> string(15) "example.com?q=a" ["path"]=> string(2) "/b" } 

So far, a URL without a slash in the request works fine:

 var_dump( parse_url('http://example.com?q=ab') ); array(3) { ["scheme"]=> string(4) "http" ["host"]=> string(11) "example.com" ["query"]=> string(4) "q=ab" } 

All of the external libraries I've tried ( Jwage \ Purl , League \ Url , Saber \ Uri ) basically do the same thing, which surprises me a bit.

Why do (all?) Browsers get this โ€œrightโ€ and (all?) PHP libraries get this โ€œwrongโ€?

Besides trying to catch these cases with a regular expression before parsing the URL (which may be unreliable - why do I want to use the library in the first place), what are my alternatives?

(*) I turned to three sources: RFC 1738 , RFC 3986 , WHATWG URL Standard , and they all disagree with what is considered valid.

+5
source share
2 answers

If you still want to apply the regex, the following should generate the URL you are looking for:

 $url=pcre_replace('/([^/]+:\/\/[^/]+)\?/', '$1/?',$url); 

The URL is required to start with the protocol name of at least one character, followed by ": //", the domain name of at least one character ("localhost" would also be acceptable). After that, it will insert '/' before the character '?', But only if before the '?' No more//.

0
source

The WHATWG URL standard comes closest to what browsers do. Other software is not quite aligned yet, although for PHP https://phppackages.org/p/esperecyan/url may work. (Did not try.)

-1
source

All Articles