Just to make everyone understand the vocabulary involved, the general structure of the URL is as follows:
http :// www.a.com / path/to/resource.html ? query=value
The path consists of the path and resource, in the case of path/to/resource.html path/to/ , and the resource is resource.html .
Bad, nasty and rude:
HTML, as it is found in the wild, can be poor, nasty and cruel , although quite often far from short. In this poor, nasty and cruel world, there are lively connections that themselves can be poor, unpleasant and cruel, even though URLs must adhere to standards . Therefore, bearing this in mind, I present to you the problem ...
Problem:
I am trying to create a regex to remove a resource from a URL path, which is necessary when there is a link on a web page that is a relative path. For instance:
- I visit
www.domain.com/path/to/page1.html . - There is a relative link to
/page2.html - Remove
/page1.html from the URL - Add
/page2.html to www.domain.com/path/to
Result: at www.domain.com/path/to/page2.html
I'm stuck in step 3!
I have allocated a path and resource, but now I want to separate them. The regular expression that I tried to create is as follows: \z([^\/]\.[^\/])
In C #, the same regular expression is: "\\z([^/]\\.[^/])"
Translated into English, a regular expression should mean: matching the end of a line, which includes all characters separated by a period, until these characters are slashes.
I tried this regex, but currently it fails. What is the correct query to achieve the specified result.
Here are some examples:
/path/to/resource.html => / path / to / and resource.html
/pa.th/to/resource.html => /pa.th/to/ and resource.html
/path/to/resource.html/ => /path/to/resource.html/
/ * I # $> /78zxdc.78& (! ~ => / * I # $> / and 78zxdc.78 & (! ~
Thank you for your help!