In the now carried over question of readable URLs, I allowed myself to develop a small hobby horse:
When I come across URLs like http://www.example.com/product/123/subpage/456.html , I always think this is an attempt to create meaningful hierarchical URLs, which, however, are not completely hierarchical. I mean, you should be able to cut one level at a time. In the above example, the URL has two violations by this principle:
/product/123 is one piece of information, presented in two levels. This would be more correctly represented as /product:123 (or any other delimiter you like)/subpage is most likely not an entity in itself (i.e. you cannot go up one level from 456.html , since http://www.example.com/product/123/subpage "nothing") .
Therefore, I consider the following more correct:
http:
Here you can always move one level at a time:
http://www.example.com/product:123/456.html - Subpagehttp://www.example.com/product:123 - Product Pagehttp://www.example.com/ - Root
Following the same philosophy, the following will make sense [and provide an additional link to the product list]:
http://www.example.com/products/123/456.html
Where:
http://www.example.com/products/123/456.html - Subpagehttp://www.example.com/products/123 - Product Pagehttp://www.example.com/products - Product Listhttp://www.example.com/ - Root
My main motivation for this approach is that if each "path element" (limited by / ) is self-saved 1 you can always go to the "parent" simply by deleting the last URL element. This is what I (sometimes) do in my file explorer when I want to go to the parent directory. Following the same logic, the user (or search engine / crawler) can do the same. I think pretty smart.
On the other hand (and this is an important bit of the question): Although I can never prevent a user from trying to access a URL that he himself has amputated, am I wrongly asserting (and honoring) that a search engine can do the same? Those. Is it reasonable to expect that no search engine (or indeed: Google) will attempt to access http://www.example.com/product/123/subpage (paragraph 2 above)? (Or am I really considering the human factor here?)
This is not a matter of personal preference. This is a technical question about what I can expect from a crawler / indexer, and to what extent I should consider manipulating non-human people when designing URLs.
In addition, the structural "depth" of http://www.example.com/product/123/subpage/456.html is 4, where http://www.example.com/products/123/456.html is only 3 According to rumors, this depth affects the ranking of the search engine. At least that's what they told me. (Now it’s obvious that SEO is not what I know best.) Is this (still?) True: does hierarchical depth (number of directories) affect search rankings?
So, is my “hunch” technically, or should I spend my time on something else?
Example: Running (almost) to the right
Good ol 'SO is almost right. Example: profiles, e.g. http://stackoverflow.com/users/52162 :
http://stackoverflow.com/users/52162 - Single Profilehttp://stackoverflow.com/users - Member Listhttp://stackoverflow.com/ - Root
However, the canonical URL for the profile is actually http://stackoverflow.com/users/52162/jensgram , which seems redundant (the same endpoint, presented at two hierarchical levels). Alternative: http://stackoverflow.com/users/52162-jensgram (or any other delimiter sequentially used).
1) Carries complete information that does not depend on the "deeper" elements.