Human-readable URLs: preferably hierarchical too?

In the now carried over question of readable URLs, I allowed myself to develop a small hobby horse:

When I come across URLs like http://www.example.com/product/123/subpage/456.html , I always think this is an attempt to create meaningful hierarchical URLs, which, however, are not completely hierarchical. I mean, you should be able to cut one level at a time. In the above example, the URL has two violations by this principle:

  • /product/123 is one piece of information, presented in two levels. This would be more correctly represented as /product:123 (or any other delimiter you like)
  • /subpage is most likely not an entity in itself (i.e. you cannot go up one level from 456.html , since http://www.example.com/product/123/subpage "nothing") .

Therefore, I consider the following more correct:

 http://www.example.com/product:123/456.html 

Here you can always move one level at a time:

  • http://www.example.com/product:123/456.html - Subpage
  • http://www.example.com/product:123 - Product Page
  • http://www.example.com/ - Root

Following the same philosophy, the following will make sense [and provide an additional link to the product list]:

 http://www.example.com/products/123/456.html 

Where:

  • http://www.example.com/products/123/456.html - Subpage
  • http://www.example.com/products/123 - Product Page
  • http://www.example.com/products - Product List
  • http://www.example.com/ - Root

My main motivation for this approach is that if each "path element" (limited by / ) is self-saved 1 you can always go to the "parent" simply by deleting the last URL element. This is what I (sometimes) do in my file explorer when I want to go to the parent directory. Following the same logic, the user (or search engine / crawler) can do the same. I think pretty smart.

On the other hand (and this is an important bit of the question): Although I can never prevent a user from trying to access a URL that he himself has amputated, am I wrongly asserting (and honoring) that a search engine can do the same? Those. Is it reasonable to expect that no search engine (or indeed: Google) will attempt to access http://www.example.com/product/123/subpage (paragraph 2 above)? (Or am I really considering the human factor here?)

This is not a matter of personal preference. This is a technical question about what I can expect from a crawler / indexer, and to what extent I should consider manipulating non-human people when designing URLs.

In addition, the structural "depth" of http://www.example.com/product/123/subpage/456.html is 4, where http://www.example.com/products/123/456.html is only 3 According to rumors, this depth affects the ranking of the search engine. At least that's what they told me. (Now it’s obvious that SEO is not what I know best.) Is this (still?) True: does hierarchical depth (number of directories) affect search rankings?

So, is my “hunch” technically, or should I spend my time on something else?


Example: Running (almost) to the right
Good ol 'SO is almost right. Example: profiles, e.g. http://stackoverflow.com/users/52162 :

  • http://stackoverflow.com/users/52162 - Single Profile
  • http://stackoverflow.com/users - Member List
  • http://stackoverflow.com/ - Root

However, the canonical URL for the profile is actually http://stackoverflow.com/users/52162/jensgram , which seems redundant (the same endpoint, presented at two hierarchical levels). Alternative: http://stackoverflow.com/users/52162-jensgram (or any other delimiter sequentially used).


1) Carries complete information that does not depend on the "deeper" elements.

+6
url seo human-readable hierarchical
source share
1 answer

Hierarchical URLs of this kind “http://www.example.com/product:123/456.html” are just as useless as “http://www.example.com/product/123/subpage" because when users see your URLs, they don’t need identifiers from your database, they need meaningful ways. This is why StackOverflow puts question headers in the URLs: "http://stackoverflow.com/questions/4017365/human-readable-urls-preasons-hierarchical-too".

Google’s advice against the practice of replacing common queries, such as "http://www.example.com/?product=123&page=456", because when each site develops its own scheme, the crawler does not know what each part means if it is important or not. Google has invented sophisticated mechanisms for finding important arguments and ignoring non-essential ones, which means that you will get more pages in the index and fewer duplicates. But these algorithms often fail when web developers come up with their own scheme.

If you like both users and crawlers, you should use URLs like this:

In addition, search engines give higher ranking to pages with keywords in the URL.

+4
source share

All Articles