Best way to create a friendly URI string for SEO

The method must only allow the characters " 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ- " in the URI strings.

What is the best way to make a good SEO URI string?

+7
source share
3 answers

Here is what a common opinion is:

  • Lowercase string.

     string = string.toLowerCase(); 
  • Normalize all characters and get rid of all diacritics (so, for example, Γ©, ΓΆ, Γ  becomes e, o, a).

     string = Normalizer.normalize(string, Form.NFD).replaceAll("\\p{InCombiningDiacriticalMarks}+", ""); 
  • Replace all remaining non-alphanumeric characters with - and collapse if necessary.

     string = string.replaceAll("[^\\p{Alnum}]+", "-"); 

So in short:

 public static String toPrettyURL(String string) { return Normalizer.normalize(string.toLowerCase(), Form.NFD) .replaceAll("\\p{InCombiningDiacriticalMarks}+", "") .replaceAll("[^\\p{Alnum}]+", "-"); } 
+28
source

The following regex will do the same as your algorithm. I do not know about libraries for this.

  String s = input
 .replaceAll ("? -?", "-") // remove spaces around hyphens
 .replaceAll ("[']", "-") // turn spaces and quotes into hyphens
 .replaceAll ("[^ 0-9a-zA-Z-]", "");  // remove everything not in our allowed char set
+3
source

They are usually called "slugs" if you want to find more information.

You might want to check out other answers, such as How to create a URL with a friendly search type from a string? and How to make Django slugify work correctly with Unicode strings?

They cover C # and Python more than javascript, but have some language discussion about the relationship conventions and the problems you might encounter when creating them (e.g. uniqueness, problems with Unicode normalization, etc.).

+1
source

All Articles