Java - convert string to a valid URI object

I am trying to get java.net.URI object from String . The string contains several characters that will need to be replaced with their percentage escape sequences. But when I use URLEncoder to encode UTF-8 encoded String, even they are replaced with their escape sequences.

How to get the correct encoded url from a String object?

http://www.google.com?q=a b gives http% 3A% 2F% 2www.google.com ... whereas I want the result to be http://www.google.com?q= a% 20b

Can someone please tell me how to achieve this.

I am trying to do this in an android application. Therefore, I have access to a limited number of libraries.

+66
java android encoding utf-8
Feb 21 '09 at 15:07
source share
11 answers

You can try: org.apache.commons.httpclient.util.URIUtil.encodeQuery in Apache commons-httpclient project

Like this (see URIUtil ):

 URIUtil.encodeQuery("http://www.google.com?q=ab") 

will become:

 http://www.google.com?q=a%20b 

You can, of course, do it yourself, but parsing a URI can get pretty dirty ...

+55
Feb 21 '09 at 15:26
source share

Android has always had a Uri class as part of the SDK: http://developer.android.com/reference/android/net/Uri.html

You can just do something like:

 String requestURL = String.format("http://www.example.com/?a=%s&b=%s", Uri.encode("foo bar"), Uri.encode("100% fubar'd")); 
+45
Apr 07 2018-11-11T00:
source share

I am going to add one sentence here for Android users. You can do this to avoid having to get any external libraries. In addition, all character search / replace solutions suggested in some of the answers above are dangerous and should be avoided.

Try:

 String urlStr = "http://abc.dev.domain.com/0007AC/ads/800x480 15sec h.264.mp4"; URL url = new URL(urlStr); URI uri = new URI(url.getProtocol(), url.getUserInfo(), url.getHost(), url.getPort(), url.getPath(), url.getQuery(), url.getRef()); url = uri.toURL(); 

You can see that in this particular url I need to encode these spaces so that I can use it for the request.

This allows you to use several functions available to you in Android classes. First, the URL class can split the URL into its corresponding components, so you don’t need to do any string search / replace operations. Secondly, this approach uses a function of the URI class to properly shield components when constructing URIs through components, rather than from a single line.

The beauty of this approach is that you can use any valid url string and work with it without any special knowledge about it.

+33
Jan 22 2018-12-22T00:
source share

Even if this is an old post with an already accepted answer, I am posting my alternative answer because it is well suited for this problem and no one seems to mention this method.

With the java.net.URI library:

 URI uri = URI.create(URLString); 

And if you need the corresponding string in URL format:

 String validURLString = uri.toASCIIString(); 

Unlike many other methods (e.g. java.net.URLEncoder), this replaces only unsafe ASCII characters (e.g. ç , é ...).




In the above example, if the URLString is the following String :

 "http://www.domain.com/façon+word" 

the resulting validURLString will be:

 "http://www.domain.com/fa%C3%A7on+word" 

which is a well formatted url.

+13
Aug 6
source share

If you don't like libraries, how about this?

Please note that you should not use this function throughout the url, instead you should use this for components ... for example. just the “a” component when you create the url, otherwise the computer will not know which characters should have special meaning and which of them should have literal meaning.

 /** Converts a string into something you can safely insert into a URL. */ public static String encodeURIcomponent(String s) { StringBuilder o = new StringBuilder(); for (char ch : s.toCharArray()) { if (isUnsafe(ch)) { o.append('%'); o.append(toHex(ch / 16)); o.append(toHex(ch % 16)); } else o.append(ch); } return o.toString(); } private static char toHex(int ch) { return (char)(ch < 10 ? '0' + ch : 'A' + ch - 10); } private static boolean isUnsafe(char ch) { if (ch > 128 || ch < 0) return true; return " %$&+,/:;=?@<>#%".indexOf(ch) >= 0; } 
+9
Jul 26 '10 at 7:07
source share

You can use constructors with multiple arguments to the URI class. From the javadoc URI :

Constructors with multiple arguments quote illegal characters, as required by the components in which they are displayed. The percent character ('%') is always quoted by these constructors. Any other characters are retained.

So if you use

 URI uri = new URI("http", "www.google.com?q=ab"); 

Then you get http:www.google.com?q=a%20b , which is not entirely correct, but a little closer.

If you know that your string will not contain URL fragments (for example, http://example.com/page#anchor ), you can use the following code to get what you want:

 String s = "http://www.google.com?q=ab"; String[] parts = s.split(":",2); URI uri = new URI(parts[0], parts[1], null); 

To be safe, you must scan the string for # characters, but this should start you up.

+4
Feb 21 '09 at 15:41
source share

I had similar problems for one of my projects to create a URI from a string. I could not find a single clean solution. Here is what I came up with:

 public static URI encodeURL(String url) throws MalformedURLException, URISyntaxException { URI uriFormatted = null; URL urlLink = new URL(url); uriFormatted = new URI("http", urlLink.getHost(), urlLink.getPath(), urlLink.getQuery(), urlLink.getRef()); return uriFormatted; } 

Instead, you can use the following URI constructor to specify the port, if necessary:

 URI uri = new URI(scheme, userInfo, host, port, path, query, fragment); 
+4
Jan 19 2018-12-12T00:
source share

Well, I tried using

 String converted = URLDecoder.decode("toconvert","UTF-8"); 

Hope this is what you were really looking for?

+3
Jul 12 '12 at 8:22
source share

The java.net blog had a class the other day that could do what you want (but it isn’t working right now, so I can’t check).

This code here can probably be modified to do what you want:

http://svn.apache.org/repos/asf/incubator/shindig/trunk/java/common/src/main/java/org/apache/shindig/common/uri/UriBuilder.java

Here is what I thought about java.net: https://urlencodedquerystring.dev.java.net/

+1
Feb 21 '09 at 16:04
source share

Or maybe you can use this class:

http://developer.android.com/reference/java/net/URLEncoder.html

What is present in Android with API level 1.

Annoyingly, however, it handles spaces specifically (replacing them with + instead of% 20). To get around this, we simply use this snippet:

URLEncoder.encode(value, "UTF-8").replace("+", "%20");

+1
Jan 12 '11 at 20:15
source share

I ended up using httpclient-4.3.6:

 import org.apache.http.client.utils.URIBuilder; public static void main (String [] args) { URIBuilder uri = new URIBuilder(); uri.setScheme("http") .setHost("www.example.com") .setPath("/somepage.php") .setParameter("username", "Hello Günter") .setParameter("p1", "parameter 1"); System.out.println(uri.toString()); } 

The output will be:

  http://www.example.com/somepage.php?username=Hello+G%C3%BCnter&p1=paramter+1 
0
Feb 12 '15 at 4:51 on
source share



All Articles