What is the best way to check if STRING contains only certain characters?

I have this problem: I have a String , but I need to make sure that it contains only the letters AZ and the numbers 0-9 . Here is my current code:

 boolean valid = true; for (char c : string.toCharArray()) { int type = Character.getType(c); if (type == 2 || type == 1 || type == 9) { // the character is either a letter or a digit } else { valid = false; break; } } 

But what is the best and most effective way to implement it?

+8
java string regex char
source share
8 answers

Since no one else was worried about the β€œfastest,” here is my contribution:

 boolean valid = true; char[] a = s.toCharArray(); for (char c: a) { valid = ((c >= 'a') && (c <= 'z')) || ((c >= 'A') && (c <= 'Z')) || ((c >= '0') && (c <= '9')); if (!valid) { break; } } return valid; 

Full test code below:

 public static void main(String[] args) { String[] testStrings = {"abcdefghijklmnopqrstuvwxyz0123456789", "", "00000", "abcdefghijklmnopqrstuvwxyz0123456789&", "1", "q", "test123", "(#*$))&v", "ABC123", "hello", "supercalifragilisticexpialidocious"}; long startNanos = System.nanoTime(); for (String testString: testStrings) { isAlphaNumericOriginal(testString); } System.out.println("Time for isAlphaNumericOriginal: " + (System.nanoTime() - startNanos) + " ns"); startNanos = System.nanoTime(); for (String testString: testStrings) { isAlphaNumericFast(testString); } System.out.println("Time for isAlphaNumericFast: " + (System.nanoTime() - startNanos) + " ns"); startNanos = System.nanoTime(); for (String testString: testStrings) { isAlphaNumericRegEx(testString); } System.out.println("Time for isAlphaNumericRegEx: " + (System.nanoTime() - startNanos) + " ns"); startNanos = System.nanoTime(); for (String testString: testStrings) { isAlphaNumericIsLetterOrDigit(testString); } System.out.println("Time for isAlphaNumericIsLetterOrDigit: " + (System.nanoTime() - startNanos) + " ns"); } private static boolean isAlphaNumericOriginal(String s) { boolean valid = true; for (char c : s.toCharArray()) { int type = Character.getType(c); if (type == 2 || type == 1 || type == 9) { // the character is either a letter or a digit } else { valid = false; break; } } return valid; } private static boolean isAlphaNumericFast(String s) { boolean valid = true; char[] a = s.toCharArray(); for (char c: a) { valid = ((c >= 'a') && (c <= 'z')) || ((c >= 'A') && (c <= 'Z')) || ((c >= '0') && (c <= '9')); if (!valid) { break; } } return valid; } private static boolean isAlphaNumericRegEx(String s) { return Pattern.matches("[\\dA-Za-z]+", s); } private static boolean isAlphaNumericIsLetterOrDigit(String s) { boolean valid = true; for (char c : s.toCharArray()) { if(!Character.isLetterOrDigit(c)) { valid = false; break; } } return valid; } 

Produces this output for me:

 Time for isAlphaNumericOriginal: 164960 ns Time for isAlphaNumericFast: 18472 ns Time for isAlphaNumericRegEx: 1978230 ns Time for isAlphaNumericIsLetterOrDigit: 110315 ns 
+11
source share

If you want to avoid regex, then the Character class can help:

 boolean valid = true; for (char c : string.toCharArray()) { if(!Character.isLetterOrDigit(c)) { valid = false; break; } } 

If you need uppercase, then do below instead:

 if(!((Character.isLetter(c) && Character.isUpperCase(c)) || Character.isDigit(c))) 
+9
source share

You can use Apache Commons Lang:

 StringUtils.isAlphanumeric(String) 
+3
source share

In addition to all the other answers, here's the Guava approach:

 boolean valid = CharMatcher.JAVA_LETTER_OR_DIGIT.matchesAllOf(string); 

Learn more about CharMatcher: https://code.google.com/p/guava-libraries/wiki/StringsExplained#CharMatcher

+3
source share

Use regex :

 Pattern.matches("[\\dA-Z]+", string) 

[\\dA-Z]+ : at least one occurrence (+) of numbers or capital letters.

If you want to include a lowercase letter, replace [\\dA-Z]+ with [\\dA-Za-z]+ .

+2
source share

The next method is not as fast as the regular expression to implement, but it is one of the most efficient solutions (I think), because it uses bitwise operations, which are very fast.

My solution is harder and harder to read and maintain, but I think this is another easy way to do what you want.

A good way to verify that a string contains only numbers or capital letters is a simple 128 bits bitmask (2 Longs) representing an ASCII table.

So, for the standard ASCII table, there is 1 for every character that we want to save (bit from 48 to 57 and bit from 65 to 90)

So you can check that char is:

  • Number with this mask: 0x3FF000000000000L (if character code <65)
  • Upper letter with this mask: 0x3FFFFFFL (if character code> = 65)

Thus, the following method should work:

 public boolean validate(String aString) { for (int i = 0; i < aString.length(); i++) { char c = aString.charAt(i); if ((c <= 64) & ((0x3FF000000000000L & (1L << c)) == 0) | (c > 64) & ((0x3FFFFFFL & (1L << (c - 65))) == 0)) { return false; } } return true; } 
+2
source share

The best way to maintain maintainability and simplicity is with a regex already published. Once you are familiar with this technique, you know what to expect and it is very easy to expand the criteria if necessary. The disadvantage of this is performance.

The fastest way is the Array approach. Checking for the numerical value of a character in the desired ASCII range AZ and 0-9 is almost the speed of light. But maintainability is bad. Simplicity is gone.

You can use java 7-switch with char approach, but as bad as the second one.

In the end, since we are talking about java, I highly recommend using regular expressions.

+1
source share

StringUtils in Apache Commons Lang 3 has a containsOnly method, https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html

Implementation should be fast enough.

0
source share

All Articles