Regular expression to search for the format "lastname, firstname middlename"

I am trying to find the format "abc, def g", which is the format of the name "lastname, firstname middlename". I think the best suitable method is regex, but I have no idea in Regex. I tried to do some regular expression training and tried some expression, but no luck. Another dot may contain more than a few spaces between words.

This is what I tried. But that does not work.

(([AZ][,]\s?)*([AZ][az]+\s?)+([AZ]\s?[az]*)*) 

Need help! Any idea how I can do this to match only the above expression.

Thanks!

ANSWER

Finally i use

 ([A-Za-z]+),\\s*([A-Za-z]+)\\s*([A-Za-z]+) 

Thanks everyone for the suggestions.

+7
java regex
source share
5 answers

Your input sample is "lastname, firstname middlename" - with this you can use the following regexp to extract lastname, firstname and middlename (with the addition that there can be several spaces, and that there can be both capital and non-capital letters in the strings - also all parts are required):

 String input = "Lastname, firstname middlename"; String regexp = "([A-Za-z]+),\\s+([A-Za-z]+)\\s+([A-Za-z]+)"; Pattern pattern = Pattern.compile(regexp); Matcher matcher = pattern.matcher(input); matcher.find(); System.out.println("Lastname : " + matcher.group(1)); System.out.println("Firstname : " + matcher.group(2)); System.out.println("Middlename: " + matcher.group(3)); 

Short description:

 ([A-Za-z]+) First capture group - matches one or more letters to extract the last name ,\\s+ Capture group is followed by a comma and one or more spaces ([A-Za-z]+) Second capture group - matches one or more letters to extract the first name \\s+ Capture group is followed by one or more spaces ([A-Za-z]+) Third capture group - matches one or more letters to extract the middle name 

This only works if your names contain only latin letters - perhaps you should use a more open correspondence for characters:

 String input = "Müller, firstname middlename"; String regexp = "(.+),\\s+(.+)\\s+(.+)"; 

This matches any character for the name, name and middlame name.

If spaces are optional (only the first occurrence may be optional, otherwise we cannot distinguish between the name and the name middlame), then use * instead of + :

 String input = "Müller,firstname middlename"; String regexp = "(.+),\\s*(.+)\\s+(.+)"; 

As @Elliott mentions, there may be other possibilities, such as using String.split() or String.indexOf() with String.substring() - regular expressions are often more flexible, but harder to maintain, especially for complex expressions.

In any case, implement unit tests with the maximum possible inputs (including invalid ones) so that you can verify that your algorithm remains valid after changing it.

+3
source share

I would try to avoid complex regex, I would use String.substring() and indexOf() . That is, something like

 String name = "Last, First Middle"; int comma = name.indexOf(','); int lastSpace = name.lastIndexOf(' '); String lastName = name.substring(0, comma); String firstName = name.substring(comma + 2, lastSpace); String middleName = name.substring(lastSpace + 1); System.out.printf("first='%s' middle='%s' last='%s'%n", firstName, middleName, lastName); 

Exit

 first='First' middle='Middle' last='Last' 
+6
source share

As an alternative to directly matching lastname, firstname middlename you can use String.split and provide a regular expression that matches the delimiters. For example:

 static String[] lastFirstMiddle(String input){ String[] result=input.split("[,\\s]+"); System.out.println(Arrays.asList(result)); return result; } 

I tested this with inputs

 "Müller, firstname middlename" "Müller,firstname middlename" "O'Gara, Ronan Ramón" 

Note: this approach fails with last names that contain spaces, for example "van der Heuvel", "de Valera", "mac Piarais", or "bin Laden", but again, the original OP specification does not seem to allow spaces in the last name (or in other names, I work with “Mary Kate.” This is her name, not the first and second). There is an interesting page about personal names at http://www.w3.org/International/questions/qa-personal-names

+1
source share
 ^([a-zA-Z]+)\s*,\s*([a-zA-Z]+)\s+([a-zA-Z]+)$ 

I think you are looking for this. Just take groups to get your needs. Watch the demo.

http://regex101.com/r/hQ1rP0/6

0
source share

I think this one will work a little shorter than yours:

 ([AZ][az]*)(?:,\s*)? 

Demo

Or you can use split with this regex:

 (,?\s+) 
0
source share

All Articles