Match to a specific pattern using regex

I have a line in a text file containing text as follows:

txt = "java.awt.GridBagLayout.layoutContainer" 

I want to get everything up to the class name "GridBagLayout" .

I tried something like the following, but I can't figure out how to get rid of the "."

 txt = re.findall(r'java\S?[^AZ]*', txt) 

and I get the following: "java.awt."

instead of what i want: "java.awt"

Any directions on how I can fix this?

+7
source share
3 answers

Without using capture groups, you can use lookahead (business (?= ... ) ).

java\s?[^AZ]*(?=\.[AZ]) should capture everything you need. Here it is broken:

 java //Literal word "java" \s? //Match for an optional space character. (can change to \s* if there can be multiple) [^AZ]* //Any number of non-capital-letter characters (?=\.[AZ]) //Look ahead for (but don't add to selection) a literal period and a capital letter. 
+10
source

Make your pattern match with an uppercase letter:

 '(java\S?[^AZ]*?)\.[AZ]' 

Everything in the capture group will be what you want.

0
source

This is similar to what you want with re.findall() : (java\S?[^AZ]*)\.[AZ]

0
source

All Articles