Python Regex for figurative words

I am looking for a regular expression to match portable words in python.

The closest I managed to get: '\ w + - \ w + [- w +] *'

text = "one-hundered-and-three- some text foo-bar some--text" hyphenated = re.findall(r'\w+-\w+[-\w+]*',text) 

which returns the list ['one-hundered-and-three-', 'foo-bar'].

This is almost perfect, with the exception of the end hyphen after the "three". I need only an additional hyphen, if you follow the word. those. instead of "[- \ w +] *" I need something like "(- \ w +) *", which I thought would work, but not (it returns ['-three,' ']). that is, something that matches | a word followed by a hyphen, followed by a word, followed by hyphen_word zero or more times.

+8
python regex hyphen
source share
1 answer

Try the following:

 re.findall(r'\w+(?:-\w+)+',text) 

Here we consider the portable word:

  • several word characters
  • followed by any number:
    • one hyphen
    • followed by the word chars
+18
source share

All Articles