Removing unwanted characters from a phone number string

Question

Removing unwanted characters from a phone number string

I am aiming for a regex code to capture a phone number and remove unnecessary characters.

import re strs = 'dsds +48 124 cat cat cat245 81243!!' match = re.search(r'.[ 0-9\+\-\.\_]+', strs) if match: print 'found', match.group() ## 'found word:cat' else: print 'did not find'

It returns only:

 +48 124

How can I return the whole number?

+4

python regex

Efrin Jun 20 '12 at 11:25

source share

3 answers

The problem with re.sub() is that you get extra spaces in your last line of the phone number. Wrong expression method that returns the correct phone number (no spaces):

 >>> strs = 'dsds +48 124 cat cat cat245 81243!!' >>> ''.join(x for x in strs if x.isdigit() or x == '+') '+4812424581243'

+4

Burhan khalid Jun 20 '12 at 11:48

source share

This is what I use to replace all non-digital digits with a single hyphen, and it seems to work for me:

 # convert sequences of non-digits to a single hyphen fixed_phone = re.sub("[^\d]+","-",raw_phone)

0

Rufusvs Dec 11 '15 at 4:21

source share

Tim pietzcker · Accepted Answer · 2012-06-20T11:28:53+0000

You want to use sub() , not search() :

 >>> strs = 'dsds +48 124 cat cat cat245 81243!!' >>> re.sub(r"[^0-9+._ -]+", "", strs) ' +48 124 245 81243'

[^0-9+._ -] is a negative character class . The ^ sign is significant here - this expression means: "Match characters that are neither a number, nor a plus, a dot, an underscore, a space, or a dash."

+ tells the regex engine to match one or more instances of the previous token.

Removing unwanted characters from a phone number string

More articles: