Removing all non-numeric characters from a string in Python

Question

Removing all non-numeric characters from a string in Python

How to remove all non-numeric characters from a string in Python?

+109

python numbers

grizzley Aug 08 '09 at 17:13

source share

8 answers

Not sure if this is the most effective way, but:

 >>> ''.join(c for c in "abc123def456" if c.isdigit()) '123456'

The ''.join means combining all the resulting characters together without any characters in between. Then the rest is an understanding of the list, where (as you probably guessed), we accept only parts of the string that match the isdigit condition.

+75

Mark Rushakoff Aug 08 '09 at 17:16

source share

This should work for strings and Unicode objects:

 # python <3.0 def only_numerics(seq): return filter(type(seq).isdigit, seq) # python ≥3.0 def only_numerics(seq): seq_type= type(seq) return seq_type().join(filter(seq_type.isdigit, seq))

+13

tzot Sep 07 '09 at 3:01

source share

The quickest approach, if you need to perform more than one or two such delete operations (or even one, but a very long line!), Is to rely on the translate string method, although it needs preparation:

 >>> import string >>> allchars = ''.join(chr(i) for i in xrange(256)) >>> identity = string.maketrans('', '') >>> nondigits = allchars.translate(identity, string.digits) >>> s = 'abc123def456' >>> s.translate(identity, nondigits) '123456'

The translate method is different and might be easier to use on Unicode strings than on byte strings, btw:

 >>> unondig = dict.fromkeys(xrange(65536)) >>> for x in string.digits: del unondig[ord(x)] ... >>> s = u'abc123def456' >>> s.translate(unondig) u'123456'

You might want to use a collation class rather than an actual dict, especially if your Unicode string can contain characters with very large ord values (which will make an excessive dict ;-). For example:

 >>> class keeponly(object): ... def __init__(self, keep): ... self.keep = set(ord(c) for c in keep) ... def __getitem__(self, key): ... if key in self.keep: ... return key ... return None ... >>> s.translate(keeponly(string.digits)) u'123456' >>>

+5

Alex Martelli Aug 08 '09 at 17:35

source share

To add another parameter to the mix, the string module has several useful constants. Although they are more useful in other cases, they can be used here.

 >>> from string import digits >>> ''.join(c for c in "abc123def456" if c in digits) '123456'

There are several constants in the module, including:

ascii_letters (abbreviation)
hexdigits (0123456789abcdefABCDEF)

If you use these constants heavily, it may be helpful to hide them until frozenset . This allows you to use O (1) rather than O (n), where n is the constant length for the source strings.

 >>> digits = frozenset(digits) >>> ''.join(c for c in "abc123def456" if c in digits) '123456'

+5

Tim McNamara Sep 07 '12 at 10:37

source share

@ Ned Batchelder and @newacct gave the correct answer, but ...

Just in case, if your line has a comma (,) decimal (.):

 import re re.sub("[^\d\.]", "", "$1,999,888.77") '1999888.77'

0

kennyut Nov 09 '18 at 15:49

source share

I do not have enough reputation, but I tried the solution in the second most popular comment ( https://stackoverflow.com/a/165478/ ). there was feedback, and I corrected it. I suppose there should be a “[]” to understand the list?

 def strip_nonnumerics(s): return ''.join([i for i in s if i.isdigit()])

0

Aster May 17 '19 at 14:55

source share

 user = (input): print ("hello")

-5

GEVANS8 Aug 18 '17 at 9:54 on

source share

Ned Batchelder · Accepted Answer · 2009-08-08 17:25

>>> import re >>> re.sub("[^0-9]", "", "sdkjh987978asd098as0980a98sd") '987978098098098'

Removing all non-numeric characters from a string in Python

More articles: