Regex to separate% ages and values ​​in python

Hello, I am new to python and regex. I have a large CSV file that has a type field %age compositionthat contains values ​​such as:

'34% passed 23% failed 46% deferred'

How would you split this line so that you get a dictionary object:

{'passed': 34, 'failed': 23, 'deferred': 46} for each row?

I tried this:

for line in csv_lines:
    for match in re.findall('[\d\s%%]*\s', line)

but that required the% age value

0
source share
3 answers

And if you still want to go with regular expressions, you can use this:

(\w+)%\s(\w+)

- (: [0-9a-zA-Z_]+), %, - . Parenthesis .

:

>>> import re
>>> s = '34% passed 23% failed 46% deferred'
>>> pattern = re.compile(r'(\w+)%\s(\w+)')
>>> {value: key for key, value in pattern.findall(s)}
{'failed': '23', 'passed': '34', 'deferred': '46'}
+5

:

>>> s = '34% passed 23% failed 46% deferred'
>>> groups = zip(*[iter(s.split())]*2)
>>> groups
[('34%', 'passed'), ('23%', 'failed'), ('46%', 'deferred')]
>>> {result: int(percent.rstrip('%')) for percent, result in groups}
{'failed': 23, 'passed': 34, 'deferred': 46}

zip(*[iter(..)]*2) grouper - itertools ( . zip (* [iter (s)] * n) Python?):

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)
+3

:

[EDIT: ​​ OP. , alecx : /questions/1554882/regex-to-split-ages-and-values-in-python/4670558#4670558

import re

data = """34% passed 23% failed 46% deferred 34% checked"""
checkList = ['passed', 'failed', 'deferred', 'checked']
result = {k:v for (v, k) in re.findall('(\d{1,3})% (' + '|'.join(checkList) + ')', data)}
print(result) # Python 3
#print result # Python 2.7

Here the regular expression \ d {1,3} is to catch the percentage int and pass | failed | set aside to get type. I use list comprehension to create a list of tuples of keys and values, which I then convert to a dictionary

To build the “passed” line failed | .. 'I use the .join function of the string to join the words from the checklist with the pipe symbol as a separator.

0
source

All Articles