I have a solution very similar to Jack's answer:
import re
identifier_pattern = re.compile(r'Identifier: (.*)$')
m = []
with open('huge_file', 'r') as f:
for line in f:
m.extend(identifier_pattern.findall(line))
You can use another part of the regexp API to get the same result:
import re
identifier_pattern = re.compile(r'Identifier: (.*)$')
m = []
with open('huge_file', 'r') as f:
for line in f:
pattern_found = identifier_pattern.search(line)
if pattern_found:
value_found = pattern_found.group(0)
m.append(value_found)
What could we simplify with expression and list comprehension
import re
identifier_pattern = re.compile(r'Identifier: (.*)$')
with open('huge_file', 'r') as f:
patterns_found = (identifier.search(line) for line in f)
m = [pattern_found.group(0)
for pattern_found in patterns_found if pattern_found]
source
share