How to run a script for all * .txt files in the current directory?

I am trying to run a script in all * .txt files in the current directory. Currently, it will only process the test.txt file and print a block of text based on the regular expression. What will be the fastest way to scan the current directory for * .txt files and below the script for all * .txt files found? Also, how could I include lines containing the words "word1" and "word3" since the script currently only prints content between the two lines? I would like to print the whole block.

#!/usr/bin/env python
import os, re
file = 'test.txt'
with open(file) as fp:
   for result in re.findall('word1(.*?)word3', fp.read(), re.S):
     print result

I would appreciate any advice or suggestions on how to improve the code above, for example. speed when working on a large set of text files. Thanks.

+4
source share
2 answers

Use glob.glob:

import os, re
import glob

pattern = re.compile('word1(.*?)word3', flags=re.S)
for file in glob.glob('*.txt'):
    with open(file) as fp:
        for result in pattern.findall(fp.read()):
            print result
+6
source

Inspired by falsetru 's answer , I rewrote my code, making it more general.

Now the files to study:

  • can be described by a string as the second argument to be used glob(),
    or using a function specially written for this purpose, if the set of desired files cannot be described using the globish pattern

  • and can be in the current directory if the third argument is not passed ,
    or in the specified directory if its path is passed as the second argument

.

import re,glob
from itertools import ifilter
from os import getcwd,listdir,path
from inspect import isfunction

regx = re.compile('^[^\n]*word1.*?word3.*?$',re.S|re.M)

G = '\n\n'\
    'MWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMW\n'\
    'MWMWMW  %s\n'\
    'MWMWMW  %s\n'\
    '%s%s'

def search(REGX, how_to_find_files, dirpath='',
           G=G,sepm = '\n======================\n'):
    if dirpath=='':
        dirpath = getcwd()

    if isfunction(how_to_find_files):
        gen = ifilter(how_to_find_files,
                      ifilter(path.isfile,listdir(dirpath)))
    elif isinstance(how_to_find_files,str):
        gen = glob.glob(path.join(dirpath,
                                  how_to_find_files))

    for fn in gen:
        with open(fn) as fp:
            found = REGX.findall(fp.read())
            if found:
                yield G % (dirpath,path.basename(fn),
                           sepm,sepm.join(found))

# Example of searching in .txt files

#============ one use ===================
def select(fn):
    return fn[-4:]=='.txt'
print ''.join(search(regx, select))

#============= another use ==============
print ''.join(search(regx,'*.txt'))

sevral , ''.join() , ,
, , - ( ?)

0

All Articles