How to run a script for all * .txt files in the current directory?

Question

How to run a script for all * .txt files in the current directory?

I am trying to run a script in all * .txt files in the current directory. Currently, it will only process the test.txt file and print a block of text based on the regular expression. What will be the fastest way to scan the current directory for * .txt files and below the script for all * .txt files found? Also, how could I include lines containing the words "word1" and "word3" since the script currently only prints content between the two lines? I would like to print the whole block.

#!/usr/bin/env python
import os, re
file = 'test.txt'
with open(file) as fp:
   for result in re.findall('word1(.*?)word3', fp.read(), re.S):
     print result

I would appreciate any advice or suggestions on how to improve the code above, for example. speed when working on a large set of text files. Thanks.

+4

python regex

user3066287 Dec 04 '13 at 14:52

source share

2 answers

Inspired by falsetru 's answer , I rewrote my code, making it more general.

Now the files to study:

can be described by a string as the second argument to be used glob(),
or using a function specially written for this purpose, if the set of desired files cannot be described using the globish pattern
and can be in the current directory if the third argument is not passed ,
or in the specified directory if its path is passed as the second argument

.

import re,glob
from itertools import ifilter
from os import getcwd,listdir,path
from inspect import isfunction

regx = re.compile('^[^\n]*word1.*?word3.*?$',re.S|re.M)

G = '\n\n'\
    'MWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMW\n'\
    'MWMWMW  %s\n'\
    'MWMWMW  %s\n'\
    '%s%s'

def search(REGX, how_to_find_files, dirpath='',
           G=G,sepm = '\n======================\n'):
    if dirpath=='':
        dirpath = getcwd()

    if isfunction(how_to_find_files):
        gen = ifilter(how_to_find_files,
                      ifilter(path.isfile,listdir(dirpath)))
    elif isinstance(how_to_find_files,str):
        gen = glob.glob(path.join(dirpath,
                                  how_to_find_files))

    for fn in gen:
        with open(fn) as fp:
            found = REGX.findall(fp.read())
            if found:
                yield G % (dirpath,path.basename(fn),
                           sepm,sepm.join(found))

# Example of searching in .txt files

#============ one use ===================
def select(fn):
    return fn[-4:]=='.txt'
print ''.join(search(regx, select))

#============= another use ==============
print ''.join(search(regx,'*.txt'))

sevral , ''.join() , ,
, , - ( ?)

0

eyquem 04 . '13 17:09

falsetru · Accepted Answer · 2013-12-04T14:53:57+0000

Use glob.glob:

import os, re
import glob

pattern = re.compile('word1(.*?)word3', flags=re.S)
for file in glob.glob('*.txt'):
    with open(file) as fp:
        for result in pattern.findall(fp.read()):
            print result

How to run a script for all * .txt files in the current directory?

More articles: