How to find the position of a word in a file?

for example, I have a file and the word "test". the file is partially binary, but has the string "test". How to find the position of a word (index) in a file without loading this file into memory?

+5
source share
3 answers

You cannot find the position of the text in the file unless you open the file. It's like asking someone to read a newspaper without opening their eyes.

To answer the first part of your question, this is relatively simple.

with open('Path/to/file', 'r') as f:
    content = f.read()
    print content.index('test')
+4
source

You can use memory mapped files and regular expressions .

, . , , . mmap , ; , re . theyre mutable, obj [index] = 'a' : obj [i1: i2] = '...'. , () .

import re
import mmap

f = open('path/filename', 'r+b')
mf = mmap.mmap(f.fileno(), 0)
mf.seek(0) # reset file cursor
m = re.search('pattern', mf)
print m.start(), m.end()
mf.close()
f.close()
+2

:

with open(file_dmp_path, 'rb') as file:
fsize = bsize = os.path.getsize(file_dmp_path)
word_len = len(SEARCH_WORD)
while True:
    p = file.read(bsize).find(SEARCH_WORD)
    if p > -1:
        pos_dec = file.tell() - (bsize - p)
        file.seek(pos_dec + word_len)
        bsize = fsize - file.tell()
    if file.tell() < fsize:
        seek = file.tell() - word_len + 1
        file.seek(seek)
    else:
        break
+2

All Articles