Another option, since I just ran into this problem:
import pandas as pd import subprocess grep = subprocess.check_output(['grep', '-n', '^TITLE', filename]).splitlines() bad_lines = [int(s[:s.index(':')]) - 1 for s in grep] df = pd.read_csv(filename, skiprows=bad_lines)
It is less portable than @eumiro (read: it may not work on Windows) and requires reading the file twice, but has the advantage that you do not need to store all the contents of the file in memory.
Of course, you could do the same thing as grep in Python, but it will probably be slower.
Dougal
source share