Numpy loadtxt skip the first line

I have a little problem when I try to import data from CSV files with the numpy loadtxt function. Here is an example of the type of data files that I have.

Name it 'datafile1.csv':

# Comment 1 # Comment 2 x,y,z 1,2,3 4,5,6 7,8,9 ... ... # End of File Comment 

The script that I thought would work for this situation is as follows:

 import numpy as np FH = np.loadtxt('datafile1.csv',comments='#',delimiter=',',skiprows=1) 

But I get an error message:

 ValueError: could not convert string to float: x 

This tells me that kwarg "skiprows" does not skip the title, it skips the first line of comments. I could just make sure skiprows = 3, but the complication is that I have a very large number of files, which not all have the same number of commented lines at the top of the file. How can I make sure that when I use loadtxt, I only get the actual data in this situation?

PS - I am open to bash solutions.

+10
source share
3 answers

Skip the comment line manually using the generator expression:

 import numpy as np with open('datafile1.csv') as f: lines = (line for line in f if not line.startswith('#')) FH = np.loadtxt(lines, delimiter=',', skiprows=1) 
+17
source

Create your own filter function, for example:

 def skipper(fname): with open(fname) as fin: no_comments = (line for line in fin if not line.lstrip().startswith('#')) next(no_comments, None) # skip header for row in no_comments: yield row a = np.loadtxt(skipper('your_file'), delimiter=',') 
+1
source
 def skipper(fname, header=False): with open(fname) as fin: no_comments = (line for line in fin if not line.lstrip().startswith('#')) if header: next(no_comments, None) # skip header for row in no_comments: yield row a = np.loadtxt(skipper('your_file'), delimiter=',') 

This is just a small modification to @Jon Clements' answer by adding an optional header parameter, given that in some cases there are comment lines (starting with C #) in the csv file without a header.

0
source

All Articles