Numpy loadtxt skip the first line

Question

Numpy loadtxt skip the first line

I have a little problem when I try to import data from CSV files with the numpy loadtxt function. Here is an example of the type of data files that I have.

Name it 'datafile1.csv':

# Comment 1 # Comment 2 x,y,z 1,2,3 4,5,6 7,8,9 ... ... # End of File Comment

The script that I thought would work for this situation is as follows:

 import numpy as np FH = np.loadtxt('datafile1.csv',comments='#',delimiter=',',skiprows=1)

But I get an error message:

 ValueError: could not convert string to float: x

This tells me that kwarg "skiprows" does not skip the title, it skips the first line of comments. I could just make sure skiprows = 3, but the complication is that I have a very large number of files, which not all have the same number of commented lines at the top of the file. How can I make sure that when I use loadtxt, I only get the actual data in this situation?

PS - I am open to bash solutions.

+10

python bash numpy csv import-from-csv

astromax Jun 17 '13 at 15:25

source share

3 answers

Create your own filter function, for example:

 def skipper(fname): with open(fname) as fin: no_comments = (line for line in fin if not line.lstrip().startswith('#')) next(no_comments, None) # skip header for row in no_comments: yield row a = np.loadtxt(skipper('your_file'), delimiter=',')

+1

Jon clements Jun 17 '13 at 15:46

source share

 def skipper(fname, header=False): with open(fname) as fin: no_comments = (line for line in fin if not line.lstrip().startswith('#')) if header: next(no_comments, None) # skip header for row in no_comments: yield row a = np.loadtxt(skipper('your_file'), delimiter=',')

This is just a small modification to @Jon Clements' answer by adding an optional header parameter, given that in some cases there are comment lines (starting with C #) in the csv file without a header.

0

Jeffzheng Jan 23 '19 at 7:51

source share

falsetru · Accepted Answer · 2013-06-17T15:31:58+0000

Skip the comment line manually using the generator expression:

 import numpy as np with open('datafile1.csv') as f: lines = (line for line in f if not line.startswith('#')) FH = np.loadtxt(lines, delimiter=',', skiprows=1)

Numpy loadtxt skip the first line

More articles: