The fastest way to read every nth line with numpy genfromtxt

Question

The fastest way to read every nth line with numpy genfromtxt

I read my data using numpy genfromtxt:

import numpy as np measurement = np.genfromtxt('measurementProfile2.txt', delimiter=None, dtype=None, skip_header=4, skip_footer=2, usecols=(3,0,2)) rows, columns = np.shape(measurement) x=np.zeros((rows, 1), dtype=measurement.dtype) x[:]=394 measurement = np.hstack((measurement, x)) np.savetxt('measurementProfileFormatted.txt',measurement)

this works great. But I want only a 5-th , 6-th (so n-th ) line in the final Output file. According to numpy.genfromtxt.html there will be no parameter that would do this. I do not want to iterate over the array. Is there a recommended way to solve this problem?

+2

python arrays numpy genfromtxt

user69453 Jan 15 '15 at 10:52

source share

3 answers

In any case, you must read the entire file to select the nth element:

 >>> a = np.arange(50) >>> a[::5] array([ 0, 5, 10, 15, 20, 25, 30, 35, 40, 45])

0

elyase Jan 15 '15 at 11:19

source share

If you just need certain rows in the final output file, then why not save only these rows instead of saving the entire dimension matrix?

 output_rows = [5,7,11] np.savetxt('measurementProfileFormatted.txt',measurement[output_rows,:])

0

bitspersecond Jan 15 '15 at 14:25

source share

Alex Riley · Accepted Answer · 2015-01-15T12:02:16+0000

To avoid reading the entire array, you can combine np.genfromtxt with itertools.islice to skip lines. This is slightly faster than reading the entire array and then cutting (at least for the smaller arrays I tried).

For example, here is the contents of file.txt :

 12 34 22 17 41 28 62 71

Then, for example:

 >>> import itertools >>> with open('file.txt') as f_in: x = np.genfromtxt(itertools.islice(f_in, 0, None, 3), dtype=int)

returns an array x with 0 , 3 and 6 indexed elements of the specified file:

 array([12, 17, 62])

The fastest way to read every nth line with numpy genfromtxt

More articles: