The fastest way to read every nth line with numpy genfromtxt

I read my data using numpy genfromtxt:

import numpy as np measurement = np.genfromtxt('measurementProfile2.txt', delimiter=None, dtype=None, skip_header=4, skip_footer=2, usecols=(3,0,2)) rows, columns = np.shape(measurement) x=np.zeros((rows, 1), dtype=measurement.dtype) x[:]=394 measurement = np.hstack((measurement, x)) np.savetxt('measurementProfileFormatted.txt',measurement) 

this works great. But I want only a 5-th , 6-th (so n-th ) line in the final Output file. According to numpy.genfromtxt.html there will be no parameter that would do this. I do not want to iterate over the array. Is there a recommended way to solve this problem?

+2
source share
3 answers

To avoid reading the entire array, you can combine np.genfromtxt with itertools.islice to skip lines. This is slightly faster than reading the entire array and then cutting (at least for the smaller arrays I tried).

For example, here is the contents of file.txt :

 12 34 22 17 41 28 62 71 

Then, for example:

 >>> import itertools >>> with open('file.txt') as f_in: x = np.genfromtxt(itertools.islice(f_in, 0, None, 3), dtype=int) 

returns an array x with 0 , 3 and 6 indexed elements of the specified file:

 array([12, 17, 62]) 
+3
source

In any case, you must read the entire file to select the nth element:

 >>> a = np.arange(50) >>> a[::5] array([ 0, 5, 10, 15, 20, 25, 30, 35, 40, 45]) 
0
source

If you just need certain rows in the final output file, then why not save only these rows instead of saving the entire dimension matrix?

 output_rows = [5,7,11] np.savetxt('measurementProfileFormatted.txt',measurement[output_rows,:]) 
0
source

All Articles