Parallel Processing with Numpy.loadtxt ()

I have a file> 100 MB in size that needs to be read with numpy.loadtxt()

The reading part is the main bottleneck in my code. A file with 72 MB requires 17.3s

Is it possible to somehow read a file in parallel using loadtxt()

If possible, without breaking the file.

+1
python numpy parallel-processing
source share
1 answer

It seems that numpy.loadtxt () is your problem.

http://wesmckinney.com/blog/?p=543

http://codrspace.com/durden/performance-lessons-for-reading-ascii-files-into-numpy-arrays/

According to these sites, you better not use the numpy download function.

pandas.read_csv and read_table should be useful from the pandas module.

+2
source share

All Articles