I have a file of size 50,000x5,000 (float). when using x = np.genfromtxt(readFrom, dtype=float) to load a file into memory, the following error message appears:
File "C: \ Python27 \ lib \ site-packages \ numpy \ lib \ npyio.py", line 1583, in genfromtxt for (i, converter) in the listing (converters)])
Memoryerror
I want to load the entire file into memory, because I calculate the Euclidean distance between each vector using Scipy. dis = scipy.spatial.distance.euclidean(x[row1], x[row2])
Is there an efficient way to load a huge matrix file into memory.
Thanks.
Update:
I managed to solve the problem. Here is my solution. I'm not sure if it is efficient or logically fixed, but works fine for me:
x = open(readFrom, 'r').readlines() y = np.asarray([np.array(s.split()).astype('float32') for s in x], dtype=np.float32) .... dis = scipy.spatial.distance.euclidean(y[row1], y[row2])
Please help me improve my decision.
source share