I need to iteratively read data files and store data in (numpy) arrays. I decided to save the data in the dictionary of "data fields": {'field1': array1, 'field2': array2, ...}.
Case 1 (lists):
Using lists (or collections.deque () ) to add new arrays of data, the code is efficient . But when I combine the arrays stored in the lists, the memory grows , and I could not free it again. Example:
filename = 'test'
Calculation time : 63.4 s
Memory usage (above): 13862 gime_se 20 0 1042m 934m 4148 S 0 5.8 1: 00.44 python
Case 2 (numpy arrays):
Concatenating numpy arrays every time they are read is inefficient , but memory remains under control , Example:
nFields = 56 dataDict = {}
Calculation time : 1377.8 s
Memory usage (above): 14850 gime_se 20 0 650 m 542 m 4144 S 0 3.4 22: 31.21 python
Question (s):
Is there a way to have Case 1 performance, but keeping the memory under control, as in case 2 ?
In case 1, it seems that memory grows when combining list items (np.concatenate (value, axis = 0)). Best ideas to do this?
performance python memory-management
chan gimeno
source share