I get a system error (shown below) while doing some simple numpy-based matrix algebra calculations using the Multiprocessing package (python 2.73 from numpy 1.7.0 on Ubuntu 12.04 on Amazon EC2). My code works fine for smaller matrix sizes, but crashes for large ones (with lots of available memory)
The size of the matrices used is significant (my code works fine for dense matrices of 1000000x10, but drops by 1000000x500 units - for example, I pass these matrices to / from subprocesses). 10 vs 500 is a runtime parameter, everything else remains the same (input, other runtime parameters, etc.).
I also tried to run the same (ported) code using python3 - for larger matrices, the subprocesses go into standby / idle mode (instead of crashes, as in python 2.7), and the program / subprocesses just hang nothing there. For smaller matrices, the code works fine with python3.
Any suggestions would be much appreciated (I'm running out of ideas here)
Error message:
Exception in thread Thread-5: Traceback (most recent call last): File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner self.run() File "/usr/lib/python2.7/threading.py", line 504, in run self.__target(*self.__args, **self.__kwargs) File "/usr/lib/python2.7/multiprocessing/pool.py", line 319, in _handle_tasks put(task) SystemError: NULL result without error in PyObject_Call
Multiplication code used:
def runProcessesInParallelAndReturn(proc, listOfInputs, nParallelProcesses): if len(listOfInputs) == 0: return
Below is the "proc" that runs for each subprocess. Basically, it solves many systems of linear equations using numpy (it builds the necessary matrices inside the subprocess) and returns the results as another matrix. Once again, it works fine for smaller values โโof one runtime parameter, but crashes (or hangs in python3) for larger ones.
def solveForLFV(param): startTime = time.time() (chunkI, LFVin, XY, sumLFVinOuterProductLFVallPlusPenaltyTerm, indexByIndexPurch, outerProductChunkSize, confWeight), queue = param LFoutChunkSize = XY.shape[0] nLFdim = LFVin.shape[1] sumLFVinOuterProductLFVpurch = np.zeros((nLFdim, nLFdim)) LFVoutChunk = np.zeros((LFoutChunkSize, nLFdim)) for LFVoutIndex in xrange(LFoutChunkSize): LFVInIndexListPurch = indexByIndexPurch[LFVoutIndex] sumLFVinOuterProductLFVpurch[:, :] = 0. LFVInIndexChunkLow, LFVInIndexChunkHigh = getChunkBoundaries(len(LFVInIndexListPurch), outerProductChunkSize) for LFVInIndexChunkI in xrange(len(LFVInIndexChunkLow)): LFVinSlice = LFVin[LFVInIndexListPurch[LFVInIndexChunkLow[LFVInIndexChunkI] : LFVInIndexChunkHigh[LFVInIndexChunkI]], :] sumLFVinOuterProductLFVpurch += sum(LFVinSlice[:, :, np.newaxis] * LFVinSlice[:, np.newaxis, :]) LFVoutChunk[LFVoutIndex, :] = np.linalg.solve(confWeight * sumLFVinOuterProductLFVpurch + sumLFVinOuterProductLFVallPlusPenaltyTerm, XY[LFVoutIndex, :]) queue.put((chunkI, LFVoutChunk)) print 'solveForLFV: ', time.time() - startTime, 'sec' sys.stdout.flush()