Efficient and economical to run multiple instances of a python program?

I wrote a program that calls a function with the following prototype:

def Process(n):

    # the function uses data that is stored as binary files on the hard drive and 
    # -- based on the value of 'n' -- scans it using functions from numpy & cython.    
    # the function creates new binary files and saves the results of the scan in them.
    #
    # I optimized the running time of the function as much as I could using numpy &  
    # cython, and at present it takes about 4hrs to complete one function run on 
    # a typical winXP desktop (three years old machine, 2GB memory etc).

My goal is to perform this function exactly 10,000 times (for 10,000 different "n" values) in the fastest and most economical way. after these runs I will have 10,000 different binaries with the results of all the individual scans. note that each run function is independent (this means there is no dependency between the individual runs).

So the question is this. having only one computer at home, it is obvious that it will take me about 4.5 years (10,000 runs x 4 hours per run = 40,000 hours ~ = 4,5 years) to complete all the runs at home. However, I would like all runs to be completed within a week or two.

, . ( / , ), ? ( ?) -? , , ?

, "Process()" 500 . .

+5
3

PiCloud: http://www.picloud.com/

import cloud
cloud.call(function)

, .

+9

Process ? - .

, Process , ? ?

, , Amazon EC2 ( this ), (EC2 $0,085 ) - ( , , - ).

+1

, , , IO... parallelism ( , -) .

: , , , ... - , .... PyTables !

, numpy mmap . , .

Memmapping , (, Z C- 3D-). - , , , .

, , . IO, , ( ) , .

, , - , , . pytables.

, tables.Expr (~ 1 ) memmapped . . ( ) . :

PyTables vs Numpy Memmap

+1

All Articles