I read that some of the Python functions implemented in C, which I suppose include file.read (), can free the GIL while it is running, and then return it to completion, and thus use several cores if available.
I use multiprocess to parallelize some code, and currently I have three processes: the parent, one child, which reads the data from the file, and one child, which generates a checksum from the data passed to her by the first child to process.
Now, if I understand this right, it seems that creating a new process for reading a file, as I am doing now, is not mandatory, and I should just name it in the main process. The question is, do I understand this right and will I work better with reading stored in the main process or in a separate one?
Thus, my function reads and processes the processed data:
def read(file_path, pipe_out): with open(file_path, 'rb') as file_: while True: block = file_.read(block_size) if not block: break pipe_out.send(block) pipe_out.close()
I believe this definitely uses multiple cores, but also introduces some overhead:
multiprocess.Process(target=read, args).start()
But now I'm wondering if this will just use multiple cores, minus the overhead:
read(*args)
Any understanding of any person regarding which it would be faster and for what reason would be highly appreciated!