Python 3 - Multiprocessing - Queue.get () not responding

I want to make a brute force attack and therefore need speed ... So I came up with using a multiprocessor library ... However, in every tutorial I found that something does not work .... Hm .. It seems That this works very well, except that whenever I call the get () function, it is idle, it seems to go to bed, and it does not respond at all. Am I just stupid or what? I just copied the example to make it work.

import multiprocessing as mp import random import string # Define an output queue output = mp.Queue() # define a example function def rand_string(length, output): """ Generates a random string of numbers, lower- and uppercase chars. """ rand_str = ''.join(random.choice( string.ascii_lowercase + string.ascii_uppercase + string.digits) for i in range(length)) output.put(rand_str) # Setup a list of processes that we want to run processes = [mp.Process(target=rand_string, args=(5, output)) for x in range(2)] # Run processes for p in processes: p.start() # Exit the completed processes for p in processes: p.join() # Get process results from the output queue results = [output.get() for p in processes] print(results) 
0
python queue multiprocessing core
source share
2 answers

@ given hit him on the head! You do not have if __name__ == "__main__": so you have a "plug". That is, each process starts processes, etc. You will also notice that I have moved the queue creation.

 import multiprocessing as mp import random import string # define a example function def rand_string(length, output): """ Generates a random string of numbers, lower- and uppercase chars. """ rand_str = ''.join(random.choice( string.ascii_lowercase + string.ascii_uppercase + string.digits) for i in range(length)) output.put(rand_str) if __name__ == "__main__": # Define an output queue output = mp.Queue() # Setup a list of processes that we want to run processes = [mp.Process(target=rand_string, args=(5, output)) for x in range(2)] # Run processes for p in processes: p.start() # Exit the completed processes for p in processes: p.join() # Get process results from the output queue results = [output.get() for p in processes] print(results) 

It happens that multiprocessing starts each child process as a module, so __name__ is only __main__ in the parent. If you do not have this, then each child process (attempt) will start two more processes, each of which will start two more, and so on. No wonder IDLE stops.

0
source share

I tried the Multiprocessing module to convert a list of text files to BERT Embedding.

A BERT attachment is created for each file, but the process does not end for a specific file.

Previously, I used the join () operation to terminate processes, but earlier it came to a standstill.

So, as suggested here

Process.join () and Queue () do not work in case of large numbers

I made changes to the code to replace process.join ()

 from multiprocessing import Process import multiprocessing import time import sys def process(file,appended_data): start = datetime.now() file1_obj = open(form_path + file, 'r') file1 = file1_obj.readlines() file1_obj.close() file11=[i.rstrip() for i in file1 if not(bool(not i or i.isspace()))] file111=[' |||'.join(file11)] try: bc=BertClient() embedding1=bc.encode(file111) del bc except ValueError: #some files have '' as their first strins in the list embedding1=None appended_data.put({file:embedding1}) print("finished %s"%file) print(datetime.now()-start) return appended_data def embedding_dic(file_list): procs = [] appended_data = multiprocessing.Queue() print(file_list[0]) print(file_list) for file in file_list: procs.append(Process(target=process, args=(file,appended_data,))) for proc in procs: proc.start() results = [] liveprocs = list(procs) while liveprocs: try: while 1: r=appended_data.get(False) results.append(r) except Exception: pass time.sleep(0.05) # Give tasks a chance to put more data in if not appended_data.empty(): continue liveprocs = [p for p in liveprocs if p.is_alive()] print(liveprocs) print(len(results)) return results 

nevertheless, a deadlock happens in the case of certain files.

Description below:

Executing the embedding_dic function on a list of files results in

 No of files available : 7 Files _names: ['0001368007_10-K_2007-03-22.txt', '0001368007_10-K_2008-03-25.txt', '0001368007_10-K_2009-02-27.txt', '0001368007_10-K_2010-03-01.txt', '0001368007_10-K_2011-02-28.txt', '0001368007_10-K_2012-02-29.txt', '0001368007_10-K_2012-02-29.txt'] Processes_started: [<Process(Process-1899, started)>, <Process(Process-1900, started)>, <Process(Process-1901, started)>, <Process(Process-1902, started)>, <Process(Process-1903, started)>, <Process(Process-1904, started)>, <Process(Process-1905, started)>] 0 [<Process(Process-1899, started)>, <Process(Process-1900, started)>, <Process(Process-1901, started)>, <Process(Process-1902, started)>, <Process(Process-1903, started)>, <Process(Process-1904, started)>, <Process(Process-1905, started)>] 0 [<Process(Process-1899, started)>, <Process(Process-1900, started)>, <Process(Process-1901, started)>, <Process(Process-1902, started)>, <Process(Process-1903, started)>, <Process(Process-1904, started)>, <Process(Process-1905, started)>] 0 [<Process(Process-1899, started)>, <Process(Process-1900, started)>, <Process(Process-1901, started)>, <Process(Process-1902, started)>, <Process(Process-1903, started)>, <Process(Process-1904, started)>, <Process(Process-1905, started)>] 0 [<Process(Process-1899, started)>, <Process(Process-1900, started)>, <Process(Process-1901, started)>, <Process(Process-1902, started)>, <Process(Process-1903, started)>, <Process(Process-1904, started)>, <Process(Process-1905, started)>] 0 finished 0001368007_10-K_2009-02-27.txt 0:00:03.055049 finished 0001368007_10-K_2012-02-29.txt 0:00:03.023879 finished 0001368007_10-K_2012-02-29.txt 0:00:03.055496 finished 0001368007_10-K_2010-03-01.txt 0:00:03.096127 finished 0001368007_10-K_2011-02-28.txt 0:00:03.099099 [<Process(Process-1899, started)>, <Process(Process-1900, started)>] 5 finished 0001368007_10-K_2008-03-25.txt 0:00:04.473414 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 [<Process(Process-1899, started)>] 6 Process Process-1899: File "/home/jovyan/.conda/envs/pycp_py3k/lib/python3.6/site-packages/bert_serving/client/__init__.py", line 206, in arg_wrapper return func(self, *args, **kwargs) [<Process(Process-1899, started)>] 6 Traceback (most recent call last): File "/home/jovyan/.conda/envs/pycp_py3k/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/home/jovyan/.conda/envs/pycp_py3k/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "<ipython-input-315-ffe782d1c2f5>", line 12, in process embedding1=bc.encode(file111) File "/home/jovyan/.conda/envs/pycp_py3k/lib/python3.6/site-packages/bert_serving/client/__init__.py", line 291, in encode r = self._recv_ndarray(req_id) 

Thus, this process is at an impasse with the file 0001368007_10-K_2007-03-22.txt, when a list of files is specified as input.

In case I try only with the same file as the input. It ends !!!

It ends even if the number of files is saved up to 5.

Even for some other list of files that have files larger than 7, for example 10 or 12. The process ends.

I cannot debug the same thing.

Another symptom I noticed is that

  • if I restart the code after a while, it will end!

Help appreciated!

0
source share

All Articles