Zeromq and python multiprocessor, too many open files

I have an agent-based model where several agents are launched by a central process and exchange data through another central process. Each agent and communication process interact through zmq. However, when I run more than 100 agents, it sends a standard__ report:

Invalid argument (src / stream_engine.cpp: 143) Too many open files (SRC / ipc_listener.cpp: 292)

and Mac Os reports a problem:

Python quits unexpectedly when using the libzmq.5.dylib module.

It seems to me that too many contexts are open. But how can I avoid this with multiprocessing?

I am attaching a piece of code below:

class Agent(Database, Logger, Trade, Messaging, multiprocessing.Process): def __init__(self, idn, group, _addresses, trade_logging): multiprocessing.Process.__init__(self) .... def run(self): self.context = zmq.Context() self.commands = self.context.socket(zmq.SUB) self.commands.connect(self._addresses['command_addresse']) self.commands.setsockopt(zmq.SUBSCRIBE, "all") self.commands.setsockopt(zmq.SUBSCRIBE, self.name) self.commands.setsockopt(zmq.SUBSCRIBE, group_address(self.group)) self.out = self.context.socket(zmq.PUSH) self.out.connect(self._addresses['frontend']) time.sleep(0.1) self.database_connection = self.context.socket(zmq.PUSH) self.database_connection.connect(self._addresses['database']) time.sleep(0.1) self.logger_connection = self.context.socket(zmq.PUSH) self.logger_connection.connect(self._addresses['logger']) self.messages_in = self.context.socket(zmq.DEALER) self.messages_in.setsockopt(zmq.IDENTITY, self.name) self.messages_in.connect(self._addresses['backend']) self.shout = self.context.socket(zmq.SUB) self.shout.connect(self._addresses['group_backend']) self.shout.setsockopt(zmq.SUBSCRIBE, "all") self.shout.setsockopt(zmq.SUBSCRIBE, self.name) self.shout.setsockopt(zmq.SUBSCRIBE, group_address(self.group)) self.out.send_multipart(['!', '!', 'register_agent', self.name]) while True: try: self.commands.recv() # catches the group adress. except KeyboardInterrupt: print('KeyboardInterrupt: %s,self.commands.recv() to catch own adress ~1888' % (self.name)) break command = self.commands.recv() if command == "!": subcommand = self.commands.recv() if subcommand == 'die': self.__signal_finished() break try: self._methods[command]() except KeyError: if command not in self._methods: raise SystemExit('The method - ' + command + ' - called in the agent_list is not declared (' + self.name) else: raise except KeyboardInterrupt: print('KeyboardInterrupt: %s, Current command: %s ~1984' % (self.name, command)) break if command[0] != '_': self.__reject_polled_but_not_accepted_offers() self.__signal_finished() #self.context.destroy() 

all code is under http://www.github.com/DavoudTaghawiNejad/abce

+8
python-multiprocessing zeromq
source share
2 answers

Most likely, these are not too many contexts, too many sockets. Looking through your repo, I see that you (correctly) use IPC as your transport; IPC uses a file descriptor as an β€œaddress” to transfer data between different processes. If I read correctly, you open up to 7 sockets for each process to add up quickly. I am sure that if you do some debugging in the middle of your code, you will see that it will not work when creating the last context, but when the last socket presses the limit on the open file along the edge.

I understand that the typical user limit for open FDs is around 1000, so about 100 agents you press 700 open FDs for your sockets only. The rest is probably just typical. There should be no problem increasing your limit to 10,000, depending on your situation. Otherwise, you will have to rewrite to use fewer sockets for each process in order to get a higher process limit.

+6
source share

This has nothing to do with zeromq and python. This is the basic operating system, which allows only up to a certain threshold of simultaneously opened files. This restriction includes regular files, but also socket connections.

You can see your current limit using ulimit -n , probably defaults to 1024 . Machines running on servers or for other reasons (for example, multiprocessing) often require this limit to be set higher or only unlimited . - Additional information about ulimit .

In addition, there is another global limit , but I did not have to adjust anything.

In general, you should ask yourself if you really need a lot of agents. Typically, X / 2X workflows should be sufficient, where X matches the number of your CPUs.

+7
source share

All Articles