To get speed acceleration for my python program, do I have to spawn a separate thread or a separate registration process?

To get speed acceleration for my python program, do I have to spawn a separate thread or a separate registration process? My program uses a lot of logging, and I'm not sure if threads are suitable because of the GIL. Many resources seem to suggest that I / O should be fine. I think registration is input / output, but I'm not sure what “should be good” means for most resources. I just need speed.

+8
performance python multithreading
source share
2 answers

Before you start trying to optimize the program, you need to do something.

To get started, you must profile your programs. You can, for example, use line_profiler .

If it turns out that your software spends a significant amount of time registering, there are two simple options.

  • Install loglevel in production code so that no or more (er) messages are logged. There will still be some overhead, but they should be significantly reduced.
  • Use mechanical means (such as sed or grep ) to completely remove registration calls from production code. If this does not improve the speed / throughput of your program, registration was not a problem.

If none of them works, and registration is a significant part of the time of your program, you can try to implement logging of threads or processes.

If you want to use threading for logging, you will need a list and a lock. The function called from the main thread for logging captures the lock, adds text to enter the list, and releases the lock. The second thread waits for the lock, captures the lock, pushes a couple of elements from the list, releases the lock, and writes the elements to a file. Since the GIL ensures that only one thread at a time uses Python bytecode, this will slightly decrease the performance of your program; part of its time is spent executing bytecode from the logging stream.

Using multiprocessing slightly different in that you probably want to use, for example. a Queue to send log messages from the main process to the logging process. The logging process takes items from the queue and writes them to disk. This means that the time taken to record the registration actions to disk is spent in another program. But there is some overhead associated with using the queue.

You will need to measure which method uses less time in your program.

+8
source share

I proceed from these assumptions:

  • You have already determined that your journal is the bottleneck of your program.
  • You have a good reason why you are registering what you are registering.

The alleged slowness is most likely associated with confirmation of the success or failure of the registration action. To avoid this “command queue”, invoke a separate process asynchronously and skip the callback. This may end up consuming more resources, but it will ease the lag in your main program. Nodejs handles this naturally, or you can roll your own python listener. Since it will be a separate process. You can redirect the logging function of other programs to this one. You can even have a separate machine to handle this workload.

0
source share

All Articles