Debugging issues for Linux?

I have a daemon process that manages the configuration. all other processes must interact with this daemon for their functioning. But when I perform a big action, after a few hours the demon process does not respond for 2-3 hours. And after 2-3 hours it works fine.

Debugging utilities for Linux hangup issues?

How to get at what point the linux process freezes?

+4
source share
3 answers
  • strace can display the latest system calls and their result
  • lsof can show open files
  • syslog can be very effective when log messages are written to track progress. Allows you to place the problem in smaller areas. Also log messages are correlated with other messages from other systems, this often causes interesting results.
  • wireshark if applications use sockets to make the chatter of wire visible.
  • ps ax + top can show that your application is in a busy cycle, that is, it runs all the time, sleeping or blocking in IO, consuming the processor, using memory.

Each of them can provide some information that together create a picture of the problem.

When using gdb, it may be useful to start a kernel dump when the application is locked. Then you have a static snapshot that you can analyze using post mortem debugging at your leisure. You can run them using a script. You quickly create a set of snapshots that you can use to test your theories.

+9
source

One option is to use gdb and use the attach command to join the running process. You will need to upload a file containing the characters of the executable file (using the file command)

+1
source

There are several ways:

  • Listening on a UNIX domain socket to handle status requests. Then the external application can find out if the application supports everything in order. If he does not receive a response within a certain waiting period, it can be assumed that the requested application is deadlocked or dead.

  • Periodically affects a file with a pre-selected path. An external application can see the timestamp of the file, and if it is out of date, then it can be assumed that the application is dead or deadlocked.

  • You can use syscall alarm several times when the signal terminates the process (use sigaction accordingly). As long as you keep ringing alarm (that is, while your program is running), it will continue to work. As soon as you do not, the signal will light up.

You can easily restart your process as it dies with fork and waitpid as described in this answer . This does not require significant resources, since the OS will share memory pages.

0
source

Source: https://habr.com/ru/post/1312723/


All Articles