Strace: what a process does

From Linuxintro
Revision as of 20:55, 4 April 2015 by imported>ThorstenStaerk (→‎QuickStart)

strace is a command that shows all requests from a running process to the kernel (aka syscalls). This helps you to do excellent troubleshooting and gain good performance tuning insights.

strace mindmap

QuickStart

Let's start with a real-world example: You want to troubleShoot sound on your computer. You know that speaker-test does some test noise. To find out what exactly happens, issue:

# strace -s 99 -ff speaker-test

You will find a lot of lines like

[pid 15287] write(6, "W", 1)            = 1
[pid 15287] recvmsg(7, {msg_name(0)=NULL, msg_iov(1)=[{"L\0\0\0\2L\0\0\2+U\0\0\0\0\0\31vqU\0\0\0\0\0\0\0\0001TU \240\326\0\4\252\354TU \240\326\0\4\262%r\0\0\0\0\0\20\0\0r\0\0\0\0\0\2\260\220R\0\0\0\0\0\0\0\0R\0\0\0\0\0\2\260\220", 83}], msg_controllen=24, {cmsg_len=24, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=2289, uid=0, gid=0}}, msg_flags=0}, 0) = 83
[pid 15287] write(6, "W", 1)            = 1

Looks complicated, right? But fear not, every line follows the same pattern:

[pid process id] syscall(parameters) = return code

You will only find the first column with the process id if you use the -ff option (follow sub-processes). For the syscall (write or recvmsg in the above example) it is easy to obtain information - just type

man 2 syscall

in the above example

man 2 write 

and

man 2 recvmsg

And you will find abundant information what the syscall does. If you understand 50% you are doing well :) The return code will also be documented in the syscall's man page.

starting strace

You can start a new process with strace or you can attach strace to an already running process. To start a new process just put strace in front of its command:

strace -s 99 -ff ls

To attach strace to an already running process, in this example firefox, find out the process' id:

# ps -A | grep firefox
2728 pts/1    00:00:02 firefox

Then call strace with the id as parameter:

strace -s 99 -ffp 2728

Understanding strace's output

The bad news is that strace's output looks tedious like this:

clock_gettime(CLOCK_MONOTONIC, {4433, 764347636}) = 0
poll([{fd=12, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 1 ([{fd=12, revents=POLLIN|POLLRDNORM}])
recv(12, "\37\36&\364T<\264\6Og\20\2\30%-\31C\3757\350{\260\0203\351\23&O\365(y\212"..., 16384, 0) = 1448

The good news is that every line follows the same scheme:

syscall(parameters) = return value

Whatever you see for syscall you can get help on using the man(2) pages. So for the above example

man 2 clock_gettime
man 2 poll
man 2 recv

Will give you the information what is going on. For example, clock_gettime retrieves the current time.


Wow, that was easy. Now to the next line, poll. According to man 2 poll it waits for an event on a file descriptor, like "data to be read". The file descriptor is the first argument, 12. Fortunately you can look up the file descriptor's meaning if you know the process id. In this case the process id was 7179:

/proc/7179/fd # ll
total 0
lr-x------ 1 root root 64 Apr 19 08:02 0 -> /dev/null
lrwx------ 1 root root 64 Apr 19 08:02 1 -> socket:[156171]
l-wx------ 1 root root 64 Apr 19 08:02 2 -> /var/log/apache2/error_log
lr-x------ 1 root root 64 Apr 19 08:02 4 -> pipe:[156208]
l-wx------ 1 root root 64 Apr 19 08:02 5 -> pipe:[156208]
l-wx------ 1 root root 64 Apr 19 08:02 6 -> /var/log/apache2/access_log
lrwx------ 1 root root 64 Apr 19 08:02 7 -> anon_inode:[eventpoll]
lrwx------ 1 root root 64 Apr 19 08:02 12 -> socket:[155097]

Ok, to find out what socket:[155097] is we use the command

# lsof | grep 155097
httpd2-pr 7158            root    3u     IPv4     155097      0t0        TCP *:http (LISTEN)
httpd2-pr 7179          wwwrun    3u     IPv4     155097      0t0        TCP *:http (LISTEN)

So this is a socket listening to the http port.

limitations

strace will show you all syscalls from a process, and that's it. This also mean that if your process hangs in an endless loop that does not do any syscall, strace will not show anything.

interprocess communication

Processes will typically communicate:

  • via network
  • via sockets
  • via pipes
  • via shared memory
  • by writing into files

For example when using strace to troubleshoot sound you may see that speaker-test receives messages from file descriptor 7:

# strace -s 99 -ffp 32681
[pid 32682] recvmsg(7, {msg_name(0)=NULL, msg_iov(1)=[{"L\0\0\0\2L\0\0\0}U\0\0\0\0\0\22\10\277U\0\0\0\0\0\0\0\0001TU b|\0\17(\347TU b|\0\0175\342r\0\0\0\0\0\20\0\0r\0\0\0\0\0\2\260\220R\0\0\0\0\0\0\0\0R\0\0\0\0\0\2\260\220", 83}], msg_controllen=24, {cmsg_len=24, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=2289, uid=0, gid=0}}, msg_flags=0}, 0) = 83

Now let's find out what file descriptor 7 of process 32681 is:

# ll /proc/32681/fd/7
lrwx------ 1 root root 64 Apr  4 18:17 /proc/32681/fd/7 -> socket:[198064]

So it is unix socket 198064. Let's find out what process it is:

# ss -p | grep 198064
u_str  ESTAB      0      0                    * 198064                * 198065  users:(("speaker-test",32681,7))
u_str  ESTAB      0      0      /run/user/0/pulse/native 198065                * 198064  users:(("pulseaudio",2289,28))

Search for a syscall

With strace, you can search for a special syscall. For example,

strace -e open command

will show you all files that have been touched (opened) by command. This can be used e.g. to find out where configuration changes are stored.

Performance analysis

With strace, you can find out the biggest time-consuming syscalls during a program run:

strace -c ls -R
Entries  Repository  Root
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 67.09    0.412153          14     29664           getdents64
 27.70    0.170168          11     14849        14 open
  4.24    0.026043           0    123740           write
  0.72    0.004443           0     14837           close
  0.20    0.001204           0     14836           fstat
  0.05    0.000285         285         1           execve
  0.00    0.000000           0        12           read
  0.00    0.000000           0         4         3 stat
  0.00    0.000000           0        33           mmap
  0.00    0.000000           0        18           mprotect
  0.00    0.000000           0         4           munmap
  0.00    0.000000           0        12           brk
  0.00    0.000000           0         2           rt_sigaction
  0.00    0.000000           0         1           rt_sigprocmask
  0.00    0.000000           0         2           ioctl
  0.00    0.000000           0         1         1 access
  0.00    0.000000           0         3           mremap
  0.00    0.000000           0         1           fcntl
  0.00    0.000000           0         1           getrlimit
  0.00    0.000000           0         1           statfs
  0.00    0.000000           0         1           arch_prctl
  0.00    0.000000           0         3         1 futex
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         8           fadvise64
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.614296                198036        19 total

Now you go

man 2 getdents64

to find out what that syscall is about.

See also