The right way to deal with frozen processes on Unix

Those who administer production Unix systems have undoubtedly encountered the problem of frozen processes before. They just sit around, consuming CPU and/or memory indefinitely until you forcefully shut them down.

Phusion Passenger 3 – our high-performance and advanced web application server – was not completely immune to this problem either. That is until today, because we have just implemented a fix which will be part of Phusion Passenger 4.0 (4.0 hasn’t been released yet, but a first beta will appear soon). It’s going into the open source version, not just Phusion Passenger Enterprise, because we believe this is an important change that should benefit the entire community.

Behind our solution lies an array of knowledge about operating systems, process management and Unix. Today, I’d like to take this opportunity to share some of our knowledge with our readers. In this article we’re going to dive into the operating system-level details of frozen processes. Some of the questions answered by this article include:

What causes a process to become frozen and how can you debug them?
What facilities does Unix provide to control and to shut down processes?
How can a process manager (e.g. Phusion Passenger) automatically cleanup frozen processes?
Why was Phusion Passenger not immune to the problem of frozen processes?
How is Phusion Passenger’s frozen process killing fix implemented, and under what constraints does it work?

This article attempts to be generic, but will also provide Ruby-specific tips.

Not all frozen processes are made equal

Let me first clear some confusion. Frozen processes are sometimes also called zombie processes, but this is not formally correct. Formally, a zombie process as defined by Unix operating systems is a process that has already exited, but its parent process has not waited for its exit yet. Zombie processes show up in ps and on Linux they have “<defunct>” appended to their names. Zombie processes do not consume memory or CPU. Killing them – even with SIGKILL – has no effect. The only way to get rid of them is to either make the parent process call waitpid() on them, or by terminating the parent process. The latter causes the zombie process to be “adopted” by the init process (PID 1), which immediately calls waitpid() on any adopted processes. In any case, zombie processes are harmless.

In this article we’re only covering frozen processes, not zombie processes. A process is often considered frozen if it has stopped responding normally. They appear stuck inside something and are not throwing errors. Some of the general causes are:

The process is stuck in an infinite loop. In practice we rarely see this kind of freeze.
The process is very slow, even during shutdown, causing it to appear frozen. Some of the causes include:
2.1. It is using too much memory, causing it to hit the swap. In this case you should notice the entire system becoming slow.
2.2. It is performing an unoptimized operation that takes a long time. For example you may have code that iterates through all database table rows to perform a calculation. In development you only had a handful of rows so you never noticed this, but you forgot that in production you have millions of rows.
The process is stuck in an I/O operation. It’s waiting for an external process or a server, be it the database, an HTTP API server, or whatever.

Debugging frozen processes

Obviously fixing a frozen process involves more than figuring out how to automatically kill it. Killing it just fixes the symptoms, and should be considered a final defense in the war against frozen processes. It’s better to figure out the root cause. We have a number of tools in our arsenal to find out what a frozen process is doing.

crash-watch

crash-watch is a tool that we’ve written to easily obtain process backtraces. Crash-watch can be instructed to dump backtraces when a given process crashes, or dump their current backtraces.

Crash-watch is actually a convenience wrapper around gdb, so you must install gdb first. It dumps C-level backtraces, not language-specific backtraces. If you run crash-watch on a Ruby program it will dump the Ruby interpreter’s C backtraces and not the Ruby code’s backtraces. It also dumps the backtraces of all threads, not just the active thread.

Invoke crash-watch as follows:

crash-watch --dump <PID>

Here is a sample output. This output is the result of invoking crash-watch on a simple “hello world” Rack program that simulates being frozen.

Crash-watch is especially useful for analyzing C-level problems. As you can see in the sample output, the program is frozen in a freeze_process call.

Crash-watch can also assist in analyzing problems caused by Ruby C extensions. Ruby’s mysql gem is quite notorious in this regard because it blocks all Ruby threads while it’s doing its work. If the MySQL server doesn’t respond (e.g. because of network problems, because the server is dead, or because the query is too heavy) then the mysql gem will freeze the entire process, making it even unable to respond to signals. With crash-watch you are able to clearly see that a process is frozen in a mysql gem call.

Phusion Passenger’s SIGQUIT handler

Ruby processes managed by Phusion Passenger respond to SIGQUIT by printing their backtraces to stderr. On Phusion Passenger, stderr is always redirected to the global web server error log (e.g. /var/log/apache2/error.log or /var/log/nginx/error.log), not the vhost-specific error log. If your Ruby interpreter supports it, Phusion Passenger will even print the backtraces of all Ruby threads, not just the active one. This latter feature requires either Ruby Enterprise Edition or Ruby 1.9.

Note that for this to work, the Ruby interpreter must be responding to signals. If the Ruby interpreter is frozen inside a C extension call (such as is the case in the sample program) then nothing will happen. In that case you should use crash-watch or the rb_backtrace() trick below.

rb_backtrace()

If you want to debug a Ruby program that’s not managed by Phusion Passenger then there’s another trick to obtain the Ruby backtrace. The Ruby interpreter has a nice function called rb_backtrace() which causes it to print its current Ruby-level backtrace to stdout. You can use gdb to force a process to call that function. This works even when the Ruby interpreter is stuck inside a C call. This method has two downsides:

Its reliability depends on the state of the Ruby interpreter. You are forcing a call from arbitrary places in the code, so you risk corrupting the process state. Use with caution.
It only prints the backtrace of the active Ruby thread. It’s not possible to print the backtraces of any other Ruby threads.

First, start gdb:

$ gdb

Then attach gdb to the process you want:

attach <PID>

This will probably print a whole bunch of messages. Ignore them; if gdb prints a lot of library names and then asks you whether to continue, answer Yes.

Now we get to the cream. Use the following command to force a call to rb_backtrace():

p (void) rb_backtrace()

You should now see a backtrace appearing in the process’s stdout:

from config.ru:5:in `freeze_process'
from config.ru:5
from /Users/hongli/Projects/passenger/lib/phusion_passenger/rack/thread_handler_extension.rb:67:in `call'
from /Users/hongli/Projects/passenger/lib/phusion_passenger/rack/thread_handler_extension.rb:67:in `process_request'
from /Users/hongli/Projects/passenger/lib/phusion_passenger/request_handler/thread_handler.rb:126:in `accept_and_process_next_request'
from /Users/hongli/Projects/passenger/lib/phusion_passenger/request_handler/thread_handler.rb:100:in `main_loop'
from /Users/hongli/Projects/passenger/lib/phusion_passenger/utils/robust_interruption.rb:82:in `disable_interruptions'
from /Users/hongli/Projects/passenger/lib/phusion_passenger/request_handler/thread_handler.rb:98:in `main_loop'
from /Users/hongli/Projects/passenger/lib/phusion_passenger/request_handler.rb:432:in `start_threads'
from /Users/hongli/Projects/passenger/lib/phusion_passenger/request_handler.rb:426:in `initialize'
from /Users/hongli/Projects/passenger/lib/phusion_passenger/request_handler.rb:426

strace and dtruss

The strace tool (Linux) and the dtruss tool (OS X) can be used to see which system calls a process is calling. This is specially useful for detecting problems belonging to categories (1) and (2.2).

Invoke strace and dtruss as follows:

sudo strace -p <PID>
sudo dtruss -p <PID>

Phusion Passenger’s role

Phusion Passenger was traditionally architected to trust application processes. That is, it assumed that if we tell an application process to start, it will start, and if we tell it to stop it will stop. In practice this is not always true. Applications and libraries contain bugs that can cause freezes, or maybe interaction with an external buggy component causes freezes (network problems, database server problems, etc). Our point of view was that the developer and system administrator should be responsible for these kind of problems. If the developer/administrator does not manually intervene, the system may remain unusable.

Starting from Phusion Passenger 3, we began turning away from this philosophy. Our core philosophy has always been that software should Just Work™ with the least amount of hassle, and that software should strive to be zero-maintenance. As a result, Phusion Passenger 3 introduced the Watchdog, Phusion Passenger Enterprise introduced request time limiting, and Phusion Passenger 4 will introduce application spawning time limiting. These things attack different aspects of the freezing process problem:

Application spawning time limiting solves the problem of application processes not starting up quickly enough, or application processes freezing during startup. This feature will be included in the open source version of Phusion Passenger.
Request time limiting solves the problem of application processes freezing during a web request. This feature is not available in the open source version of Phusion Passenger, and only in Phusion Passenger Enterprise.
The Watchdog traditionally only solves the problem of Phusion Passenger helper processes (namely the HelperAgent and LoggingAgent) freezing during web server shutdown. Now, it also solves the problem of application processes freezing during web server shutdown. These features will too be included in the open source version of Phusion Passenger.

The shutdown procedure and the fix

When you shut down your web server, the Watchdog will be the one to notice this event. It will send a message to the LoggingAgent and the HelperAgent to tell them to gracefully shut down. In turn, the HelperAgent will tell all application processes to gracefully shut down. The HelperAgent does not wait until they’ve actually shut down. Instead it will just assume that they will eventually shut down. It is for this reason that even if you shut down your web server, application processes may stay behind.

The Watchdog was already designed to assume that agent processes could freeze during shutdown, so it gives agent processes a maximum amount of time during which they shut down gracefully. If they don’t, then the Watchdog will forcefully kill them with SIGKILL. It wouldn’t just kill the agent processes, but also all application processes.

The fix was therefore pretty straightforward. Always have the watchdog kill applications processes, even if the HelperAgent terminates normally and in time. The final fix was effectively 3 lines of code.

Utilizing Unix process groups

The most straightforward method to shutdown application processes would be to maintain a list of their PIDs and then killing them one-by-one. The Watchdog however uses a more powerful and little-used Unix mechanism, namely process groups. Let me first explain them.

A system can have multiple process groups. A process belongs to exactly one process group. Each process group has exactly 1 “process group leader”. The process group ID is equal to the PID of the group leader.

The process group that a process belongs to is inherited from its parent process upon forking. However a process can be a member of any process group, no matter what group the parent belongs to.

Same-colored processes denote processes belonging to the same process group. As you can see in the process tree on the right, process group membership is not constrained by parent-child relationships.

You can simulate the process tree on the right using the following Ruby code.

top = $$
puts "Top process PID: #{$$}"
Process.setpgrp
pid = fork do
  pid = $$
  pid2 = fork do
    pid2 = $$
    Process.setpgrp
    pid3 = fork do
      pid3 = $$
      sleep 0.1
      puts "#{top} belongs to process group: #{Process.getpgid(top)}"
      puts "#{pid} belongs to process group: #{Process.getpgid(pid)}"
      puts "#{pid2} belongs to process group: #{Process.getpgid(pid2)}"
      puts "#{pid3} belongs to process group: #{Process.getpgid(pid3)}"
    end

    # We change process group ID of pid3 after the fact!
    Process.setpgid(pid3, top)

    Process.waitpid(pid3)
  end
  Process.waitpid(pid2)
end
Process.waitpid(pid)

As you can see, you can change the process group membership of any process at any time, provided you have the permissions to do so.

The kill() system call provides a neat little feature: it allows you to send a signal to an entire process group! You can already guess where this is going.

Whenever the Watchdog spawns an agent process, it creates a new process group for it. The HelperAgent and the LoggingAgent are both process group leaders of their own process group. Since the HelperAgent is responsible for spawning application processes, all application processes automatically belong to the HelperAgent’s process group, unless the application processes themselves change this.

Upon shutdown, the Watchdog sends SIGKILL to HelperAgent’s and the LoggingAgent’s process groups. This ensures that everything is killed, including all application processes and even whatever subprocesses they spawn.

Conclusion

In this article we’ve considered several reasons why processes may freeze and how to debug them. We’ve seen which mechanisms Phusion Passenger provides to combat frozen processes. We’ve introduced the reader to Unix process groups and how they interact with Unix signal handling.

A lot of engineering has gone into Phusion Passenger to ensure that everything works correctly even in the face of failing software components. As you can see, Phusion Passenger uses proven and rock-solid Unix technologies and concepts to implement its advanced features.

Support us by buying Phusion Passenger Enterprise

It has been a little over a month now since we’ve released Phusion Passenger Enterprise, which comes with a wide array of additional features not found in the open source Phusion Passenger. However as we’ve stated before, we remain committed to maintaining and developing the open source version. This is made possible by the support of our customers; your support. If you like Phusion Passenger or if you want to enjoy the premium Enterprise features, please consider buying one or more Phusion Passenger Enterprise licenses. Development of the open source Phusion Passenger is directly funded by Phusion Passenger Enterprise sales.

Let me know your thoughts about this article in the comments section and thank you for your support!