How the Linux OOM killer works
Most admins have probably experienced failures due to applications leaking memory, or worse yet consuming all of the virtual memory (physical memory + swap) on a host. The Linux kernel has an interesting way of dealing with memory exhaustion, and it comes in the way of the Linux OOM killer. When invoked, the OOM killer will begin terminating processes in order to free up enough memory to keep the system operational. I was curious how the OOM worked, so I decided to spend some time reading through the linux/mm/oom_kill.c Linux kernel source code file to see what the OOM killer does.
The OOM killer uses a point system to pick which processes to execute. The points are assigned by the badness() function, which contains the following block comment:
/** * badness - calculate a numeric value for how bad this task has been * @p: task struct of which task we should calculate * @uptime: current uptime in seconds * * The formula used is relatively simple and documented inline in the * function. The main rationale is that we want to select a good task * to kill when we run out of memory. * * Good in this context means that: * 1) we lose the minimum amount of work done * 2) we recover a large amount of memory * 3) we don't kill anything innocent of eating tons of memory * 4) we want to kill the minimum amount of processes (one) * 5) we try to kill the process the user expects us to kill, this * algorithm has been meticulously tuned to meet the principle * of least surprise ... (be careful when you change it) */
The actual code in this function does the following:
- Processes that have the PF_SWAPOFF flag set will be killed first
- Processes which fork a lot of child processes are next in line
- Kill off niced processes, since they are typically less important
- Superuser processes are usually more important, so try to avoid killing those
The code also takes takes into account the length of time the process has been running, which may or may not be a good thing. It’s interesting to see how technologies we take for granted actually work, and this experience really helped me understand what all the fields in the task_struct structure are used for. Now to dig into mm_struct. :)








Robert Milkowski on October 1st, 2009
OOM killer is in Linux mostly do to workaround problems with memory overcommiting in Linux. Linux is slowly moving into direction of getting rid of memory overcommiting approach (by tweaking /proc you can disable it to some extend for some time now). The truth is that memory overcommiting + OOM killer is a bad thing – killing semi-randomly applications because system allowed them to allocate more virtual memory than it has is just plain stupid in most environments. But as I said – Linux is slowly catching up and getting rid of that unpleasant feature.
btw: in most other Unixes like Solaris OOM killer is not needed as system generally won’t allow for memory overcommitment.
See also – http://developers.sun.com/solaris/articles/subprocess/subprocess.html#overcom