http://duartes.org/gustavo/blog/post/anatomy-of-a-program-in-memory/

Many But Finite

Tech and science for curious people.

Home

  * Archives
  * About
  * Subscribe
  *  

Anatomy of a Program in Memory

Jan 27th, 2009

Memory management is the heart of operating systems; it is crucial
for both programming and system administration. In the next few posts
I'll cover memory with an eye towards practical aspects, but without
shying away from internals. While the concepts are generic, examples
are mostly from Linux and Windows on 32-bit x86. This first post
describes how programs are laid out in memory.

Each process in a multi-tasking OS runs in its own memory sandbox.
This sandbox is the virtual address space, which in 32-bit mode is
always a 4GB block of memory addresses. These virtual addresses are
mapped to physical memory by page tables, which are maintained by the
operating system kernel and consulted by the processor. Each process
has its own set of page tables, but there is a catch. Once virtual
addresses are enabled, they apply to all software running in the
machine, including the kernel itself. Thus a portion of the virtual
address space must be reserved to the kernel:

                      Kernel/User Memory Split

This does not mean the kernel uses that much physical memory, only
that it has that portion of address space available to map whatever
physical memory it wishes. Kernel space is flagged in the page tables
as exclusive to privileged code (ring 2 or lower), hence a page fault
is triggered if user-mode programs try to touch it. In Linux, kernel
space is constantly present and maps the same physical memory in all
processes. Kernel code and data are always addressable, ready to
handle interrupts or system calls at any time. By contrast, the
mapping for the user-mode portion of the address space changes
whenever a process switch happens:

              Process Switch Effects on Virtual Memory

Blue regions represent virtual addresses that are mapped to physical
memory, whereas white regions are unmapped. In the example above,
Firefox has used far more of its virtual address space due to its
legendary memory hunger. The distinct bands in the address space
correspond to memory segments like the heap, stack, and so on. Keep
in mind these segments are simply a range of memory addresses and
have nothing to do with Intel-style segments. Anyway, here is the
standard segment layout in a Linux process:

           Flexible Process Address Space Layout In Linux

When computing was happy and safe and cuddly, the starting virtual
addresses for the segments shown above were exactly the same for
nearly every process in a machine. This made it easy to exploit
security vulnerabilities remotely. An exploit often needs to
reference absolute memory locations: an address on the stack, the
address for a library function, etc. Remote attackers must choose
this location blindly, counting on the fact that address spaces are
all the same. When they are, people get pwned. Thus address space
randomization has become popular. Linux randomizes the stack, memory
mapping segment, and heap by adding offsets to their starting
addresses. Unfortunately the 32-bit address space is pretty tight,
leaving little room for randomization and hampering its effectiveness