proxy70

                               Memory Management

   In [1]programming memory management is (unsurprisingly) the act and
   various techniques of managing the working [2]memory ([3]RAM) of a
   computer, i.e. for example dividing the total physically available memory
   among multiple memory users such as operating system processes and
   assuring they don't illegally access each other's part of memory. The
   scope of the term may differ depending on context, but tasks falling under
   memory management may include e.g. memory [4]allocation (finding and
   assigning blocks of free memory) and deallocation (freeing such blocks),
   ensuring [5]memory safety, organizing blocks of memory and [6]optimizing
   memory access (e.g. with [7]caches or data reorganization), [8]memory
   virtualization and related tasks such as address translation, handling
   out-of-memory [9]exceptions etc.

   Memory management can be handled at different levels: hardware units such
   as the [10]MMU and CPU [11]caches exist to perform certain time-critical
   memory-related tasks (such as address translation) quickly, [12]operating
   system may help with memory management (e.g. implement virtual memory and
   offer [13]syscalls for dynamic allocation and deallocation of memory), a
   [14]programming language may do some automatic memory management (e.g.
   [15]garbage collection or handling call stack) and programmer himself may
   do his own memory management (e.g. deciding between static and dynamic
   allocation or choosing the size of dynamic allocation chunk).

   Why all this fuzz? As a newbie programmer who only works with simple
   variables and high level languages like [16]Python that do everything for
   you you don't need to do much memory management yourself, but when working
   with data whose size may wildly differ and is not known in advance (e.g.
   files), someone has to handle e.g. the possibility of the data on disk not
   being able to fit to RAM currently allocated for your program, or -- if
   the data fits -- there may not be a big enough continuous chunk of memory
   for it. If we don't know how much memory a process will need, how much
   memory do we give it (too little and it may not be enough, too much and
   there will not be enough memory for others)? Someone has to prevent
   [17]memory leaks so that your computer doesn't run out of memory due to
   [18]bugs in programs. With many [19]processes running [20]simultaneously
   on a computer someone has to keep track of which process uses which part
   of memory and ensure [21]collisions (one process overwriting another
   processe's memory) don't happen, and someone needs to make sure that if
   bad things happen (such as process trying to write to a memory that
   doesn't belong to it), they don't have catastrophic consequences like
   [22]crashing or exploding the system.

Memory Management In C

   In [23]C -- a [24]low level language -- you need to do a lot of manual
   memory management and there is a big danger of fucking up, especially with
   dynamic allocation -- C won't hold your hand (but as a reward your program
   will be fast and efficient), there is no uber memory safety. There is no
   automatic [25]garbage collection, i.e. if you allocate memory dynamically,
   YOU need to keep track of it and manually free it once you're done using
   it, or you'll end up with [26]memory leak.

   For start let's see which kinds of allocation (and their associated parts
   of memory) there are in C:

     * static allocation (code/data memory): Simplest kind of allocation
       happening at compile time: if the compiler can do so (i.e. if it knows
       enough things such as the size of the data in advance), it allocates
       space of concrete size at some specific address in the part of memory
       reserved for code or static data (code and data may be in the same or
       separate parts depending on platform, see e.g. [27]Harvard
       architecture) -- this is straightforward, simple, automatic and poses
       no real dangers, bloat or burden of dependencies. This kind of
       allocation applies to:
          * global variables (variables declared outside any function, i.e.
            even outside main)
          * static variables (variables inside functions declared with static
            keyword)
          * constants/literals (e.g. strings in the source code such as
            "abc")
     * automatic allocation (stack memory): For local variables (variables
       inside functions) the memory is allocated in a special part of memory
       known as call [28]stack only at the time when the function is actually
       called and executed; i.e. this is similar to dynamic allocation (it
       happens at run time) but happens automatically, without needing any
       libraries or other explicit actions from the programmer. I.e. when a
       function is called at run time, a new call frame is created on stack
       which includes space for local variables of that function (along with
       e.g. return address from the function etc.). This is necessary e.g. to
       allow [29]recursion (during which several instances of the same
       function may be active, each of which may have different values of its
       variables), and it also helps consume less RAM. This allows for
       creating variable sized arrays inside functions (e.g. int array[x];
       where x is variable) which is not possible to do with a global array
       (however variable size arrays aren't supported in old ANSI C!). The
       disadvantage over dynamic allocation is that stack memory is
       relatively small and overusing it may easily cause stack [30]overflow
       (running out of memory). Still this kind of allocation is better than
       dynamic allocation as it doesn't need any libraries, it doesn't
       generate complex code and the only danger is that of stack overflow --
       memory leaks can't happen (deallocation happens automatically when
       function is exited). Automatic allocation applies to:
          * local variables (including function arguments and local variable
            size arrays)
     * dynamic allocation (heap memory): A kind of more complex manual
       allocation that happens at run time and is initiated by the programmer
       calling special functions such as malloc from the stdlib standard
       library, which return [31]pointers to the allocated memory. This
       memory is taken from a special part of memory known as [32]heap. This
       allows to allocate, resize and deallocate potentially very big parts
       of memory, but requires caution as working with pointers is involved
       and there is a danger of memory leaks -- it is the responsibility of
       the programmer to free allocated memory with the free function once it
       is no longer needed, otherwise that memory will simply remain
       allocated and unusable by others (if this happens for example in a
       loop, the program may just start eating up more and more RAM and
       eventually run out of memory). Dynamic allocation is also pretty
       complex (it usually involves communicating with operating system and
       also keeping track of the structure of memory) and creates a
       [33]dependency on the stdlib library. Some implementations of the
       allocation functions are also infamously slow (up to the point of some
       programmers resorting to program their own dynamic allocation
       systems). Therefore only use dynamic allocation when absolutely
       necessary! Dynamic allocation applies to:
          * memory allocated with special functions (malloc, calloc, realloc)

   Rule of the thumb: use the simplest thing possible, i.e. static allocation
   if you can, if not then automatic and only as the last option resort to
   dynamic allocation. The good news is that you mostly won't need dynamic
   allocation -- you basically only need it when working with data whose size
   can potentially be VERY big and is unknown at compile time (e.g. you need
   to load a WHOLE file AT ONCE which may potentially be VERY big). In other
   cases you can get away with static allocation (just reserving some
   reasonable amount of memory in advance and hope the data fits, e.g. a
   global array such as int myData[DATA_MAX_SIZE]) or automatic allocation if
   the data is reasonably small (i.e. you just create a variable sized array
   inside some function that processes the data). If you end up doing dynamic
   allocation, be careful, but it's not THAT hard to do it right (just pay
   more attention) and there are tools (e.g. [34]valgrind) to help you find
   memory leaks. However by the principles of [35]good design you should
   avoid dynamic allocation if you can, not only because of the potential for
   errors and worse performance, but most importantly to avoid dependencies
   and complexity.

   For [36]pros: you can also create your own kind of pseudo dynamic
   allocation in pure C if you really want to avoid using stdlib or can't use
   it for some reason. The idea is to allocate a big chunk of memory
   statically (e.g. global unsigned char myHeap[MY_HEAP_SIZE];) and then
   create functions for allocating and freeing blocks of this static memory
   (e.g. myAlloc and myFree with same signatures as malloc and free). This
   allows you to use memory more efficiently than if you just dumbly (is it a
   word?) preallocate everything statically, i.e. you may need less total
   memory; this may be useful e.g. on [37]embedded. Yet another uber [38]hack
   to "improve" this may be to allocate the "personal heap" on the stack
   instead of statically, i.e. you create something like a global pointer
   unsigned char *myHeapPointer; and a global variable unsigned int
   myHeapSize;, then somewhere at the beginning of main you compute the size
   myHeapSize and then create a local array myHeap[myHeapSize], then finally
   set the global pointer to it as myHeapPointer = myHeap; the rest remains
   the same (your allocation function will access the heap via the global
   pointer). Just watch out for reinventing wheels, bugs and that you
   actually don't end up with a worse mess that if you took a more simple
   approach. Hell, you might even try to write your own garbage collection
   and array bound checking and whatnot, but then why just not fuck it and
   use an already existing abomination like [39]Java? :)

   Finally let's see some simple code example:

 #include <stdio.h>
 #include <stdlib.h> // needed for dynamic allocation :(

 #define MY_DATA_MAX_SIZE 1024 // if you'll ever need more, just change this and recompile

 unsigned char staticMemory[MY_DATA_MAX_SIZE]; // statically allocated array :)
 int simpleNumber; // this is also allocated statically :)

 void myFunction(int x)
 {
   static int staticNumber;  // this is allocated statically, NOT on stack
   int localNumber;          // this is allocated on stack
   int localArray[x + 1];    // variable size array, allocated on stack, hope x isn't too big

   localNumber = 2 * x;      // do something with the memory
   localArray[x] = localNumber;

   if (x > 0)                // recursively call the function
     myFunction(x - 1);
 }

 int main(void)
 {
   int localNumberInMain = 123; // this is also allocated on stack   

   myFunction(10);  // change to 10000000 to see a probable stack overflow

   for (int i = 0; i < 200000; ++i)
   {
     if (i % 1000 == 0)
       printf("i = %d\n",i);

     unsigned char *dynamicMemory = (char *) malloc((i + 1) * 10000); // oh no, dynamic allocation, BLOAAAT!

     if (!dynamicMemory)
     {
       printf("Couldn't allocate memory, there's probably not enough of it :/");
       return 1;
     }

     dynamicMemory[i * 128] = 123; // do something with the memory

     free(dynamicMemory); // if not done, memory leak occurs! try to remove this and see :)        
   }

   return 0;
 }

Links:
1. programming.md
2. memory.md
3. ram.md
4. allocation.md
5. memory_safety.md
6. optimization.md
7. cache.md
8. virtual_memory.md
9. exception.md
10. mmu.md
11. cache.md
12. os.md
13. syscall.md
14. programming_language.md
15. garbage_collection.md
16. python.md
17. memory_leak.md
18. bug.md
19. process.md
20. multitasking.md
21. collision.md
22. crash.md
23. c.md
24. low_level.md
25. garbage_collection.md
26. memory_leak.md
27. harvard.md
28. stack.md
29. recursion.md
30. overflow.md
31. pointer.md
32. heap.md
33. dependency.md
34. valgrind.md
35. lrs.md
36. pro.md
37. embedded.md
38. hacking.md
39. java.md