Dealing with explicit dynamic memory is a very hard problem. Tracing allocated buffers in large applications can be overwhelming if not done carefully. On this post, we will review the problematic behind dynamic memory management and how we can cope with its complexity.
Overview of memory management paradigms
Memory can be used in a variety of ways.
First of all, let’s review the different paradigms that cover memory management at user level.
Static memory comprises all program data qualified with static storage. This includes all the variables declared with a static specifier and global variables, both initialized and uninitialized. We may consider here Static memory is a very powerful paradigm for memory management when the maximum amount of required memory can be determined at compile time. One very interesting feature of static memory is that its size is known after compilation by simply examining the executable file.
Dynamic memory is the memory that is requested by an application during runtime, causing it to expand its memory image. Dynamic memory comes in a variety of flavors.
The stack is a Last In First Out (LIFO) memory management and the basis of any computer program. The main use of the stack is for storing local variables and contextual information of the functions.
Fixed-size block memory
Fixed-size block memory allocation is a memory management paradigm where the size of the chunks to be allocated is fixed. There are many places where this paradigm is applied, like hard disk drives or the MMU (handling fixed size physical memory pages).
Variable-size block memory
With variable-size block memory allocation, the user can request blocks of memory of different sizes. This problem is mathematically characterized as a bin packing problem, a combinatorial NP-hard problem, which makes the implementation of allocators an extremely hard problem. As a consequence, this kind of memory is non-deterministic, i.e., it cannot be guaranteed that any allocation request could be satisfied despite having enough free memory.
There are many implementations of dynamic storage allocators based on this paradigm, like the TLSF, half-fit, dlmalloc and many others. They stand as general purpose memory allocators, useful in the majority of the situations.
The above paradigms cover the general purpose allocation paradigms. However, other custom paradigms with special properties can be used in domain-specific problems. For instance, region-based memory management is a very interesting paradigm when dealing with groups of objects with the same lifetime.
Characterizing use of dynamic memory
The requirements of dynamic memory in any application can be characterized according to its intended use and properties. Characterizing the use of memory for a given application is the basis for deciding which is the memory allocation strategy that best suits our application needs. The right choice can dramatically reduce many of the problems that appear when dealing with dynamic memory.
Memory required for the allocation of fixed-size objects
Each application has to deal with instances of objects or elements with fixed size.
The obvious approach in this case is to have a fixed-size block memory allocator for each type of object. This has two immediate benefits:
- Faster memory allocation. The implementation of a memory pool is trivial, compared with that of a general purpose allocator.
- Lower synchronization burden. When having specialized pools for each kind of object, the synchronization required in a concurrent system is reduced dramatically, as there are fewer accesses to the common memory allocator, thus less contention in mutually exclusive code.
In all the projects in which I’ve been involved, fixed-size block memory allocation has been by far the most common required paradigm. That means that, by identifying fixed-size elements, we can shake off most of the complexity related to memory management.
In some specific cases, the management of such blocks can be simplified even more if the memory blocks can be indexed. When objects are indexable (let’s say, by a number that can be used as an array index), a statically allocated array is the best solution.
Memory required for temporary data storage
The application may require to store intermediate calculation results. Depending on the complexity of the task, the stack or a region-based memory allocator may be the best choices. If the task is simple like, for example, generating a query string for a database, the best choice could be to use the stack. For a more complex task, like resolving an algorithm, a region allocator allows allocating several memory buffers as required during the process and freeing them all in just a single step at the end.
Memory required for data buffers
Data buffers have a bad habit of being variable in size. In some cases, the size can be bounded. However, most of the time the size is not predictable. Here, the use of a memory allocation paradigm that supports variable size blocks is required. Further characterization of the properties of the data buffers can allow us to choose the best allocator for our purposes.
In general, malloc is the solution. However, identifying the situation can be useful to select a simpler solution. Sometimes, tracing the allocated buffer can be hard. In those cases, a free-less allocation (custom) paradigm can outweigh the cost of tracing the data buffers.
Memory allocation rules of thumb
Rule of proximity
One rule I follow that is essential for easier tracking of dynamic memory buffers is the proximity rule. The code allocating memory for a certain purpose must be kept as close as possible to the deallocation code. When we leave the deallocation to an external module, nasty dependencies appear, and keeping track of the allocated memory becomes a real pain.
Rule of minimum allocations
Some objects in the system are composed by aggregations of other objects. Objects in the resulting aggregate have all the same lifetime. In this situation, the allocation of an aggregate must be made in a single step when possible. A single allocation is simpler to trace, provokes less accesses to the common memory allocator, hence reducing the synchronization burden, and eliminates the possibility of an intertwined allocation, hence avoiding external fragmentation.
Rule of stack
Whenever possible, use the stack. It is the most powerful memory allocation paradigm that exists. It requires no synchronization, as the memory is local to the running thread, and the allocation time is the cost of a sum operation. Most of the temporary data storage for an application can be placed in the stack. Needless to say that care must be taken not to introduce potential buffer overflows. Also, the typical maximum value of the stack size makes it a limited resource.
Rule of side-effects
Having a function allocating memory as a side effect is the recipe for memory leaks. Therefore, avoid returning allocated memory when the primary goal of a function is not instantiating some object. One example of such function is the POSIX library function strdup. It allocates memory and then it fills it with a copy of the string passed as an argument. As part of the POSIX standard it is well understood that strdup allocates memory. However, memory allocation as a side may not be so obvious in user functions.
A safer alternative would be having the caller function allocating the memory and then using the callee only to fill the data. For example, strcpy (actually strncpy) would be a safe alternative to strdup.
Memory allocation is hard. However, a better understanding of the problematic can allow us to keep it under control.