eheap Part 2: Enhanced debugging

Embedded systems are typically designed by small teams of engineers who have too much to do and need every bit of help they can get. Heap usage problems can be difficult to find. The main ones are:

• Memory leaks due to blocks not being released when they should be.
• Overflows of buffers and stacks that damage the heap and neighboring chunks.

eheap is a new embedded system heap that provides special features to help debug these kinds of problems. eheap will run on any RTOS-based or stand-alone system. This is the second in a series of three posts on eheap:

eheap Part 1: Configuration
• eheap Part 2: Enhanced debugging
• eheap Part 3: Self-healing Heap physical structure

The previous configuration post focused on the logical structure of eheap, which comprises bins. This paper focuses on the physical structure of eheap, which comprises chunks. Following heap initialization, eheap is structured as shown in Figure 1.

The start and end chunks are in-use chunks with no data blocks; they mark the start and end of the heap space, respectively. All free space is initially contained in the donor chunk (dc) and in the top chunk (tc). These are free chunks. As explained in the configuration post, the optional donor chunk is the source for small chunk allocations, and the top chunk is the source for all other chunk allocations. Initially chunks are calved from dc or tc, then freed to bins when no longer needed. Eventually chunk allocations come almost entirely from bins.

Chunks
Each chunk has a header, and its remainder is available for use as a data block. Chunks are multiples of 8 bytes in size and they are aligned on 8-byte boundaries. eheap supports three types of chunks: inuse, free, and debug. An inuse chunk is allocated by malloc(sz) and is shown in Figure 2.


The forward link is a pointer to the next chunk, and the back-ward link is a pointer to the previous chunk. Since chunks are 8-byte aligned, the lower 3 bits of links are always 0 and thus can be used for flags. In this case, flags == INUSE (bit 0). The data block is used by application code and is at least as big as sz.
A free chunk is freed by free(dp) and is shown in Figure 3.


fl and blf are the same as the inuse chunk, except flags == 0. The chunk size is in bytes and includes the header + free space. The free forward and free backward links are used to link this chunk into the free list of the bin specified in the bin number field. The header requires 24 bytes. Hence, this is the minimum chunk size supported by eheap. The free space is available to be allocated. When a chunk is allocated by malloc(), the fields after blf are overwritten with the data block. Hence, the smallest possible chunk of 24 bytes can hold a data block of up to 16 bytes.

A debug chunk is allocated by malloc(sz) when the heap debug mode is ON and is shown in Figure 4.


fl and blf are the same as an inuse chunk, except flags == DEBUG (bit 1) + INUSE (bit 0). Hence, a debug chunk is a special kind of inuse chunk. The chunk size is in bytes and includes the header + data block + fences. The time field is the time when the chunk was allocated and the owner field is the task or LSR that allocated the chunk. A fence is a fixed word pattern, such as 0xAAAAAAA3. One fence is part of the debug chunk header. The other fences surround the data block and as many as desired can be specified.

Heap debug mode
Free, inuse, and debug chunks may be freely mixed in eheap. The heap debug mode determines if a chunk is allocated as a debug chunk (debug ON) or as an inuse chunk (debug OFF) when it is allocated by malloc() or realloc(). The debug mode can be turned ON or OFF by application software, thus allowing debug chunks to be selectively created. This is important because they usually are much larger than inuse chunks.

For example, a programmer might want to surround a suspect data block with 10 fences above and below. Thus the debug chunk would be 40 + 40 + 24 – 8 = 96 bytes larger than an inuse chunk for the same size data block. Adding this much overhead to all inuse chunks might exceed memory available for the heap and thus be prohibitive. Also, performance takes a hit because fences must be filled with the fence pattern when chunks are allocated. This might result in the system running too slowly to produce the problem being debugged or not running at all.

In a typical case, the addition of a new task to an already debugged partial system might result in heap problems cropping up unexpectedly. Debug mode can be limited to the new task, taskA, as shown in Figure 5.


When taskA first starts, debug mode is turned ON and entry and exit routines are hooked into the scheduler for taskA. When taskA is suspended or preempted, the scheduler automatically turns debug mode OFF via the hooked taskA_exit() routine. When taskA is resumed, the scheduler automatically turns debug mode ON via the hooked taskA_enter() routine. As a consequence, debug mode is ON whenever taskA runs and only when taskA runs. Thus, all chunks allocated or reallocated by taskA will be debug chunks and all chunks allocated or reallocated by other tasks will be inuse chunks.

Note: If another task is structured like taskA it also will allocate only debug chunks.

Using debug chunks
Data block overflows are a common problem in heaps. Stacks overflow up (toward location 0) and buffers overflow up or down. A debug chunk allows as many fences as necessary above and below its data block to contain the overflow. This is helpful because heap damage is prevented so the system continues running despite a data block overflow.

Furthermore, the overflow pattern can be observed by looking at the chunk in the memory window of a debugger. Seeing the overflow pattern is helpful to determine what caused the over-flow. As discussed in the next post (self-healing), the eheap scan function stops at chunks with broken fences so that they can be examined.[1] This can be done by setting a breakpoint on the FENCE_BRKN error reported by the heap scan function. Being able to do this is helpful because block overflows are often sporadic and difficult to cause to repeat. Hence, it may be necessary to allow the system to run for long periods before the overflow occurs.

Memory leaks can be found by scanning time and owner fields in debug chunks. In the first case, chunks older than a certain time might be suspect. In the second case, chunks owned by deleted or stopped tasks are suspect. For this type of sleuthing, fences around the data block would probably be eliminated so that debug chunk overhead vs. inuse chunk overhead would be 16 bytes rather than 96 bytes more. Then the net can be cast wider to cover more suspected chunks.

Chunk fill patterns
It often is necessary to look directly at a heap via a debugger memory window in order to figure out what is wrong. An additional debug aid offered by eheap is chunk fill patterns, which make this task much more pleasant and productive.

When fill mode is ON, malloc() fills the data block with the data fill pattern (e.g., 0xDDDDDDDD). This is helpful both to identify inuse chunks and to see how much of an inuse chunk’s data block has actually been used. free() fills the free space of a chunk with a free fill pattern (e.g., 0xEEEEEEEE). The donor chunk and top chunk are filled with a DTC (donor and top chunk) fill pattern (e.g., 0×88888888). In addition, fences are filled with their pattern in debug chunks.
These fill patterns really help understand the heap image presented in a memory window. Old chunk headers are overwritten with the above patterns, so it is clear which chunk headers are actually in use and where chunks begin and end. In addition, it is helpful to see what memory is in use and what memory is free and to see dc and tc clearly delineated.

Fill mode, like debug mode, may be selectively turned ON or OFF. This is beneficial because filling chunks greatly reduces the performance of malloc() and free(). Therefore, it is possible to enable filling only chunks of interest. Since such chunks are not likely to be in the same heap area, filling them helps them to stand out against the background of chunks not of interest. In addition, knowing what kind of header to expect helps to find and interpret it.

Error checking and reporting
eheap services check all parameters and report any that are not in their expected ranges, then they abort operation to reduce damage to the heap. In addition, links, flags, and fences are also checked by eheap services. Also heap scan functions (see part 3), if enabled to run, continually look for errors. These checks help make debugging easier by detecting problems before the system “goes into the weeds.”

Summary
eheap is a new embedded heap that offers significant debug features. These can be of great value in finding and fixing heap usage bugs, which often are difficult to track down by other means. eheap is freely available for non-commercial use. See www.smxrtos.com/eheap.

References
[1] This is true only when debugging. In normal operation, the fence is fixed and the fix is reported.

Ralph Moore, President and Founder of Micro Digital, graduated with a degree in Physics from Caltech. He spent his early career in computer research, then moved into mainframe design and consulting.

Micro Digital
www.smxrtos.com
ralph@smxrtos.com