Allocator Builder
Policy Based C++ Template Allocator Library
|
Let's say that we need an unknown amount of memory inside a function. There are different ways to handle this:
The best would be, if we would have a solution that takes the memory if possible from the stack and if not it goes to the heap. One way would be to use a fallback_allocator in the following way:
Now we get the memory from the stack until 4096 bytes. Just beyond this, a call to ::malloc() would be done. From my point of view this is a cleaner solution. As well it offers the possibility, easily to expand or reallocate the returned block; by design it automatically covers moving data allocated on the stack to the heap, if not enough memory is left on the stack.
Many string class designs use today for short strings a so called "small string optimization". That means that the characters of the string a stored in a const buffer member or the class if the string is shorter than 16 bytes e.g. Memory is allocated from the heap if the representing text is longer. Something similar like: (I do not claim that this is best possible implementation :-) )
Let's further assume that SIMD CPU instructions are used in further methods to implement comparison, copy etc. These instructions work on modern CPUs (at least on x86) always on 16 bytes chunks. So we would allocate on the heap the memory in 16 bytes aligned chunks, e.g. 32, 48, 64 etc. So it would be good, if the allocator, that we want to use, automatically returns such aligned memory. But that is not all. Further we assume that our application does lot's of string operations. So there would be lot's of allocation and deallocation operations. We already discussed that these operations are quite expensive. So why not keep the not any more used memory blocks in a pool (or free-list) and reuse them soon. So we start with a freelist:
This instance allocates all memory from the heap (via the mallocator) and it stores elements up to 32 bytes in the pool. The max pool size is 1024 elements. This allocator would help for strings that would never be bigger than 32 bytes. In normal cases they get bigger, so let's use a bucketizer that can manage several free-lists:
So the typedef specifies a free-list that's min- and max size can be changed during runtime. (Normally this values are set during compile time). The bucketizer creates free-lists in increasing 16 bytes steps capacity, [17-32], [33-48], [49-64], ... Each free-list has a maximum capacity of 1024 elements. So now we can handle strings up to the length of 512 bytes. If this strings can get longer, we have to add a segregator:
Now all allocations up to 512 bytes are handled by the bucketizer and all above is taken directly from the heap. The code from the string class would be rewritten to this:
The std::make_unique and std::make_shared creation function cannot take their memory from custom allocators. They always allocate the memory via the global ::new() functions.
Create an ALB allocator of any that type that has as the outmost allocator an alb::affix_allocator.
Now use the ALB variants of these factory functions.
Let's assume that we see potential in replacing the standard heap with our own, custom optimized version. In general this is pretty easy. We only have to replace the global new(), new[]() and the corresponding delete operators. A regular implementation might look like:
Let's make the assumption that MyAllocator is a combined allocator from the Allocator-Builder (ALB) library that should serve as the basis of the memory allocation.
We cannot use this MyAllocator directly, because the ALB relies on the fact that always an alb::block must be passed to deallocate().
Beside the pointer to the memory it contains the length. The operator ::delete(void*) does not get passed the size of the block. So we have to keep track by ourselves. Probably the easiest way is to encapsulate our MyAllocator within an affix_allocator with a prefix that contains the length of the block. I choose for the moment the prefix of type uint32_t, because that can cover memory allocations up to 4GB which is fine for now. (When we come to the point, that it might happen, our application needs more than 4GB in a single chunk, then we can change the prefix to uint64_t).
Let's make as a next step two convenience function for allocation and releasing the memory:
void* operator new(std::size_t sz) { return operatorNewInternal(sz); }
void* operator new[](std::size_t sz) { return operatorNewInternal(sz); }
void operator delete(void* ptr) { operatorDeleteInternal(ptr); }
void operator delete[](void* ptr) { operatorDeleteInternal(ptr); }
Or even better, we use a ready template from the ALB library and change the code of the helper functions to:
Let's assume for the moment that we have a STL container under heavy load and with lot's of size changes or lot's of re-balancing in case of a tree base one. (I know, that changing the design in a way, that regularly rebalancing of the tree is avoided totally is even better.) But stick to the idea for the moment. So we have to implement the complete interface as it is described e.g. in Nicolai Josuttis excellent book "The Standard Template Library". –> Refer to adendum pdf!
Not very much must be done. Just take the generic alb::stl_allocator and we are done:
Now we can use in in our code as:
Or when we have to deal with a map: