Guides#

pw_allocator: Flexible, safe, and measurable memory allocation

Get started#

Bazel

Add @pigweed//pw_allocator to the deps list in your Bazel target:

cc_library("...") {
  # ...
  deps = [
    # ...
    "@pigweed//pw_allocator",
    # ...
  ]
}

For a narrower dependency, depend on subtargets, e.g. @pigweed//pw_allocator:tracking_allocator.

This assumes @pigweed is the name you pulled Pigweed into your Bazel WORKSPACE as.

This assumes that your Bazel WORKSPACE has a repository named @pigweed that points to the upstream Pigweed repository.

Add dir_pw_allocator to the deps list in your build target:

pw_executable("...") {
  # ...
  deps = [
    # ...
    dir_pw_allocator,
    # ...
  ]
}

For a narrower dependency, depend on subtargets, e.g. "$dir_pw_allocator:tracking_allocator".

If the build target is a dependency of other targets and includes allocators in its public interface, e.g. its header file, use public_deps instead of deps.

CMake

Add pw_allocator to your pw_add_library or similar CMake target:

pw_add_library(my_library STATIC
  HEADERS
    ...
  PRIVATE_DEPS
    # ...
    pw_allocator
    # ...
)

For a narrower dependency, depend on subtargets, e.g. pw_allocator.tracking_allocator, etc.

If the build target is a dependency of other targets and includes allocators in its public interface, e.g. its header file, use PUBLIC_DEPS instead of PRIVATE_DEPS.

Module configuration#

This module has configuration options that globally affect the behavior of pw_allocator via compile-time configuration of this module, see the module documentation for more details.

PW_ALLOCATOR_STRICT_VALIDATION#

Enables additional checks and crashing on check failure.

Individual allocators have a number of invariants that must hold at before and after each API call. For example, for a given block that is not first or last block->Next()->Prev() == block and block->Prev()->Next() == block.

These invariants can be checked at the beginning an end of each API call, but doing so may become expensive. Additionally, it may not always be clear what should be done in a production setting if an invariant fails, e.g. should it crash, log, or something else?

As a result, these checks and the behavior to crash on failure are only enabled when strict validation is requested. Strict validation is always enabled for tests.

PW_ALLOCATOR_BLOCK_POISON_INTERVAL#

Controls how frequently blocks are poisoned on deallocation.

Blocks may be “poisoned” when deallocated by writing a pattern to their useable memory space. When next allocated, the pattern is checked to ensure it is unmodified, i.e. that nothing has changed the memory while it was free. If the memory has been changed, then a heap-overflow, use-after-free, or other memory corruption bug exists and the program aborts.

If set to 0, poisoning is disabled. For any other value N, every Nth block is poisoned. This allows consumers to stochiastically sample allocations for memory corruptions while mitigating the performance impact.

Inject allocators#

Routines that need to allocate memory dynamically should do so using the generic Allocator interface. By using dependency injection, memory users can be kept decoupled from the details of how memory is provided. This yields the most flexibility for modifying or replacing the specific allocators.

Use the `Allocator` interface#

Write or refactor your memory-using routines to take a pointer or reference to such an object and use the Allocate, Deallocate, Reallocate, and Resize methods to manage memory. For example:

using pw::allocator::Layout;

void* AllocateNamedU32(pw::Allocator& allocator) {
  return allocator.Allocate(Layout::Of<NamedU32>());
}

void DeallocateNamedU32(pw::Allocator& allocator, void* ptr) {
  allocator.Deallocate(ptr);
}

To invoke methods or objects that inject allocators now requires an allocator to have been constructed. The exact location where allocators should be instantiated will vary from project to project, but it’s likely to be early in a program’s lifecycle and in a device-specific location of the source tree.

For initial testing on host, a simple allocator such as LibCAllocator can be used. This allocator is trivally constructed and simply wraps malloc and free.

Use New and Delete#

In addition to managing raw memory, an Allocator can also be used to create and destroy objects:

NamedU32* NewNamedU32(pw::Allocator& allocator,
                      std::string_view name,
                      uint32_t value) {
  return allocator.New<NamedU32>(name, value);
}

void DeleteNamedU32(pw::Allocator& allocator, NamedU32* named_u32) {
  allocator.Delete<NamedU32>(named_u32);
}

Use UniquePtr<T>#

Where possible, using RAII is a recommended approach for making memory management easier and less error-prone. UniquePtr is a smart pointer that makes allocating and deallocating memory more transparent:

pw::UniquePtr<NamedU32> MakeNamedU32(pw::Allocator& allocator,
                                     std::string_view name,
                                     uint32_t value) {
  return allocator.MakeUnique<NamedU32>(name, value);
}

Determine an allocation’s Layout#

Several of the Allocator methods take a parameter of the Layout type. This type combines the size and alignment requirements of an allocation. It can be constructed directly, or if allocating memory for a specific type, by using a templated static method:

template <typename T, typename... Args>
T* my_new(Args... args) {
  void* ptr = allocator.Allocate(pw::allocator::Layout::Of<T>());
  return ptr == nullptr ? nullptr : new (ptr) T(std::forward<Args>(args)...);
}

As stated above, you should generally try to keep allocator implementation details abstracted behind the Allocator interface. One exception to this guidance is when integrating allocators into existing code that assumes malloc and free semantics. Notably, free does not take any parameters beyond a pointer describing the memory to be freed.

void* my_malloc(size_t size) {
  return allocator.Allocate(
      pw::allocator::Layout(size, alignof(std::max_align_t)));
}

void my_free(void* ptr) { allocator.Deallocate(ptr); }

Use standard library containers#

All of C++’s standard library containers are AllocatorAwareContainers, with the exception of std::array. These types include a template parameter used to specify an allocator type, and a constructor which takes a reference to an object of this type.

While there are Differences with C++ polymorphic allocators, an Allocator can be used with these containers by wrapping them with a PMR adapter type, PmrAllocator:

#include <map>
#include <string_view>

#include "pw_allocator/allocator.h"
#include "pw_allocator/pmr_allocator.h"

namespace examples {

class LibraryIndex {
 public:
  using StringType = ::std::pmr::string;
  using MapType = ::std::pmr::multimap<StringType, StringType>;
  using Iterator = typename MapType::iterator;

  LibraryIndex(pw::allocator::Allocator& allocator)
      : allocator_(allocator), by_author_(allocator_) {}

  void AddBook(std::string_view title, std::string_view author) {
    by_author_.emplace(author, title);
  }

  std::pair<Iterator, Iterator> FindByAuthor(std::string_view author) {
    return by_author_.equal_range(StringType(author, allocator_));
  }

 private:
  pw::allocator::PmrAllocator allocator_;
  MapType by_author_;
};

}  // namespace examples

Warning

Some of the standard library containers may add a significant amount of additional code size and/or memory overhead. In particular, implementations of std::deque are known to preallocate significant memory in order to meet its complexity requirements, e.g. O(1) insertion at the front of the deque.

Warning

The standard library containers expect their allocators to throw an exception on allocation failure, and do not check for failure themselves. If exceptions are disabled, PmrAllocator instead asserts that allocation succeeded. Care must be taken in this case to ensure that memory is not exhausted.

Choose the right allocator#

Once one or more routines are using allocators, you can consider how best to implement memory allocation for each particular scenario.

Concrete allocator implementations#

This module provides several allocator implementations. The following is an overview. Consult the API reference for additional details.

LibCAllocator: Uses malloc, realloc, and free. This should only be used if the libc in use provides those functions. This allocator is a stateless singleton that may be referenced using GetLibCAllocator().
NullAllocator: Always fails. This may be useful if allocations should be disallowed under specific circumstances. This allocator is a stateless singleton that may be referenced using GetNullAllocator().
BumpAllocator: Allocates objects out of a region of memory and only frees them all at once when the allocator is destroyed.
BuddyAllocator: Allocates objects out of a blocks with sizes that are powers of two. Blocks are split evenly for smaller allocations and merged on free.
BlockAllocator: Tracks memory using Block interfaces. Derived types use specific strategies for how to choose a block to use to satisfy a request. See also pw::allocator::Block. Derived types include:
- FirstFitAllocator: Chooses the first block that’s large enough to satisfy a request. This strategy is very fast, but may increase fragmentation.
- BestFitAllocator: Chooses the smallest block that’s large enough to satisfy a request. This strategy maximizes the avilable space for large allocations, but may increase fragmentation and is slower.
- WorstFitAllocator: Chooses the largest block if it’s large enough to satisfy a request. This strategy minimizes the amount of memory in unusably small blocks, but is slower.
- BucketAllocator: Sorts and stores each free blocks in a Bucket with a given maximum block inner size.
TypedPool: Efficiently creates and destroys objects of a single given type.

Forwarding allocator implementations#

This module provides several “forwarding” allocators, as described in Forwarding allocator concept. The following is an overview. Consult the API reference for additional details.

FallbackAllocator: Dispatches first to a primary allocator, and, if that fails, to a secondary allocator.
PmrAllocator: Adapts an allocator to be a std::pmr::polymorphic_allocator, which can be used with standard library containers that use allocators, such as std::pmr::vector<T>.
SynchronizedAllocator: Synchronizes access to another allocator, allowing it to be used by multiple threads.
TrackingAllocator: Wraps another allocator and records its usage.

Custom allocator implementations#

If none of the allocator implementations provided by this module meet your needs, you can implement your allocator and pass it into any routine that uses the generic interface.

Allocator uses an NVI pattern. To add a custom allocator implementation, you must at a miniumum implement the DoAllocate and DoDeallocate methods.

For example, the following is a forwarding allocator that simply writes to the log whenever a threshold is exceeded:

#include <cstddef>

#include "pw_allocator/allocator.h"

namespace examples {

class CustomAllocator : public pw::Allocator {
 public:
  using Layout = pw::allocator::Layout;

  CustomAllocator(pw::Allocator& allocator, size_t threshold);

 private:
  /// @copydoc pw::Allocator::Allocate
  void* DoAllocate(Layout layout) override;

  /// @copydoc pw::Allocator::Deallocate
  void DoDeallocate(void* ptr) override;

  /// @copydoc pw::Allocator::Deallocate
  void DoDeallocate(void* ptr, Layout) override { DoDeallocate(ptr); }

  pw::Allocator& allocator_;
  size_t used_ = 0;
  size_t threshold_ = 0;
};

}  // namespace examples

#include "examples/custom_allocator.h"

#include <cstdint>

#include "pw_allocator/capability.h"
#include "pw_log/log.h"
#include "pw_result/result.h"

namespace examples {

CustomAllocator::CustomAllocator(Allocator& allocator, size_t threshold)
    : Allocator(pw::allocator::Capabilities()),
      allocator_(allocator),
      threshold_(threshold) {}

// Allocates, and reports if allocated memory exceeds its threshold.
void* CustomAllocator::DoAllocate(Layout layout) {
  void* ptr = allocator_.Allocate(layout);
  if (ptr == nullptr) {
    return nullptr;
  }
  size_t prev = used_;
  pw::Result<Layout> allocated = GetAllocatedLayout(allocator_, ptr);
  if (allocated.ok()) {
    used_ += allocated->size();
  }
  if (prev <= threshold_ && threshold_ < used_) {
    PW_LOG_INFO("more than %zu bytes allocated.", threshold_);
  }
  return ptr;
}

void CustomAllocator::DoDeallocate(void* ptr) {
  if (ptr == nullptr) {
    return;
  }
  pw::Result<Layout> allocated = GetAllocatedLayout(allocator_, ptr);
  if (allocated.ok()) {
    used_ -= allocated->size();
  }
  allocator_.Deallocate(ptr);
}

}  // namespace examples

There are also several optional methods you can provide:

If an implementation of DoResize isn’t provided, then Resize will always return false.
If an implementation of DoReallocate isn’t provided, then Reallocate will try to Resize, and, if unsuccessful, try to Allocate, copy, and Deallocate.
If an implementation of DoGetInfo isn’t provided, then GetInfo will always return pw::Status::Unimplmented.

Custom allocators can indicate which optional methods they implement and what optional behaviors they want from the base class by specifying Capabilities when invoking the base class constructor.

Measure memory usage#

You can observe how much memory is being used for a particular use case using a TrackingAllocator.

struct CustomMetrics {
  PW_ALLOCATOR_METRICS_ENABLE(allocated_bytes);
  PW_ALLOCATOR_METRICS_ENABLE(peak_allocated_bytes);
  PW_ALLOCATOR_METRICS_ENABLE(num_failures);
};

  constexpr pw::metric::Token kToken = PW_TOKENIZE_STRING("CustomMetrics");
  pw::allocator::TrackingAllocatorImpl<CustomMetrics> tracker(kToken,
                                                              allocator);

Metric data can be retrieved according to the steps described in Exporting metrics, or by using the Dump method of Group:

  tracker.metric_group().Dump();

The CustomMetrics type used in the example above is a struct provided by the developer. You can create your own metrics structs that enable zero or more of the following metrics:

requested_bytes: The number of bytes currently requested from this allocator.
peak_requested_bytes: The most bytes requested from this allocator at any one time.
cumulative_requested_bytes: The total number of bytes that have been requested from this allocator across its lifetime.
allocated_bytes: The number of bytes currently allocated by this allocator.
peak_allocated_bytes: The most bytes allocated by this allocator at any one time.
cumulative_allocated_bytes: The total number of bytes that have been allocated by this allocator across its lifetime.
num_allocations: The number of allocation requests this allocator has successfully completed.
num_deallocations: The number of deallocation requests this allocator has successfully completed.
num_resizes: The number of resize requests this allocator has successfully completed.
num_reallocations: The number of reallocation requests this allocator has successfully completed.
num_failures: The number of requests this allocator has failed to complete.

If you only want a subset of these metrics, you can implement your own metrics struct. For example:

struct CustomMetrics {
  PW_ALLOCATOR_METRICS_ENABLE(allocated_bytes);
  PW_ALLOCATOR_METRICS_ENABLE(peak_allocated_bytes);
  PW_ALLOCATOR_METRICS_ENABLE(num_failures);
};

If you have multiple routines that share an underlying allocator that you want to track separately, you can create multiple tracking allocators that forward to the same underlying allocator:

  using MyTrackingAllocator =
      pw::allocator::TrackingAllocatorImpl<pw::allocator::internal::AllMetrics>;

  constexpr pw::metric::Token kToken0 = PW_TOKENIZE_STRING("Combined");
  MyTrackingAllocator combined(kToken0, allocator);

  constexpr pw::metric::Token kToken1 = PW_TOKENIZE_STRING("Tracker1");
  MyTrackingAllocator tracker1(kToken1, combined);

  constexpr pw::metric::Token kToken2 = PW_TOKENIZE_STRING("Tracker2");
  MyTrackingAllocator tracker2(kToken2, combined);

  combined.metric_group().Add(tracker1.metric_group());
  combined.metric_group().Add(tracker2.metric_group());

Measure fragmentation#

If you are using a BlockAllocator, you can use the MeasureFragmentation method to examine how fragmented the heap is. This method returns a Fragmentation struct, which includes the “sum of squares” and the sum of the inner sizes of the current free blocks. On a platform or host with floating point support, you can divide the square root of the sum of squares by the sum to obtain a number that ranges from 0 to 1 to indicate maximal and minimal fragmenation, respectively. Subtracting this number from 1 can give a more intuitive “fragmenation score”.

For example, consider a heap consisting of the following blocks:

100 bytes in use.
200 bytes free.
50 bytes in use.
10 bytes free.
200 bytes in use.
300 bytes free.

For such a heap, MeasureFragmentation will return 130100 and 510. The above calculation gives a fragmentation score of 1 - sqrt(130100) / 510, which is approximately 0.29.

Detect memory corruption#

The pw::allocator::Block class provides a few different mechanisms to help detect memory corruptions when they happen. First, on every deallocation it will check the integrity of the block header and assert if it has been modified.

Additionally, you can enable poisoning to detect additional memory corruptions such as use-after-frees. The Module configuration for pw_allocator includes the PW_ALLOCATOR_BLOCK_POISON_INTERVAL option, which will “poison” every N-th Block. Allocators “poison” blocks on deallocation by writing a set pattern to the usable memory, and later check on allocation that the pattern is intact. If it’s not, some routine has modified unallocated memory.

Test custom allocators#

If you create your own allocator implementation, it’s strongly recommended that you test it as well. If you’re creating a forwarding allocator, you can use AllocatorForTest. This simple allocator provides its own backing storage and automatically frees any outstanding allocations when it goes out of scope. It also tracks the most recent values provided as parameters to the interface methods.

For example, the following tests the custom allocator from Custom allocator implementations:

// TODO: b/328076428 - Use pwrev/193642.
TEST(CustomAllocatorExample, MakeUnique) {
  AllocatorForTest allocator;
  constexpr size_t kThreshold = sizeof(examples::NamedU32) * 3;
  examples::CustomAllocator custom(allocator, kThreshold);
  // log_basic::test::LogCache<32> logs;

  auto ptr1 = custom.MakeUnique<examples::NamedU32>("test", 111);
  auto ptr2 = custom.MakeUnique<examples::NamedU32>("test", 222);
  auto ptr3 = custom.MakeUnique<examples::NamedU32>("test", 333);
  // EXPECT_FALSE(logs.HasNext());

  auto ptr4 = custom.MakeUnique<examples::NamedU32>("test", 444);
  // ASSERT_TRUE(logs.HasNext());
  // log_basic::test::LogMessage log = logs.Next();
  // EXPECT_EQ(log.level, PW_LOG_LEVEL_INFO);

  // StringBuffer<40> expected;
  // expected << "more than " << kThreshold << " bytes allocated.";
  // EXPECT_STREQ(log.message.c_str(), expected.c_str());
}

You can also extend the TestHarness to perform pseudorandom sequences of allocations and deallocations, e.g. as part of a performance test:

#include <cstddef>

#include "examples/custom_allocator.h"
#include "pw_allocator/test_harness.h"
#include "pw_allocator/testing.h"

namespace examples {

class CustomAllocatorTestHarness : public pw::allocator::test::TestHarness {
 public:
  static constexpr size_t kCapacity = 0x1000;
  static constexpr size_t kThreshold = 0x800;

  CustomAllocatorTestHarness() : custom_(allocator_, kThreshold) {}

  pw::Allocator* Init() override { return &custom_; }

 private:
  pw::allocator::test::AllocatorForTest<kCapacity> allocator_;
  CustomAllocator custom_;
};

}  // namespace examples

#include <cstdint>

#include "examples/custom_allocator_test_harness.h"
#include "pw_perf_test/perf_test.h"
#include "pw_perf_test/state.h"

namespace examples {

void PerformAllocations(pw::perf_test::State& state, uint64_t seed) {
  static CustomAllocatorTestHarness harness;
  harness.set_prng_seed(seed);
  while (state.KeepRunning()) {
    harness.GenerateRequest(CustomAllocatorTestHarness::kCapacity);
  }
  harness.Reset();
}

PW_PERF_TEST(CustomAllocatorPerfTest, PerformAllocations, 1);

}  // namespace examples

Even better, you can easily add fuzz tests for your allocator. This module uses the TestHarness to integrate with pw_fuzzer and provide FuzzTest support.

void NeverCrashes(const pw::Vector<pw::allocator::test::Request>& requests) {
  static examples::CustomAllocatorTestHarness harness;
  harness.HandleRequests(requests);
}

constexpr size_t kMaxRequests = 256;
constexpr size_t kMaxSize = examples::CustomAllocatorTestHarness::kCapacity / 2;
FUZZ_TEST(CustomAllocatorFuzzTest, NeverCrashes)
    .WithDomains(
        pw::allocator::test::ArbitraryRequests<kMaxRequests, kMaxSize>());

Measure custom allocator size#

If you create your own allocator implementation, you may wish to measure its code size, similar to measurements in the module’s own Code size analysis. You can use pw_bloat and SizeReporter to create size reports as described in Defining size reports in GN.

For example, the C++ code for a size report binary might look like:

#include <cstdint>

#include "examples/custom_allocator.h"
#include "pw_allocator/first_fit.h"
#include "pw_allocator/size_reporter.h"

int main() {
  pw::allocator::SizeReporter reporter;
  reporter.SetBaseline();

  pw::allocator::FirstFitAllocator<> allocator(reporter.buffer());
  examples::CustomAllocator custom(allocator, 128);
  reporter.Measure(custom);

  return 0;
}

The resulting binary could be compared with the binary produced from pw_allocator/size_report/first_fit.cc to identify just the code added in this case by CustomAllocator.

For example, the GN build rule to generate a size report might look liek:

pw_size_diff("size_report") {
  title = "Example size report"
  binaries = [
    {
      target = ":size_report"
      base = "$dir_pw_allocator/size_report:first_fit"
      label = "CustomAllocator"
    },
  ]
}

The size report produced by this rule would render as:

Label

Segment

Delta

CustomAllocator

+240