pw_enum#
pw_enum: Rich enum support
pw_enum supports automatic stringifying and tokenizing of C++ enums. It
works by parsing C++ standard header files and generating versions of those
headers with minimal additions needed to support these features.
Why use pw_enum?
Efficient string or tokenized logging: Stringifies or tokenizes logs automatically for seamless logging.
Automatic content-based versioning: Generates version hashes to prevent collisions as values change.
Automatic tokenized and stringified enums#
pw_enum works on enums declared in standard C++ header files. To use
pw_enum:
Declare one or more enums in a header files.
Include the header file in a
pw_cc_enumtarget instead of a standardcc_library.Include pw_enum/generate.h in the header file.
Register the enum using the PW_ENUM(MyEnum, …) macro at global scope. List the fully qualified enum name, followed by all of its enumerators. If an enumerator has multiple aliases, only include one of them.
Important
The PW_ENUM macro must be called at global scope (outside of any
namespace blocks, class definitions, or functions).
If PW_ENUM is called inside a namespace block, class, or function, the
C++ compiler will reject it with a compilation error indicating that the
template specialization of
_PW_ENUM_cannot_be_used_within_namespaces must occur at global scope.
pw_enum headers are parsed during the build to support versioned
tokenization and stringification with pw::EnumToString().
Example#
Declare enums in a standard C++ header and call PW_ENUM(MyEnum, …) at the bottom of the file, outside of any namespace blocks (in the global namespace).
#pragma once
#include <cstdint>
#include "pw_enum/generate.h"
namespace my::nested::pkg {
// Declare the enum as normal.
enum class MyEnum : uint8_t {
kAlpha = 0,
kBeta,
kAliasedBeta = kBeta,
};
} // namespace my::nested::pkg
PW_ENUM(my::nested::pkg::MyEnum, kAlpha, kBeta, kAliasedBeta);
Use the enum normally. It is tokenized with PW_TOKENIZE_ENUM and works with tokenized logs and pw::EnumToString().
#include "enum_example/basic_enum.h"
#include "pw_enum/to_string.h"
#include "pw_log/log.h"
#include "pw_log/tokenized_args.h"
namespace my::nested::pkg {
const char* HandleEnum(MyEnum value) {
// Log the enum as a string or token, depending on the logging backend.
PW_LOG_INFO("The enum value is: " MY_NESTED_PKG_MY_ENUM, PW_LOG_ENUM(value));
switch (value) {
case MyEnum::kAlpha:
// Handle case kAlpha
break;
case MyEnum::kBeta:
// Handle case kBeta
break;
}
// The const char* string version of the enum is always available.
return pw::EnumToString(value);
}
} // namespace my::nested::pkg
Enums can reference values from other enums, even if they reside in different files and namespaces.
#pragma once
#include <cstdint>
#include "enum_example/basic_enum.h"
#include "pw_enum/generate.h"
namespace my::nested::pkg {
enum class OtherEnum : uint8_t {
kFirst = 0,
kSecond = static_cast<uint8_t>(my::nested::pkg::MyEnum::kBeta),
};
enum class AnotherEnum {
kX = 1,
kY = 2,
};
} // namespace my::nested::pkg
PW_ENUM(my::nested::pkg::OtherEnum, kFirst, kSecond);
PW_ENUM(my::nested::pkg::AnotherEnum, kX, kY);
#pragma once
#include "enum_example/basic_enum.h"
#include "pw_enum/generate.h"
namespace my::other::pkg {
enum class ReferencesOtherEnum {
kFirstValue = 0,
kFromOther = static_cast<int>(my::nested::pkg::MyEnum::kAlpha),
};
} // namespace my::other::pkg
PW_ENUM(my::other::pkg::ReferencesOtherEnum, kFirstValue, kFromOther);
pw_cc_enum(
name = "advanced_enum",
hdrs = ["enum_example/references_other_enum.h"],
strip_include_prefix = ".",
deps = [":basic_enum"],
)
Enumerator names#
By default, enumerator names that follow Google’s kEnumName style are
converted to upper snake case, without the k prefix (ENUM_NAME). Names
that do not follow Google style are used directly.
To override the default enumerator name, specify it in the PW_ENUM(name,
…) macro with a string literal after =. For example:
PW_ENUM(my::Enum, // String name:
kStandardStyle, // "STANDARD_STYLE"
kCustom = "custom_name", // "custom_name"
nonStandard, // "nonStandard"
);
Enumerator aliases#
If multiple enumerator names share the same value (aliases), they can be
registered together in the PW_ENUM macro. The generator groups registered
aliases, sorting their display names alphabetically and joining them with |
(e.g. "ALPHA|ALIAS_ALPHA"). To omit aliases, simply leave them out of
PW_ENUM.
Logging enums#
Enums generated by pw_enum natively support Pigweed’s tokenized logging
infrastructure.
Versioned format macro:
pw_enumgenerates a macro to use in the format string for the enum. The macro is named for the namespace and enum name (e.g.MY_NESTED_PKG_MY_ENUM). The macro evaluates to a string literal that can be concatenated into a format string.The macro is versioned based on the enum’s contents. The version changes automatically when the enum changes, so tokenized logs of enums never have collisions.
A
*_DOMAINmacro (e.g.MY_NESTED_PKG_MY_ENUM_DOMAIN) is also generated with the enum’s tokenization domain, for use with nested tokenization.Argument macro: Include pw_log/tokenized_args.h and use PW_LOG_ENUM(value) as the argument to the log statement.
When using a tokenizing logging backend, the generated format macro evaluates to
PW_TOKEN_FMT(::namespace::Enum), and PW_LOG_ENUM resolves to
pw::tokenizer::EnumToToken(), logging the 32-bit token. When using a
standard string-based logging backend, the format macro yields the string format
specifier %s, and PW_LOG_ENUM resolves to pw::EnumToString(),
which yields the string representation.
Example#
PW_LOG_INFO("State " MY_NESTED_PKG_OTHER_ENUM ": received packet",
PW_LOG_ENUM(state));
Build integration#
pw_enum provides build integration for Bazel, GN, and CMake.
Use the pw_cc_enum rule from //pw_enum:pw_cc_enum.bzl.
pw_cc_enum(
name = "basic_enum",
hdrs = ["private/pw_enum_private/basic_enum.h"],
strip_include_prefix = "private",
deps = [":base_enum"],
)
Use the pw_cc_enum template from “$dir_pw_enum/pw_cc_enum.gni”.
pw_cc_enum("basic_enum") {
public = [ "private/pw_enum_private/basic_enum.h" ]
deps = [ ":base_enum" ]
include_dirs = [ "private" ]
}
Use the pw_cc_enum function from pw_enum/pw_cc_enum.cmake.
pw_cc_enum(pw_enum.basic_enum
HEADERS
private/pw_enum_private/basic_enum.h
PUBLIC_INCLUDES
private
PUBLIC_DEPS
pw_enum.base_enum
)
Stringifying enums#
pw::EnumToString() returns a string version of an enum. It uses a FTADLE
extension point PwEnumToString(enum). FTADLE is a pattern that enables
customization by searching for a matching function via Argument-Dependent Lookup
(ADL). For more information, see Designing Extension Points With FTADLE.
If you don’t use pw_cc_enum, you can manually use PW_TOKENIZE_ENUM
in pw_tokenizer/enum.h to
tokenize the enum and implement PwEnumToString.
Cross language support#
pw_enum is currently C++-only, but could be expanded to support other
languages. The parser extracts the full enum definition and resolves all
enumerator values, so it would be straightforward to generate compatible enum
definitions for other languages from a C++ header. pw_enum could also
support an alternate format, such as JSON, for the original enum definition, and
generate C++ and other languages from that.
Background#
pw_tokenizer is one of Pigweed’s most widely adopted features. It has supported nested tokenization—a tokenized message inside another tokenized message—since the early days. Initially, only Base64-encoded messages were supported, which is inefficient. Support for directly encoding nested messages as 32-bit integers was added later (see 0105: Nested Tokens and Tokenized Log Arguments).
With support for encoding tokens as integers, supporting rich enums was a clear next step. This culminated in the creation of pw_tokenizer/enum.h and its supporting macros. This approach uses the enum’s integral value as a nested token, discriminated by its namespace to avoid collisions between different enum types. The result is highly efficient enum logs that are still readable and user-friendly.
The need for versioning#
Real-world deployment of pw_tokenizer/enum.h soon revealed a critical flaw. When enum values were changed or reordered during development, the resulting tokens changed. When merging token databases from different builds, this led to collisions, where the same token mapped to different string representations. It became clear that enum tokenization required versioning.
Several alternatives were explored for automatic enum versioning. A key constraint was that enum values must be able to be set with expressions, which may reference constants or other enum values. Approaches considered included:
Tokenize names instead of values: Hash the enumerator names and generate a function with a switch statement to map values to tokens at runtime.
Version in the domain: Incorporate a hash of the enum’s contents (names and values) into the tokenization domain, requiring two arguments to log an enum (the version and the value).
Calculate tokens from a base: This approach used a hash of the enum’s contents as a base offset, adding the enum value to it at the call site.
Ultimately, these approaches were ruled out because they increased code size relative to the existing implementation, primarily due to the additional code required at the call site.
The code size penalty could be avoided if there were a constexpr way to
insert the enum’s version into the log format string. Then, the existing token
logging macros could be used (PW_LOG_FMT). For example:
// If there were a way to define this macro during compilation, versioned
// enums would have no code size cost relative to unversioned enums.
#define MY_ENUM_FMT PW_LOG_FMT("::my::Enum::version_1234")
PW_LOG_INFO("My enum: " MY_ENUM_FMT, PW_LOG_ENUM(my_enum))
Unfortunately, there is no way get the versioned enum domain into a
concatenatable string literal. This is required for compatibility with
pw_log’s C-style API. If pw_log offered a C++-only API, this would be
feasible, but adding such an API was out of scope.
Generating enums#
Generating enums appeared to be the only way to get the enum’s version into a
string literal at compile time. This led to the creation of pw_enum.
JSON definition#
The initial implementation of pw_enum generated C++ headers from JSON files.
While this worked well technically, it proved too difficult for projects to
adopt due to the friction of maintaining JSON definitions for standard C++
enums. Protocol Buffers were considered in place of JSON, but they are too
limited for this use case. Protobufs do not support setting enumerator values
based on other enums or external constants.
Parse C++ source#
Finally, multiple approaches for parsing enum definitions out of C++ source code were explored. These included:
Use libclang from Python to parse header files. This would be robust and even perform
constexprevaluation of enumerator values. Unfortunately,libclangis a large dependency, and is not readily available on all platforms.Parse
clang’s -ast-dump output. This would be fairly robust, but would involve parsing moderately complex, non-standard text output intended for human consumption. It also requiresclang, which not all projects build with.Use a custom Python parser. This approach would be utterly impractical and brittle.
Ultimately, parsing C++ source directly proved infeasible. The final design avoids parsing arbitrary C++ source with the PW_ENUM(name, …) macro.
Design#
The final design of pw_enum addresses the constraints identified during its
evolution by combining standard C++ header files with a specialized build-time
generator powered by compile-time template evaluation.
This architecture provides a seamless user experience with zero runtime overhead and robust protection against database collisions, while maintaining the full expressiveness of standard C++ enum definitions.
Source files#
Users define enums in standard C++ header files. To opt-in to pw_enum, the
header includes pw_enum/generate.h and registers each enum by calling the
PW_ENUM(…) macro with the enum name and names of all of its
enumerators.
The macro’s primary purpose is to capture enum metadata in an easily parsable
format. PW_ENUM() expands to the enum name and list of enumerators,
surrounded by unique markers. A Python script searches a preprocessed source file for the
markers and extracts the enum metadata.
The macro also serves to require that users list the file in a pw_cc_enum
target (see Build system integration), which is necessary for it to be
processed. If the file is not processed by pw_enum machinery, the macro
expands to static_assert(false), causing the build to fail with an
informative error.
Enumerator evaluation#
Enumerator values can be defined by arbitrary C++ expressions. The values may change, even if the individual source file does not change.
The parse.py script evaluates enumerators generating a source file that references them. The source file instantiates a template with the enumerators as template arguments. Compilation fails, but includes the enumerator values in an easily parsable form.
This solution is far from ideal, but has proven to be robust. It evaluates enumerators with the same toolchain as the rest of the project. Printing compile-time constants with failed template instantiations is a common workaround to achieve compile-time “printf” functionality. The script searches for a unique template name and doesn’t depend on a particular compiler or version.
Build-time generation#
After parsing, the pw_cc_enum target runs a Python script
(pw_enum/py/pw_enum/generate.py) that generates a header.
Enum generation: The script generates a “shadowed” version of the header in the build directory. This generated header contains the original content, plus a footer with tokenization metadata. It also replaces the
PW_ENUM(...)calls with_PW_ENUM_GENERATED(...).PW_ENUM(...)expands tostatic_assert(false)to require users to build headers withpw_cc_enum.Versioning: A unique version hash is calculated for each enum based on its fully qualified name and the names and values of all its enumerators. This hash is used to construct a unique tokenization domain (e.g.,
::namespace::_pw_enum_HASH::EnumName). This ensures that if the enum changes, the domain changes, preventing collisions in merged token databases.Tokenization: The generated footer includes a call to PW_TOKENIZE_ENUM_CUSTOM from pw_tokenizer, which registers the enum values and their string representations in the database.
Build system integration#
The pw_cc_enum build rule automates the process of parsing C++ headers and
generating versioned enum metadata. It invokes the pw_enum generator with
the correct compilation flags and ensures the generated headers are prioritized
during compilation.
Bazel (pw_cc_enum.bzl) creates an internal library target to collect the compilation flags (includes, defines) required to parse the header correctly and passes them to the Python script.
CMake (pw_cc_enum.cmake) creates an internal interface library to collect includes and defines from dependencies, and uses
file(GENERATE)to produce a flags file for the generator. It uses-iquoteto ensure that the build system prioritizes the generated shadowed header over the original source header.GN (pw_cc_enum.gni) compiles a placeholder C++ file with the enum’s dependencies to generate a target Ninja file. The generator script parses the target Ninja file and the toolchain’s Ninja file to extract the compiler and its compilation flags (defines, includes, and flags) to run the evaluation step. This is similar to how pw_compilation_testing works. Like CMake, the GN build uses
-iquoteto ensure that the build system prioritizes the generated shadowed header over the original source header.