#include <token_database.h>
Classes | |
class | Entries |
struct | Entry |
An entry in the token database. More... | |
class | iterator |
Iterator for TokenDatabase values. More... | |
Public Types | |
using | value_type = Entry |
using | size_type = std::size_t |
using | difference_type = std::ptrdiff_t |
using | reference = value_type & |
using | const_reference = const value_type & |
using | pointer = const value_type * |
using | const_pointer = const value_type * |
using | const_iterator = iterator |
using | reverse_iterator = std::reverse_iterator< iterator > |
using | const_reverse_iterator = std::reverse_iterator< const_iterator > |
Public Member Functions | |
constexpr | TokenDatabase () |
Creates a database with no data. ok() returns false. | |
Entries | Find (uint32_t token) const |
Returns all entries associated with this token. This is O(n) . | |
constexpr size_type | size () const |
Returns the total number of entries (unique token-string pairs). | |
constexpr bool | ok () const |
constexpr iterator | begin () const |
Returns an iterator for the first token entry. | |
constexpr iterator | end () const |
Returns an iterator for one past the last token entry. | |
Static Public Member Functions | |
template<typename ByteArray > | |
static constexpr bool | IsValid (const ByteArray &bytes) |
template<const auto & kDatabaseBytes> | |
static constexpr TokenDatabase | Create () |
template<typename ByteArray > | |
static constexpr TokenDatabase | Create (const ByteArray &database_bytes) |
Static Public Attributes | |
static constexpr uint32_t | kDateRemovedNever = 0xFFFFFFFF |
Reads entries from a v0 binary token string database. This class does not copy or modify the contents of the database.
The v0 token database has two significant shortcomings:
\0
). If a string contains a \0
, the database will not work correctly.A v0 binary token database is comprised of a 16-byte header followed by an array of 8-byte entries and a table of null-terminated strings. The header specifies the number of entries. Each entry contains information about a tokenized string: the token and removal date, if any. All fields are little- endian.
The token removal date is stored within an unsigned 32-bit integer. It is stored as <day> <month> <year>
, where <day>
and <month>
are 1 byte each and <year>
is two bytes. The fields are set to their maximum value (0xFF
or 0xFFFF
) if they are unset. With this format, dates may be compared naturally as unsigned integers.
embed:rst:leading-asterisk * ====== ==== ========================= * Header (16 bytes) * --------------------------------------- * Offset Size Field * ====== ==== ========================= * 0 6 Magic number (``TOKENS``) * 6 2 Version (``00 00``) * 8 4 Entry count * 12 4 Reserved * ====== ==== ========================= * * ====== ==== ================================== * Entry (8 bytes) * ------------------------------------------------ * Offset Size Field * ====== ==== ================================== * 0 4 Token * 4 1 Removal day (1-31, 255 if unset) * 5 1 Removal month (1-12, 255 if unset) * 6 2 Removal year (65535 if unset) * ====== ==== ================================== *
Entries are sorted by token. A string table with a null-terminated string for each entry in order follows the entries.
Entries are accessed by iterating over the database. A O(n) Find
function is also provided. In typical use, a TokenDatabase
is preprocessed by a pw::tokenizer::Detokenizer
into a std::unordered_map
.
|
inlinestaticconstexpr |
Creates a TokenDatabase
and checks if the provided data is valid at compile time. Accepts references to constexpr containers (array
, span
, string_view
, etc.) with static storage duration. For example:
|
inlinestaticconstexpr |
Creates a TokenDatabase
from the provided byte array. The array may be a span, array, or other container type. If the data is not valid, returns a default-constructed database for which ok() is false.
Prefer the Create
overload that takes the data as a template parameter when possible, since that overload verifies data integrity at compile time.
|
inlinestaticconstexpr |
Returns true if the provided data is a valid token database. This checks the magic number (TOKENS
), version (which must be 0
), and that there is is one string for each entry in the database. A database with extra strings or other trailing data is considered valid.
|
inlineconstexpr |
True if this database was constructed with valid data. The database might be empty, but it has an intact header and a string for each entry.
|
staticconstexpr |
Default date_removed for an entry in the token datase if it was never removed.