#include <detokenize.h>
Public Member Functions | |
Detokenizer (const TokenDatabase &database) | |
Detokenizer (std::unordered_map< std::string, std::unordered_map< uint32_t, std::vector< TokenizedStringEntry > > > &&database) | |
Constructs a detokenizer by directly passing the parsed database. | |
DetokenizedString | Detokenize (const span< const std::byte > &encoded) const |
DetokenizedString | Detokenize (const span< const uint8_t > &encoded) const |
Overload of Detokenize for span<const uint8_t> . | |
DetokenizedString | Detokenize (std::string_view encoded) const |
Overload of Detokenize for std::string_view . | |
DetokenizedString | Detokenize (const void *encoded, size_t size_bytes) const |
Overload of Detokenize for a pointer and length. | |
DetokenizedString | DetokenizeBase64Message (std::string_view text) const |
std::string | DetokenizeText (std::string_view text, unsigned max_passes=3) const |
std::string | DetokenizeBase64 (std::string_view text) const |
std::string | DecodeOptionallyTokenizedData (const span< const std::byte > &optionally_tokenized_data) |
const DomainTokenEntriesMap & | database () const |
Static Public Member Functions | |
static Result< Detokenizer > | FromElfSection (span< const std::byte > elf_section) |
static Result< Detokenizer > | FromElfSection (span< const uint8_t > elf_section) |
Overload of FromElfSection for a uint8_t span. | |
static Result< Detokenizer > | FromElfFile (stream::SeekableReader &stream) |
static Result< Detokenizer > | FromCsv (std::string_view csv) |
Constructs a detokenizer from a parsed CSV database. | |
Decodes and detokenizes from a token database. This class builds a hash table of tokens to give O(1)
token lookups.
|
explicit |
Constructs a detokenizer from a TokenDatabase
. The TokenDatabase
is not referenced by the Detokenizer
after construction; its memory can be freed.
std::string pw::tokenizer::Detokenizer::DecodeOptionallyTokenizedData | ( | const span< const std::byte > & | optionally_tokenized_data | ) |
Decodes data that may or may not be tokenized, such as proto fields marked as optionally tokenized.
This function currently only supports Base64 nested tokenized messages. Support for hexadecimal-encoded string literals will be added.
This function currently assumes when data is not tokenized it is printable ASCII. Otherwise, the returned string will be base64-encoded.
[in] | optionally_tokenized_data | Data optionally tokenized. |
DetokenizedString pw::tokenizer::Detokenizer::Detokenize | ( | const span< const std::byte > & | encoded | ) | const |
Decodes and detokenizes the binary encoded message. Returns a DetokenizedString
that stores all possible detokenized string results.
|
inline |
Deprecated version of DetokenizeText
with no recursive detokenization.
DetokenizeText
instead. DetokenizedString pw::tokenizer::Detokenizer::DetokenizeBase64Message | ( | std::string_view | text | ) | const |
Decodes and detokenizes a Base64-encoded message. Returns a DetokenizedString
that stores all possible detokenized string results.
std::string pw::tokenizer::Detokenizer::DetokenizeText | ( | std::string_view | text, |
unsigned | max_passes = 3 |
||
) | const |
Decodes and detokenizes nested tokenized messages in a string.
This function currently only supports Base64 nested tokenized messages. Support for hexadecimal-encoded string literals will be added.
[in] | text | Text potentially containing tokenized messages. |
[in] | max_passes | DetokenizeText supports recursive detokenization. Tokens can expand to other tokens. The maximum number of detokenization passes is specified by max_passes (0 is equivalent to 1). |
|
static |
Constructs a detokenizer from the .pw_tokenizer.entries
section of an ELF binary.
|
static |
Constructs a detokenizer from the .pw_tokenizer.entries
section of an ELF binary.