Pigweed
 
Loading...
Searching...
No Matches
pw::tokenizer::Detokenizer Class Reference

#include <detokenize.h>

Public Member Functions

 Detokenizer (const TokenDatabase &database)
 
 Detokenizer (std::unordered_map< std::string, std::unordered_map< uint32_t, std::vector< TokenizedStringEntry > > > &&database)
 Constructs a detokenizer by directly passing the parsed database.
 
DetokenizedString Detokenize (const span< const std::byte > &encoded) const
 
DetokenizedString Detokenize (const span< const uint8_t > &encoded) const
 Overload of Detokenize for span<const uint8_t>.
 
DetokenizedString Detokenize (std::string_view encoded) const
 Overload of Detokenize for std::string_view.
 
DetokenizedString Detokenize (const void *encoded, size_t size_bytes) const
 Overload of Detokenize for a pointer and length.
 
DetokenizedString DetokenizeBase64Message (std::string_view text) const
 
std::string DetokenizeText (std::string_view text, unsigned max_passes=3) const
 
std::string DetokenizeBase64 (std::string_view text) const
 
std::string DecodeOptionallyTokenizedData (const span< const std::byte > &optionally_tokenized_data)
 
const DomainTokenEntriesMap & database () const
 

Static Public Member Functions

static Result< DetokenizerFromElfSection (span< const std::byte > elf_section)
 
static Result< DetokenizerFromElfSection (span< const uint8_t > elf_section)
 Overload of FromElfSection for a uint8_t span.
 
static Result< DetokenizerFromElfFile (stream::SeekableReader &stream)
 
static Result< DetokenizerFromCsv (std::string_view csv)
 Constructs a detokenizer from a parsed CSV database.
 

Detailed Description

Decodes and detokenizes from a token database. This class builds a hash table of tokens to give O(1) token lookups.

Constructor & Destructor Documentation

◆ Detokenizer()

pw::tokenizer::Detokenizer::Detokenizer ( const TokenDatabase database)
explicit

Constructs a detokenizer from a TokenDatabase. The TokenDatabase is not referenced by the Detokenizer after construction; its memory can be freed.

Member Function Documentation

◆ DecodeOptionallyTokenizedData()

std::string pw::tokenizer::Detokenizer::DecodeOptionallyTokenizedData ( const span< const std::byte > &  optionally_tokenized_data)

Decodes data that may or may not be tokenized, such as proto fields marked as optionally tokenized.

This function currently only supports Base64 nested tokenized messages. Support for hexadecimal-encoded string literals will be added.

This function currently assumes when data is not tokenized it is printable ASCII. Otherwise, the returned string will be base64-encoded.

Parameters
[in]optionally_tokenized_dataData optionally tokenized.
Returns
The decoded text if successfully detokenized or if the data is printable, otherwise returns the data base64-encoded.

◆ Detokenize()

DetokenizedString pw::tokenizer::Detokenizer::Detokenize ( const span< const std::byte > &  encoded) const

Decodes and detokenizes the binary encoded message. Returns a DetokenizedString that stores all possible detokenized string results.

◆ DetokenizeBase64()

std::string pw::tokenizer::Detokenizer::DetokenizeBase64 ( std::string_view  text) const
inline

Deprecated version of DetokenizeText with no recursive detokenization.

Deprecated:
Call DetokenizeText instead.

◆ DetokenizeBase64Message()

DetokenizedString pw::tokenizer::Detokenizer::DetokenizeBase64Message ( std::string_view  text) const

Decodes and detokenizes a Base64-encoded message. Returns a DetokenizedString that stores all possible detokenized string results.

◆ DetokenizeText()

std::string pw::tokenizer::Detokenizer::DetokenizeText ( std::string_view  text,
unsigned  max_passes = 3 
) const

Decodes and detokenizes nested tokenized messages in a string.

This function currently only supports Base64 nested tokenized messages. Support for hexadecimal-encoded string literals will be added.

Parameters
[in]textText potentially containing tokenized messages.
[in]max_passesDetokenizeText supports recursive detokenization. Tokens can expand to other tokens. The maximum number of detokenization passes is specified by max_passes (0 is equivalent to 1).
Returns
The original string with nested tokenized messages decoded in context. Messages that fail to decode are left as-is.

◆ FromElfFile()

static Result< Detokenizer > pw::tokenizer::Detokenizer::FromElfFile ( stream::SeekableReader stream)
static

Constructs a detokenizer from the .pw_tokenizer.entries section of an ELF binary.

◆ FromElfSection()

static Result< Detokenizer > pw::tokenizer::Detokenizer::FromElfSection ( span< const std::byte >  elf_section)
static

Constructs a detokenizer from the .pw_tokenizer.entries section of an ELF binary.


The documentation for this class was generated from the following file: