namespace orcus::sax

Enum

parse_token_t

enum class orcus::sax::parse_token_t

Values:

enumerator unknown
enumerator start_element
enumerator end_element
enumerator characters
enumerator parse_error

Type aliases

parse_tokens_t

typedef std::vector<parse_token> orcus::sax::parse_tokens_t

Functions

decode_xml_encoded_char

char orcus::sax::decode_xml_encoded_char(const char *p, size_t n)

Given an encoded name (such as ‘quot’ and ‘amp’), return a single character that corresponds with the name. The name shouldn’t include the leading ‘&’ and trailing ‘;’.

Parameters:
  • p – pointer to the first character of encoded name

  • n – length of encoded name

Returns:

single character that corresponds with the encoded name. ‘\0’ is returned if decoding fails.

decode_xml_unicode_char

std::string orcus::sax::decode_xml_unicode_char(const char *p, size_t n)

Given an encoded unicode value (such as #20A9), return a UTF-8 string that corresponds with the unicode value. The value shouldn’t include the leading ‘&’ and trailing ‘;’.

Parameters:
  • p – pointer to the first character of encoded name

  • n – length of encoded name

Returns:

string that corresponds with the encoded value. An empty string is returned if decoding fails.

Struct

doctype_declaration

struct doctype_declaration

Document type declaration passed by sax_parser to its handler’s doctype() call.

Public Types

enum class keyword_type

Values:

enumerator dtd_public
enumerator dtd_private

Public Members

keyword_type keyword
std::string_view root_element
std::string_view fpi
std::string_view uri

parse_token

struct parse_token

Public Types

using value_type = std::variant<std::string_view, parse_error_value_t, const xml_token_element_t*>

Public Functions

parse_token()
parse_token(std::string_view _characters)
parse_token(parse_token_t _type, const xml_token_element_t *_element)
parse_token(std::string_view msg, std::ptrdiff_t offset)
parse_token(const parse_token &other)
parse_token &operator=(parse_token) = delete
bool operator==(const parse_token &other) const
bool operator!=(const parse_token &other) const

Public Members

parse_token_t type
value_type value

parser_attribute

struct parser_attribute

Attribute properties passed by sax_parser to its handler’s attribute() call. When an attribute value is “transient”, it has been converted due to presence of encoded character(s) and has been stored in a temporary buffer. The handler must assume that the value will not survive after the callback function ends.

Public Members

std::string_view ns

Optional attribute namespace. It may be empty if it’s not given.

std::string_view name

Attribute name.

std::string_view value

Attribute value.

bool transient

Whether or not the attribute value is in a temporary buffer.

parser_element

struct parser_element

Element properties passed by sax_parser to its handler’s open_element() and close_element() calls.

Public Members

std::string_view ns

Optional element namespace. It may be empty if it’s not given.

std::string_view name

Element name.

std::ptrdiff_t begin_pos

Position of the opening brace ‘<’.

std::ptrdiff_t end_pos

Position immediately after the closing brace ‘>’.

Classes

parser_base

class parser_base : public orcus::parser_base

Subclassed by orcus::sax_parser< handler_wrapper >, orcus::sax_parser< HandlerT, ConfigT >

parser_thread

class parser_thread

Public Functions

parser_thread(const char *p, size_t n, const orcus::tokens &tks, xmlns_context &ns_cxt, size_t min_token_size)
parser_thread(const char *p, size_t n, const orcus::tokens &tks, xmlns_context &ns_cxt, size_t min_token_size, size_t max_token_size)
~parser_thread()
void start()
bool next_tokens(parse_tokens_t &tokens)

Wait until new set of tokens becomes available.

Parameters:

tokens – new set of tokens.

Returns:

true if the parsing is still in progress (therefore more tokens to come), false if it’s done i.e. this is the last token set.

void swap_string_pool(string_pool &pool)
void abort()

Child namespaces