Boost C++ Libraries Home Libraries People FAQ More

PrevUpHomeNext

Symbol Tables and Attributes

Overview

Symbol tables can be built into xpressive regular expressions with just a std::map<>. The map keys are the strings to be matched and the map values are the data to be returned to your semantic action. Xpressive attributes, named a1, a2, through a9, hold the value corresponding to a matching key so that it can be used in a semantic action. A default value can be specified for an attribute if a symbol is not found.

Symbol Tables

An xpressive symbol table is just a std::map<>, where the key is a string type and the value can be anything. For example, the following regular expression matches a key from map1 and assigns the corresponding value to the attribute a1. Then, in the semantic action, it assigns the value stored in attribute a1 to an integer result.

int result;
std::map<std::string, int> map1;
// ... (fill the map)
sregex rx = ( a1 = map1 ) [ ref(result) = a1 ];

Consider the following example code, which translates number names into integers. It is described below.

#include <string>
#include <iostream>
#include <boost/xpressive/xpressive.hpp>
#include <boost/xpressive/regex_actions.hpp>
using namespace boost::xpressive;

int main()
{
    std::map<std::string, int> number_map;
    number_map["one"] = 1;
    number_map["two"] = 2;
    number_map["three"] = 3;
    // Match a string from number_map
    // and store the integer value in 'result'
    // if not found, store -1 in 'result'
    int result = 0;
    cregex rx = ((a1 = number_map ) | *_)
        [ ref(result) = (a1 | -1)];

    regex_match("three", rx);
    std::cout << result << '\n';
    regex_match("two", rx);
    std::cout << result << '\n';
    regex_match("stuff", rx);
    std::cout << result << '\n';
    return 0;
}

This program prints the following:

3
2
-1

First the program builds a number map, with number names as string keys and the corresponding integers as values. Then it constructs a static regular expression using an attribute a1 to represent the result of the symbol table lookup. In the semantic action, the attribute is assigned to an integer variable result. If the symbol was not found, a default value of -1 is assigned to result. A wildcard, *_, makes sure the regex matches even if the symbol is not found.

A more complete version of this example can be found in libs/xpressive/example/numbers.cpp[2]. It translates number names up to "nine hundred ninety nine million nine hundred ninety nine thousand nine hundred ninety nine" along with some special number names like "dozen".

Symbol table matches are case sensitive by default, but they can be made case-insensitive by enclosing the expression in icase().

Attributes

Up to nine attributes can be used in a regular expression. They are named a1, a2, ..., a9 in the boost::xpressive namespace. The attribute type is the same as the second component of the map that is assigned to it. A default value for an attribute can be specified in a semantic action with the syntax (a1 | default-value).

Attributes are properly scoped, so you can do crazy things like: ( (a1=sym1) >> (a1=sym2)[ref(x)=a1] )[ref(y)=a1]. The inner semantic action sees the inner a1, and the outer semantic action sees the outer one. They can even have different types.

[Note] Note

Xpressive builds a hidden ternary search trie from the map so it can search quickly. If BOOST_DISABLE_THREADS is defined, the hidden ternary search trie "self adjusts", so after each search it restructures itself to improve the efficiency of future searches based on the frequency of previous searches.



[2] Many thanks to David Jenkins, who contributed this example.


PrevUpHomeNext