I play with Boost.Spirit . As part of a larger work, I am trying to build a grammar for parsing C / C ++ style string literals. I ran into a problem:
How to create a subgram that adds the result of std::string() to the calling grammar attribute std::string() (instead of a simple char ?
Here is my code that still works. (Actually, I already had a lot more, including things like '\n' , etc., but I reduced it to what I needed.)
#define BOOST_SPIRIT_UNICODE #include <string> #include <boost/spirit/include/qi.hpp> #include <boost/spirit/include/phoenix_operator.hpp> using namespace boost; using namespace boost::spirit; using namespace boost::spirit::qi; template < typename Iterator > struct EscapedUnicode : grammar< Iterator, char() > // <-- should be std::string { EscapedUnicode() : EscapedUnicode::base_type( escaped_unicode ) { escaped_unicode %= "\\" > ( ( "u" >> uint_parser< char, 16, 4, 4 >() ) | ( "U" >> uint_parser< char, 16, 8, 8 >() ) ); } rule< Iterator, char() > escaped_unicode; // <-- should be std::string }; template < typename Iterator > struct QuotedString : grammar< Iterator, std::string() > { QuotedString() : QuotedString::base_type( quoted_string ) { quoted_string %= '"' >> *( escaped_unicode | ( char_ - ( '"' | eol ) ) ) >> '"'; } EscapedUnicode< Iterator > escaped_unicode; rule< Iterator, std::string() > quoted_string; }; int main() { std::string input = "\"foo\u0041\""; typedef std::string::const_iterator iterator_type; QuotedString< iterator_type > qs; std::string result; bool r = parse( input.cbegin(), input.cend(), qs, result ); std::cout << result << std::endl; }
This prints fooA - the fooA grammar QuotedString grammar, which causes char to be added to the std::string QuotedString ( A , 0x41 ).
But of course, I will need to generate a sequence of characters (bytes) for anything other than 0x7f. EscapedUnicode could create a std::string , which should be added to the line generated by QuotedString .
And this is where I met the checkpoint. I donβt understand what Boost.Spirit does with Boost.Phoenix, and any attempts that I made led to long and rather elusive compiler errors related to templates.
So how can I do this? In fact, the answer does not require a proper Unicode conversion; this is a std::string question I need a solution for.