Boost :: spirit :: qi Waiting Parser and parser group unexpected behavior

I hope that someone will be able to illuminate the light because of my ignorance of using the > and >> operators in spirit analysis.

I have a working grammar where a top-level rule looks like

 test = identifier >> operationRule >> repeat(1,3)[any_string] >> arrow >> any_string >> conditionRule; 

It uses attributes for automatically distributed parsed values ​​for a merge-adapted structure (i.e., a forced tuple).

However, I know that as soon as we map the Rule operation, we must continue or fail (i.e. we do not want to allow backtracking to try other rules starting with identifier ).

 test = identifier >> operationRule > repeat(1,3)[any_string] > arrow > any_string > conditionRule; 

This causes a cryptographic compiler error ( 'boost::Container' : use of class template requires template argument list ). Futz around the bit and the following compilation:

 test = identifier >> (operationRule > repeat(1,3)[any_string] > arrow > any_string > conditionRule); 

but the attribute parameter no longer works - my data structure contains garbage after parsing. This can be [at_c<0>(_val) = _1] adding actions such as [at_c<0>(_val) = _1] , but it seems a bit awkward - and also slow down according to boost docs.

So my questions are:

  • Should reverse tracking be prevented?
  • Why do we need a grouping operator ()
  • My last example above really stops tracking after operationRule matches (I suspect not, it seems that if the whole parser inside (...) cannot go back)?
  • If the answer to the previous question is / no /, how can I build a rule that allows me to backtrack if operation is / is not / is matched, but does not allow backtracking after the operation / matches / matches?
  • Why does a grouping operator destroy an attribute grammar - requiring action?

I understand that this is a fairly broad question - any hints pointing in the right direction will be welcome!

+4
c ++ boost parsing boost-spirit boost-spirit-qi
source share
1 answer
  • Should reverse tracking be prevented?

    That's right. Preventing backtracking in general is a proven way to improve analyzer performance.

    • reduce the use of (negative) lookahead (operator !, operator - and some operator &)
    • order branches (operator |, operator ||, operator ^ and some operator * / - / +), so that the most frequent / probable branch is ordered first or that the most expensive branch is checked last

    Using pending points ( > ) does not significantly reduce the return: it simply prohibits. This will allow you to create targeted error messages and prevent useless "parsing into the unknown."

  • Why grouping operator ()

    I'm not sure. I had a check with my helpers what_is_the_attr

    • ident >> op >> repeat(1,3)[any] >> "->" >> any
      synthesizes an attribute:

       fusion::vector4<string, string, vector<string>, string> 
    • ident >> op > repeat(1,3)[any] > "->" > any
      synthesizes an attribute:

       fusion::vector3<fusion::vector2<string, string>, vector<string>, string> 

    I did not find the need to group the subexpressions with parentheses (things are compiled), but obviously the DataT needs to be changed according to the changed layout.

     typedef boost::tuple< boost::tuple<std::string, std::string>, std::vector<std::string>, std::string > DataT; 

The full code below shows how I prefer to do this using adapted structures.

  • Is my above example stopping tracking after the operation matches the rule (I suspect not, it seems that if the whole parser inside (...) cannot go back)?

    That's right. If the expectations (expectations) are not met, a qi::expectation_failure<> exception is thrown. This overrides parsing by default. You can use qi :: on_error for retry , fail , accept or rethrow . Example MiniXML has very good examples of using waiting points with qi::on_error

  • If the answer to the previous question is / no /, how can I build a rule that allows you to retreat if the operation is / no / matched, but does not allow you to go back after the operation / is / matched?

  • Why does a grouping operator destroy an attribute grammar - requiring action?

    It does not destroy the grammar of the attribute, it just changes the open type. So, if you bind the corresponding attribute link to a rule / grammar, you will not need semantic actions. Now I feel that there should be paths without grouping , so let me try (preferably on your short self-sufficient sample). And indeed, I did not find such a need. I added a complete example to help you see what is happening in my testing and not use semantic actions.

Full code

The full code shows 5 scenarios:

  • OPTION 1: Original without expectations

    (no corresponding changes)

  • OPTION 2: with expectations

    Using a modified typedef for DataT (as shown above)

  • OPTION 3: adapted structure, no expectations

    Using custom structure with BOOST_FUSION_ADAPT_STRUCT

  • OPTION 4: adapted structure with expectations

    Change Adapted Structure from OPTION 3

  • OPTION 5: manual check

    This tool uses a smart (?) Hack , doing everything >> on hold and pre-detecting the presence of operationRule -match. This, of course, is suboptimal, but allows you to save the DataT unmodified and without the use of semantic actions.

Obviously, define OPTION to the required value before compiling.

 #include <boost/spirit/include/qi.hpp> #include <boost/spirit/include/karma.hpp> #include <boost/spirit/include/phoenix.hpp> #include <boost/fusion/adapted.hpp> #include <iostream> namespace qi = boost::spirit::qi; namespace karma = boost::spirit::karma; #ifndef OPTION #define OPTION 5 #endif #if OPTION == 1 || OPTION == 5 // original without expectations (OR lookahead hack) typedef boost::tuple<std::string, std::string, std::vector<std::string>, std::string> DataT; #elif OPTION == 2 // with expectations typedef boost::tuple<boost::tuple<std::string, std::string>, std::vector<std::string>, std::string> DataT; #elif OPTION == 3 // adapted struct, without expectations struct DataT { std::string identifier, operation; std::vector<std::string> values; std::string destination; }; BOOST_FUSION_ADAPT_STRUCT(DataT, (std::string, identifier)(std::string, operation)(std::vector<std::string>, values)(std::string, destination)); #elif OPTION == 4 // adapted struct, with expectations struct IdOpT { std::string identifier, operation; }; struct DataT { IdOpT idop; std::vector<std::string> values; std::string destination; }; BOOST_FUSION_ADAPT_STRUCT(IdOpT, (std::string, identifier)(std::string, operation)); BOOST_FUSION_ADAPT_STRUCT(DataT, (IdOpT, idop)(std::vector<std::string>, values)(std::string, destination)); #endif template <typename Iterator> struct test_parser : qi::grammar<Iterator, DataT(), qi::space_type, qi::locals<char> > { test_parser() : test_parser::base_type(test, "test") { using namespace qi; quoted_string = omit [ char_("'\"") [_a =_1] ] >> no_skip [ *(char_ - char_(_a)) ] > lit(_a); any_string = quoted_string | +qi::alnum; identifier = lexeme [ alnum >> *graph ]; operationRule = string("add") | "sub"; arrow = "->"; #if OPTION == 1 || OPTION == 3 // without expectations test = identifier >> operationRule >> repeat(1,3)[any_string] >> arrow >> any_string; #elif OPTION == 2 || OPTION == 4 // with expectations test = identifier >> operationRule > repeat(1,3)[any_string] > arrow > any_string; #elif OPTION == 5 // lookahead hack test = &(identifier >> operationRule) > identifier > operationRule > repeat(1,3)[any_string] > arrow > any_string; #endif } qi::rule<Iterator, qi::space_type/*, qi::locals<char> */> arrow; qi::rule<Iterator, std::string(), qi::space_type/*, qi::locals<char> */> operationRule; qi::rule<Iterator, std::string(), qi::space_type/*, qi::locals<char> */> identifier; qi::rule<Iterator, std::string(), qi::space_type, qi::locals<char> > quoted_string, any_string; qi::rule<Iterator, DataT(), qi::space_type, qi::locals<char> > test; }; int main() { std::string str("addx001 add 'str1' \"str2\" -> \"str3\""); test_parser<std::string::const_iterator> grammar; std::string::const_iterator iter = str.begin(); std::string::const_iterator end = str.end(); DataT data; bool r = phrase_parse(iter, end, grammar, qi::space, data); if (r) { using namespace karma; std::cout << "OPTION " << OPTION << ": " << str << " --> "; #if OPTION == 1 || OPTION == 3 || OPTION == 5 // without expectations (OR lookahead hack) std::cout << format(delimit[auto_ << auto_ << '[' << auto_ << ']' << " --> " << auto_], data) << "\n"; #elif OPTION == 2 || OPTION == 4 // with expectations std::cout << format(delimit[auto_ << '[' << auto_ << ']' << " --> " << auto_], data) << "\n"; #endif } if (iter!=end) std::cout << "Remaining: " << std::string(iter,end) << "\n"; } 

Output for all OPTIONS:

 for a in 1 2 3 4 5; do g++ -DOPTION=$a -I ~/custom/boost/ test.cpp -o test$a && ./test$a; done OPTION 1: addx001 add 'str1' "str2" -> "str3" --> addx001 add [ str1 str2 ] --> str3 OPTION 2: addx001 add 'str1' "str2" -> "str3" --> addx001 add [ str1 str2 ] --> str3 OPTION 3: addx001 add 'str1' "str2" -> "str3" --> addx001 add [ str1 str2 ] --> str3 OPTION 4: addx001 add 'str1' "str2" -> "str3" --> addx001 add [ str1 str2 ] --> str3 OPTION 5: addx001 add 'str1' "str2" -> "str3" --> addx001 add [ str1 str2 ] --> str3 
+5
source share

All Articles