How to express the industry in the Rebol PARSE dialect?

I have a mysql schema as shown below:

data: { `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `name` varchar(10) DEFAULT '' COMMENT 'the name', `content` text COMMENT 'something', } 

Now I want to extract some information from it: name, type and comment, if any. See below:

 ["id" "int" "" "name" "varchar" "the name" "content" "text" "something" ] 

My code is:

 parse data [ any [ thru {`} copy field to {`} {`} thru some space copy field-type to [ {(} | space] (comm: "") opt [ thru {COMMENT} thru some space thru {'} copy comm to {'}] (repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""]) ] ] 

but I get something like this:

 ["id" "int" "the name" "content" "text" "something"] 

I know that the opt .. string is incorrect.

I want to express if the COMMENT keyword is found, and then extract the comment information; if it is found first, then continue the next cycle. But I do not know how to express it. Anyone can help?

+5
source share
4 answers

I really love (whenever possible) creating a set of grammar rules with positive terms to match the target input - I find it more literate, accurate, flexible and easier to debug. In the above snippet, we can identify five main components:

 space: use [space][ space: charset "^-^/ " [some space] ] word: use [letter][ letter: charset [#"a" - #"z" #"A" - #"Z" "_"] [some letter] ] id: use [letter][ letter: complement charset "`" [some letter] ] number: use [digit][ digit: charset "0123456789" [some digit] ] string: use [char][ char: complement charset "'" [any [some char | "''"]] ] 

With certain terms, writing a rule that describes the grammar of input is relatively trivial:

 result: collect [ parsed?: parse/all data [ ; parse/all for Rebol 2 compatibility opt space some [ (field: type: none comment: copy "") "`" copy field id "`" space copy type word opt ["(" number ")"] any [ space [ "COMMENT" space "'" copy comment string "'" | word | "'" string "'" | number ] ] opt space "," (keep reduce [field type comment]) opt space ] ] ] 

As an added bonus, we can confirm the entry.

 if parsed? [new-line/all/skip result true 3] 

One of the new-line applications, to make friends a little, should give:

 == [ "id" "int" "" "name" "varchar" "the name" "content" "text" "something" ] 
+5
source

I think this is closer to what you need.

 data: { `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `name` varchar(10) DEFAULT '' COMMENT 'the name', `content` text COMMENT 'something', } temp: [] parse data [ any [ thru {`} copy field to {`} {`} some space copy field-type to [ {(} | space] (comm: copy "") opt [ thru {COMMENT} some space thru {'} copy comm to {'}] (repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""]) ] ] probe temp 

To break the differences.

  • Configure a word with an empty block for temp
  • Changed thru some space only to some space , as this will move forward through the series in the same way. Please note: false

     parse " " [ thru some space ] 
  • Changed comm: "" to comm: copy "" to make sure you get a new line every time you extract a comment (does not affect the output, but this is good practice)

  • Changed {COMMENT} thru some space to {COMMENT} some space according to comment 2.
  • Just added a probe at the end for debugging

As a note, you can use ?? (almost) anywhere in the parse rule to help with debugging that will show you the current position.

+3
source

parse / all for string parsing

 data: { `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `name` varchar(10) DEFAULT '' COMMENT 'the name', `content` text COMMENT 'something', } nodata: charset { ()'} dat: complement nodata collect [ parse/all data [ some [ thru {`} copy field to {`} (keep field) skip some " " copy type some dat ( keep type comm: copy "" ) copy rest thru "," ( parse/all rest [ some [ ["," (keep comm) ] | ["COMMENT" some nodata copy comm to "'" ] | skip ] ] ) ] ] ] == ["id" "int" "" "name" "varchar" "the name" "content" "text" "something"] 

another (best) solution with pure analysis

 collect [ probe parse/all data [ some [ thru {`} copy field to {`} (keep field) skip some " " copy type some dat ( keep type comm: "" further: []) some [ "," (keep comm further: [ to end skip]) | ["COMMENT" some nodata copy comm to "'" ] | skip further ] ] ] ] 
+3
source

I will figure out an alternative way to get data as a block! but not a line !.

 data: read/lines data.txt probe data temp: copy [] foreach d data [ parse d [ thru {`} copy field to {`} {`} thru some space copy field-type to [ {(} | space] (comm: "") opt [ thru {COMMENT} thru some space thru {'} copy comm to {'}] (repend temp field repend temp field-type either comm [ repend temp comm ][ repend temp ""]) ] ] probe temp 
+1
source

All Articles