What is a free context grammar?

Can someone explain to me what contextual free grammar is? After looking at a Wikipedia entry and then a Wikipedia entry on formal grammar, I was completely and completely intoxicated. Would anyone be so kind as to explain what these things are?

I am interested in this because I want to explore parsing, and also on the side, the limitation of the regex engine.

I am not sure if these terms are directly related to programming, or if they are more related to linguistics in general. If so, I apologize, maybe this can be rescheduled, if so?

+66
regex parsing context-free-grammar grammar
Jul 15 2018-11-21T00:
source share
2 answers

A contextual free grammar is a grammar that satisfies certain properties. In computer science, grammars describe languages; in particular, they describe formal languages.

A formal language is simply a collection (a mathematical term for a collection of objects) of strings (sequences of characters ... very similar to the use of programming the word "string"). A simple example of a formal language is the set of all binary strings of length three, {000, 001, 010, 011, 100, 101, 110, 111}.

Grammars work by defining the transformations you can make to build a string in the language described by the grammar. Grammars will tell you how to convert the start character (usually S) to some string of characters. Grammar for this language:

S -> BBB B -> 0 B -> 1 

The way to interpret this means that S can be replaced by B, and B can be replaced by 0, and B can be replaced by 1. So, to build row 010, we can do S β†’ BBB β†’ 0BB β†’ 01B β†’ 010.

A context-free grammar is just a grammar where the thing you are replacing (to the left of the arrow) is the only "non-terminal" symbol. A nonterminal character is any character that you use in a grammar that cannot appear in your final lines. In the grammar above, β€œS” and β€œB” are non-terminal characters, and β€œ0” and β€œ1” are β€œterminal” characters. Grammar like

 S -> AB AB -> 1 A -> AA B -> 0 

They are not regular, as they contain rules such as "AB β†’ 1".

+70
Jul 15 2018-11-21T00:
source share
β€” -

The theory of language is connected with the theory of computing. Which one is the more philosophical side in the field of computer science, about which programs are possible or what will ever be possible to write, and which problems it is impossible to write an algorithm for solving.

Regular expression is a way of describing a regular language. A regular language is a language that can be solved by a deterministic finite state machine.

You should read the article on finite machines: http://en.wikipedia.org/wiki/Finite_state_machine

And common languages: http://en.wikipedia.org/wiki/Regular_language

All common languages ​​are free context languages, but there are context languages ​​that are not regular. Context Free Language is the set of all strings accepted by Context Free Grammer or Automation Pushdown, which is a single-stack end state machine: http://en.wikipedia.org/wiki/Pushdown_automaton#PDA_and_Context-free_Languages

There are more complex languages ​​that require a Turing machine (any possible program that you can write to your computer) to decide if there is a line in that language or not.

Language theory is also very related to the P vs NP problem and some other interesting things.

In my third year course, Introduction to Computer Science, in the third year it was pretty good to explain this material: Introduction to Computing Theory. Michael Sipser. But it cost me, like $ 160, to buy it new, and it's not very big. Perhaps you can find a used copy or find a copy in the library or something that might help you.

EDIT:

The limitations of Regular Expressions and higher language classes have been investigated per ton over the past 50 years or so. You may be interested in the pumping lemma for regular languages. This is a means of proving that a particular language is not regular:

http://en.wikipedia.org/wiki/Pumping_lemma_for_regular_languages

If the language is not regular, it can be Context Free, which means it can be described by Context Free Grammer, or it can even be in a higher class of languages, you can prove that it is not Context Free by the context transfer pumping lemma Free languages ​​that are similar to those used for regular expressions.

The language can even be unsolvable, which means that even a Turing machine (can program your computer can work) cannot be programmed to choose whether to accept a string in a language or reject it.

I think that you are most interested in finite machines (both deterministic and deterministic) to find out which languages ​​the Regular Expression can choose, and the pumping lemma to prove which languages ​​are not regular.

Basically, a language is not regular if it needs some memory or the ability to count. The language of matching parentheses is not regular, for example, because the machine must remember whether it opened the bracket to find out if it needs to be closed.

The language of all strings using the letters a and b that contain at least three b is the usual language: abababa

The language of all strings using the letters a and b containing more than b than a is not regular.

It also does not follow that all final languages ​​are regular, for example:

The language of all lines less than 50 characters long using the letters a and b that contain more than b than a is regular because it is finite, we know that it can be described as (b | abb | bab | bba | aabbb | ababb | ...) until all possible combinations are indicated.

+16
Jul 15 2018-11-21T00:
source share



All Articles