How to prevent table regeneration in PLY

Question

How to prevent table regeneration in PLY

I use PLY in a command line application, which I install as a Python egg to install through pip . Each time I run my script from the command line, I see the following message:

 "Generating LALR tables"

In addition, the parser.out and parsetab.py files are written to the directory from which the script is called. Is there a way to send these files using the application so that it does not restore tables every time?

+6

python parsing ply

Michael Sep 28 '12 at 17:47

source share

5 answers

use

 yacc.yacc(debug=0, write_tables=0)

+11

eserge Feb 13 '13 at 22:28

source share

You want to use optimized mode by calling lex like:

 lexer = lex.lex(optimize=1)

.

It is worth emphasizing ( from the same link ):

On subsequent launches, lextab.py will simply be imported to create lexer. This approach significantly improves lexer startup time and works in Python-optimized mode.
When working in optimized mode, it is important to note that lex disables most error checks. Thus, it is really recommended only if you are sure that everything is working correctly and you are ready to start producing production code .

Since this is production code, it looks like what you want.

.

In this question, I met different Yacc notes :

Since generating LALR tables is relatively expensive, previously created tables are cached and reused, if possible. The decision to restore the tables is determined by the adoption of the MD5 checksum of all grammar rules and priority rules. Only in case of non-compliance tables are restored.

Going deeper into the yacc function inside yacc.py , we see that the optimization ignores this discrepancy in the following snippet:

 if optimize or (read_signature == signature): try: lr.bind_callables(pinfo.pdict) parser = LRParser(lr,pinfo.error_func) parse = parser.parse return parser

where signature compared with the checksum stored in parsetab.py (like _lr_signature ).

+2

Andy hayden Oct 4 '12 at 16:45

source share

This is an old question, but I ran into a similar problem with ply when I tried to use the outputdir yacc keyword argument to place the generated parser tables in specific directories in my project - he placed them there, but play them each time independently. I found this patch on github that solved the regeneration problem without any noticeable side effects. Basically, all he does is change the read_table method in the yacc class to take an additional parameter - outputdir - and search the directory there before re-generating it. To do this, the only call site in read_table (in the yacc method) must also be changed to pass the outputdir keyword argument.

0

Syzygy Apr 14 '15 at 21:13

source share

There are apparently arguments for this in ply.yacc:

 def yacc(method='LALR', debug=yaccdebug, module=None, tabmodule=tab_module, start=None, check_recursion=1, optimize=0, write_tables=1, debugfile=debug_file,outputdir='', debuglog=None, errorlog = None, picklefile=None):

So, you just pass another error log and debuglog (using debug () methods, etc., which are not printed on stdout / stderr). And you specify a fixed outputdir. And that’s all you need to do.

UPDATE: I just checked and this is the correct parameter:

 yacc.yacc( debug=False, # do not create parser.out outputdir=r"c:\temp\aaa" # instruct to place parsetab here )

In fact, you need to use outputdir, which already contains parsetab.py. This will not only eliminate the message, but your program will not select parsetab.py either. He will just use it.

-1

nagylzs Oct 3 '12 at 14:25

source share

Michael · Accepted Answer · 2012-10-07T03:55:44+0000

What I ended up with was disabling optimization. I went through the source of PLY 3.4 and I found this little nugget in lexer code:

 # If in optimize mode, we write the lextab if lextab and optimize: lexobj.writetab(lextab,outputdir) return lexobj

After changing the code that the lexer and parser creates, do the following:

self.lexer = lex.lex(module=self, optimize=False, debug=False, **kwargs)

and

self.lexer = lex.lex(module=self, optimize=False, debug=False, **kwargs)

I avoided all file lists. The debugger writes the .out files to the directory, and the Python files are the result of the optimize flag.

While this works, I cannot say that I am completely satisfied with this approach. Presumably, having some way to keep the optimization and, at the same time, keep the working directory clean will be an excellent solution, which will lead to better performance. If anyone has a better methodology, I am more than open to it.

How to prevent table regeneration in PLY

More articles: