What is the Pythonic way to report non-fatal errors in the parser?

The parser I created reads the recorded chess games from a file. The API is used as follows:

import chess.pgn pgn_file = open("games.pgn") first_game = chess.pgn.read_game(pgn_file) second_game = chess.pgn.read_game(pgn_file) # ... 

Sometimes illegal moves (or other problems) arise. What a good Python way to handle them?

  • Getting exceptions as soon as an error occurs. However, this makes every problem fatal, and execution stops. Often there is still useful data that has been analyzed and can be returned. In addition, you cannot simply continue parsing the next data set, because we are still in the middle of some of the received data.

  • Accumulation of exceptions and their increase at the end of the game. This makes the mistake fatal again, but at least you can catch it and continue parsing the next game.

  • Enter an optional argument as follows:

     game = chess.pgn.read_game(pgn_file, parser_info) if parser_info.error: # This appears to be quite verbose. # Now you can at least make the best of the sucessfully parsed parts. # ... 

Are some of these or other methods used in the wild?

+8
python exception-handling warnings error-handling
source share
5 answers

In fact, these are fatal errors - at least as far as they are able to reproduce the correct game; on the other hand, it is possible that the player really made an illegal move, and at that time no one noticed (what would make this warning, and not a fatal mistake).

Given the possibility of both fatal errors (the file is corrupted) and warnings (an illegal move was performed, but subsequent moves show consistency with this movement (in other words, the user's mistake and no one caught it at that time)). I recommend a combination of the first and second options:

  • throws an exception if continued parsing is not an option
  • collect any errors / warnings that do not prevent further analysis to the end.

If you do not encounter a fatal error, you can return the game, as well as any warnings / non-fatal errors at the end:

 return game, warnings, errors 

But what if you make a fatal mistake?

No problem: create a custom exception to which you can attach the useful part of the game and any other warnings / non-fatal errors:

 raise ParsingError( 'error explanation here', game=game, warnings=warnings, errors=errors, ) 

then, when you catch an error, you can access the restored part of the game, as well as warnings and errors.

User error may be:

 class ParsingError(Exception): def __init__(self, msg, game, warnings, errors): super().__init__(msg) self.game = game self.warnings = warnings self.errors = errors 

and when using:

 try: first_game, warnings, errors = chess.pgn.read_game(pgn_file) except chess.pgn.ParsingError as err: first_game = err.game warnings = err.warnings errors = err.errors # whatever else you want to do to handle the exception 

This is similar to how the subprocess module handles errors.

To be able to extract and analyze subsequent games after a fatal game error, I would suggest changing your API:

  • has a game iterator that simply returns raw data for each game (it needs to know how to say when one game ends and the next starts).
  • so that the parser takes this raw game data and parses it (so that it is no longer responsible for where you ended up in the file)

Thus, if you have a file of five games and two games, you can still try to analyze games 3, 4, and 5.

+6
source share

The most Pythonic way is logging . This was mentioned in the comments, but, unfortunately, without much persistence. There are many reasons preferable to warnings :

  • The warning module is designed to alert you about possible code problems, not bad user data.
  • The first reason is actually enough. :-)
  • The logging module provides customizable message severity: not only warnings, but also messages from debug messages to critical errors.
  • You can fully control the output of the logging module. Messages can be filtered by source, content and severity, formatted in any way, sent to different output targets (console, channels, files, memory, etc.) ...
  • The logging module separates the actual reporting and output of error / warning / message messages: your code can generate messages of the appropriate type and should not bother how they are presented to the end user.
  • The logging module is the de facto standard for Python code. Everyone uses it everywhere. Therefore, if your code uses it, combining it with third-party code (probably through logging) will be easy. Well, maybe something is stronger than the wind, but definitely not a category 5 hurricane .:-)

A basic usage example for a logging module would look like this:

 import logging logger = logging.getLogger(__name__) # module-level logger # (tons of code) logger.warning('illegal move: %s in file %s', move, file_name) # (more tons of code) 

This will print messages such as:

 WARNING:chess_parser:illegal move: a2-b7 in file parties.pgn 

(if your module is called chess_parser.py)

Most importantly, you do not need to do anything in the analyzer module . You declare that you use a logging system, you use a logger with a specific name (the same as the analyzer module name in this example), and you send it a warning level message. Your module does not need to know how these messages are processed, formatted and communicated to the user. Or, if at all communicated. For example, you can configure the registration module (usually at the very beginning of your program) to use a different format and upload it to a file:

 logging.basicConfig(filename = 'parser.log', format = '%(name)s [%(levelname)s] %(message)s') 

And suddenly, without any changes to the code of your module, your warning messages are saved in a file with a different format instead of printing on the screen:

 chess_parser [WARNING] illegal move: a2-b7 in file parties.pgn 

Or you can suppress warnings if you want:

 logging.basicConfig(level = logging.ERROR) 

And the warnings of your module will be completely ignored, while any ERROR or higher level messages from your module will be processed.

+7
source share

I offered generosity because I would like to know if this is really the best way to do this. However, I am also writing a parser, and therefore I need this functionality, and this is what I came up with.


warnings module is exactly what you want.

At first, I was disgusted by the fact that each warning example used in the documents looks like these :

 Traceback (most recent call last): File "warnings_warn_raise.py", line 15, in <module> warnings.warn('This is a warning message') UserWarning: This is a warning message 

... which is undesirable because I do not want it to be UserWarning , I need my own own warning name.

Here's a solution to this:

 import warnings class AmbiguousStatementWarning(Warning): pass def x(): warnings.warn("unable to parse statement syntax", AmbiguousStatementWarning, stacklevel=3) print("after warning") def x_caller(): x() x_caller() 

which gives:

 $ python3 warntest.py warntest.py:12: AmbiguousStatementWarning: unable to parse statement syntax x_caller() after warning 
+3
source share

I’m not sure if the solution is pythonic or not, but I use it quite often with minor changes: the parser does its job in the generator and gives the results and status code. The receive code decides what to do with the failed elements:

 def process_items(items) for item in items: try: #process item yield processed_item, None except StandardError, err: yield None, (SOME_ERROR_CODE, str(err), item) for processed, err in process_items(items): if err: # process and log err, collect failed items, etc. continue # further process processed 

A more general approach is the practice of using design patterns. A simplified version of Observer (when registering callbacks for certain errors) or a kind of Visitor (where the visitor has methods for handling specific errors, see SAX for analysis) can be a clear and well-understood solution.

+3
source share

Without libraries, it's hard to do it cleanly, but still possible.

Depending on the situation, there are various processing methods.

Method 1:

Put the entire contents of the while loop in the following:

 while 1: try: #codecodecode except Exception as detail: print detail 

Method 2:

The same as in method 1, except for the presence of several try / except objects, so it does not miss too much code, and you know the exact location of the error.

Sorry, in a hurry, hope this helps!

0
source share

All Articles