Paranoia, excessive logging and exception handling on simple file-related scripts. This is normal?

Question

Paranoia, excessive logging and exception handling on simple file-related scripts. This is normal?

I use python for a large number of file management scripts, as shown below. When searching for examples on the net, I wonder how few cases of registering and handling exceptions are presented in the examples. Every time I write a new script, my intention is not to be lower, but as for the files, regardless of what my paranoia accepts, and the end result is not like the examples I see on the network . Since I am a beginner, I would like to know if this is normal or not. If not, how do you deal with unknowns and fears of deleting valuable information?

def flatten_dir(dirname): '''Flattens a given root directory by moving all files from its sub-directories and nested sub-directories into the root directory and then deletes all sub-directories and nested sub-directories. Creates a backup directory preserving the original structure of the root directory and restores this in case of errors. ''' RESTORE_BACKUP = False log.info('processing directory "%s"' % dirname) backup_dirname = str(uuid.uuid4()) try: shutil.copytree(dirname, backup_dirname) log.debug('directory "%s" backed up as directory "%s"' % (dirname,backup_dirname)) except shutil.Error: log.error('shutil.Error: Error while trying to back up the directory') sys.stderr.write('the program is terminating with an error\n') sys.stderr.write('press consult the log file\n') sys.stderr.flush() time.sleep(0.25) print 'Press any key to quit this program.' msvcrt.getch() sys.exit() for root, dirs, files in os.walk(dirname, topdown=False): log.debug('os.walk passing: (%s, %s, %s)' % (root, dirs, files)) if root != dirname: for file in files: full_filename = os.path.join(root, file) try: shutil.move(full_filename, dirname) log.debug('"%s" copied to directory "%s"' % (file,dirname)) except shutil.Error: RESTORE_BACKUP = True log.error('file "%s" could not be copied to directory "%s"' % (file,dirname)) log.error('flagging directory "%s" for reset' % dirname) if not RESTORE_BACKUP: try: shutil.rmtree(root) log.debug('directory "%s" deleted' % root) except shutil.Error: RESTORE_BACKUP = True log.error('directory "%s" could not be deleted' % root) log.error('flagging directory "%s" for reset' % dirname) if RESTORE_BACKUP: break if RESTORE_BACKUP: RESTORE_FAIL = False try: shutil.rmtree(dirname) except shutil.Error: log.error('modified directory "%s" could not be deleted' % dirname) log.error('manual restoration from backup directory "%s" necessary' % backup_dirname) RESTORE_FAIL = True if not RESTORE_FAIL: try: os.renames(backup_dirname, dirname) log.debug('back up of directory "%s" restored' % dirname) print '>' print '>******WARNING******' print '>There was an error while trying to flatten directory "%s"' % dirname print '>back up of directory "%s" restored' % dirname print '>******WARNING******' print '>' except WindowsError: log.error('backup directory "%s" could not be renamed to original directory name' % backup_dirname) log.error('manual renaming of backup directory "%s" to original directory name "%s" necessary' % (backup_dirname,dirname)) print '>' print '>******WARNING******' print '>There was an error while trying to flatten directory "%s"' % dirname print '>back up of directory "%s" was NOT restored successfully' % dirname print '>no information is lost' print '>check the log file for information on manually restoring the directory' print '>******WARNING******' print '>' else: try: shutil.rmtree(backup_dirname) log.debug('back up of directory "%s" deleted' % dirname) log.info('directory "%s" successfully processed' % dirname) print '>directory "%s" successfully processed' % dirname except shutil.Error: log.error('backup directory "%s" could not be deleted' % backup_dirname) log.error('manual deletion of backup directory "%s" necessary' % backup_dirname) print '>' print '>******WARNING******' print '>directory "%s" successfully processed' % dirname print '>cleanup of backup directory "%s" failed' % backup_dirname print '>manual cleanup necessary' print '>******WARNING******' print '>'

+6

python logging exception

Ltpinback Aug 4 '10 at 14:39

source share

5 answers

Good thing he's a little paranoid. But there are different types of paranoia :). At the development stage, I use a lot of debugging instructions so that I can see where I am wrong (if I am wrong). Sometimes I leave these statements, but use a flag to control whether they need to be displayed or not (pretty much the debug flag). You can also have a verbosity flag to control how many records you make.

Another type of paranoia comes with sanity checks. This paranoia comes into play when you rely on external data or tools - almost everything that does not exit your program. In this case, it is never terribly paranoid (especially with the data you receive - you never trust it ).

It is also normal to be paranoid if you check to see if a particular operation completed successfully. This is just part of normal error handling. I notice that you perform functions such as deleting directories and files. These are operations that can potentially fail, and therefore you must deal with a scenario in which they fail. If you simply ignore this, your code may be in an undefined state / undefined and can potentially do bad (or at least unwanted) things.

As for the log files and debug files, you can leave them if you want. I usually do a decent amount of registration; enough to tell me what's going on. Of course, this is subjective. The key is to make sure that you do not drown yourself during registration; where there is so much information that you cannot easily select. Registration generally helps you understand that when it’s wrong, when the script you wrote suddenly stops working. Instead of going through a program to figure it out, you can get a general idea of where the problem is by going through the logs.

+3

Vivin paliath Aug 4 '10 at 14:50

source share

Paranoia can definitely hide what your code is trying to do. This is very bad for several reasons. He hides mistakes. This makes the program harder to change when you need it to do something else. This makes debugging difficult.

Assuming Amoss can't cure you of your paranoia, here's how I can rewrite the program. Note:

Each block of code containing a lot of paranoia is divided into its own function.
Each time an exception is caught, it is thrown again until it is finally caught in the main function. This eliminates the need for variables of type RESTORE_BACKUP and RESTORE_FAIL .
The heart of the program (in flatten_dir ) now is only 17 lines and without paranoia.

 def backup_tree(dirname, backup_dirname): try: shutil.copytree(dirname, backup_dirname) log.debug('directory "%s" backed up as directory "%s"' % (dirname,backup_dirname)) except: log.error('Error trying to back up the directory') raise def move_file(full_filename, dirname): try: shutil.move(full_filename, dirname) log.debug('"%s" copied to directory "%s"' % (file,dirname)) except: log.error('file "%s" could not be moved to directory "%s"' % (file,dirname)) raise def remove_empty_dir(dirname): try: os.rmdir(dirname) log.debug('directory "%s" deleted' % dirname) except: log.error('directory "%s" could not be deleted' % dirname) raise def remove_tree_for_restore(dirname): try: shutil.rmtree(dirname) except: log.error('modified directory "%s" could not be deleted' % dirname) log.error('manual restoration from backup directory "%s" necessary' % backup_dirname) raise def restore_backup(backup_dirname, dirname): try: os.renames(backup_dirname, dirname) log.debug('back up of directory "%s" restored' % dirname) print '>' print '>******WARNING******' print '>There was an error while trying to flatten directory "%s"' % dirname print '>back up of directory "%s" restored' % dirname print '>******WARNING******' print '>' except: log.error('backup directory "%s" could not be renamed to original directory name' % backup_dirname) log.error('manual renaming of backup directory "%s" to original directory name "%s" necessary' % (backup_dirname,dirname)) print '>' print '>******WARNING******' print '>There was an error while trying to flatten directory "%s"' % dirname print '>back up of directory "%s" was NOT restored successfully' % dirname print '>no information is lost' print '>check the log file for information on manually restoring the directory' print '>******WARNING******' print '>' raise def remove_backup_tree(backup_dirname): try: shutil.rmtree(backup_dirname) log.debug('back up of directory "%s" deleted' % dirname) log.info('directory "%s" successfully processed' % dirname) print '>directory "%s" successfully processed' % dirname except shutil.Error: log.error('backup directory "%s" could not be deleted' % backup_dirname) log.error('manual deletion of backup directory "%s" necessary' % backup_dirname) print '>' print '>******WARNING******' print '>directory "%s" successfully processed' % dirname print '>cleanup of backup directory "%s" failed' % backup_dirname print '>manual cleanup necessary' print '>******WARNING******' print '>' raise def flatten_dir(dirname): '''Flattens a given root directory by moving all files from its sub-directories and nested sub-directories into the root directory and then deletes all sub-directories and nested sub-directories. Creates a backup directory preserving the original structure of the root directory and restores this in case of errors. ''' log.info('processing directory "%s"' % dirname) backup_dirname = str(uuid.uuid4()) backup_tree(dirname, backup_dirname) try: for root, dirs, files in os.walk(dirname, topdown=False): log.debug('os.walk passing: (%s, %s, %s)' % (root, dirs, files)) if root != dirname: for file in files: full_filename = os.path.join(root, file) move_file(full_filename, dirname) remove_empty_dir(dirname) except: remove_tree_for_restore(dirname) restore_backup(backup_dirname, dirname) raise else: remove_backup_tree(backup_dirname) def main(dirname): try: flatten_dir(dirname) except: import exceptions logging.exception('error flattening directory "%s"' % dirname) exceptions.print_exc() sys.stderr.write('the program is terminating with an error\n') sys.stderr.write('press consult the log file\n') sys.stderr.flush() time.sleep(0.25) print 'Press any key to quit this program.' msvcrt.getch() sys.exit()

+2

Jason orendorff Aug 4 '10 at 15:28

source share

It seems reasonable to me. It depends on how important your data is.

I often start like this, and I have a log, which should be optional, with a flag set at the top of the file (or the caller), which sets logging on or off. You may also have verbosity.

Typically, after something has been working for some time and is no longer under development, I stop reading magazines and create giant log files that I never read. However, if something goes wrong, it's good to know that they are there.

+1

Jal Aug 4 '10 at 14:47

source share

If it’s normal to leave the job half-sold on error (only some files were moved), if the files are not lost, then the backup directory is not needed. This way you can write much simpler code:

 import os, logging def flatten_dir(dirname): for root, dirs, files in os.walk(dirname, topdown=False): assert len(dirs) == 0 if root != dirname: for file in files: full_filename = os.path.join(root, file) target_filename = os.path.join(dirname, file) if os.path.exists(target_filename): raise Exception('Unable to move file "%s" because "%s" already exists' % (full_filename, target_filename)) os.rename(full_filename, target_filename) os.rmdir(root) def main(): try: flatten_dir(somedir) except: logging.exception('Failed to flatten directory "%s".' % somedir) print "ERROR: Failed to flatten directory. Check log files for details."

Each individual system call here makes progress without destroying the data that you wanted to save. There is no need for a backup directory because you never need to “restore”.

0

Jason orendorff Aug 4 '10 at 15:55

source share

Amoss · Accepted Answer · 2010-08-04T15:14:15+0000

Learning to let go (or how I learned to live with a bomb) ...

Ask yourself: what exactly are you afraid of, and how will you deal with this if this happens? In the example below, you want to avoid data loss. The way you deal with this is to search for any combination of conditions that you think are a mistake and put a huge amount of records into it. Everything will continue to go wrong, and it is not clear that having a large registration volume will be a good way to handle this. Draw what you are trying to achieve:

 for each file in a tree if file is below the root move it into the root if nothing went wrong delete empty subtrees

So, what things can go wrong in this process? Well, there are many ways in which file transfer operations can be blocked due to the underlying file system. Can we list them all and provide good ways to deal with them? No ... but in general you will deal with them the same way. Sometimes a mistake is just a mistake, no matter what it is.

So, in this case, if any error occurs, you want to undo and undo any changes. The way you decide to do this is to back up and restore it when something goes wrong. But your most likely mistake is that the file system is full, in which case these steps are likely to fail. Okay, so this is a fairly common problem - if you are worried about unknown errors at any time, how to stop your recovery path from being incorrect?

The general answer is to make sure that you do some intermediate work first, and then take one unpleasant (hopefully, atomic) step. In your case, you need to flip the recovery. Instead of backing up as a backup, create a copy of the result. If all goes well, you can replace the new result with the old source tree. Or, if you are really paranoid, you can leave this step to the person. The advantage here is that if something goes wrong, you can just interrupt and throw away the partial state that you created.

Then your structure will look like this:

 make empty result directory for every file in the tree copy file into new result on failure abort otherwise move result over old source directory

By the way, in your current script there is an error that this pseudo-code becomes more obvious: if you have files with the same names in different branches, they overwrite each other in a new flattened version.

The second point in this psuedo code is that all error handling is in the same place (i.e. wrap the creation of a new directory and a recursive copy inside one try block and catch all errors after it), this solves your original question about the big relation of registration / error checking to the actual working code.

 backup_dirname = str(uuid.uuid4()) try: shutil.mkdir(backup_dirname) for root, dirs, files in os.walk(dirname, topdown=False): for file in files: full_filename = os.path.join(root, file) target_filename = os.path.join(backup_dirname,file) shutil.copy(full_filename, target_filename) catch Exception, e: print >>sys.stderr, "Something went wrong %s" % e exit(-1) shutil.move(back_dirname,root) # I would do this bit by hand really

Paranoia, excessive logging and exception handling on simple file-related scripts. This is normal?

More articles: