Debugging strategy for a bug (apparently) affected by a timing

I'm pretty inexperienced with python, so I don't understand how to approach this error. I have inherited a python application that basically just copies files and folders from one place to another. All files and folders are located on the local computer, and the user has full administrator rights, so there are no problems with network or security.

I found that several files cannot be copied from one directory to another unless I somehow slow down the code. If I just run the program, it fails, but if I go through the debugger or add print statements to the copy loop, it will succeed. The difference, apparently, is connected either with the synchronization of the cycle or with the movement in the memory.

I have seen such an error in compiled languages ​​before, and usually this indicates a race condition or memory corruption. However, there is only one thread and interaction with other processes, so the state of the race seems impossible. Memory corruption remains an opportunity, but I'm not sure how to investigate this possibility with python.

Here is the cycle in question. As you can see, this is pretty simple:

 def concat_dirs(dir, subdir): return dir + "/" + subdir for name in fullnames: install_name = concat_dirs(install_path, name) dirname = os.path.dirname(install_name) if not os.path.exists(dirname): os.makedirs(dirname) shutil.copyfile(concat_dirs(java_path, name), install_name) 

This loop usually does not copy files unless I step over it using the debugger or add this statement after the shutil.copyfile line.

  print "copied ", concat_dirs(java_path, name), " to ", install_name 

If I add this statement or go through debug, the loop will work perfectly and consistently. I was tempted to say “good enough” with the print statement, but I know that it just masks the underlying problem.

I am not asking you to debug my code because I know that you cannot; I am asking for a debugging strategy. How can I find this error?

+6
source share
1 answer

You have a race condition: you check for the dirname and then try to create it. This should make the program bomb if something unexpected happens, but ...

In Python, we say it's easier to ask forgiveness than permission. Continue and create this directory every time, and then apologize if it already exists:

 import errno import os for name in fullnames: source_name = os.path.join(java_path, name) dest_name = os.path.join(install_path, name) dest_dir = os.path.dirname(dest_name) try: os.makedirs(dest_dir) except OSError as exc: if exc.errno != errno.EEXIST: raise shutil.copyfile(source_name, dest_name) 

I’m not sure how to fix this problem, except by trying it in an indifferent way and seeing what happens. There may be a problem with the thin file system, which makes this run weird.

+1
source

All Articles