I am trying to write a Python script that will get md5sum of all the files in a directory (on Linux). Which, I believe, I did in the code below.
I want this to be able to run to make sure that the files in the directory are not changed and that the files were not added for deletion.
The problem is that I make changes to the file in the directory, but then change it. I get a different result from running the function below. (Despite the fact that I changed the modified file back.
Can someone explain this. And let me know if you can think about work?
def get_dir_md5(dir_path): """Build a tar file of the directory and return its md5 sum""" temp_tar_path = 'tests.tar' t = tarfile.TarFile(temp_tar_path,mode='w') t.add(dir_path) t.close() m = hashlib.md5() m.update(open(temp_tar_path,'rb').read()) ret_str = m.hexdigest()
Edit: As these lovely people replied, it looks like tar contains header information, such as the date modified. Will using zip work differently or in a different format?
Any other ideas for working around?
Greg
source share