We have an installation of an ftp system for monitoring / loading from remote ftp servers that are not under our control. The script connects to the remote ftp and captures the file names of the files on the server, then we check to see if something has already been downloaded. If it has not been uploaded, we will upload the file and add it to the list.
Recently, we encountered a problem when someone on the remote ftp side would copy in a massive single file (> 1 GB), then the script wakes up, sees a new file and starts downloading the file that is being copied to.
What is the best way to test this? I was thinking about capturing the file size, waiting for a few seconds, checking the file size again and see if it increased, if it was not then we downloaded it. But since time is a concern, we cannot wait a few seconds for each set of files and see if the file size has increased.
What would be the best way to do this, now everything is done using pythons ftplib, how can we do this except using the above method.
Once again, let me repeat this, we have 0 control over remote ftp sites.
Thanks.
Update1:
I thought that if I try to rename it ... since we have full permissions on ftp, if the file is being downloaded, will the rename command fail?
We have no real options ... are we?
UPDATE2: Well, here's something interesting, some of the ftps tested seem to automatically allocate space after the start of the transfer.
eg. If I transfer the 200mb file to the ftp server. Although the transfer is active, if I connect to the ftp server and do the size at boot time. It shows 200 mb for size. Despite the fact that the file is only 10% full.
The permissions also seem like a randomly configured FTP server that comes with IIS sets permissions AFTER the file completes. Although some of the other old ftp servers install it as soon as you send the file.
: '(