Gracefully reading files without locking

Board Overview

Below are images of 1000 x 750 px, ~130 kB JPEGs hosted on ImageShack.


Additional Information

I should mention that each user (from client boxes) will work directly with the /Foo resource. Due to the nature of the business, users will never have to browse or work with each other at the same time, so conflicts of this nature will never be a problem. Access should be as simple as possible for them, which probably means mapping the drive to the appropriate /Foo/username subdirectory.

In addition, no one except my applications (my own and those on the server) will directly use the FTP directory.


Possible implementations

Unfortunately, this is not like I can use shelf tools such as WinSCP, because some other logic should be closely related to the process.

I believe that there are two easy ways for me to do this on the inside.

  • Method One (Slow):

    • Go through the directory tree /Foo every N minutes.

    • Diff with the previous tree using a combination of timestamps (it can be faked with file copying tools, but in this case they are not relevant) and verification summation.

    • Combine the changes with an off-site FTP server.

  • Method Two:

    • Register for directory change notifications (for example, using ReadDirectoryChangesW from WinAPI or FileSystemWatcher if using .NET).

    • Changes in the log.

    • Combine the changes with the FTP server within N minutes.

I will probably end up using something like the second method for performance reasons.


Problem

Since this synchronization should be performed during business hours, the first problem occurs during the off-site download phase.

While I am moving the file off-site, I really need to prevent users from writing to the file (for example, use CreateFile with FILE_SHARE_READ or something else) while I read it. The speed of the Internet upstream in their office is nowhere close to the size of the files that they will work with, so it is quite possible that they will return to the file and try to change it while I am still reading it.


Possible Solution

The simplest solution to this problem would be to create a copy of the file (s) in question elsewhere in the file system and transfer these “snapshots” without interference.

The files (some of them will be binary) that these guys will work with are relatively small, probably ≤20 MB, so copying them (and, therefore, temporarily blocking) will be almost instantaneous. The chances of them trying to write the file at the very moment when I copy it should be close to zero.

This solution seems pretty ugly, and I'm sure there is a better way to handle this problem.

One thing that comes to mind is something like a file system filter that takes care of replication and synchronization at the IRP level, like what some A / Vs do. This, however, is too much for my project.


Questions

This is the first time I have encountered this type of problem, so maybe I think too much about it.

I am interested in clean solutions that do not require going overboard with the complexity of their implementation. Perhaps I missed something in WinAPI that handles this problem gracefully?

I did not decide what I would write, but I like: C, C ++, C #, D and Perl.

+8
synchronization windows filesystems locking winapi
source share
2 answers

After discussions in the comments, my suggestion would be:

  • Create a partition on your data server, about 5 GB, for security.
  • Create a Windows service project in C # that will control your data driver / location.
  • When the file has been modified, create a local copy of the file containing the same directory structure and location in the new section.
  • Create another service that will perform the following actions:
    • Using Monitor Bandwidth
    • Track file creation on a temporary partition.
    • Transfer several files at the same time (use Threading) to your FTP server, observing the current bandwidth usage, reducing / increasing workflows depending on network traffic.
    • Delete files from the successfully migrated partition.

So basically you have your drives:

  • C: Install Windows
  • D: Share storage
  • X: Temporary section

Then you will have the following services:

  • LocalMirrorService - watches D: and copies to X: with structure dir
  • TransferClientService - moves files from X: to ftp-server, deletes from X:
    • Also use multiple threads to move multiples and track bandwidth.

I would argue that this is an idea that you had in mind, but it seems like a reasonable approach if you are really good at application development and able to create a robust system that would cope with most problems.

When a user, for example, edits a document in Microsoft Word, the file will be changed on the share and can be copied to X: even if the user is still working on it, there will be APIs in the windows if, for example, the file descriptor is still opened by the user if so, then you can simply create a hook to see when the user actually closes the document so that all edits are complete, then you can go to the X: drive.

This suggests that if the user is working on a document and for some reason the computer crashes, the document / file descriptor cannot be released until the document is opened later, which will cause problems.

+1
source share

For those in a similar situation (I assume that the person who has long asked the question about the implementation of the solution), I would suggest implementing rsync .

rsync.net The Windows Backup Agent does what is described in method 1 and can also run as a service (see " Advanced Usage "). Although I'm not quite sure if there is a bandwidth limit ...

Another (probably best) solution with bandwidth limitations is Duplicati . It also correctly backs up open or locked files. Uses SharpRSync, a managed implementation of rsync, for its backend. Open source too, which is always a plus!

0
source share

All Articles