Windows file system API for requesting large files

I have a HDD (say 1 TB) with FAT32 and NTFS partitions, and I don’t have the information on which all the files are stored on it, but if necessary I want to quickly access large files, say more than 500 MB. I do not want to scan the entire hard drive, as it takes a lot of time. I need quick results. I was wondering if there are any NTFS / FAT32 APIs that I can directly call - I mean, if they have some metadata about the files that are stored, then it will be faster. I want to write my program in C ++ and C #.

EDIT If scanning the hard drive is the only option, then what can I do to ensure maximum performance. For example - I can skip scanning system folders, since I am only interested in user data.

+4
source share
4 answers

If you're ready to target Vista and beyond, you can use the search indexer API.

If you look here , you can find information about the search index. A search index indexes the size of the file so that it can do what you want.

+3
source

Impossible. Neither the file system contains a list of large files that you could request directly. You will have to recursively look at each folder and check the size of each file to find what you think is large.

+2
source

Your only prayer is to click on the indexer file, otherwise you will have to iterate over all the files. Depending on your computer, you can click on the Microsoft native index (searchindexer.exe) or if you have a Google Desktop search, you can click on it.

Possible snap method on Microsoft indexer

+2
source

If you are willing to do a lot of extra work yourself to speed up the process, you might be able to do something. Much will depend on what you need.

Let's start with FAT32. FAT (in general, not only the 32-bit version) is named for the file allocation table. This is the data block at the beginning of the section, which reports which clusters in the section belong to the files. FAT is mainly organized as linked cluster lists. If you just want to find data areas for large files, you can read the FAT as a series of raw sectors and look at that data to find linked lists larger than X clusters (where X defines the lower limit for what you are considering a large file) . You can then access these clusters and view the actual data associated with each file. Oddly enough, what you don't know is the name of this file. File names are contained in directories that are mostly similar to files, except that they contain records of a fixed size of the specified format. You must start from the root directory and read the directory tree to find the file names.

NTFS is simpler and more complex. The NTFS file system has a master file table (MFT) that contains entries for all files in the partition. The good point is that you can read the MFT and get information about all the files on the disk without having to chase the directory tree to get it. The bad point is that decoding the contents of an NTFS partition is definitely non-trivial. Reading data is (meaningfully) quite difficult - and writing data is much more difficult. In addition, recent versions of Windows have added more restrictions on raw reading from disk partitions, so depending on which partition you need, you may not be able to access the data that you need at all.

None of this, however, is anything more than minimal. To do this, you open a file with the name "\. \ D:" (where D = the drive letter you care about). Then you can read the raw sectors from this disk (assuming this works). This will allow you to see the raw data for the entire disk (or, depending on the case, partition), starting from the boot sector, and view everything else (FAT, root directory, subdirectories, etc. - all as sectors of raw data ) The system will allow you to read raw data, but all the work to have an idea about this data is 100% of your responsibility. If the speed you requested is an absolute must, this may be an opportunity, but it will take a lot of work for FAT volumes and much more than for NTFS. If you do not need extra speed, as you said, you probably should not even try to do it.

+2
source

All Articles