Why do files handle such an expensive resource?

In holy wars about whether garbage collection is good, people often point out that they don't handle things like freeing file descriptors. Including this logic in the finalizer is considered bad because the resource is then freed non-deterministically. However, it seems that a simple solution will be to let the OS just make sure that many, many files are available, that they are a cheap and abundant resource, and you can afford to spend a few at any given time. Why is this not done in practice?

+6
garbage-collection file operating-system
source share
6 answers

Closing a file also flushes writes to disk - well, from the perspective of your application. After closing the file, the application may crash, if the system itself does not work, the changes will not be lost. Therefore, it would be nice to let the GC close files at your leisure. Even if it may be technically possible these days.

In addition, frankly, old habits die hard. File processing was expensive and still considered so for historical reasons.

+2
source share

In practice, this cannot be done because the OS will allocate a lot more memory overhead to track which descriptors are used by various processes. In the C code example, as shown below, I will demonstrate the simple structure of an OS process stored in a circular queue for an example ...

  struct ProcessRecord {
   int ProcessId;
   CPURegs cpuRegs;
   TaskPointer ** children;
   int * baseMemAddress;
   int sizeOfStack;
   int sizeOfHeap;
   int * baseHeapAddress;
   int granularity;
   int time;
   enum State {Running, Runnable, Zombie ...};
   / * ... few more fields here ... * /
   long * fileHandles;
   long fileHandlesCount;
 } proc;

Imagine that fileHandles is a pointer to an array of integers, each of which contains a location (possibly in encoded format) for shifting to the OS table where the files are stored on disk.

Now imagine how much memory that could eat and slow down the whole kernel can lead to instability, since the concept of "multitasking" the system will fall due to the need to track how many file descriptors are used and provide a mechanism for dynamically increasing / decreasing the pointer to integers that can cause the effect of pressing when the user program slows down, if the OS issues files for processing files on demand of the user program.

Hope this helps you understand why this is not implemented and practical.

Hope this makes sense, Best regards, Tom.

+5
source share

This is not just the number of file descriptors, it is sometimes when they are used in some modes, they can prevent other callers from accessing the same file.

+2
source share

I’m sure that more complete answers will follow, but based on my limited experience and understanding of the basic operation of Windows, file descriptors (structures used to represent their OS) are kernel objects, and therefore they require access to a specific type of memory - not to mention about processing on a part of the kernel to ensure consistency and consistency with several processes that require access to the same resources (for example, files)

+2
source share

I do not think that they are necessarily expensive - if your application contains only a few non-stationary discoveries, this will not kill the system. Just as if you only skipped a few lines in C ++, no one will notice if they do not look pretty carefully. Where does this become a problem:

  • if you miss hundreds or thousands
  • If opening the file prevents other operations from appearing in this file (other applications may not open or delete the file)
  • this is a sign of sloppiness - if your program cannot keep track of what it owns and uses or has stopped using, what other problems will the program have? Sometimes a small leak turns into a big leak when there are some small changes or the user does something a little different than before.
+1
source share

In Linux paradigms, file descriptors. There are certain advantages to freeing TCP ports as soon as possible.

0
source share

All Articles