Why does std :: filesystem provide so many non-member functions?

Question

Why does std :: filesystem provide so many non-member functions?

Consider, for example, file_size . To get the file size, we will use

 std::filesystem::path p = std::filesystem::current_path(); // ... usual "does this exist && is this a file" boilerplate auto n = std::filesystem::file_size(p);

Nothing wrong with that if it were just ol C, but being taught that C ++ is an OO language [I know this is a multi-paradigm, I apologize to our language lawyers :-)], which just feels like ... imperative ( shudder) to me where I came to expect the ish object

 auto n = p.file_size();

instead of this. The same applies to other functions such as resize_file , remove_file and possibly more.

Do you know any justification why Boost and, therefore, std::filesystem chose this imperative style instead of object-different? What is the use? Boost mentions the rule (at the very bottom), but has no justification for this.

I was thinking about inherent issues like p state after remove_file(p) , or error flags (overloads with an extra argument), but no approach solves these less elegant than others.

You can observe a similar model with iterators, where at present we can (presumably?) Do begin(it) instead of it.begin() , but here I think that the rationale should have been more compatible with the non-modifying next(it) etc.

+8

c ++ c ++ 17 boost-filesystem std-filesystem

dlw Mar 27 '17 at 17:57

source share

3 answers

The file system library has a very clear separation between the filesystem::path type, which represents the abstract path name (which is not even the name of an existing file) and operations that access the real physical file system, i.e. read + write data to discs.

You even pointed out an explanation for this:

The design rule is that purely lexical operations are provided as member functions of the class, and operations performed by the operating system are provided as free functions.

That's why.

Theoretically, you can use filesystem::path on a filesystem::path system. The path class simply contains a string of characters and allows you to manipulate this string, convert between character sets and use some rules that define the structure of file names and path names in the host OS. For example, he knows that directory names are split / on POSIX systems and \ on Windows. Manipulating the string contained in path is a “lexical operation” because it simply performs string manipulations.

Functions that are not members, which are known as file system operations, are completely different. They do not just work with the abstract path object, which is just a string of characters, they perform valid I / O operations that access the file system ( stat system calls, open , readdir , etc.), These operations take the path argument, which names files or directories to work with, and then they access real files or directories. They do not just manipulate strings in memory.

These operations depend on the API provided by the OS for accessing files, and depend on hardware that may not fully work in string manipulations. Disks may be full or may be disconnected before the operation is completed or may have hardware failures.

Looking at this, of course, file_size not a member of path , because it has nothing to do with the path itself. A path is simply a representation of the file name, not the actual file. The file_size function file_size for a physical file with the given name and tries to read its size. This is not a file name property, it is a permanent file property in the file system. Something that exists completely separate from the character string in memory that contains the file name.

In other words, I can have a path object that contains complete nonsense, such as filesystem::path p("hgkugkkgkuegakugnkunfkw") , and that is fine. I can add to this path or ask if it has a root directory, etc. But I can not read the size of such a file if it does not exist. I have a path to files that exist, but I do not have access to access, for example filesystem::path p("/root/secret_admin_files.txt"); , and that's fine too, because it's just a string of characters. When I try to access something in this place using the file system functions, I get an "access denied" error.

Because path member functions never touch the file system, they will never be able to exit due to permissions or nonexistent files. This is a useful guarantee.

You can observe a similar model with iterators, where at present we can (should?) Start (it) instead of it.begin (), but here I think that the rationale should have been more compatible with the -modulation of the following (this ) etc.

No, this was due to the fact that it works equally well with arrays (which cannot have member functions) and class types. If you know that the range you are dealing with is a container, not an array, then you can use x.begin() , but if you write general code and don’t know if it will be a container or an array, then std::begin(x) works in both cases.

The reasons for both of these things (file system design and out-of-band access functions) are not some preferences against OO, they have much more reasonable practical reasons. It would be a bad design based on any of them, because it feels better for those who love OO, or feels better for people who don't like OO.

In addition, there are things that you cannot do when all member functions are:

 struct ConvertibleToPath { operator const std::filesystem::path& () const; // ... }; ConvertibleToPath c; auto n = std::filesystem::file_size(c); // works fine

But if file_size was a member of path :

 c.file_size(); // wouldn't work static_cast<const std::filesystem::path&>(c).file_size(); // yay, feels object-ish!

+10

Jonathan wakely Mar 27 '17 at 19:21

source share

Several reasons (some speculative, although I do not follow the standardization process very closely):

Because it is based on boost::filesystem , which is designed in this way. Now you may ask: “Why is boost::filesystem designed this way?”, Which would be a fair question, but considering that this is so and that he saw a lot of mileage the way it is, it was accepted into the standard with very little changes. So were other Boost designs (although sometimes there are some changes, mostly under the hood).
The general principle when designing classes is “if functions do not need access to the protected / private members of the class,” and you can use existing members instead - you will not become a member. ”Although not everyone ascribes this, it seems that boost::filesystem designers do.
See the discussion (and argument) for this in the context of std::string() , a "monolithic" class using zillion methods, using C ++ Hebert Sutter highlighting, in "Guru of the week" No. 84 .
It was expected that in C ++ 17 we could already have Uniform Call Syntax (see the Bjarne Stroustrup offer with a high degree of readability). If it were accepted in the standard, calling
```
 p.file_size(); 
```
would be equivalent to calling
```
 file_size(p); 
```
so that you can choose whatever you want. Basically.

0

einpoklum Mar 27 '17 at 19:29

source share

Nir friedman · Accepted Answer · 2017-03-27T19:38:36+0000

There are some good answers that have already been sent, but they do not go to the point: ceteris paribus, if you can implement something as a free, non-friendly function, you should always.

Why?

Because free, unrelated functions do not have privileged access to state. Testing classes is much more complicated than testing functions, because you need to convince yourself that class invariants are supported regardless of which member functions are called or even combinations of member functions. The more functions you have, the more work you must do.

Free functions can be justified and tested autonomously. Since they do not have privileged access to the state of the class, they cannot violate any class invariants.

I don’t know the details about which invariants and which privileged path access allows, but, obviously, they were able to implement many functions in the form of free functions, and they make the right choice and do it.

Scott Meyers' brilliant article on this topic , which provides an "algorithm" for making a function a member or not.

Here Herb Sutter mourns the massive std::string interface . What for? Because most of the string interface could be implemented as free functions. Sometimes this can be a little cumbersome, but it is easier to test, reason, improve encapsulation and modularity, opens up possibilities for reusing code that did not exist before, etc.

Why does std :: filesystem provide so many non-member functions?

More articles: