Are there any obstacles in C ++ that prevent the use of D ranges?

Question

Are there any obstacles in C ++ that prevent the use of D ranges?

This is a C ++ / D intersection issue. The D language has ranges that, unlike C ++ libraries, such as Boost.Range, are not based on iterator pairs. The official C ++ Range Study Group seems to be bogged down in solving technical problems.

Question : Does the current C ++ 11 or the upcoming C ++ 14 standard have any obstacles preventing the use of D ranges, as well as a suitable version of <algorithm> - wholesale?

I don't know D or its ranges well enough, but they seem lazy and complex, and are also capable of providing a superset of STL algorithms. Given their claim to success for D, it would be nice to have a library for C ++. I wonder how important D unique functions are (for example, string mixins, syntax of uniform functions) for implementing its ranges and whether C ++ can simulate this without much effort (for example, C ++ 14 constexpr seems very similar to the compilation time of the D evaluation function)

Note. I am looking for technical answers, not opinions on whether D ranges match the correct design as a C ++ library.

+17

c ++ c ++ 11 range c ++ 14 d

TemplateRex Aug 31 '13 at 15:29

source share

2 answers

My C ++ 11 knowledge is much more limited than I would like, so there may be newer functions that improve things that I don't know yet, but there are three areas that I can think of at the moment, which at least least problematic: template constraints, static if and introspection type.

In D, a range-based function typically has a pattern constraint indicating what type of ranges it accepts (e.g., direct range and random access range). For example, here is a simplified signature for std.algorithm.sort :

 auto sort(alias less = "a < b", Range)(Range r) if(isRandomAccessRange!Range && hasSlicing!Range && hasLength!Range) {...}

It checks that the type being passed is a random access range that can be sliced, and that it has a length property. Any type that does not satisfy these requirements will not be compiled with sort , and when the template constraint fails, it makes it clear to the programmer why their type will not work with sort (and not just gives an unpleasant compiler error from the middle of the template function when it does not compile with this type).

Now, although this may seem like an improvement in usability just to just give a compilation error when sort failed to compile because the type does not have the correct operations, it actually has a big impact on function overloading, as well as the type of introspection. For example, here are two of the std.algorithm.find overloads:

 R find(alias pred = "a == b", R, E)(R haystack, E needle) if(isInputRange!R && is(typeof(binaryFun!pred(haystack.front, needle)) : bool)) {...} R1 find(alias pred = "a == b", R1, R2)(R1 haystack, R2 needle) if(isForwardRange!R1 && isForwardRange!R2 && is(typeof(binaryFun!pred(haystack.front, needle.front)) : bool) && !isRandomAccessRange!R1) {...}

The first accepts a needle, which represents only one element, while the second accepts a needle, which represents a front range. These two can have different types of parameters, based solely on the limitations of the template, and can have very different code inside. Without any restrictions on the template, you cannot have template functions that are overloaded by the attributes of their arguments (as opposed to overloading by the specific types themselves), which makes it much harder (if not impossible) to have different implementations based on the range genre used ( e.g. input range and forward range) or other attributes of the types used. Some work in this area is done in C ++ with concepts and similar ideas, but AFAIK, C ++ is still seriously absent in the functions needed to overload templates (be it template functions or template types) based on the attributes of their argument types, and what specializes in specific types of arguments (as happens with specialized specialization).

A related function would be static if . This is the same as if , except that its condition is evaluated at compile time and whether it determines true or false , which branch is compiled, and not which branch is started. It allows you to deploy code based on conditions known at compile time. eg.

 static if(isDynamicArray!T) {} else {}

or

 static if(isRandomAccessRange!Range) {} else static if(isBidirectionalRange!Range) {} else static if(isForwardRange!Range) {} else static if(isInputRange!Range) {} else static assert(0, Range.stringof ~ " is not a valid range!");

static if can to some extent eliminate the need for template restrictions, since you can essentially overload overloaded templates into a single function. eg.

 R find(alias pred = "a == b", R, E)(R haystack, E needle) { static if(isInputRange!R && is(typeof(binaryFun!pred(haystack.front, needle)) : bool)) {...} else static if(isForwardRange!R1 && isForwardRange!R2 && is(typeof(binaryFun!pred(haystack.front, needle.front)) : bool) && !isRandomAccessRange!R1) {...} }

but this still leads to more unpleasant errors when compilation fails and actually makes it impossible for you to overload the template (at least using the D implementation), because the overload is determined before the template is instantiated. So you can use static if to specialize parts of the template implementation, but it doesn’t quite suit you which template constraints will force you to not need template constraints (or something like that).

Rather, static if great for creating things like specializing only part of your implementation of a function, or for creating it so that the range type can correctly inherit the attributes of the type of range it wraps. For example, if you call std.algorithm.map in an array of integers, the resulting range may have a cut (because the range of the source code), whereas if you called map in a range that did not have a cut (for example, the ranges returned by std.algorithm.filter , cannot have slices), then the resulting ranges will not chop. For this, map uses static if to compile in opSlice only if the range of sources supports it. Currently, map code that looks like this:

 static if (hasSlicing!R) { static if (is(typeof(_input[ulong.max .. ulong.max]))) private alias opSlice_t = ulong; else private alias opSlice_t = uint; static if (hasLength!R) { auto opSlice(opSlice_t low, opSlice_t high) { return typeof(this)(_input[low .. high]); } } else static if (is(typeof(_input[opSlice_t.max .. $]))) { struct DollarToken{} enum opDollar = DollarToken.init; auto opSlice(opSlice_t low, DollarToken) { return typeof(this)(_input[low .. $]); } auto opSlice(opSlice_t low, opSlice_t high) { return this[low .. $].take(high - low); } } }

This is the code in determining the type of the returned type map , and whether this code is compiled or not depends entirely on the results of static if s, none of which can be replaced by template specialists based on specific types without the need to write a new specialized template for map for each new type that you use with it (which is obviously not acceptable). To compile code based on type attributes, rather than specific types, you really need something like static if (which C ++ currently doesn't have).

The third important element that is missing in C ++ (and which I have more or less touched on) is type introspection. The fact that you can do something like is(typeof(binaryFun!pred(haystack.front, needle)) : bool) or isForwardRange!Range is critical. Without the ability to check whether a particular type has a specific set of attributes or that a particular piece of code is compiling, you cannot even write conditions, template constraints, and static if . For example, std.range.isInputRange looks something like this:

 template isInputRange(R) { enum bool isInputRange = is(typeof( { R r = void; // can define a range object if (r.empty) {} // can test for empty r.popFront(); // can invoke popFront() auto h = r.front; // can get the front of the range })); }

It verifies that a particular piece of code is compiling for a given type. If so, then this type can be used as an input range. If this is not so, then this is not possible. AFAIK, in C ++ it is impossible to do something even vaguely similar. But in order to intelligently implement ranges, you really need to be able to do things like 2 →, or check if a particular type compiles with sort - is(typeof(sort(myRange))) . Without this, you cannot specialize implementations based on what types of operations support a certain range, you cannot properly redirect the attributes of a range when it is transferred (and range functions constantly transfer their arguments to new ranges), and you cannot even properly protect your function from compiling with types that won't work with it. And, of course, the results of static if and template constraints also affect type introspection (since they affect what will and will not compile), so the three functions are very interconnected.

Indeed, the main reasons why ranges do not work very well in C ++ are some reasons why metaprogramming in C ++ is primitive compared to metaprogramming in D. AFAIK, there is no reason that these features (or similar ones) could not be added to C ++ and fix the problem, but as long as C ++ does not have metaprogramming capabilities similar to those of D, ranges in C ++ will be seriously violated.

Other functions, such as mixins and Uniform Function Call Syntax, would also help, but they are nowhere near the fundamental ones. Mikshinda will primarily help reduce code duplication, and UFCS will help in the first place so that the common code can simply call all functions as if they were member functions, so if the type defines a particular function (for example, find ), then will be used instead of the more general version of the free function (and the code still works if such a member function is not declared, because then the free function is used). UFCS is fundamentally not required, and you could go in the opposite direction and use free functions for everything (for example, C ++ 11 with begin and end ), although for this you need, in fact, to require free functions to be able to verify the existence member function, then call the member function inside, and not use your own implementations. So again, you need type introspection with static if and / or template constraints.

As much as I like ranges, at the moment I have largely abandoned trying to do anything with them in C ++, because functions to make them reasonable just don't exist. But if other people can figure out how to do this, the more power they have. Regardless of the ranges, I would like to see C ++ features, such as pattern restrictions, static if and the type of introspection, because without them metaprogramming is less pleasant, to the extent that while I do this all the time in D, I almost never I do not do this in C ++.

+8

Jonathan M Davis Aug 31 '13 at 21:04

source share

Dietmar Kühl · Accepted Answer · 2013-08-31 22:38

I do not think that in C ++ there are any inherent technical limitations that would not allow us to define a system of D-style ranges and the corresponding algorithms in C ++. The biggest problem at the language level would be that for -loops based C ++ level requires begin() and end() be used in ranges, but assuming we will use the length of the library definition using ranges D-style, expanding the for -loops range to solve these problems seems insignificant.

The main technical problem that I encountered when experimenting with algorithms in D-style ranges in C ++ was that I could not make the algorithms as fast as my iterators (in fact, based on the cursor). Of course, this can only be an implementation of my algorithm, but I have not seen anyone provide a reasonable set of algorithms based on the D ++ style range in C ++ with which I could profile. Performance is important, and the C ++ standard library should provide at least poor performance implementations of the algorithms (a general implementation of an algorithm is called poor performance if it is at least as fast when applied to a data structure as a user implementation of the same algorithm using the same data structure using the same programming language). I could not create weakly efficient algorithms based on D-style ranges, and my goal is actually highly efficient algorithms (similar to weakly efficient, but allowing any programming language and assuming only the same basic equipment).

When experimenting with D-style-based algorithms, I found that algorithms are much more difficult to implement than iterator-based algorithms, and found it necessary to deal with the codes in order to circumvent some of their limitations. Of course, not everything in the current algorithm, as indicated in C ++, is also ideal. A rough outline of how I want to change the algorithms and abstractions that they work with can be STL 2.0 . However, this page does not have much to do with ranges, as it is a related, but somewhat different topic. I would rather assume that ranges based on an iterator (well, really a cursor) than D-style ranges, but that wasn't the question.

One technical problem: all range abstractions in C ++ do face deal with temporary objects in a reasonable way. For example, consider this expression:

 auto result = ranges::unique(ranges::sort(std::vector<int>{ read_integers() }));

Depending on whether ranges::sort() or ranges::unique() lazy or not, you need to consider the representation of the time range. A simple representation of the source range is not an option for any of these algorithms, since the temporary object will be deleted at the end of the expression. One possibility may be to move the range if it is included as the value of r, requiring a different result for ranges::sort() and ranges::unique() to distinguish between the cases of the actual argument of both the temporary object and the object which was saved independently. D does not have this particular problem, because it is garbage collection, and therefore the original range will be supported anyway.

The above example also shows one of the problems with a possibly lazy evaluation algorithm: since any type, including types that cannot be otherwise written, can be inferred based on auto variables or template functions, there is nothing lazy evaluation at the end of the expression . Thus, results from expression patterns can be obtained, and the algorithm is not actually executed. That is, if the l-value is passed to the algorithm, you need to make sure that the expression is actually evaluated to obtain the actual effect. For example, any sort() algorithm that mutates the entire sequence explicitly makes the mutation in place (if you want the version not to do this in place, just copy the container and apply the version in place, if you only have it in place, you don’t you can avoid extra sequence, which can be an immediate problem, for example, for giant sequences). Assuming he's lazy somehow, accessing the original l-value sequence provides a peak in the current state, which is almost certainly bad. This may mean that lazy evaluation of mutation algorithms is not such a great idea.

In any case, there are some aspects of C ++ that make it impossible to immediately accept D-sytle ranges, although the same considerations apply to other range abstractions. I think that these considerations, therefore, are somewhat beyond the scope of the question. In addition, the obvious “solution” to the first problem (add garbage collection) is unlikely to happen. I don’t know if there is a solution to the second problem in D. There may be a solution to the second problem (a pre-occupied auto operator), but I don’t know a specific proposal or how such a function will really look like.

BTW, the Ranges research team is not really bogged down in any technical details. So far, we have simply tried to find out what problems we are actually trying to solve, and to some extent expand the space of solutions. In addition, groups do not do any work at all! Actual work is always done by people, often very few people. Since most of the work is actually developing a set of abstractions, I would expect that the basics of any results of the Range Study Group would be from 1 to 3 people who have some vision of what is needed and how it should look.

Are there any obstacles in C ++ that prevent the use of D ranges?

More articles: