Will C # benefit from aggregate structures / classes?

Foreword

tl; wr: This is a discussion.

I know that this “question” is more likely to be discussed, so I will mark it as a wiki community. However, according to the How to Ask page, it may belong here, since it is specifically related to programming, it has not been discussed anywhere on the Internet after an hour of research related to most C # programmers and topics. Moreover, the question itself is intended to get an answer for which I would remain open, regardless of my bias: could C # benefit from aggregate structures? Despite this introduction, I would understand it to be closed, but it would be appreciated if users with the authority and intentions to close me were redirected to the appropriate place for discussion on the Internet.


Introduction


Disadvantages of structural variability

Structures are flexible, but discussed types in C #. They offer an organizational paradigm of type type with stack distribution, but not the immutability of other value types.

Some say that structures should represent values, and values ​​do not change (for example, int i = 5; 5;, 5 is unchanged), while some perceive them as OOP mockups with subfields.

The discussion about the immutability of the structure ( 1 , 2 , 3 ), for which the current solution seems to require the programmer to ensure immutability, is also not resolved.

For example, the C # compiler detects possible data loss when structures are accessed as a reference (at the bottom of this page ) and restricts the purpose. Moreover, since the constructors, properties and functions of the structure can perform any operation, and the limit (for constructors) of assigning all fields before returning controls, structs cannot be declared as a constant , which would be the correct expression if they were limited by the representation of the data.


An immutable subset of structures, aggregates

Aggregate classes ( Wikipedia ) are strict data structures with limited functionality designed to use syntactic sugar versus lack of flexibility. In C ++, they have "no user-declared constructors, no private or protected non-static data members, no base classes, and no virtual functions." The theoretical features of such classes in C # are open to discussion here, although the basic concept remains unchanged.

Since aggregate structures are strictly data holders with marked accessories, their immutability (in a possible C # context) will be insured. Aggregates also cannot be canceled if the null operator ( ? ) ? not specified, as for other pure value types. For this reason, it became possible to perform many illegal structural operations, as well as some syntactic sugar.


Using


  • Aggregates can be declared as const, because their constructor will only force field assignments.
  • Aggregates can be used as default values ​​for method parameters.
  • Aggregates can be implicitly sequential, facilitating interaction with native
  • Aggregates will be unchanged, which will not lead to data loss for reference access. Detection by the compiler of such modifications of subfields can lead to a complete, implicit reassignment. Libraries

Hypothetical syntax


Taking from C ++ syntax, we could imagine something like: (Remember, this is a community wiki, improvement is welcome and encouraged)

 aggregate Size { int Width; int Height; } aggregate Vector { // Default values for constructor. double X = 0, Y = 0, Z = 0; } aggregate Color { byte R, G, B, A = 255; } aggregate Bar { int X; Qux Qux; } aggregate Qux { int X, Y; } static class Foo { // Constant is possible. const Size Big = new Size(200, 100); // Inline constructor. const Vector Gravity = { 0, -9.8, 0 }; // Default value / labeled parameter. const Color Fuschia = { 255, 0, 255 }; const Vector Up = { y: 1 }; // Sub-aggregate initialization const Bar Test = { 20, { 4, 3 } }; static void SetVelocity(Vector velocity = { 0, 1, 0 }) { ... } static void SetGravity(Vector gravity = Foo.Gravity) { ... } static void Main() { Vector v = { 1, 2, 3 }; double y = vY; // Valid. vY = 5; // Invalid, immutable. } } 

Implicit (re) assignment

Today, the correct assignment of a structure subfield in C # 4.0 is:

 Vector v = new Vector(1, 2, 3); vZ = 5; // Legal in current C#. 

However, sometimes the compiler can detect when structures mistakenly refer to them as links, and prohibit changing subfields. For example, ( example question )

 //(in a Windows.Forms context) control.Size.Width = 20; // Illegal in current C#. 

Since Size is a property and struct Size a value type, we will edit the copy / clone of the actual property, which would be useless in this case. As C # users, we tend to assume that most things are accessible by reference, especially in OOP projects, which makes us think that such a call is legal (and it would be if struct Size were class ).

In addition, when accessing collections, the compiler also forbids us to change the structural subfield: ( example question )

 List<Vector> vectors = ... // Imagine populated data. vectors[4].Y = 10; // Illegal in current C#. 

The good news about these unsuccessful constraints is that the compiler makes half the total solution possible for such cases: it detects when they happen. The other half will be to implicitly reassign a new unit with a changed value.

  • In the local area, just reassign the vector.
  • When you are in the outside area, find get, and if access to the corresponding set is available, reassign it.

For this to be done and to avoid confusion, the delegate must be marked as implicit:

 implicit aggregate Vector { ... } implicit aggregate Size { ... } // Example 1 { Vector v = new Vector(1, 2, 3); vZ = 5; // Legal with implicit aggregates. // What is implicitly done: v = new Vector(vX, vY, 5); // Local variable, simply reassign. } // Example 2 { //(in a Windows.Forms context) control.Size.Width = 20; // Legal with implicit aggregates. // What is implicitly done: Size old = control.Size.__get(); // External, MSIL detects a get. // If MSIL can find a matching, accessible __set: control.Size.__set({ 20, old.Height }); } // Example 3 { List<Vector> vectors = ... // Imagine populated data. vectors[4].Y = 10; // Legal with implicit aggregates. // What is implicitly done: Vector old = vectors[4].__get(); // External, MSIL detects a get. // If MSIL can find a matching, accessible __set: vectors[4].__set({ old.X, 10, old.Z }); } // Example 4 { Vector The5thVector(List<Vector> vectors) { return vectors[4]; } ... List<Vector> vectors = ...; The5thVector(vectors).Y = 10; // Illegal with implicit aggregates. // This is illegal because the compiler cannot find an implicit // "set" to match. as it is a function return, not a property or // indexer. } 

Of course, this last implicit reassignment is just a syntactic simplification that could or could not be accepted. I just suggest this because the compiler seems to be able to detect such referential access to structures and can easily convert code for the programmer if it was an aggregate.


Summary

  • Aggregates may have fields;
  • Aggregates are value types;
  • Aggregates are immutable;
  • Aggregates are allocated on the stack;
  • Aggregates cannot inherit;
  • Units have a sequential layout;
  • Aggregates have a default serial constructor;
  • Aggregates cannot have user-defined constructors;
  • Aggregates may have default values ​​and labeled designs;
  • Aggregates can be defined inline;
  • Aggregates may be declared as permanent;
  • Aggregates can be used as default parameters;
  • Aggregates are not NULL unless specified ( ? );

Maybe:

  • Aggregates (may be) implicitly reassigned; See Answer and comment by Marcelo Cantosa.
  • Aggregates (may) have interfaces;
  • Aggregates (may) have methods;

Against

Since aggregates will not replace structures, but rather are a different organizational chart, I cannot find many disadvantages, but I hope that C # veterans from S / O will be able to fill out this CW section. In the last note, please answer the question directly and also discuss it: Will C # bring an advantage to aggregate classes as described in this post? I’m not a C # specialist at all, but I’m only an enthusiast in C # and will miss this function, which seems important to me. I am looking for advice and comments from experienced programmers on this matter. I know that there are numerous workarounds that exist and actively use them every day , I just think that they are too common to ignore them.

+7
source share
3 answers

I would like the structures to be defined with something like your proposed semantics in the first place.

However, we are stuck with what we have now, and I think that we are unlikely to ever get a completely new “type” in the CLR. Introducing a new type of type means implementing it in every .NET language, and not just in C #, and this is a big change.

I think it’s more likely - and remember, when I talk about hypothetical language opportunities for hypothetical, undeclared future products that do not exist and may never exist, I do this only for entertainment purposes - we will find a way to make better annotations and enforcement actions for both classes and structures. The compiler could improve both to ensure immutability and to simplify programming in an immutable style, regardless of whether the type in question is a value type or a reference type. And the compiler or CLR can also potentially be better at optimizing code that runs on multi-core machines if it has more immutability guarantees known at compile time or jit time.

While you are shying away from your suggestion, an interesting question that you might want to consider is this: if the aggregated types have methods, does this have a value or a variable? For example:

 aggregate Vector { int x, y, z; public void M(Action action) { Console.WriteLine(this.x); action(); Console.WriteLine(this.x); } } ... Vector v = new Vector(1, 2, 3); Action action = ()=>{ v = new Vector(4, 5, 6); }; vM(action); 

What's happening? Is "this" passed to M by value, in which case it writes out "1" twice, or is it passed as a reference to a variable, in which case the so-called "immutable" type is observed? (Since what is a mutation is a variable, by definition, variables are allowed to mutate, so they are called "variables."

+4
source

What will this do?

 List<Vector> vectors = ...; Vector v = vectors[4]; vY = 10; 

or that?

 Vector The5thVector(List<Vector> vectors) { return vectors[4]; } ... List<Vector> vectors = ...; The5thVector(vectors).Y = 10; 

Replacing diagnostics with an implicit assignment will not make you very far. There, the cause of volatile structures is so problematic, and simply declaring a new concept, aggregates, will not eliminate any of these problems.

The best solution would be to ban mutable structures in the language in the first place. The second best solution is to act as if they were forbidden. Structures should be small and autonomous, which eliminates any shortcomings in order to make them unchanged.

+3
source

No, it will not do any good. Structures are better as mutable types.

First of all ... "Invariance with implicit reassignment" is actually just "ineffective variability."

Given the "Point" structure, if you intend to change only the value of X, why force a rewrite of the entire memory structure? Just rewriting X is only more efficient than rewriting X with a new value and it is pointless to rewrite Y with its current value. There would be no benefit to such a scheme.

Honestly, the whole topic of variability is a matter of perspective. In fact, it makes sense to talk about variability when referring to a complex object as a whole and ask whether its individual figures change while maintaining references to the object as a whole.

For example, it makes sense to call a string immutable because you are referring to it as a specific block of memory representing a collection of characters in which characters do not change the meaning in terms of everything that has a reference to It. On the other hand, int struct is mutable, because its value can be changed by simple assignment, and any references (pointers) to int struct will see these changes.

As for "this" in structural or aggregated methods, it should always refer to the structure / aggregate memory cell on the stack, therefore updates via anonymous methods and delegates that change the structure value should be reflected and displayed as mutable. In summary, mutability is a good idea at the fundamental level of a variable, and immutability is best handled at a higher level where complex objects are represented, and "immutable" behavior is explicitly encoded.

+1
source

All Articles