Immutable objects that reference each other?

Today I tried to wrap my head around immutable objects that reference each other. I came to the conclusion that you cannot do this without a lazy assessment, but in the process I wrote this (in my opinion) interesting code.

public class A { public string Name { get; private set; } public BB { get; private set; } public A() { B = new B(this); Name = "test"; } } public class B { public AA { get; private set; } public B(A a) { //a.Name is null A = a; } } 

Interestingly, I cannot think of another way to observe an object of type A in a state that is not yet fully constructed and includes threads. Why is this even true? Are there other ways to monitor the state of an object that is not fully constructed?

+75
immutability c #
Oct 05 2018-11-12T00:
source share
8 answers

Why is this even true?

Why do you expect it to be invalid?

Because the constructor must ensure that the code that it contains is executed before the external code can observe the state of the object.

Right. But the compiler is not responsible for maintaining this invariant. You. If you write code that breaks this invariant, and it hurts when you do this, stop doing it.

Are there other ways to monitor the state of an object that is not fully constructed?

Sure. For reference types, they are all related to the fact that they pass "this" from the constructor, obviously, since the only user code that contains a reference to the repository is the constructor. Some of the ways that the constructor of this can proceed are:

  • Put "this" in a static field and access it from another thread
  • make a method call or constructor call and pass "this" as an argument
  • making a virtual call is especially frustrating if the virtual method is overridden by the derived class, because it is executed before the derived ctor body is executed.

I said that the only user code that contains the link is ctor, but of course the garbage collector also contains the link. Thus, another interesting way of observing an object in a semi-constructed state is that the object has a destructor, and the constructor throws an exception (or receives an asynchronous exception, for example, interrupting a stream, more on this later). ) In this case, the object must be dead and, therefore, must be completed, but the stream of the finalizer can see the semi-initialized state of the object. And now we are back in the user code that can see the semi-designed object!

Destructors must be reliable in the face of this scenario. The destructor should not depend on any invariant of the object set by the supported constructor, since the destroyed object may never have been completely built.

In another crazy way that a semi-constructed object can be detected by external code, of course, if the destructor sees a semi-initialized object in the above scenario, and then copies the link to this object to a static field, thereby ensuring that the semi-constructed, semi-finished object is saved from death. Please, do not do that. As I said, if it hurts, don’t do it.

If you are in a value type constructor, then everything is basically the same, but there are slight differences in the mechanism. The language requires that calling the constructor on the value type creates a temporary variable that only ctor has access to, mutates that variable and then makes a structural copy of the changed value in the actual storage. This ensures that if the constructor throws, then the final repository is not in a semi-mutated state.

Note that since copies of the structure are not guaranteed to be atomic, it is possible that another thread might see the store in a half-mutated state; Use locks correctly if you are in this situation. In addition, an asynchronous exception, such as a thread interrupt, can be selected half through a copy of the structure. These atomicity problems occur whether the copy is a temporary or temporary copy. In general, very few invariants are supported if there are asynchronous exceptions.

In practice, the C # compiler optimizes time distribution and copy if it can determine that there is no way for this scenario to occur. For example, if a new value initializes a local one that is not closed by a lambda, and not in an iterator block, then S s = new S(123); mutates directly s .

For more information on how value type constructors work, see:

Destruction of another myth about value types

And for more information on how C # language semantics tries to save you from you, see:

Why do initializers work in the opposite order as constructors? Part one

Why do initializers work in the opposite order as constructors? Part two

I seem to be off topic. In the structure, you can, of course, observe that the object must be semi-constructed in the same way - copy the semi-constructed object into a static field, call the method with "this" as an argument, and so on. (Obviously, invoking a virtual method for a more derived type is not a problem for structures.) And, as I said, a copy from temporary to final storage is not atomic, and therefore another thread can observe a semi-copied structure.




Now consider the reason for your question: how do you create immutable objects that reference each other?

Usually, as you find out, you do not. If you have two immutable objects that reference each other, then logically they form a directed cyclic graph. You can simply build an immutable directed graph! This is pretty easy to do. A continuous directed graph consists of:

  • An optional list of immutable nodes, each of which contains a value.
  • An optional list of immutable node pairs, each of which has a start and end point on a graph edge.

Now the way that you create nodes A and B "reference" each other:

 A = new Node("A"); B = new Node("B"); G = Graph.Empty.AddNode(A).AddNode(B).AddEdge(A, B).AddEdge(B, A); 

And you are done, you have a graph where A and B "link" to each other.

The problem, of course, is that you cannot get to B from A without G in your hand. The presence of this additional level of indirection may not be acceptable.

+105
Oct 05 2018-11-11T00:
source share

Yes, this is the only way for two immutable objects to refer to each other - at least one of them should see the other in an incomplete form.

It is usually a bad idea to let this exit your constructor , but in cases where you are sure what both constructors are doing and this is the only alternative to mutability, I don't think this is too bad.

+47
Oct 05 2018-11-12T00:
source share

"Fully built" is determined by your code, not the language.

This is a variant of calling a virtual method from the constructor,
general guide: do not do this.

To correctly implement the concept of "fully built", do not miss this from your constructor.

+22
05 Oct 2018-11-12T00:
source share

Indeed, leaking the this link during the constructor will allow you to do this; this can cause problems if the methods are called on an incomplete object, obviously. Regarding "other methods of monitoring the state of an object that is not fully built":

  • call the virtual method in the constructor; the constructor of the subclass will not be called yet, therefore, override may try to access the incomplete state (fields declared or initialized in the subclass, etc.).
  • reflection, possibly using FormatterServices.GetUninitializedObject (which creates an object without calling the constructor at all)
+8
Oct 05 2018-11-12T00:
source share

If you consider the initialization order

  • Derived Static Fields
  • Derived Static Constructor
  • Derived instance fields
  • Basic static fields
  • Basic static constructor
  • Base Instance Fields
  • Base Instance Constructor
  • Derived instance constructor

it is clear that you are raising casting, you can access the class before the derived instance of the instance is called (this is the reason you should not use virtual methods from constructors. They can easily access derived fields not initialized by the constructor / constructor into a derived class could not bring the derived class into a “consistent” state)

+6
Oct 05 '11 at 12:53
source share

You can avoid this problem by putting the latter in your constuctor:

  public A() { Name = "test"; B = new B(this); } 

If what you offer is not possible, then A will not be immutable.

Edit: fixed thanks leppie.

+4
Oct 05 2018-11-12T00:
source share

The principle is that do not remove this object from the body of the constructor.

Another way to observe such a problem is to call virtual methods inside the constructor.

+3
Oct 05 '11 at 12:53
source share

As already noted, the compiler does not have the ability to know at what point the object was designed well enough to be useful; therefore, he assumes that the programmer who passes this from the constructor will know if the object was constructed well enough to satisfy his needs.

I would add, however, that for objects that should be truly immutable, you should avoid passing this to any code that will check the state of the field before it is assigned its final value. This means that this not passed to arbitrary external code, but it doesn’t mean that there is something wrong with the fact that the object under development passes itself to another object to store the backlink, which in fact will not be used until until the first constructor completes .

If someone were developing a language to facilitate the construction and use of immutable objects, it may be useful to declare it as usable only during construction, only after construction, or either; fields can be declared non-standard during construction and read-only; parameters can also be marked to indicate that they should be non-differentiable. In such a system, the compiler could allow the creation of data structures that referred to each other, but where any property could not change after it was observed. As for whether the benefits of such a static check will outweigh the cost, I'm not sure, but it might be interesting.

By the way, a related feature that would be useful would be the ability to declare parameters and returned functions as ephemeral, returned, or (by default) persistable. If the return of the parameter or function was declared ephemeral, it could not be copied to any field and not passed as a stable parameter to any method. In addition, passing an ephemeral or return value as a return parameter to the method will cause the return value of the function to inherit the limitations of this value (if the function has two return parameters, its return value will inherit a more restrictive restriction from its parameters). The main weakness of Java and .net is that all object references are messy; as soon as the external code gets their hands, no one knows who can do it. If the parameters can be declared ephemeral, then most often the code may contain code that contains a single link to what he knew, contains a single link, and thus avoids unnecessary defensive copy operations. In addition, things like closing can be recycled if the compiler could know that the links after them did not exist after they were returned.

+1
Jul 25 2018-12-12T00:
source share



All Articles