How safe are fast collections when used with invalid iterators / indexes?

I do not see much information in the stdlib help system. For example, a dictionary says that certain methods (like remove) will override indexes, but that is.

For a language that calls itself "safe", it needs a solution for classic C ++ pedals:

  • get a pointer to an element in the vector, then add more elements (now the pointer is now invalid), now use the pointer, crash

  • start iterating through the collection. while iterating, delete some elements (before or after the current position of the iterator). continue iteration, crash.

(edit: in C ++, you are lucky to crash - worst case - memory corruption)

I find that 1 is solved quickly because if the collection stores classes, using a reference (like a strong pointer) for an element increases the conversion factor. However, I do not know the answer to 2.

It would be very useful if there was a comparison of pedals in C ++ that / were not resolved fast.

EDIT due to Rob's answer:

There seems to be some kind of undocumented snapshot-like behavior with a dictionary and / or for loops. The iteration creates a snapshot / blind carbon copy of it at startup.

Which gives me both a big "WAT" and "cool, so safe, I think," and "how expensive is this copy?"

I do not see this being registered either in the generator or in the for-loop.

The code below prints two logical snapshots of the dictionary. The first snapshot of userInfo , as it was at the beginning of the iteration loop, and does not reflect any changes made during the loop.

 var userInfo: [String: String] = [ "first_name" : "Andrei", "last_name" : "Puni", "job_title" : "Mad scientist" ] userInfo["added_one"] = "1" // can modify because it var print("first snapshot:") var hijacked = false for (key, value) in userInfo { if !hijacked { userInfo["added_two"] = "2" // doesn't error userInfo.removeValueForKey("first_name") // doesn't error hijacked = true } print("- \(key): \(value)") } userInfo["added_three"] = "3" // modify again print("final snapshot:") for (key, value) in userInfo { print("- \(key): \(value)") } 
+5
source share
1 answer

As you say, No. 1 is not a problem. You do not have a pointer to an object in Swift. You either have a value or a link to it. If you have a value, then this is a copy. If you have a link, it is protected. Therefore, there is no problem.

But consider the second and the experiment, be surprised, and then stop being surprised.

 var xs = [1,2,3,4] for x in xs { // (1) if x == 2 { xs.removeAll() // (2) } print(x) // Prints "1\n2\n3\n\4\n" } xs // [] (3) 

Wait for it to print all the values ​​when we blow away the values ​​in (2). Now we are very surprised.

But we should not be. Fast arrays are values. The value xs at (1) is the value. Nothing can change it. This is not a "pointer to memory, which includes an array structure containing 4 elements." This value is [1,2,3,4] . In (2) we do not "remove all elements from the marked xs thing". We take the xs is thing, create an array that occurs if you delete all the elements (which will be [] in all cases), and then assign this new xs array. Nothing bad happens.

So what does the documentation mean, "invalidates all indexes?" That is exactly so. If we generated indexes, they will no longer be useful. Let's get a look:

 var xs = [1,2,3,4] for i in xs.indices { if i == 2 { xs.removeAll() } print(xs[i]) // Prints "1\n2\n" and then CRASH!!! } 

Once xs.removeAll() is called, there is no promise that the old xs.indices result xs.indices nothing more. You are not allowed to use these indexes safely against the collection from which they came.

"Invalidates indices" in Swift does not match C ++ "invalidates iterators." I would call it pretty safe, except for the fact that using collection indexes is always a little dangerous, so you should avoid indexing collections when you can; iterate them instead. Even if you need indexes for any reason, use enumerate to get them without creating the danger of indexing.

(Side note, dict["key"] not indexed in dict . Dictionaries are a bit confusing because their key is not their index. Accessing dictionaries by DictionaryIndex just as dangerous as accessing arrays by their Int index.)

Also note that the above does not apply to NSArray . If you modify the NSArray when repeating it, you will get the error "mutated collection while iterating". I discuss only Swift data types.


EDIT: for-in very clear on how it works:

The generate () method is called in the collection expression to get the value of the generator type, that is, the type that conforms to the GeneratorType protocol. The program starts the execution of the loop by calling the next () method in the stream. If the return value is not None, it is assigned to the element template, the program executes the instructions, and then continues execution at the beginning of the loop. Otherwise, the program does not perform assignment or execution of statements, and it completes the execution of the for-in statement.

The returned Generator is a struct and contains the value of the collection. You did not expect any changes to any other value to change its behavior. Remember: [1,2,3] no different from 4 . They are both values. When you assign them, they make copies. Therefore, when you create a Generator over a collection value, you will take a snapshot of this value as if I created a generator at number 4. (This causes an interesting problem, because generators are not really values, and therefore really should not be structures. They should be classes. The stdlib fix fixed this. See, for example, the new AnyGenerator . But they still contain the value of the array, and you never expect changes to any other value of the array to work on them.)

See also "Structures and Enumerations are Value Types" for a more detailed discussion of the importance of value types in Swift. Arrays are just structures.

Yes, that means it logically copies. Swift has many optimizations to minimize actual copying when it is not needed. In your case, when you mutate a dictionary while it repeats, it will make a copy happen. Mutation is cheap if you are the only consumer of a particular storage of reserve value. But this is O (n) if you haven’t. (This is determined by the Swift isUniquelyReferenced() .) In short: Swift Collections is Copy-on-Write, and simply passing the array does not cause the memory to be allocated or copied.

You will not receive KOR for free. Your own structures are not COW. This is what Swift does in stdlib. (See Mike Ash for a great discussion on how you recreate it.) Passing your own custom structures results in real copies. However, most of the memory in most structures is stored in collections, and these collections are COW, so the cost of copying structures is usually quite small.

The book does not spend a lot of time rolling up value types in Swift (it explains everything, but just doesn't say “hey, and that's what it means”). On the other hand, it was a constant topic at WWDC. You may be interested, in particular, in Creating the best applications with type values ​​in Swift that relate to this topic. I believe Swift in Practice also discussed this.


EDIT2:

@KarlP brings up an interesting point in the comments below, and it's worth considering. None of the promises security issues we discuss are related to for-in . They are based on Array . for-in does not make promises at all about what happens if you mutate a collection while it repeats. That would not even make sense. for-in does not "iterate over collections", it calls next() on Generators . Therefore, if your Generator becomes undefined, if the collection is modified, then the for-in explode because the Generator explode.

This means that the following may be unsafe, depending on how strictly you read the specification:

 func nukeFromOrbit<C: RangeReplaceableCollectionType>(var xs: C) { var hijack = true for x in xs { if hijack { xs.removeAll() hijack = false } print(x) } } 

And the compiler will not help you. This will work well for all Swift collections. But if calling next() after a mutation for your collection is undefined behavior, then this is undefined behavior.

My opinion is that it would be bad if Swift created a collection that allows this Generator to become undefined in this case. You can even claim that you violated the Generator specification if you do (it does not offer UB "out" unless the generator has been copied or nil returned). Therefore, you can claim that the above code is completely within the specification, and your generator is damaged. These arguments are usually a bit messy with a “spec” like Swift, which does not plunge into all corner cases.

Does this mean that you can write unsafe code in Swift without getting a clear warning? Absolutely. But in many cases that usually cause errors in the real world, Swift's built-in behavior does the right thing. And in this, it is safer than some other options.

+7
source

All Articles