Why does iteration continue through the Structs MUCH array faster than an array of classes?

I am developing a game with Swift, and I have a static array of positional data that I use to process in the game loop. I originally used an array of Structs to store this data, but I decided to switch to classes to use references. However, after making changes and profiling, I noticed that the processor spends much more time on a method that processes this data than when using Structs.

So, I decided to create a simple test to find out what was going on.

final class SomeClass {} struct SomeStruct {} let classes = [ SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), ] let structs = [ SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), ] 


 func test1() { for i in 0...10000000 { for s in classes {} } } func test2() { for i in 0...10000000 { for s in structs {} } } 

Test1 takes 15.4722579717636 s, while Test2 takes only 0.276068031787872 s. Iterating continuously through an array of structs was 56 times faster. So my question is: why is this? I am looking for a detailed answer. If I were to assume, I would say that the structures themselves are stored sequentially in memory, and classes are stored only as addresses. Therefore, each time they need to be played. But then again, is it not necessary to copy structures every time?

Side note: Both arrays are small, but I repeat them continuously. If I changed the code to iterate once, making the arrays very large, for example:

 for i in 0...10000000 { structs.append(SomeStruct()) classes.append(SomeClass()) } func test1() { for s in classes {} } func test2() { for s in structs {} } 

Then I get the following: Test1 takes 0.841085016727448 s, and Test2 takes 0.00960797071456909 s. Structures 88 times faster.

I am using the OS X build build and the optimization level is set to Fastest,Smallest [-Os]


Edit

As requested, I edited this question to include a test in which structures and classes are no longer empty. They use the same properties that I use in my game. Still not changed. Structures are still much faster, and I don't know why. Hope someone can give an answer.

 import Foundation final class StructTest { let surfaceFrames = [ SurfaceFrame(a: SurfacePoint(x: 0, y: 410), b: SurfacePoint(x: 0, y: 400), c: SurfacePoint(x: 875, y: 410), surfaceID: 0, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 880, y: 304), b: SurfacePoint(x: 880, y: 294), c: SurfacePoint(x: 962, y: 304), surfaceID: 1, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 787, y: 138), b: SurfacePoint(x: 791, y: 129), c: SurfacePoint(x: 1031, y: 248), surfaceID: 2, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 523, y: 138), b: SurfacePoint(x: 523, y: 128), c: SurfacePoint(x: 806, y: 144), surfaceID: 3, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1020, y: 243), b: SurfacePoint(x: 1020, y: 233), c: SurfacePoint(x: 1607, y: 241), surfaceID: 4, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1649, y: 304), b: SurfacePoint(x: 1649, y: 294), c: SurfacePoint(x: 1731, y: 305), surfaceID: 5, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1599, y: 240), b: SurfacePoint(x: 1595, y: 231), c: SurfacePoint(x: 1852, y: 128), surfaceID: 6, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1807, y: 141), b: SurfacePoint(x: 1807, y: 131), c: SurfacePoint(x: 2082, y: 138), surfaceID: 7, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 976, y: 413), b: SurfacePoint(x: 976, y: 403), c: SurfacePoint(x: 1643, y: 411), surfaceID: 8, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1732, y: 410), b: SurfacePoint(x: 1732, y: 400), c: SurfacePoint(x: 2557, y: 410), surfaceID: 9, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 2130, y: 490), b: SurfacePoint(x: 2138, y: 498), c: SurfacePoint(x: 2109, y: 512), surfaceID: 10, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1598, y: 828), b: SurfacePoint(x: 1597, y: 818), c: SurfacePoint(x: 1826, y: 823), surfaceID: 11, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 715, y: 826), b: SurfacePoint(x: 715, y: 816), c: SurfacePoint(x: 953, y: 826), surfaceID: 12, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 840, y: 943), b: SurfacePoint(x: 840, y: 933), c: SurfacePoint(x: 920, y: 943), surfaceID: 13, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1005, y: 1011), b: SurfacePoint(x: 1005, y: 1001), c: SurfacePoint(x: 1558, y: 1011), surfaceID: 14, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1639, y: 943), b: SurfacePoint(x: 1639, y: 933), c: SurfacePoint(x: 1722, y: 942), surfaceID: 15, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1589, y: 825), b: SurfacePoint(x: 1589, y: 815), c: SurfacePoint(x: 1829, y: 825), surfaceID: 16, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 0, y: 0), b: SurfacePoint(x: 1, y: 1), c: SurfacePoint(x: 2, y: 2), surfaceID: 17, dynamic:true) ] func run() { let startTime = CFAbsoluteTimeGetCurrent() for i in 0 ... 10000000 { for s in surfaceFrames { } } let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime println("Time elapsed \(timeElapsed) s") } } struct SurfacePoint { var x,y: Int } struct SurfaceFrame { let a,b,c :SurfacePoint let surfaceID: Int let dynamic: Bool } 


 import Foundation final class ClassTest { let surfaceFrames = [ SurfaceFrame(a: SurfacePoint(x: 0, y: 410), b: SurfacePoint(x: 0, y: 400), c: SurfacePoint(x: 875, y: 410), surfaceID: 0, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 880, y: 304), b: SurfacePoint(x: 880, y: 294), c: SurfacePoint(x: 962, y: 304), surfaceID: 1, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 787, y: 138), b: SurfacePoint(x: 791, y: 129), c: SurfacePoint(x: 1031, y: 248), surfaceID: 2, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 523, y: 138), b: SurfacePoint(x: 523, y: 128), c: SurfacePoint(x: 806, y: 144), surfaceID: 3, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1020, y: 243), b: SurfacePoint(x: 1020, y: 233), c: SurfacePoint(x: 1607, y: 241), surfaceID: 4, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1649, y: 304), b: SurfacePoint(x: 1649, y: 294), c: SurfacePoint(x: 1731, y: 305), surfaceID: 5, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1599, y: 240), b: SurfacePoint(x: 1595, y: 231), c: SurfacePoint(x: 1852, y: 128), surfaceID: 6, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1807, y: 141), b: SurfacePoint(x: 1807, y: 131), c: SurfacePoint(x: 2082, y: 138), surfaceID: 7, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 976, y: 413), b: SurfacePoint(x: 976, y: 403), c: SurfacePoint(x: 1643, y: 411), surfaceID: 8, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1732, y: 410), b: SurfacePoint(x: 1732, y: 400), c: SurfacePoint(x: 2557, y: 410), surfaceID: 9, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 2130, y: 490), b: SurfacePoint(x: 2138, y: 498), c: SurfacePoint(x: 2109, y: 512), surfaceID: 10, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1598, y: 828), b: SurfacePoint(x: 1597, y: 818), c: SurfacePoint(x: 1826, y: 823), surfaceID: 11, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 715, y: 826), b: SurfacePoint(x: 715, y: 816), c: SurfacePoint(x: 953, y: 826), surfaceID: 12, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 840, y: 943), b: SurfacePoint(x: 840, y: 933), c: SurfacePoint(x: 920, y: 943), surfaceID: 13, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1005, y: 1011), b: SurfacePoint(x: 1005, y: 1001), c: SurfacePoint(x: 1558, y: 1011), surfaceID: 14, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1639, y: 943), b: SurfacePoint(x: 1639, y: 933), c: SurfacePoint(x: 1722, y: 942), surfaceID: 15, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 1589, y: 825), b: SurfacePoint(x: 1589, y: 815), c: SurfacePoint(x: 1829, y: 825), surfaceID: 16, dynamic:false), SurfaceFrame(a: SurfacePoint(x: 0, y: 0), b: SurfacePoint(x: 1, y: 1), c: SurfacePoint(x: 2, y: 2), surfaceID: 17, dynamic:true) ] func run() { let startTime = CFAbsoluteTimeGetCurrent() for i in 0 ... 10000000 { for s in surfaceFrames { } } let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime println("Time elapsed \(timeElapsed) s") } } struct SurfacePoint { var x,y: Int } final class SurfaceFrame { let a,b,c :SurfacePoint let surfaceID: Int let dynamic: Bool init(a: SurfacePoint, b: SurfacePoint, c: SurfacePoint, surfaceID: Int, dynamic: Bool) { self.a = a self.b = b self.c = c self.surfaceID = surfaceID self.dynamic = dynamic } } 

In this test, classes took 14.5261079668999 s, while the test with structs took only 0.310304999351501 s. Structures were 47 times faster.

+5
source share
3 answers

As Martin R recommended, I profiled both tests, and indeed, save / release calls are what make iterating through an array of classes much slower than iterating through an array of structures. To be clear, here are the tests that I performed.

 import Foundation final class SomeClass {} struct SomeStruct {} var classes = [ SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), SomeClass(), ] var structs = [ SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), SomeStruct(), ] let startTime = CFAbsoluteTimeGetCurrent() /* structTest() classTest() */ let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime println("Time elapsed \(timeElapsed) s") 


 func structTest() { for i in 0 ... 1000000 { for e in structs {} } } 


 func classTest() { for i in 0 ... 1000000 { for e in classes {} } } 

Here are photos of profiling both tests using tools. You can see by simply adding the runtime that the Classes test spends almost all of its time saving / releasing during each iteration. I would be interested to see how Swift 2.0 deals with this.

Structure enter image description here

Classes enter image description here

So just out of curiosity, I thought what would happen if I can get around the save / release calls by doing the arithmetic of the pointer directly on the array (side of note: I recommend that you never do this in a real application). So I created one last test. However, in this test, instead of repeating the array several times, I simply created one large array and repeated it once, because in this case most of the overhead occurs. I also decided to access the properties in this test to reduce the ambiguity in optimization.

So here are the final test results:

  • One iteration over a large Struct array: 1.00037097930908 s
  • One iteration of a large array of classes: 11.3165299892426 s
  • One iteration over a large Struct array using arithmetic pointer: 0.773443996906281 s
  • One iteration of a large array of classes using arithmetic pointer: 2.81995397806168 s

Below is the code for the test.

 final class SomeClass { var a: Int init(a: Int) { self.a = a } } struct SomeStruct { var a: Int init(a: Int) { self.a = a } } var classes: [SomeClass] = [] var structs: [SomeStruct] = [] var total: Int = 0 for i in 0 ... 100000000 { classes.append(SomeClass(a:i)) structs.append(SomeStruct(a:i)) } let startTime = CFAbsoluteTimeGetCurrent() /*structTest() classTest() structTestPointer() classTestPointer()*/ let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime println("Time elapsed \(timeElapsed) s") func structTest() { for someStruct in structs { let a = someStruct.a total = total &+ a } } func structTestPointer() { var pointer = UnsafePointer<SomeStruct>(structs) for j in 0 ..< structs.count { let someStruct = pointer.memory let a = someStruct.a total = total &+ a pointer++ } } func classTest() { for someClass in classes { let a = someClass.a total = total &+ a } } func classTestPointer() { var pointer = UnsafePointer<SomeClass>(classes) for j in 0 ..< classes.count { let someClass = pointer.memory let a = someClass.a total = total &+ a pointer++ } } 
+2
source

It depends a lot on the compiler.

Your structure is empty, which makes this a minor comparison. You must create actual structures with unique properties and real class objects with unique properties in order to do the right test. (let's say random int in each?)

My assumption is that since structs is a value type, Swift actually creates a contiguous block of memory to hold the values โ€‹โ€‹and does pointer math to retrieve the individual structures, but some indirectness and message passing must be done with class objects. However, the difference is huge.

Actually, since your structures are identical and immutable, the Swift compiler can fold all of them into a single object and ignore array indexing.

0
source

I read somewhere that creating and copying structures and enumerations happens at the very last moment in order to optimize performance. If so, then the structure will be created no later than when it will be available.

You tried to access structures in

 for i in 0 ... 10000000 { for s in surfaceFrames { 

eg. by printing content to the console? It would be interesting to know if this will affect the measured performance.

0
source

All Articles