How to effectively combine class design and matrix math?

Question

How to effectively combine class design and matrix math?

For some time I mentally suffered from the collision of two design concepts for modeling physical systems, and I wonder what community solutions came up with for this.

For complex (er) simulations, I like the abstraction of creating classes for objects and how the objects of class instances can be identified with real objects that I want to study, and how certain attributes of an object represent the physical features of real life objects. Let as a simple example take systems of ballistic particles:

class Particle(object): def __init__(self, x=0, y=0, z=0): self.x = x self.y = y self.z = z def __repr__(self): return "x={}\ny={}\nz={}".format(self.x, self.y, self.z) def apply_lateral_wind(self, dx, dy): self.x += dx self.y += dy

If I initialize this with a million values, I can do this:

 start_values = np.random.random((int(1e6),3)) particles = [Particle(*i) for i in start_values]

Now suppose that I need to do something specific for all my particles, for example adding a crosswind vector, just leading to a shift of ax, y for this particular operation, since I only have a bunch (list) of all my particles, I need it would have to go through all my particles to do this, and it takes such an amount of time:

 %timeit _ = [p.apply_lateral_wind(0.5, 1.2) for p in particles] 1 loop, best of 3: 551 ms per loop

Now the opposite obvious paradigm for this, which is obviously more efficient, is to stay at the numpy level and just do the math operation directly on the array, which is more than 10 times faster:

 %timeit start_values[...,:2] += np.array([0.5,1.2]) 10 loops, best of 3: 20.3 ms per loop

Now my question is: if there are any design patterns that effectively combine these two approaches so as not to lose such efficiency? I find it personally easier for me to think in terms of object methods and attributes, this is much clearer in my head, and for me it is also really the main reason for the success of object-oriented programming (or used in (physical) modeling), but the disadvantages are obvious . Would you like to love if there was something elegant back and forth between these approaches?

+6

python oop numpy data-modeling modeling

K.-Michael Aye Feb 14 '16 at 23:35

source share

2 answers

David zwicker · Answer 1 · 2016-02-15T00:23:42+0000

You can define a class that serves several particles:

 class Particles(object): def __init__(self, coords): self.coords = coords def __repr__(self): return "Particles(coords={})".format(self.coords) def apply_lateral_wind(self, dx, dy): self.coords[:, 0] += dx self.coords[:, 1] += dy start_values = np.random.random((int(1e6), 3)) particles = Particles(start_values)

The timings on my systems show that it is actually faster than your numpy version, apparently because it does not create an additional array and is not worried about broadcasting:

 %timeit particles.apply_lateral_wind(0.5, 1.2) 100 loops, best of 3: 3.17 ms per loop

while using your numpy example gives

 %timeit start_values[..., :2] += np.array([0.5, 1.2]) 10 loops, best of 3: 21.1 ms per loop

jb326 · Answer 2 · 2016-02-15T01:02:34+0000

Cython extension types can be an interesting option if you really want to use an object instead of a bare NumPy array. (Although there are also some limitations described on this page of related documentation.)

I slightly modified your sample code for Cythonize and came up with the following:

 cdef class Particle(object): cdef double x, y, z def __init__(self, double x=0, double y=0, double z=0): self.x = x self.y = y self.z = z def __repr__(self): return "x={}\ny={}\nz={}".format(self.x, self.y, self.z) cpdef apply_lateral_wind(self, double dx, double dy): self.x += dx self.y += dy

With this basic version of Cython, I got the following time when looping over 1 million particles:

 >>> %timeit _ = [p.apply_lateral_wind(0.5, 1.2) for p in particles] 10 loops, best of 3: 102 ms per loop

This is about 5 times faster than the simple version of Python, which took 532 ms on a single computer. At the same time, it is clearly still an order of magnitude slower than just using a bare NumPy array, but I think the price is readability.

How to effectively combine class design and matrix math?

More articles: