Understanding NumPy einsum

I am trying to understand how einsum works. I looked through the documentation and a few examples, but it doesn't seem to be that way.

Here is an example that we covered in the class:

 C = np.einsum("ij,jk->ki", A, B) 

for two arrays A and B

I think this will take A^T * B , but I'm not sure (does it take the transpose of one of them correctly?). Can someone get me through what is happening here (and generally when using einsum )?

+59
python arrays numpy numpy-einsum
Sep 28 '14 at 21:33
source share
4 answers

(Note: this answer is based on a short einsum that I wrote a while ago.)

What does einsum do?

Imagine that we have two multidimensional arrays, A and B Now suppose we want ...

  • multiply A by B certain way to create a new array of products; and then maybe
  • summarize this new array along individual axes; and then maybe
  • Transpose the axis of the new array in a specific order.

It is very likely that einsum will help us do this faster and more efficiently in terms of memory, that combinations of NumPy functions like multiply , sum and transpose will allow.

How does einsum work?

Here is a simple (but not entirely trivial) example. Take the following two arrays:

 A = np.array([0, 1, 2]) B = np.array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) 

We will multiply A and B by elements, and then sum along the lines of the new array. In the "normal" NumPy we will write:

 >>> (A[:, np.newaxis] * B).sum(axis=1) array([ 0, 22, 76]) 

Thus, the indexing operation on A aligns the first axes of the two arrays so that multiplication can be transmitted. Then the rows of the product array are summed to return a response.

Now, if we wanted to use einsum , we could write:

 >>> np.einsum('i,ij->i', A, B) array([ 0, 22, 76]) 

The signature line 'i,ij->i' is the key here and needs a little explanation. You can think about it in two halves. On the left side (to the left of -> ) we marked two input arrays. To the right of -> we have marked the array we want to end in.

Here's what will happen next:

  • A has one axis; we marked it i . And B has two axes; we designated axis 0 as i and axis 1 as j .

  • repeating the label i in both input arrays, we tell einsum that these two axes should be multiplied together. In other words, we multiply array A by each column of array B , just like by A[:, np.newaxis] * B

  • Note that j not displayed as a label in our desired output; we just used i (we want to get a 1D array). Under the skipping label, we say einsum sum along this axis. In other words, we summarize the product lines, as does .sum(axis=1) .

This is basically all you need to know to use einsum . It helps to play a little; if we leave both labels in the output, 'i,ij->ij' , we return a 2D array of products (same as A[:, np.newaxis] * B ). If we do not specify the output labels, 'i,ij-> , we return a single number (the same as (A[:, np.newaxis] * B).sum() ).

However, the great thing about einsum is that it does not first create a temporary array of products; he simply summarizes the products as they become available. This can result in significant memory savings.

A little bigger example

To explain the point product, here are two new arrays:

 A = array([[1, 1, 1], [2, 2, 2], [5, 5, 5]]) B = array([[0, 1, 0], [1, 1, 0], [1, 1, 1]]) 

We calculate the product of points using np.einsum('ij,jk->ik', A, B) . Here is the image labeled A and B and the output array that we get from the function:

enter image description here

You can see that label j repeated - this means that we multiply rows A by columns B In addition, the label j not included in the output - therefore, we summarize these products. Labels i and k are saved for output, so we return a 2D array.

Perhaps it would be even easier to compare this result with an array where the label j not cumulative. Below, on the left, you can see the 3D array obtained as a result of writing np.einsum('ij,jk->ijk', A, B) (i.e. we saved the label j ):

enter image description here

Summing the j axis gives the expected point product shown on the right.

Some exercises

To learn more about einsum , it is useful to implement familiar NumPy array operations using index notation. Everything related to combinations of multiplying and summing axes can be written with einsum .

Let A and B be two 1D arrays with the same length. For example, A = np.arange(10) and B = np.arange(5, 15) .

  • Amount A can be written:

     np.einsum('i->', A) 
  • Elementary multiplication, A * B , can be written:

     np.einsum('i,i->i', A, B) 
  • The internal product or the point product np.inner(A, B) or np.dot(A, B) can be written:

     np.einsum('i,i->', A, B) # or just use 'i,i' 
  • The external product np.outer(A, B) can be written:

     np.einsum('i,j->ij', A, B) 

For 2D C and D arrays, provided that the axes are compatible lengths (either the same length, or one of them has a length of 1), here are a few examples:

  • Track C (the sum of the main diagonal), np.trace(C) , can be written:

     np.einsum('ii', C) 
  • Elemental multiplication of C and transposition of D , C * DT can be written:

     np.einsum('ij,ji->ij', C, D) 
  • Multiplying each element of C by an array of D (to create a 4D array), C[:, :, None, None] * D , you can write:

     np.einsum('ij,kl->ijkl', C, D) 
+111
Nov 10 '15 at 23:10
source share

Allows you to make 2 arrays with different, but compatible sizes, to highlight their interaction

 In [43]: A=np.arange(6).reshape(2,3) Out[43]: array([[0, 1, 2], [3, 4, 5]]) In [44]: B=np.arange(12).reshape(3,4) Out[44]: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) 

Your calculation takes a β€œpoint” (the sum of the pieces) (2,3) from (3,4) to create an array (4,2). i - 1st dull A , the last of C ; k last of B , 1st of C j β€œabsorbed” by summation.

 In [45]: C=np.einsum('ij,jk->ki',A,B) Out[45]: array([[20, 56], [23, 68], [26, 80], [29, 92]]) 

This is the same as np.dot(A,B).T is the final output that has been transposed.

To learn more about what happens with j , change the C indices to ijk :

 In [46]: np.einsum('ij,jk->ijk',A,B) Out[46]: array([[[ 0, 0, 0, 0], [ 4, 5, 6, 7], [16, 18, 20, 22]], [[ 0, 3, 6, 9], [16, 20, 24, 28], [40, 45, 50, 55]]]) 

This can also be created using

 A[:,:,None]*B[None,:,:] 

That is, add the size k to the end of A and i to the beginning of B , as a result we get an array (2,3,4).

0 + 4 + 16 = 20 , 9 + 28 + 55 = 92 , etc .; Sum on j and transpose to get an earlier result:

 np.sum(A[:,:,None] * B[None,:,:], axis=1).T # C[k,i] = sum(j) A[i,j (,k) ] * B[(i,) j,k] 
+3
Sep 29 '14 at 22:06
source share

I found Numpy: Trading Tricks (Part II) instructive

We use β†’ to indicate the order of the output array. So think that ij, i-> j 'have a left side (LHS) and a right side (RHS). Any repetition of labels on the LHS computes the product element wise, and then summarizes. By changing the label on the RHS (output) side, we can determine the axis in which we want to act relative to the input array, i.e. Summation along the axis 0, 1, etc.

 import numpy as np >>> a array([[1, 1, 1], [2, 2, 2], [3, 3, 3]]) >>> b array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) >>> d = np.einsum('ij, jk->ki', a, b) 

Note that there are three axes, i, j, k, and j repeats (left). i,j represent the rows and columns for a . j,k for b .

To compute the product and align the j axis, we need to add the axis to a . ( b will be transmitted along (?) the first axis)

 a[i, j, k] b[j, k] >>> c = a[:,:,np.newaxis] * b >>> c array([[[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8]], [[ 0, 2, 4], [ 6, 8, 10], [12, 14, 16]], [[ 0, 3, 6], [ 9, 12, 15], [18, 21, 24]]]) 

j missing on the right side, so we sum over j , which is the second axis of the 3x3x3 array

 >>> c = c.sum(1) >>> c array([[ 9, 12, 15], [18, 24, 30], [27, 36, 45]]) 

Finally, the indices (in alphabetical order) change to the right side, so we transpose.

 >>> cT array([[ 9, 18, 27], [12, 24, 36], [15, 30, 45]]) >>> np.einsum('ij, jk->ki', a, b) array([[ 9, 18, 27], [12, 24, 36], [15, 30, 45]]) >>> 
+2
Sep 29 '14 at 12:17
source share

Here are some examples illustrating the use of np.einsum() in the implementation of some common tensor or n-dimensional operations.

Inputs

 In [197]: vec Out[197]: array([0, 1, 2, 3]) In [198]: A Out[198]: array([[11, 12, 13, 14], [21, 22, 23, 24], [31, 32, 33, 34], [41, 42, 43, 44]]) In [199]: B Out[199]: array([[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]]) 

1) matrix multiplication (similar to np.matmul(arr1, arr2) )

 In [200]: np.einsum("ij, jk -> ik", A, B) Out[200]: array([[130, 130, 130, 130], [230, 230, 230, 230], [330, 330, 330, 330], [430, 430, 430, 430]]) 

2) Extract items along the main diagonal (similar to np.diag(arr) )

 In [202]: np.einsum("ii -> i", A) Out[202]: array([11, 22, 33, 44]) 

3) Hadamard product (i.e. the elemental product of two arrays) (similar to arr1 * arr2 )

 In [203]: np.einsum("ij, ij -> ij", A, B) Out[203]: array([[ 11, 12, 13, 14], [ 42, 44, 46, 48], [ 93, 96, 99, 102], [164, 168, 172, 176]]) 

4) Elementary squaring (similar to np.square(arr) or arr ** 2 )

 In [210]: np.einsum("ij, ij -> ij", B, B) Out[210]: array([[ 1, 1, 1, 1], [ 4, 4, 4, 4], [ 9, 9, 9, 9], [16, 16, 16, 16]]) 

5) Tracing (i.e. the sum of the main diagonal elements) (similar to np.trace(arr) )

 In [217]: np.einsum("ii -> ", A) Out[217]: 110 

6) Matrix transposes (similar to np.transpose(arr) )

 In [221]: np.einsum("ij -> ji", A) Out[221]: array([[11, 21, 31, 41], [12, 22, 32, 42], [13, 23, 33, 43], [14, 24, 34, 44]]) 

7) External product (vectors) (similar to np.outer(vec1, vec2) )

 In [255]: np.einsum("i, j -> ij", vec, vec) Out[255]: array([[0, 0, 0, 0], [0, 1, 2, 3], [0, 2, 4, 6], [0, 3, 6, 9]]) 

8) Domestic product (vectors) (similar to np.inner(vec1, vec2) )

 In [256]: np.einsum("i, i -> ", vec, vec) Out[256]: 14 

9) The sum along the 0 axis (similar to np.sum(arr, axis=0) )

 In [260]: np.einsum("ij -> j", B) Out[260]: array([10, 10, 10, 10]) 

10) Sum along axis 1 (similar to np.sum(arr, axis=1) )

 In [261]: np.einsum("ij -> i", B) Out[261]: array([ 4, 8, 12, 16]) 

11) Mass matrix multiplication

 In [287]: BM = np.stack((A, B), axis=0) In [288]: BM Out[288]: array([[[11, 12, 13, 14], [21, 22, 23, 24], [31, 32, 33, 34], [41, 42, 43, 44]], [[ 1, 1, 1, 1], [ 2, 2, 2, 2], [ 3, 3, 3, 3], [ 4, 4, 4, 4]]]) In [289]: BM.shape Out[289]: (2, 4, 4) # batch matrix multiply using einsum In [292]: BMM = np.einsum("bij, bjk -> bik", BM, BM) In [293]: BMM Out[293]: array([[[1350, 1400, 1450, 1500], [2390, 2480, 2570, 2660], [3430, 3560, 3690, 3820], [4470, 4640, 4810, 4980]], [[ 10, 10, 10, 10], [ 20, 20, 20, 20], [ 30, 30, 30, 30], [ 40, 40, 40, 40]]]) In [294]: BMM.shape Out[294]: (2, 4, 4) 



12) summarize along axis 2 (similar to np.sum(arr, axis=2) )

 In [330]: np.einsum("ijk -> ij", BM) Out[330]: array([[ 50, 90, 130, 170], [ 4, 8, 12, 16]]) 

13) sum all the elements in the array (similar to np.sum(arr) )

 In [335]: np.einsum("ijk -> ", BM) Out[335]: 480 

14) the sum of several axes (i.e. marginalization)
(similar to np.sum(arr, axis=(axis0, axis1, axis2, axis3, axis4, axis6, axis7)) )

 # 8D array In [354]: R = np.random.standard_normal((3,5,4,6,8,2,7,9)) # marginalize out axis 5 (ie "n" here) In [363]: esum = np.einsum("ijklmnop -> n", R) # marginalize out axis 5 (ie sum over rest of the axes) In [364]: nsum = np.sum(R, axis=(0,1,2,3,4,6,7)) In [365]: np.allclose(esum, nsum) Out[365]: True 

15) Double Dot Products (similar to np.sum (hadamard-product) cf. 3 )

 In [772]: A Out[772]: array([[1, 2, 3], [4, 2, 2], [2, 3, 4]]) In [773]: B Out[773]: array([[1, 4, 7], [2, 5, 8], [3, 6, 9]]) In [774]: np.einsum("ij, ij -> ", A, B) Out[774]: 124 

Read more here: Einstein-Summation and specifically here: Tensor-notation

+1
Dec 25 '17 at 7:04 on
source share



All Articles