Broadcasting
Mathematical operations are applied to arrays in numpy
in the same way as we are used to in mathematics. If the operands have different (formally incompatible) forms, the smaller of the two operands is added according to a fixed scheme so that the operation becomes possible. This process is called broadcasting.
The simplest case is the addition of a scalar to an array. Here, the scalar is added to all elements of the array:
np.array([1, 2, 3]) + 1 # array([2, 3, 4])
If the dimensions match exactly, the operation is carried out element by element:
np.array([1, 2, 3]) + np.array([4, 5, 6]) # array([5, 7, 9])
The method is also used in higher dimensions and is carried out in the following steps:
- The dimensions of two arrays are compared pairwise in reverse order.
- For each pair, either both dimensions must be the same or one of the two dimensions must have the value 1. If both dimensions are equal, the operation is performed element by element. If one of the two dimensions is 1, the array in this dimension is filled with copies of the only value in the array and the operation is then carried out element by element.
- For all remaining dimensions, the array with the smaller dimension is completed using dimensions with only one element so that the number of dimensions matches.
An example:
np.identity(3) + np.array([1, 2, 3]) # array([[2., 2., 3.],
# [1., 3., 3.],
# [1., 2., 4.]])
Here, the array np.identity(3)
has the dimensions (3, 3) and the array np.array([1, 2, 3])
has the dimensions (3,). The dimensions are compared in pairs:
np.identity(3) -> 3, 3
np.array([1, 2, 3]) -> 3
In the last dimension, the two entries are the same, so element-by-element broadcasting can be used. There is no equivalent in the first dimension, as the second array only has one dimension. A 1 is therefore assumed here:
np.identity(3) -> 3, 3
np.array([1, 2, 3]) -> 1, 3
This logically corresponds to three copies of the second array in the first dimension:
np.identity(3) + np.array([[1, 2, 3], [1, 2, 3], [1, 2, 3]])
Since the copy is made along the first axis, the row vectors are identical. If you want to have identical column vectors instead, you must instruct numpy
to make the copy along the second axis by inserting an artificial axis in the second operand. This can be done with reshape
, for example:
np.identity(3) + np.array([1, 2, 3]).reshape(3, 1)
This is formally equivalent to:
np.identity(3) + np.array([[1], [2], [3]]) # array([[2., 1., 1.],
# [2., 3., 2.],
# [3., 3., 4.]])
We will get to know an elegant way with np.newaxis
later.