Skip to content

NumPy arrays

The central data structure in numpy is the np.ndarray (array for short). Unlike a list, an array can only contain one homogeneous data type for efficiency reasons. This allows a special storage of the data, which is essential for the efficient implementation of the algorithms in numpy and the underlying libraries.

A data type within the array is called np.dtype. This dtype is close in content to the data types that we already know from Python itself: integers and decimal numbers are available here again. While the decimal numbers behave in the same way as in Python, it should be noted that (again for reasons of efficiency) the integers in numpy have a fixed value range. Frequently used data types and their restrictions are:

Data type Range
np.int8 -128 to 127
np.int32 -2147483648 to 2147483647
np.int64 -9223372036854775808 to 9223372036854775807
np.uint8 0 to 255
np.uint16 0 to 65535
np.uint32 0 to 4294967295
np.uint64 0 to 18446744073709551615
np.float64 64-bit IEEE 754 (double precision)

It is easy to create arrays from lists:

values = [1, 2, 3, 4, 5]
array = np.array(values)

A multidimensional array is created for nested lists of the same length:

values = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
array = np.array(values)

The size of an array can be queried using the .shape property:

values = [[1, 2, 3], [4, 5, 6]]
array = np.array(values)
print (array.shape) # (2, 3)

Here the number of rows comes first, then the number of columns. In numpy, a dimension is referred to as axis. Conversely, an array can also be converted into a list:

values = [[1, 2, 3], [4, 5, 6]]
array = np.array(values)
print (array.tolist()) # [[1, 2, 3], [4, 5, 6]]

Many operations are performed on the original data as long as possible without copying it. This is called numpy views. For example, the transpose of a matrix is accessible as a view via .T:

values = [[1, 2, 3], [4, 5, 6]]
array = np.array(values)
print (array.T) # [[1, 4], [2, 5], [3, 6]]

Or the shape of an array can be changed using reshape, which also creates a view, i.e. the data is not copied again.

values = [1, 2, 3, 4]
array = np.array(values)
reshaped = array.reshape(2, 2)
print (reshaped) # [[1, 2], [3, 4]]

A variety of functions can be evaluated on each array, e.g.

np.arange(4).sum() # 6
np.arange(4).mean() # 1.5
np.arange(4).std() # 1.118033988749895
np.arange(4).min() # 0
np.arange(4).max() # 3