# Udacity Data Scientist Nanodegree : Prerequisite — Python(L5, L6)

## Lesson 5: Scripting / Lesson 6: NumPy

## Try Statement

We can use `try`

statements to handle exceptions. There are four clauses you can use (one more in addition to those shown in the video).

`try`

: This is the only mandatory clause in a`try`

statement. The code in this block is the first thing that Python runs in a`try`

statement.`except`

: If Python runs into an exception while running the`try`

block, it will jump to the`except`

block that handles that exception.`else`

: If Python runs into no exceptions while running the`try`

block, it will run the code in this block after running the`try`

block.`finally`

: Before Python leaves this`try`

statement, it will run the code in this`finally`

block under any conditions, even if it's ending the program. E.g., if Python ran into an error while running code in the`except`

or`else`

block, this`finally`

block will still be executed before stopping the program.

## Specifying Exceptions

We can actually specify which error we want to handle in an `except`

block like this:

**try**:

*# some code*

**except** ValueError:

*# some code*

Now, it catches the ValueError exception, but not other exceptions. If we want this handler to address more than one type of exception, we can include a parenthesized tuple after the `except`

with the exceptions.

**try**:

*# some code*

**except** (ValueError, KeyboardInterrupt):

*# some code*

Or, if we want to execute different blocks of code depending on the exception, you can have multiple `except`

blocks.

**try**:

*# some code*

**except** ValueError:

*# some code*

**except** KeyboardInterrupt:

*# some code*

## Introduction to NumPy

**NumPy** stands for *Numerical Python* and it’s a fundamental package for scientific computing in Python. NumPy provides Python with an extensive math library capable of performing numerical computations effectively and efficiently.

## Creating NumPy ndarrays

ndarray — *nd* stands for n-dimensional. An ndarray is a multidimensional array of elements all of the same type.

# We create a 1D ndarray that contains only integers# it is important to remember that np.array() isNOTa class, it is just a function that returns an ndarray.import numpyasnp

x = np.array([1, 2, 3, 4, 5])

>>> x = [1 2 3 4 5]

## Rank of an Array (numpy.ndarray.ndim)

- It returns the number of array dimensions.

# 1-D array

x = np.array([1, 2, 3])

x.ndim

>>> 1# 2-D array

Y = np.array([[1,2,3],[4,5,6],[7,8,9], [10,11,12]])

Y.ndim

>>> 2

# The tuple (2, 3, 4) passed as an argument represents the shape of the ndarray

y = np.zeros((2, 3, 4))

y.ndim

>>> 3

## numpy.ndarray.shape

- It returns a tuple representing the array dimensions.

## numpy.dtype

The type tells us the data-type of the elements. Remember, a NumPy array is **homogeneous, meaning all elements will have the same data-type**. In the example below, we will create a rank 1 array and learn how to obtain its shape, its type, and the data-type (*dtype*) of its elements.

## Example 1

x = np.array([1, 2, 3, 4, 5])print('x = ', x)

print('x has dimensions:', x.shape)

print('x is an object of type:', type(x))

print('The elements in x are of type:', x.dtype)

x = [1 2 3 4 5]

x has dimensions: (5,) —

(5,)telling us that

xis of rank 1 (i.e.

xonly has 1 dimension) and it has 5 elements

x is an object of type: class ‘numpy.ndarray’

The elements in x are of type: int64

## Example 2

`Y = np.array([[1,2,3],[4,5,6],[7,8,9], [10,11,12]])`

print('Y = \n', Y)

# We print information about Y

print('Y has dimensions:', Y.shape)

print('Y has a total of', Y.size, 'elements')

print('Y is an object of type:', type(Y))

print('The elements in Y are of type:', Y.dtype)

Y =

[[ 1 2 3]

[ 4 5 6]

[ 7 8 9]

[10 11 12]]

Y has dimensions: (4, 3)

Y has a total of 12 elements

Y is an object of type: class 'numpy.ndarray'

The elements in Y are of type: int64

## Example 3 — Save the NumPy array to a File

# Wecreatearank1 ndarray

x = np.array([1, 2, 3, 4, 5])# Wesavexintothecurrentdirectoryas

np.save('my_array', x)

The above saves the `x`

ndarray into a file named `my_array.npy`

. You can *load* the saved ndarray into a variable by using the `load()`

function.

*# We load the saved array from our current directory into variable y*

y = np.load('my_array.npy')

>>> y = [1 2 3 4 5]

When loading an array from a file, make sure you include the name of the file together with the extension `.npy`

, otherwise you will get an error.

## Using Built-in Functions to Create ndarrays

`np.zeros(shapes)`

creates an ndarray full of`zeros`

with the given`shapes(row, column)`

. The`np.zeros()`

function creates by default an array with dtype float64. If desired, the data type can be changed by using the keyword`dtype`

.`np.ones()`

is the same but replacing with one.`np.full(shape, constant value)`

function takes two arguments. The first argument is the`shape`

of the ndarray you want to make and the second is the`constant value`

you want to populate the array with.`np.eye(N)`

creates a square`N x N`

ndarray corresponding to the**Identity matrix（單位矩陣）**. Since all Identity Matrices are square, the`np.eye()`

function only takes a single integer as an argument.- The
`np.diag()`

function creates an ndarray corresponding to a**diagonal matrix（對角矩陣，除了主對角線以外的元素皆為零）**

## numpy.arange

`numpy.arange([`**start**, ]**stop**, [step, ]dtype=**None**)

`np.arange()`

function is very versatile and can be used with either one, two, or three arguments

When used with only one argument, `np.arange(N)`

will create a rank 1 ndarray with consecutive integers between `0`

and `N - 1`

. Therefore, notice that if I want an array to have integers between 0 and 9, I have to use N = 10, *NOT* N = 9, as in the example below:

## Example 4— `np.arange(start,stop,step)`

# We create a rank 1 ndarray that has sequential integers from 0 to 9

x = np.arange(10)

>>> x = [0 1 2 3 4 5 6 7 8 9]# We create a rank 1 ndarray that has sequential integers from 4 to 9.np.arange(start,stop)

#

x = np.arange(4,10)

>>> x = [4 5 6 7 8 9]x = np.arange(1,14,3)

>>> x = [1 4 7 10 13]

The evenly spaced numbers will include `start`

but *exclude* `stop`

.

## numpy.linspace

`numpy.linspace(`**start**, **stop**, **num**=50, endpoint=True, retstep=False, dtype=**None**, axis=0)

- It returns
`num`

evenly spaced values calculated over the interval`[start, stop]`

. - Even though the
`np.arange()`

function allows for non-integer steps, such as 0.3, the output is usually inconsistent, due to the finite floating point precision. For this reason, in the cases where non-integer steps are required, it is usually better to use the function`np.linspace()`

. - The
`np.linspace(start, stop, N)`

function returns`N`

evenly spaced numbers over the*closed*interval`[start, stop]`

. This means that both the`start`

and the`stop`

values are included. We should also note the`np.linspace()`

function needs to be called with at least two arguments in the form`np.linspace(start,stop)`

. In this case, the default number of elements in the specified interval will be*N= 50*. - The reason
`np.linspace()`

works better than the`np.arange()`

function, is that`np.linspace()`

uses the number of elements we want in a particular interval, instead of the step between values. Let's see some examples:

## Example 5 — np.`linspace(start, stop, n)`

x = np.linspace(0,25,10)

>>> x = [ 0. 2.77777778 5.55555556 8.33333333 11.11111111 13.88888889 16.66666667 19.44444444 22.22222222 25. ]# We create a rank 1 ndarray that has 10 integers evenly spaced between 0 and 25,# with 25 excluded.

x = np.linspace(0,25,10, endpoint = False)

>>> x = [ 0. 2.5 5. 7.5 10. 12.5 15. 17.5 20. 22.5]

As we can see from the above example, the function `np.linspace(0,25,10)`

returns an ndarray with `10`

evenly spaced numbers in the closed interval `[0, 25]`

. We can also see that both the start and end points, `0`

and `25`

in this case, are included. However, you can let the endpoint of the interval be excluded by setting the keyword `endpoint = False`

in the `np.linspace()`

function.

## numpy.reshape — This is a Function.

`numpy.reshape(array, newshape, order='C')[source]`

- It gives a new shape to an array without changing its data.
- So far, we have only used the built-in functions
`np.arange()`

and`np.linspace()`

to create rank 1 ndarrays. However, we can use these functions to create rank 2 ndarrays of any shape by combining them with the`np.reshape()`

function. - The
`np.reshape(ndarray, new_shape)`

function converts the given`ndarray`

into the specified`new_shape`

. It is important to note that the`new_shape`

should be compatible with the number of elements in the given`ndarray`

. - For example, you can convert a rank 1 ndarray with 6 elements, into a 3 x 2 rank 2 ndarray, or a 2 x 3 rank 2 ndarray, since both of these rank 2 arrays will have a total of 6 elements. However, you can't reshape the rank 1 ndarray with 6 elements into a 3 x 3 rank 2 ndarray, since this rank 2 array will have 9 elements, which is greater than the number of elements in the original ndarray. Let's see some examples:

## Example 6 —`reshape()`

function.

# We create a rank 1 ndarray with sequential integers from 0 to 19

x = np.arange(20)

>>> Original x = [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]# We reshape x into a 4 x 5 ndarray

x = np.reshape(x, (4,5))

>>>Reshaped x =

[[ 0 1 2 3 4]

[ 5 6 7 8 9]

[10 11 12 13 14]

[15 16 17 18 19]]

## numpy.ndarray.reshape — This one is a Method.

`ndarray.reshape(shape, order='C')`

- It returns an array containing the same data with a new shape.

Method vs. Function

A

functionis a piece of code that is called by name. It can be passed data to operate on (i.e. the parameters) and can optionally return data (the return value). All data that is passed to a function is explicitly passed.A

methodis a piece of code that is called by a namethat is associated with an object. In most respects it is identical to a function except for two key differences:A method is implicitly passed the object on which it was called.

A method is able to operate on data that is contained within the class (remembering that an object is an instance of a class — the class is the definition, the object is an instance of that data).

- One great feature about NumPy, is
**that some functions can also be applied as methods.**This allows us to apply different functions in sequence**in just one line of code.** - ndarray methods are similar to ndarray attributes in that they are both applied using
*dot*notation (`.`

). Let's see how we can accomplish the same result as in the above example, but in just one line of code:

## Example 7 — Create a Numpy array by calling the `reshape()`

function from the output of `arange()`

function.

# We create a a rank 1 ndarray with sequential integers from 0 to 19 and# reshape it to a 4 x 5 array

Y = np.arange(20).reshape(4, 5)>>> Y =

[[ 0 1 2 3 4]

[ 5 6 7 8 9]

[10 11 12 13 14]

[15 16 17 18 19]]

As we can see, we get the exact same result as before. Notice that when we use `reshape()`

as a method, it's applied as `ndarray.reshape(new_shape)`

. This converts the `ndarray`

into the specified shape `new_shape`

. As before, it is important to note that the `new_shape`

should be compatible with the number of elements in `ndarray`

. In the example above, the function `np.arange(20)`

creates an ndarray and serves as the `ndarray`

to be reshaped by the `reshape()`

method. Therefore, when using `reshape()`

as a method, we don't need to pass the `ndarray`

as an argument to the `reshape()`

function, instead we only need to pass the `new_shape`

argument.

Let’s start by using the `np.random.random(shape)`

function to create an ndarray of the given `shape`

with random floats in the half-open interval [0.0, 1.0).

## Example 8 — Create a Numpy array using the `numpy.random.random()`

function.

# We create a 3 x 3 ndarray with random floats in the half-open interval [0.0, 1.0).

X = np.random.random((3,3))>>> X =

[[ 0.12379926 0.52943854 0.3443525 ]

[ 0.11169547 0.82123909 0.52864397]

[ 0.58244133 0.21980803 0.69026858]]

NumPy also allows us to create ndarrays with random integers within a particular interval. The function `np.random.randint(start, stop, size = shape)`

creates an ndarray of the given `shape`

with random integers in the half-open interval `[start, stop)`

. Let's see an example:

## Example 9 — Create a Numpy array using the `numpy.random.randint()`

function.

*# We create a 3 x 2 ndarray with random integers in the half-open interval [4, 15).*

X = np.random.randint(4,15,size=(3,2))

>>> *X =*

[[ 7 11]

[ 9 11]

[ 6 7]]

In some cases, you may need to create ndarrays with random numbers that satisfy certain statistical properties. For example, you may want the random numbers in the ndarray to have an average of 0. NumPy allows you create random ndarrays with numbers drawn from various probability distributions. The function `np.random.normal(mean, standard deviation, size=shape)`

, for example, creates an ndarray with the given `shape`

that contains random numbers picked from a `normal`

(Gaussian) distribution with the given `mean`

and `standard deviation`

. Let's create a 1,000 x 1,000 ndarray of random floating point numbers drawn from a normal distribution with a mean (average) of zero and a standard deviation of 0.1.

## Example 10 — Create a Numpy array of “Normal” distributed random numbers, using the `numpy.random.normal()`

function.

# We create a 1000 x 1000 ndarray of random floats drawn from normal (Gaussian) distribution# with a mean of zero and a standard deviation of 0.1.

X = np.random.normal(0, 0.1, size=(1000,1000))# We print X

print()

print('X = \n', X)

print()# We print information about X

print('X has dimensions:', X.shape)

print('X is an object of type:', type(X))

print('The elements in X are of type:', X.dtype)

print('The elements in X have a mean of:', X.mean())

print('The maximum value in X is:', X.max())

print('The minimum value in X is:', X.min())

print('X has', (X < 0).sum(), 'negative numbers')

print('X has', (X > 0).sum(), 'positive numbers')

X =

[[ 0.04218614 0.03247225 -0.02936003 …, 0.01586796 -0.05599115 -0.03630946]

[ 0.13879995 -0.01583122 -0.16599967 …, 0.01859617 -0.08241612 0.09684025]

[ 0.14422252 -0.11635985 -0.04550231 …, -0.09748604 -0.09350044 0.02514799]

…,

[-0.10472516 -0.04643974 0.08856722 …, -0.02096011 -0.02946155 0.12930844]

[-0.26596955 0.0829783 0.11032549 …, -0.14492074 -0.00113646 -0.03566034]

[-0.12044482 0.20355356 0.13637195 …, 0.06047196 -0.04170031 -0.04957684]]

X has dimensions: (1000, 1000)

X is an object of type: class ‘numpy.ndarray’

The elements in X are of type: float64

The elements in X have a mean of: -0.000121576684405

The maximum value in X is: 0.476673923106

The minimum value in X is: -0.499114224706

X has 500562 negative numbers

X has 499438 positive numbers

As we can see, the average of the random numbers in the ndarray is close to zero, both the maximum and minimum values in `X`

are symmetric about zero (the average), and we have about the same amount of positive and negative numbers.

# Accessing, Deleting, and Inserting Elements Into ndarrays

- NumPy ndarrays are
**mutable**, meaning that the elements in ndarrays can be changed after the ndarray has been created. NumPy ndarrays**can also be sliced.** - Elements can be accessed using indices inside square brackets, [ ]. NumPy allows you to use both positive and negative indices to access elements in the ndarray.
- We can also access and modify specific elements of rank 2 ndarrays. To access elements in rank 2 ndarrays we need to provide 2 indices in the form
`[row, column]`

. Let's see some examples.

## Example 1 — Access individual elements of 2-D array

# We create a 3 x 3 rank 2 ndarray that contains integers from 1 to 9

X = np.array([[1,2,3],[4,5,6],[7,8,9]])# Let's access some elements in X

print('This is (0,0) Element in X:', X[0,0])

print('This is (2,2) Element in X:', X[2,2])

X =

[[1 2 3]

[4 5 6]

[7 8 9]]

This is (0,0) Element in X: 1

This is (2,2) Element in X: 9

`np.delete(ndarray, elements, axis)`

function. This function `deletes`

the given list of `elements`

from the given `ndarray`

along the specified `axis`

. For rank 1 ndarrays the `axis`

keyword is not required. For rank 2 ndarrays, `axis = 0`

is used to select *rows*, and `axis = 1`

is used to select *columns*. Let's see some examples:

## Example 2 — Delete elements

x = np.array([1, 2, 3, 4, 5])

Y = np.array([[1,2,3],[4,5,6],[7,8,9]])# We delete the first and last element of x

x = np.delete(x, [0,4])# We delete the first row of y

w = np.delete(Y, 0, axis=0)# We delete the first and last column of y

v = np.delete(Y, [0,2], axis=1)

Original x = [1 2 3 4 5]

Modified x = [2 3 4]

Original Y =

[[1 2 3]

[4 5 6]

[7 8 9]]

w =

[[4 5 6]

[7 8 9]]

v =

[[2]

[5]

[8]]

## numpy.append

`numpy.append(array, values, axis=None)`

It appends values to the end of an array.

Now, let’s see how we can append values to ndarrays. We can append values to ndarrays using the `np.append(ndarray, elements, axis)`

function. This function appends the given list of `elements`

to `ndarray`

along the specified `axis`

. Let's see some examples:

## Example 3 — Append elements

x = np.array([1, 2, 3, 4, 5])

Y = np.array([[1,2,3],[4,5,6]])# We append the integer 6 to x

x = np.append(x, 6)# We append the integer 7 and 8 to x

x = np.append(x, [7,8])# We append a new row containing 7,8,9 to y

v = np.append(Y, [[7,8,9]], axis=0)# We append a new column containing 9 and 10 to y

q = np.append(Y,[[9],[10]], axis=1)

Original x = [1 2 3 4 5]

x = [1 2 3 4 5 6]

x = [1 2 3 4 5 6 7 8]

Original Y =

[[1 2 3]

[4 5 6]]

v =

[[1 2 3]

[4 5 6]

[7 8 9]]

q =

[[ 1 2 3 9]

[ 4 5 6 10]]

`np.insert(ndarray, index, e`**lements, axis)**

**lements, axis)**

This function inserts the given list of `elements`

to `ndarray`

right before the given `index`

along the specified `axis`

. Let's see some examples:

## Example 4— Insert elements

x = np.array([1, 2, 5, 6, 7])

Y = np.array([[1,2,3],[7,8,9]])# We insert the integer 3 and 4 between 2 and 5 in x.

x = np.insert(x,2,[3,4])# We insert a row between the first and last row of y

w = np.insert(Y,1,[4,5,6],axis=0)# We insert a column full of 5s between the first and second column of y

v = np.insert(Y,1,5, axis=1)

Original x = [1 2 5 6 7]

x = [1 2 3 4 5 6 7]

Original Y =

[[1 2 3]

[7 8 9]]

w =

[[1 2 3]

[4 5 6]

[7 8 9]]

v =

[[1 5 2 3]

[7 5 8 9]]

## numpy.hstack and numpy.vstack

`numpy.hstack(sequence_of_ndarray)`

It returns a stacked array formed by stacking the given arrays in sequence** horizontally** (column-wise).

`numpy.vstack(sequence_of_ndarray)`

It returns a stacked array formed by stacking the given arrays, will be at least 2-D, in sequence **vertically **(row-wise).

NumPy also allows us to stack ndarrays on top of each other, or to stack them side by side. The stacking is done using either the `np.vstack()`

function for vertical stacking, or the `np.hstack()`

function for horizontal stacking. It is important to note that in order to stack ndarrays, the shape of the ndarrays must match. Let's see some examples:

## Example 5 — Stack arrays

x = np.array([1,2])

Y = np.array([[3,4],[5,6]])# We stack x on top of Y

z = np.vstack((x,Y))# We stack x on the right of Y. We need to reshape x in order to stack it on the right of Y.

w = np.hstack((Y,x.reshape(2,1)))

x = [1 2]

Y =

[[3 4]

[5 6]]

z =

[[1 2]

[3 4]

[5 6]]

w =

[[3 4 1]

[5 6 2]]

# Slicing ndarrays

NumPy provides a way to access **subsets **of ndarrays. This is known as ** slicing**. Slicing is performed by combining indices with the colon

`:`

symbol inside the square brackets. 1. `ndarray[`**start**:**end**]

## Example 1. Slicing in a 2-D ndarray

# We create a 4 x 5 ndarray that contains integers from 0 to 19

X = np.arange(20).reshape(4, 5)# (row: column), row 橫的，column 直的

W = X[1:,2:5] # 1:last index

Y = X[:3,2:5]

v = X[2,:]

q = X[:,2]

R = X[:,2:3]

X =

[[ 0 1 2 3 4]

[ 5 6 7 8 9]

[10 11 12 13 14]

[15 16 17 18 19]]

W =

[[ 7 8 9]

[12 13 14]

[17 18 19]]

Y =

[[ 2 3 4]

[ 7 8 9]

[12 13 14]]

v = [10 11 12 13 14]

q = [ 2 7 12 17]

R =

[[ 2]

[ 7]

[12]

[17]]

It is important to note that when we perform slices on ndarrays and save them into new variables, as we did above, the data is not copied into the new variable. This is one feature that often causes confusion for beginners. Therefore, we will look at this in a bit more detail.

In the above examples, when we make assignments, such as:

`Z = X[1:4,2:5]`

the slice of the original array `X`

is not copied in the variable `Z`

. R**ather, ****X**** and ****Z**** are now just two different names for the same ndarray.(i.e. If you make any changes in Z, you’ll also be changing the elements in X.) **We say that slicing only creates a

*view*of the original array. This means that if you make changes in

`Z`

you will be in effect changing the elements in `X`

as well.## numpy.ndarray.copy

`ndarray.copy(order='C')`

It returns a copy of the array.

However, **if we want to create a new ndarray that contains a copy of the values in the slice we need to use the ****np.copy()**** function.** The `np.copy(ndarray)`

function creates a copy of the given `ndarray`

. This function can also be used as a method.

## Example 2a — Use an array as indices to either make slices, select, or change elements

# We create a 4 x 5 ndarray that contains integers from 0 to 19

X = np.arange(20).reshape(4, 5)# We create a rank 1 ndarray that will serve as indices to select elements from X

indices = np.array([1,3])# We use the indices ndarray to select the 2nd and 4th row of X

Y = X[indices,:]# We use the indices ndarray to select the 2nd and 4th column of X

Z = X[:, indices]

X =

[[ 0 1 2 3 4]

[ 5 6 7 8 9]

[10 11 12 13 14]

[15 16 17 18 19]]

indices = [1 3]

Y =

[[ 5 6 7 8 9]

[15 16 17 18 19]]

Z =

[[ 1 3]

[ 6 8]

[11 13]

[16 18]]

## Example 2b — Use an array as indices to extract specific rows from a rank 2 ndarray.

X = np.random.randint(1,20, size=(50,5))

>>> Shape of X is: (50, 5)# Create a rank 1 ndarray that contains a randomly chosen 10 values between '0' to 'len(X)' (50)# The row_indices would represent the indices of rows of X

row_indices = np.random.randint(0,50, size=10)

>>> Random 10 indices are: [1 38 31 45 44 21 6 24 19 33]

## numpy.diag

`numpy.diag(array, k=0)`

It extracts or constructs the diagonal elements.

NumPy also offers built-in functions to select specific elements within ndarrays. For example, the `np.diag(ndarray, k=N)`

function extracts the elements along the `diagonal`

defined by `N`

. As default is `k=0`

, which refers to the main diagonal. Values of `k > 0`

are used to select elements in diagonals above the main diagonal, and values of `k < 0`

are used to select elements in diagonals below the main diagonal. Let's see an example:

## Example 5. Demonstrate the `diag()`

function

# We create a 4 x 5 ndarray that contains integers from 0 to 24

X = np.arange(25).reshape(5, 5)# We print the elements in themain diagonalof X

print('z =', np.diag(X)) # default k=0# We print the elements above the main diagonal of X

print('y =', np.diag(X, k=1))# We print the elements below the main diagonal of X

print('w = ', np.diag(X, k=-1))

X =

[[ 0 1 2 3 4]

[ 5 6 7 8 9]

[10 11 12 13 14]

[15 16 17 18 19]

[20 21 22 23 24]]

z = [ 0 6 12 18 24]

y = [ 1 7 13 19]

w = [ 5 11 17 23]

## numpy.unique

`numpy.unique(`

**array**, return_index=**False**, return_inverse=**False**, return_counts=**False**, axis=None)

- It returns the sorted unique elements of an array.

**It is often useful to extract only the unique elements in an ndarray**. We can find the unique elements in an ndarray by using the `np.unique()`

function. The `np.unique(ndarray)`

function returns the `unique`

elements in the given `ndarray`

, as in the example below:

## Example 6. Demonstrate the `unique()`

function

# Create 3 x 3 ndarray with repeated values

X = np.array([[1,2,3],[5,2,8],[1,2,3]])# We print the unique elements of X

print('The unique elements in X are:',np.unique(X))

X =

[[1 2 3]

[5 2 8]

[1 2 3]]

The unique elements in X are: [1 2 3 5 8]

# Boolean Indexing, Set Operations, and Sorting

For example, suppose we have a 10,000 x 10,000 ndarray of random integers ranging from 1 to 15,000 and we only want to select those integers that are less than 20. *Boolean* indexing can help us in these cases, by allowing us select elements using logical arguments instead of explicit indices. Let’s see some examples:

## Example 1. Boolean indexing

# We create a 5 x 5 ndarray that contains integers from 0 to 24

X = np.arange(25).reshape(5, 5)# We use Boolean indexing to select elements in X:

print('The elements in X that are greater than 10:', X[X > 10])

print('The elements in X that less than or equal to 7:', X[X <= 7])

print('The elements in X that are between 10 and 17:', X[(X > 10) & (X < 17)])# We use Boolean indexing to assign the elements that are between 10 and 17 the value of -1

X[(X > 10) & (X < 17)] = -1

Original X =

[[ 0 1 2 3 4]

[ 5 6 7 8 9]

[10 11 12 13 14]

[15 16 17 18 19]

[20 21 22 23 24]]

The elements in X that are greater than 10: [11 12 13 14 15 16 17 18 19 20 21 22 23 24]

The elements in X that less than or equal to 7: [0 1 2 3 4 5 6 7]

The elements in X that are between 10 and 17: [11 12 13 14 15 16]

X =

[[ 0 1 2 3 4]

[ 5 6 7 8 9]

[10 -1 -1 -1 -1]

[-1 -1 17 18 19]

[20 21 22 23 24]]

## Example 2. Set operations

x = np.array([1,2,3,4,5])

y = np.array([6,7,2,8,4])# We use set operations to compare x and y:

print('The elements that are both in x and y:', np.intersect1d(x,y))

print('The elements that are in x that are not in y:', np.setdiff1d(x,y))

print('All the elements of x and y:',np.union1d(x,y))

x = [1 2 3 4 5]

y = [6 7 2 8 4]

The elements that are both in x and y: [2 4]

The elements that are in x that are not in y: [1 3 5]

All the elements of x and y: [1 2 3 4 5 6 7 8]

# numpy.ndarray.sort method

`ndarray.sort(axis=-1, kind=`**None**, order=**None**)

- The method above sorts an array in-place.

Like with other functions we saw before, the `sort`

can be used as a method as well as a function. The difference lies in how the data is stored in memory in this case.

- When
`numpy.sort()`

is used as a function, it sorts the ndrrays out of place, meaning, that it doesn't change the original ndarray being sorted. - On the other hand, when you use
`numpy.ndarray.sort()`

as a method,`ndarray.sort()`

sorts the ndarray in place, meaning, that the original array will be changed to the sorted one.

## Example 3. Sort arrays using sort() function

x = np.random.randint(1,11,size=(10,))# We sort x and print the sorted array using sort as a function.

print('Sorted x (out of place):', np.sort(x))

Original x = [9 6 4 4 9 4 8 4 4 7]

Sorted x (out of place): [4 4 4 4 4 6 7 8 9 9]

x after sorting: [9 6 4 4 9 4 8 4 4 7]

Notice that `np.sort()`

sorts the array but, if the ndarray being sorted has repeated values, `np.sort()`

leaves those values in the sorted array. However, if desired, we can use the `unique()`

function. Let's see how we can sort the unique elements of `x`

above:

*# Returns the sorted unique elements of an array*

print(np.unique(x))

[4 6 7 8 9]

## Example 4. Sort rank-1 arrays using sort() method

# We create an unsorted rank 1 ndarray

x = np.random.randint(1,11,size=(10,))# We sort x and print the sorted array using sort as a method.

x.sort()# When we sort in place the original array is changed to the sorted array. To see this we print x again

print()

print('x after sorting:', x)

Original x = [9 9 8 1 1 4 3 7 2 8]

x after sorting: [1 1 2 3 4 7 8 8 9 9]

# numpy.sort function

`numpy.sort(array, axis=-1, kind=None, order=None)`

It returns a **sorted copy** of an array. The `axis`

denotes the axis along which to sort. It can take values in the range `-1`

to `(ndim-1)`

. Axis can take the following possible values for a given 2-D ndarray:

- If nothing is specified, the default value is
`axis = -1`

, which sorts along the**last**axis. In the case of a given 2-D ndarray, the last axis value is`1`

. - If explicitly
`axis = None`

is specified, the array is flattened before sorting. It will return a 1-D array. - If
`axis = 0`

is specified for a given 2-D array - For one column at a time, the function will sort all rows, without disturbing other elements. In the final output,*you will see that each column has been sorted individually.* - The output of
`axis = 1`

for a given 2-D array is vice-versa for`axis = 0`

. In the final output,*you will see that each row has been sorted individually.*

Tip: As mentioned inthisdiscussion, you can read

axis = 0as "down" and

axis = 1as "across" the given 2-D array, to have a correct usage of axis in your methods/functions.

When sorting rank 2 ndarrays, we need to specify to the `np.sort()`

function whether we are sorting by rows or columns. This is done by using the `axis`

keyword. Let's see some examples:

## Example 5. Sort rank-2 arrays by specific axis.

# We create an unsorted rank 2 ndarray

X = np.random.randint(1,11,size=(5,5))# We sort the columns of X and print the sorted array

print('X with sorted columns :\n', np.sort(X, axis = 0))# We sort the rows of X and print the sorted array

print('X with sorted rows :\n', np.sort(X, axis = 1))

Original X =

[[6 1 7 6 3]

[3 9 8 3 5]

[6 5 8 9 3]

[2 1 5 7 7]

[9 8 1 9 8]]

X with sorted columns :

[[2 1 1 3 3]

[3 1 5 6 3]

[6 5 7 7 5]

[6 8 8 9 7]

[9 9 8 9 8]]

X with sorted rows :

[[1 3 6 6 7]

[3 3 5 8 9]

[3 5 6 8 9]

[1 2 5 7 7]

[1 8 8 9 9]]

# Arithmetic operations and Broadcasting

In order to do element-wise operations, NumPy sometimes uses something called *Broadcasting*. **Broadcasting is the term used to describe how NumPy handles element-wise arithmetic operations with ndarrays of different shapes**. For example, broadcasting is used implicitly when doing arithmetic operations between scalars and ndarrays.

It is important to note that when performing element-wise operations, the shapes of the ndarrays being operated on, **must have the same shape or be broadcastabl**e. We'll explain more about this later in this lesson. Let's start by performing element-wise arithmetic operations on rank 1 ndarrays:

## Example 1. Element-wise arithmetic operations on 1-D arrays

x = np.array([1,2,3,4])

y = np.array([5.5,6.5,7.5,8.5])# We perfrom basic element-wise operations using arithmetic symbols and functions

x = [1 2 3 4]

y = [ 5.5 6.5 7.5 8.5]

x + y = [ 6.5 8.5 10.5 12.5]

add(x,y) = [ 6.5 8.5 10.5 12.5]

x — y = [-4.5 -4.5 -4.5 -4.5]

subtract(x,y) = [-4.5 -4.5 -4.5 -4.5]

x * y = [ 5.5 13. 22.5 34. ]

multiply(x,y) = [ 5.5 13. 22.5 34. ]

x / y = [ 0.18181818 0.30769231 0.4 0.47058824]

divide(x,y) = [ 0.18181818 0.30769231 0.4 0.47058824]

We can also perform the same element-wise arithmetic operations on rank 2 ndarrays. Again, remember that in order to do these operations the shapes of the ndarrays being operated on, **must have the same shape or be broadcastable**.

## Example 2. Element-wise arithmetic operations on a 2-D array (Same shape)

X = np.array([1,2,3,4]).reshape(2,2)

Y = np.array([5.5,6.5,7.5,8.5]).reshape(2,2)# We perform basic element-wise operations using arithmetic symbols and functions

X =

[[1 2]

[3 4]]

Y =

[[ 5.5 6.5]

[ 7.5 8.5]]

X + Y =

[[ 6.5 8.5]

[ 10.5 12.5]]

add(X,Y) =

[[ 6.5 8.5]

[ 10.5 12.5]]

X — Y =

[[-4.5 -4.5]

[-4.5 -4.5]]

subtract(X,Y) =

[[-4.5 -4.5]

[-4.5 -4.5]]

X * Y =

[[ 5.5 13. ]

[ 22.5 34. ]]

multiply(X,Y) =

[[ 5.5 13. ]

[ 22.5 34. ]]

X / Y =

[[ 0.18181818 0.30769231]

[ 0.4 0.47058824]]

divide(X,Y) =

[[ 0.18181818 0.30769231]

[ 0.4 0.47058824]]

We can also apply mathematical functions, such as `sqrt(x)`

, to all elements of an ndarray at once.

## Example 3. Additional mathematical functions

x= np.array([1,2,3,4])# We apply different mathematical functions to all elements of xexp(x))sqrt(x))x,2))# We raise all elements to the power of 2

x = [1 2 3 4]

EXP(x) = [ 2.71828183 7.3890561 20.08553692 54.59815003]

SQRT(x) = [ 1. 1.41421356 1.73205081 2. ]

POW(x,2) = [ 1 4 9 16]

Note— Most of the statistical operations can be done using either a function or an equivalent method. For example, both numpy.mean function and numpy.ndarray.mean method will return the arithmetic mean of the array elements along the given axis.

## Example 4. Statistical functions

X = np.array([[1,2], [3,4]])print('Average of all elements in X:', X.mean())

print('Average of all elements in the columns of X:', X.mean(axis=0))

print('Average of all elements in the rows of X:', X.mean(axis=1))print('Sum of all elements in X:', X.sum())

print('Sum of all elements in the columns of X:', X.sum(axis=0))

print('Sum of all elements in the rows of X:', X.sum(axis=1))print('Standard Deviation of all elements in X:', X.std())

print('Standard Deviation of all elements in the columns of X:', X.std(axis=0))

print('Standard Deviation of all elements in the rows of X:', X.std(axis=1))print('Median of all elements in X:', np.median(X))

print('Median of all elements in the columns of X:', np.median(X,axis=0))

print('Median of all elements in the rows of X:', np.median(X,axis=1))print('Maximum value of all elements in X:', X.max())

print('Maximum value of all elements in the columns of X:', X.max(axis=0))

print('Maximum value of all elements in the rows of X:', X.max(axis=1))print('Minimum value of all elements in X:', X.min())

print('Minimum value of all elements in the columns of X:', X.min(axis=0))

print('Minimum value of all elements in the rows of X:', X.min(axis=1))

X =

[[1 2]

[3 4]]Average of all elements in X: 2.5

Average of all elements in the columns of X: [ 2. 3.]

Average of all elements in the rows of X: [ 1.5 3.5]Sum of all elements in X: 10

Sum of all elements in the columns of X: [4 6]

Sum of all elements in the rows of X: [3 7]Standard Deviation of all elements in X: 1.11803398875

Standard Deviation of all elements in the columns of X: [ 1. 1.]

Standard Deviation of all elements in the rows of X: [ 0.5 0.5]Median of all elements in X: 2.5

Median of all elements in the columns of X: [ 2. 3.]

Median of all elements in the rows of X: [ 1.5 3.5]Maximum value of all elements in X: 4

Maximum value of all elements in the columns of X: [3 4]

Maximum value of all elements in the rows of X: [2 4]Minimum value of all elements in X: 1

Minimum value of all elements in the columns of X: [1 2]

Minimum value of all elements in the rows of X: [1 3]

## Example 5. Change value of all elements of an array

X = np.array([[1,2], [3,4]])print('3 * X = \n', 3 * X)

print()

print('3 + X = \n', 3 + X)

print()

print('X - 3 = \n', X - 3)

print()

print('X / 3 = \n', X / 3)

X =

[[1 2]

[3 4]]3 * X =

[[ 3 6]

[ 9 12]]3 + X =

[[4 5]

[6 7]]X — 3 =

[[-2 -1]

[ 0 1]]X / 3 =

[[ 0.33333333 0.66666667]

[ 1. 1.33333333]]

In the examples above, NumPy is working behind the scenes to broadcast `3`

along the ndarray so that they have the same shape. This allows us to add 3 to each element of `X`

with just one line of code.

Subject to certain constraints, Numpy can do the same for two ndarrays of different shapes, as we can see below.

## Example 6. Arithmetic operations on 2-D arrays (Compatible shape)

x = np.array([1,2,3])

Y = np.array([[1,2,3],[4,5,6],[7,8,9]])Z = np.array([1,2,3]).reshape(3,1)

x = [1 2 3]

Y =

[[1 2 3]

[4 5 6]

[7 8 9]]Z =

[[1]

[2]

[3]]x + Y =

[[ 2 4 6]

[ 5 7 9]

[ 8 10 12]]Z + Y =

[[ 2 3 4]

[ 6 7 8]

[10 11 12]]

As before, NumPy is able to add 1 x 3 and 3 x 1 ndarrays to 3 x 3 ndarrays by broadcasting the smaller ndarrays along the big ndarray so that they have compatible shapes. In general, NumPy can do this provided that the smaller ndarray, **such as the 1 x 3 ndarray in our example, can be expanded to the shape of the larger ndarray in such a way that the resulting broadcast is unambiguous.**