Udacity Data Scientist Nanodegree : Prerequisite — Python(L5, L6)

Lesson 5: Scripting / Lesson 6: NumPy

Try Statement

We can use try statements to handle exceptions. There are four clauses you can use (one more in addition to those shown in the video).

  • except: If Python runs into an exception while running the try block, it will jump to the except block that handles that exception.
  • else: If Python runs into no exceptions while running the try block, it will run the code in this block after running the try block.
  • finally: Before Python leaves this try statement, it will run the code in this finally block under any conditions, even if it's ending the program. E.g., if Python ran into an error while running code in the except or else block, this finally block will still be executed before stopping the program.

Specifying Exceptions

We can actually specify which error we want to handle in an except block like this:

try:
# some code
except ValueError:
# some code
try:
# some code
except (ValueError, KeyboardInterrupt):
# some code
try:
# some code
except ValueError:
# some code
except KeyboardInterrupt:
# some code

Introduction to NumPy

NumPy stands for Numerical Python and it’s a fundamental package for scientific computing in Python. NumPy provides Python with an extensive math library capable of performing numerical computations effectively and efficiently.

Creating NumPy ndarrays

ndarray — nd stands for n-dimensional. An ndarray is a multidimensional array of elements all of the same type.

# We create a 1D ndarray that contains only integers
# it is important to remember that np.array() is NOT a class, it is just a function that returns an ndarray.
import numpy as np
x = np.array([1, 2, 3, 4, 5])
print('x = ', x)
>>> x = [1 2 3 4 5]

Rank of an Array (numpy.ndarray.ndim)

  • It returns the number of array dimensions.
# 1-D array
x = np.array([1, 2, 3])
x.ndim
>>> 1
# 2-D array
Y = np.array([[1,2,3],[4,5,6],[7,8,9], [10,11,12]])
Y.ndim
>>> 2

# The tuple (2, 3, 4) passed as an argument represents the shape of the ndarray
y = np.zeros((2, 3, 4))
y.ndim
>>> 3

numpy.ndarray.shape

  • It returns a tuple representing the array dimensions.

numpy.dtype

The type tells us the data-type of the elements. Remember, a NumPy array is homogeneous, meaning all elements will have the same data-type. In the example below, we will create a rank 1 array and learn how to obtain its shape, its type, and the data-type (dtype) of its elements.

Example 1

x = np.array([1, 2, 3, 4, 5])print('x = ', x)
print('x has dimensions:', x.shape)
print('x is an object of type:', type(x))
print('The elements in x are of type:', x.dtype)

Example 2

Y = np.array([[1,2,3],[4,5,6],[7,8,9], [10,11,12]])

print('Y = \n', Y)

# We print information about Y
print('Y has dimensions:', Y.shape)
print('Y has a total of', Y.size, 'elements')
print('Y is an object of type:', type(Y))
print('The elements in Y are of type:', Y.dtype)

Example 3 — Save the NumPy array to a File

# We create a rank 1 ndarray
x = np.array([1, 2, 3, 4, 5])
# We save x into the current directory as
np.save('my_array', x)
# We load the saved array from our current directory into variable y
y = np.load('my_array.npy')
>>> y = [1 2 3 4 5]

Using Built-in Functions to Create ndarrays

  • np.zeros(shapes) creates an ndarray full of zeros with the given shapes(row, column). The np.zeros() function creates by default an array with dtype float64. If desired, the data type can be changed by using the keyword dtype. np.ones() is the same but replacing with one.
  • np.full(shape, constant value) function takes two arguments. The first argument is the shape of the ndarray you want to make and the second is the constant value you want to populate the array with.
  • np.eye(N) creates a square N x N ndarray corresponding to the Identity matrix(單位矩陣). Since all Identity Matrices are square, the np.eye() function only takes a single integer as an argument.
  • The np.diag() function creates an ndarray corresponding to a diagonal matrix(對角矩陣,除了主對角線以外的元素皆為零)

numpy.arange

numpy.arange([start, ]stop, [step, ]dtype=None)

Example 4— np.arange(start,stop,step)

# We create a rank 1 ndarray that has sequential integers from 0 to 9
x = np.arange(10)
>>> x = [0 1 2 3 4 5 6 7 8 9]
# We create a rank 1 ndarray that has sequential integers from 4 to 9.
#
np.arange(start,stop)
x = np.arange(4,10)
>>> x = [4 5 6 7 8 9]
x = np.arange(1,14,3)
>>> x = [1 4 7 10 13]

numpy.linspace

numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)
  • Even though the np.arange() function allows for non-integer steps, such as 0.3, the output is usually inconsistent, due to the finite floating point precision. For this reason, in the cases where non-integer steps are required, it is usually better to use the function np.linspace().
  • The np.linspace(start, stop, N) function returns N evenly spaced numbers over the closed interval [start, stop]. This means that both the start and thestop values are included. We should also note the np.linspace() function needs to be called with at least two arguments in the form np.linspace(start,stop). In this case, the default number of elements in the specified interval will be N= 50.
  • The reason np.linspace() works better than the np.arange() function, is that np.linspace() uses the number of elements we want in a particular interval, instead of the step between values. Let's see some examples:

Example 5 — np.linspace(start, stop, n)

x = np.linspace(0,25,10)
>>> x = [ 0. 2.77777778 5.55555556 8.33333333 11.11111111 13.88888889 16.66666667 19.44444444 22.22222222 25. ]
# We create a rank 1 ndarray that has 10 integers evenly spaced between 0 and 25,
# with 25 excluded.
x = np.linspace(0,25,10, endpoint = False)
>>> x = [ 0. 2.5 5. 7.5 10. 12.5 15. 17.5 20. 22.5]

numpy.reshape — This is a Function.

numpy.reshape(array, newshape, order='C')[source]
  • So far, we have only used the built-in functions np.arange() and np.linspace() to create rank 1 ndarrays. However, we can use these functions to create rank 2 ndarrays of any shape by combining them with the np.reshape() function.
  • The np.reshape(ndarray, new_shape) function converts the given ndarray into the specified new_shape. It is important to note that the new_shape should be compatible with the number of elements in the given ndarray.
  • For example, you can convert a rank 1 ndarray with 6 elements, into a 3 x 2 rank 2 ndarray, or a 2 x 3 rank 2 ndarray, since both of these rank 2 arrays will have a total of 6 elements. However, you can't reshape the rank 1 ndarray with 6 elements into a 3 x 3 rank 2 ndarray, since this rank 2 array will have 9 elements, which is greater than the number of elements in the original ndarray. Let's see some examples:

Example 6 —reshape() function.

# We create a rank 1 ndarray with sequential integers from 0 to 19
x = np.arange(20)
>>> Original x = [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]
# We reshape x into a 4 x 5 ndarray
x = np.reshape(x, (4,5))
>>>
Reshaped x =
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]]

numpy.ndarray.reshape — This one is a Method.

ndarray.reshape(shape, order='C')
  • ndarray methods are similar to ndarray attributes in that they are both applied using dot notation (.). Let's see how we can accomplish the same result as in the above example, but in just one line of code:

Example 7 — Create a Numpy array by calling the reshape() function from the output of arange() function.

# We create a a rank 1 ndarray with sequential integers from 0 to 19 and
# reshape it to a 4 x 5 array
Y = np.arange(20).reshape(4, 5)
>>> Y =
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]]

Example 8 — Create a Numpy array using the numpy.random.random() function.

# We create a 3 x 3 ndarray with random floats in the half-open interval [0.0, 1.0).
X = np.random.random((3,3))
>>> X =
[[ 0.12379926 0.52943854 0.3443525 ]
[ 0.11169547 0.82123909 0.52864397]
[ 0.58244133 0.21980803 0.69026858]]

Example 9 — Create a Numpy array using the numpy.random.randint() function.

# We create a 3 x 2 ndarray with random integers in the half-open interval [4, 15).
X = np.random.randint(4,15,size=(3,2))
>>> X =
[[ 7 11]
[ 9 11]
[ 6 7]]

Example 10 — Create a Numpy array of “Normal” distributed random numbers, using the numpy.random.normal() function.

# We create a 1000 x 1000 ndarray of random floats drawn from normal (Gaussian) distribution
# with a mean of zero and a standard deviation of 0.1.
X = np.random.normal(0, 0.1, size=(1000,1000))
# We print X
print()
print('X = \n', X)
print()
# We print information about X
print('X has dimensions:', X.shape)
print('X is an object of type:', type(X))
print('The elements in X are of type:', X.dtype)
print('The elements in X have a mean of:', X.mean())
print('The maximum value in X is:', X.max())
print('The minimum value in X is:', X.min())
print('X has', (X < 0).sum(), 'negative numbers')
print('X has', (X > 0).sum(), 'positive numbers')

Accessing, Deleting, and Inserting Elements Into ndarrays

  • NumPy ndarrays are mutable, meaning that the elements in ndarrays can be changed after the ndarray has been created. NumPy ndarrays can also be sliced.
  • Elements can be accessed using indices inside square brackets, [ ]. NumPy allows you to use both positive and negative indices to access elements in the ndarray.
  • We can also access and modify specific elements of rank 2 ndarrays. To access elements in rank 2 ndarrays we need to provide 2 indices in the form [row, column]. Let's see some examples.

Example 1 — Access individual elements of 2-D array

# We create a 3 x 3 rank 2 ndarray that contains integers from 1 to 9
X = np.array([[1,2,3],[4,5,6],[7,8,9]])
# Let's access some elements in X
print('This is (0,0) Element in X:', X[0,0])
print('This is (2,2) Element in X:', X[2,2])

Example 2 — Delete elements

x = np.array([1, 2, 3, 4, 5])
Y = np.array([[1,2,3],[4,5,6],[7,8,9]])
# We delete the first and last element of x
x = np.delete(x, [0,4])
# We delete the first row of y
w = np.delete(Y, 0, axis=0)
# We delete the first and last column of y
v = np.delete(Y, [0,2], axis=1)

numpy.append

numpy.append(array, values, axis=None)

Example 3 — Append elements

x = np.array([1, 2, 3, 4, 5])
Y = np.array([[1,2,3],[4,5,6]])
# We append the integer 6 to x
x = np.append(x, 6)
# We append the integer 7 and 8 to x
x = np.append(x, [7,8])
# We append a new row containing 7,8,9 to y
v = np.append(Y, [[7,8,9]], axis=0)
# We append a new column containing 9 and 10 to y
q = np.append(Y,[[9],[10]], axis=1)

np.insert(ndarray, index, elements, axis)

This function inserts the given list of elements to ndarray right before the given index along the specified axis. Let's see some examples:

Example 4— Insert elements

x = np.array([1, 2, 5, 6, 7])
Y = np.array([[1,2,3],[7,8,9]])
# We insert the integer 3 and 4 between 2 and 5 in x.
x = np.insert(x,2,[3,4])
# We insert a row between the first and last row of y
w = np.insert(Y,1,[4,5,6],axis=0)
# We insert a column full of 5s between the first and second column of y
v = np.insert(Y,1,5, axis=1)

numpy.hstack and numpy.vstack

numpy.hstack(sequence_of_ndarray)
numpy.vstack(sequence_of_ndarray)

Example 5 — Stack arrays

x = np.array([1,2])
Y = np.array([[3,4],[5,6]])
# We stack x on top of Y
z = np.vstack((x,Y))
# We stack x on the right of Y. We need to reshape x in order to stack it on the right of Y.
w = np.hstack((Y,x.reshape(2,1)))

Slicing ndarrays

NumPy provides a way to access subsets of ndarrays. This is known as slicing. Slicing is performed by combining indices with the colon : symbol inside the square brackets. 1. ndarray[start:end]

Example 1. Slicing in a 2-D ndarray

# We create a 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)
# (row: column), row 橫的,column 直的
W = X[1:,2:5] # 1:last index
Y = X[:3,2:5]
v = X[2,:]
q = X[:,2]
R = X[:,2:3]
Z = X[1:4,2:5]

numpy.ndarray.copy

ndarray.copy(order='C')

Example 2a — Use an array as indices to either make slices, select, or change elements

# We create a 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)
# We create a rank 1 ndarray that will serve as indices to select elements from X
indices = np.array([1,3])
# We use the indices ndarray to select the 2nd and 4th row of X
Y = X[indices,:]
# We use the indices ndarray to select the 2nd and 4th column of X
Z = X[:, indices]

Example 2b — Use an array as indices to extract specific rows from a rank 2 ndarray.

X = np.random.randint(1,20, size=(50,5))
>>> Shape of X is: (50, 5)
# Create a rank 1 ndarray that contains a randomly chosen 10 values between '0' to 'len(X)' (50)
# The row_indices would represent the indices of rows of X
row_indices = np.random.randint(0,50, size=10)
>>> Random 10 indices are: [1 38 31 45 44 21 6 24 19 33]

numpy.diag

numpy.diag(array, k=0)

Example 5. Demonstrate the diag() function

# We create a 4 x 5 ndarray that contains integers from 0 to 24
X = np.arange(25).reshape(5, 5)
# We print the elements in the main diagonal of X
print('z =', np.diag(X)) # default k=0
# We print the elements above the main diagonal of X
print('y =', np.diag(X, k=1))
# We print the elements below the main diagonal of X
print('w = ', np.diag(X, k=-1))

numpy.unique

numpy.unique(array, return_index=False, return_inverse=False, return_counts=False, axis=None)

Example 6. Demonstrate the unique() function

# Create 3 x 3 ndarray with repeated values
X = np.array([[1,2,3],[5,2,8],[1,2,3]])
# We print the unique elements of X
print('The unique elements in X are:',np.unique(X))

Boolean Indexing, Set Operations, and Sorting

For example, suppose we have a 10,000 x 10,000 ndarray of random integers ranging from 1 to 15,000 and we only want to select those integers that are less than 20. Boolean indexing can help us in these cases, by allowing us select elements using logical arguments instead of explicit indices. Let’s see some examples:

Example 1. Boolean indexing

# We create a 5 x 5 ndarray that contains integers from 0 to 24
X = np.arange(25).reshape(5, 5)
# We use Boolean indexing to select elements in X:
print('The elements in X that are greater than 10:', X[X > 10])
print('The elements in X that less than or equal to 7:', X[X <= 7])
print('The elements in X that are between 10 and 17:', X[(X > 10) & (X < 17)])
# We use Boolean indexing to assign the elements that are between 10 and 17 the value of -1
X[(X > 10) & (X < 17)] = -1

Example 2. Set operations

x = np.array([1,2,3,4,5])
y = np.array([6,7,2,8,4])
# We use set operations to compare x and y:
print('The elements that are both in x and y:', np.intersect1d(x,y))
print('The elements that are in x that are not in y:', np.setdiff1d(x,y))
print('All the elements of x and y:',np.union1d(x,y))

numpy.ndarray.sort method

ndarray.sort(axis=-1, kind=None, order=None)
  • On the other hand, when you use numpy.ndarray.sort() as a method, ndarray.sort() sorts the ndarray in place, meaning, that the original array will be changed to the sorted one.

Example 3. Sort arrays using sort() function

x = np.random.randint(1,11,size=(10,))# We sort x and print the sorted array using sort as a function.
print('Sorted x (out of place):', np.sort(x))
# Returns the sorted unique elements of an array
print(np.unique(x))

Example 4. Sort rank-1 arrays using sort() method

# We create an unsorted rank 1 ndarray
x = np.random.randint(1,11,size=(10,))
# We sort x and print the sorted array using sort as a method.
x.sort()
# When we sort in place the original array is changed to the sorted array. To see this we print x again
print()
print('x after sorting:', x)

numpy.sort function

numpy.sort(array, axis=-1, kind=None, order=None)
  • If explicitly axis = None is specified, the array is flattened before sorting. It will return a 1-D array.
  • If axis = 0 is specified for a given 2-D array - For one column at a time, the function will sort all rows, without disturbing other elements. In the final output, you will see that each column has been sorted individually.
  • The output of axis = 1 for a given 2-D array is vice-versa for axis = 0. In the final output, you will see that each row has been sorted individually.

Example 5. Sort rank-2 arrays by specific axis.

# We create an unsorted rank 2 ndarray
X = np.random.randint(1,11,size=(5,5))
# We sort the columns of X and print the sorted array
print('X with sorted columns :\n', np.sort(X, axis = 0))
# We sort the rows of X and print the sorted array
print('X with sorted rows :\n', np.sort(X, axis = 1))

Arithmetic operations and Broadcasting

In order to do element-wise operations, NumPy sometimes uses something called Broadcasting. Broadcasting is the term used to describe how NumPy handles element-wise arithmetic operations with ndarrays of different shapes. For example, broadcasting is used implicitly when doing arithmetic operations between scalars and ndarrays.

Example 1. Element-wise arithmetic operations on 1-D arrays

x = np.array([1,2,3,4])
y = np.array([5.5,6.5,7.5,8.5])
# We perfrom basic element-wise operations using arithmetic symbols and functions
print('x + y = ', x + y)
print('add(x,y) = ', np.add(x,y))
print('x - y = ', x - y)
print('subtract(x,y) = ', np.subtract(x,y))
print('x * y = ', x * y)
print('multiply(x,y) = ', np.multiply(x,y))
print('x / y = ', x / y)
print('divide(x,y) = ', np.divide(x,y))

Example 2. Element-wise arithmetic operations on a 2-D array (Same shape)

X = np.array([1,2,3,4]).reshape(2,2)
Y = np.array([5.5,6.5,7.5,8.5]).reshape(2,2)
# We perform basic element-wise operations using arithmetic symbols and functions
print('X + Y = \n', X + Y)
print('add(X,Y) = \n', np.add(X,Y))

print('X - Y = \n', X - Y)
print('subtract(X,Y) = \n', np.subtract(X,Y))

print('X * Y = \n', X * Y)
print('multiply(X,Y) = \n', np.multiply(X,Y))

print('X / Y = \n', X / Y)
print('divide(X,Y) = \n', np.divide(X,Y))

Example 3. Additional mathematical functions

x = np.array([1,2,3,4])# We apply different mathematical functions to all elements of x
print('EXP(x) =', np.exp(x))
print('SQRT(x) =',np.sqrt(x))
print('POW(x,2) =',np.power(x,2)) # We raise all elements to the power of 2

Example 4. Statistical functions

X = np.array([[1,2], [3,4]])print('Average of all elements in X:', X.mean())
print('Average of all elements in the columns of X:', X.mean(axis=0))
print('Average of all elements in the rows of X:', X.mean(axis=1))
print('Sum of all elements in X:', X.sum())
print('Sum of all elements in the columns of X:', X.sum(axis=0))
print('Sum of all elements in the rows of X:', X.sum(axis=1))
print('Standard Deviation of all elements in X:', X.std())
print('Standard Deviation of all elements in the columns of X:', X.std(axis=0))
print('Standard Deviation of all elements in the rows of X:', X.std(axis=1))
print('Median of all elements in X:', np.median(X))
print('Median of all elements in the columns of X:', np.median(X,axis=0))
print('Median of all elements in the rows of X:', np.median(X,axis=1))
print('Maximum value of all elements in X:', X.max())
print('Maximum value of all elements in the columns of X:', X.max(axis=0))
print('Maximum value of all elements in the rows of X:', X.max(axis=1))
print('Minimum value of all elements in X:', X.min())
print('Minimum value of all elements in the columns of X:', X.min(axis=0))
print('Minimum value of all elements in the rows of X:', X.min(axis=1))

Example 5. Change value of all elements of an array

X = np.array([[1,2], [3,4]])print('3 * X = \n', 3 * X)
print()
print('3 + X = \n', 3 + X)
print()
print('X - 3 = \n', X - 3)
print()
print('X / 3 = \n', X / 3)

Example 6. Arithmetic operations on 2-D arrays (Compatible shape)

x = np.array([1,2,3])
Y = np.array([[1,2,3],[4,5,6],[7,8,9]])
Z = np.array([1,2,3]).reshape(3,1)
print('x + Y = \n', x + Y)
print()
print('Z + Y = \n',Z + Y)

Summary

理科與藝術交織成靈魂的會計人,喜愛戲劇與攝影,但也喜歡資料科學。

理科與藝術交織成靈魂的會計人,喜愛戲劇與攝影,但也喜歡資料科學。