Udacity Data Scientist Nanodegree : Prerequisite — Python(L2, L3, L4)

Lesson 2: Data Types and Operators / Lesson 3: Control Flow / Lesson 4: Functions

Lesson 2: Data Types and Operators

Cautions!

  1. Spacing is important.
  2. Use error messages to help you learn.
4.445e8 = 4.445 * 10 ** 8

Arithmetic Operators

  • // Divides and rounds down to the nearest integer

Variables

1. Only use ordinary letters, numbers and underscores in your variable names. They can’t have spaces, and need to start with a letter or underscore.

2. You can’t use reserved words or built-in identifiers

3. The pythonic way to name variables is to use all lowercase letters and underscores to separate words.(Class: Capital letter at first/ Constant: all upper letter)

Integers and Floats

Because the float, or approximation, for 0.1 is actually slightly more than 0.1, when we add several of them together we can see the difference between the mathematically correct answer and the one that Python creates.

>>> print(.1 + .1 + .1 == .3)
False

String

this_string = 'Simon\'s skateboard is in the garage.'
>>> Simon's skateboard is in the garage.

Split()

new_str = "The cow jumped over the moon."
new_str.split()
>>> ['The', 'cow', 'jumped', 'over', 'the', 'moon.']
  • Here the separator is space, and the max split argument is set to 3.
new_str.split(' ', 3)
>>> ['The', 'cow', 'jumped', 'over the moon.']
  • Using ‘.’ or period as a separator.
new_str.split('.')
>>> ['The cow jumped over the moon', '']

Slice and Dice with Lists

Therefore, this:

>>> list_of_random_things = [1, 3.4, 'a string', True]
>>> list_of_random_things[1:2]
[3.4]
>>> list_of_random_things[:2]
[1, 3.4]
>>> list_of_random_things[1:]
[3.4, 'a string', True]

Mutability and Order

>>> my_list = [1, 2, 3, 4, 5]
>>> my_list[0] = 'one'
>>> print(my_list)
['one', 2, 3, 4, 5]

As shown above, you are able to replace 1 with ‘one’ in the above list. This is because lists are mutable.

However, the following does not work:

>>> greeting = "Hello there"
>>> greeting[0] = 'M'

This is because strings are immutable. This means to change this string, you will need to create a completely new string.

There are two things to keep in mind for each of the data types you are using:

  1. Are they mutable?
  2. Are they ordered?

Order is about whether the position of an element in the object can be used to access the element. Both strings and lists are ordered. We can use the order to access parts of a list and string.

However, you will see some data types in the next sections that will be unordered. For each of the upcoming data structures you see, it is useful to understand how you index, are they mutable, and are they ordered. Knowing this about the data structure is really useful!

Useful Functions for Lists

  1. join(): Join is a string method that takes a list of strings as an argument, and returns a string consisting of the list elements joined by a separator string.
name = "-".join(["García", "O'Kelly"])
print(name)
>>> García-O'Kelly

Tuple

Tuples can also be used to assign multiple variables in a compact way.

dimensions = 52, 40, 100
length, width, height = dimensions
print("The dimensions are {} x {} x {}".format(length, width, height))

The parentheses are optional when defining tuples, and programmers frequently omit them if parentheses don’t clarify the code.

In the second line, three variables are assigned from the content of the tuple dimensions. This is called tuple unpacking. You can use tuple unpacking to assign the information from a tuple into multiple variables without having to access them one by one and make multiple assignment statements.

Sets

numbers = [1, 2, 6, 3, 1, 1, 6]
unique_nums = set(numbers)
print(unique_nums)
>>> {1, 2, 3, 6}

Dictionaries

elements = {"hydrogen": 1, "helium": 2, "carbon": 6}

We can check whether a value is in a dictionary the same way we check whether a value is in a list or set with the in keyword. Dicts have a related method that's also useful, get. get looks up values in a dictionary, but unlike square brackets, get returns None (or a default value of your choice) if the key isn't found.

print("carbon" in elements)
print(elements.get("dilithium"))

This would output:

True
None

Carbon is in the dictionary, so True is printed. Dilithium isn’t in our dictionary so None is returned by get and then printed. If you expect lookups to sometimes fail, get might be a better tool than normal square bracket lookups because errors can crash your program.

You can check if a key returned None with the is operator. You can check for the opposite using is not.

n = elements.get("dilithium")
print(n is None)
print(n is not None)

This would output:

True
False

Dictionary keys must be immutable, that is, they must be of a type that is not modifiable.

is — It depends on the object.

Lesson 3: Control flow

Indentation

If — Good and Bad Examples

  1. Be careful writing expressions that use logical operators
  2. Don’t compare a boolean variable with == True or == False

For Loops

An iterable is an object that can return one of its elements at a time. This can include sequence types, such as strings, lists, and tuples, as well as non-sequence types, such as dictionaries and files.

Iterating Through Dictionaries with For Loops

cast = {
"Jerry Seinfeld": "Jerry Seinfeld",
"Julia Louis-Dreyfus": "Elaine Benes",
"Jason Alexander": "George Costanza",
"Michael Richards": "Cosmo Kramer"
}

If you wish to iterate through both keys and values, you can use the built-in method items like this:

for key, value in cast.items():
print("Actor: {} Role: {}".format(key, value))

This outputs:

Actor: Jerry Seinfeld    Role: Jerry Seinfeld
Actor: Julia Louis-Dreyfus Role: Elaine Benes
Actor: Jason Alexander Role: George Costanza
Actor: Michael Richards Role: Cosmo Kramer

While Loops

Break, Continue

  • break terminates a loop
  • continue skips one iteration of a loop

Zip and Enumerate

Zip

list(zip(['a', 'b', 'c'], [1, 2, 3])) would output [('a', 1), ('b', 2), ('c', 3)].

Like we did for range() we need to convert it to a list or iterate through it with a loop to see the elements.

You could unpack each tuple in a for loop like this.

letters = ['a', 'b', 'c']
nums = [1, 2, 3]
for letter, num in zip(letters, nums):
print("{}: {}".format(letter, num))

In addition to zipping two lists together, you can also unzip a list into tuples using an asterisk.

some_list = [('a', 1), ('b', 2), ('c', 3)]
letters, nums = zip(*some_list)

This would create the same letters and nums tuples we saw earlier.

Enumerate

letters = ['a', 'b', 'c', 'd', 'e']
for i, letter in enumerate(letters):
print(i, letter)

This code would output:

0 a
1 b
2 c
3 d
4 e

Some examples

x_coord = [23, 53, 2, -12, 95, 103, 14, -5]
y_coord = [677, 233, 405, 433, 905, 376, 432, 445]
z_coord = [4, 16, -6, -42, 3, -6, 23, -1]
labels = ["F", "J", "A", "Q", "Y", "B", "W", "X"]

points = []
for point in zip(labels, x_coord, y_coord, z_coord):
points.append("{}: {}, {}, {}".format(*point))

for point in points:
print(point)

Output:

F: 23, 677, 4
J: 53, 233, 16
A: 2, 405, -6
Q: -12, 433, -42
Y: 95, 905, 3
B: 103, 376, -6
W: 14, 432, 23
X: -5, 445, -1

Notice here, the tuple was unpacked using * in the format method. This can help make your code cleaner!

Transpose with Zip

data = ((0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11))data_transpose = tuple(zip(*data))
print(data_transpose)

Output:

((0, 3, 6, 9), (1, 4, 7, 10), (2, 5, 8, 11))

List Comprehensions

capitalized_cities = []
for city in cities:
capitalized_cities.append(city.title())

can be reduced to:

capitalized_cities = [city.title() for city in cities]

Conditionals in List Comprehensions

squares = [x**2 for x in range(9) if x % 2 == 0]

Examples:

Extract First Names

names = ["Rick Sanchez", "Morty Smith", "Summer Smith", "Jerry Smith", "Beth Smith"]first_names = [name.split()[0].lower() for name in names]
print(first_names)
>>> ['rick', 'morty', 'summer', 'jerry', 'beth']

Lesson 4: Functions

  • Within this function body, we can refer to the argument variables and define new variables, which can only be used within these indented lines.

Function — Default Arguments

def cylinder_volume(height, radius=5):
pi = 3.14159
return height * pi * radius ** 2

It is possible to pass values in two ways — by position and by name. Each of these function calls are evaluated the same way.

cylinder_volume(10, 7)  # pass in arguments by position
cylinder_volume(height=10, radius=7) # pass in arguments by name

Variable Scope

If a variable is created inside a function, it can only be used within that function. Accessing it outside that function is not possible.

# This will result in an error
def some_function():
word = "hello"
print(word)

In the example above and the example below, word is said to have scope that is only local to each function. This means you can use the same name for different variables that are used in different functions.

# This works fine
def some_function():
word = "hello"
def another_function():
word = "goodbye"

Variables defined outside functions, as in the example below, can still be accessed within a function. Here, word is said to have a global scope.

# This works fine
word = "hello"
def some_function():
print(word)
some_function()

Notice that we can still access the value of the global variable word within this function. However, the value of a global variable can not be modified inside the function.(UnboundLocalError) If you want to modify that variable's value inside this function, it should be passed in as an argument.

Documentation

def population_density(population, land_area):
"""Calculate the population density of an area. """
return population / land_area

Docstrings are surrounded by triple quotes. The first line of the docstring is a brief explanation of the function’s purpose. If you feel that this is sufficient documentation you can end the docstring at this point; single line docstrings are perfectly acceptable, as in the example above.

Lambda Expressions

With a lambda expression, this function:

def multiply(x, y):
return x * y

can be reduced to:

multiply = lambda x, y: x * y

Both of these functions are used in the same way. In either case, we can call multiply like this:

multiply(4, 7)

This returns 28.

Iterators And Generators

An iterator is an object that represents a stream of data. This is different from a list, which is also an iterable, but is not an iterator because it is not a stream of data.

Generators are a simple way to create iterators using functions. You can also define iterators using classes, which you can read more about here.

Here is an example of a generator function called my_range, which produces an iterator that is a stream of numbers from 0 to (x - 1).

def my_range(x):
i = 0
while i < x:
yield i
i += 1

Notice that instead of using the return keyword, it uses yield. This allows the function to return values one at a time, and start where it left off each time it’s called. This yield keyword is what differentiates a generator from a typical function.

Remember, since this returns an iterator, we can convert it to a list or iterate through it in a loop to view its contents. For example, this code:

for x in my_range(5):
print(x)

outputs:

0
1
2
3
4

Why Generators?

Generators are a lazy way to build iterables. They are useful when the fully realized list would not fit in memory, or when the cost to calculate each list element is high and you want to do it as late as possible. But they can only be iterated over once.

--

--

理科與藝術交織成靈魂的會計人,喜愛戲劇與攝影,但也喜歡資料科學。

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Joe Chao

理科與藝術交織成靈魂的會計人,喜愛戲劇與攝影,但也喜歡資料科學。