# Udacity Data Scientist Nanodegree : Prerequisite — Python(L2, L3, L4)

## Cautions!

1. Python is case sensitive.
2. Spacing is important.
`4.445e8 = 4.445 * 10 ** 8`

## Arithmetic Operators

• `**` Exponentiation (note that `^` does not do this operation, as you might have seen in other languages)
• `//` Divides and rounds down to the nearest integer

## Variables

Assign: `x, y, z = 3, 4, 5` > but not a great way to assign variable

`1`. Only use ordinary letters, numbers and underscores in your variable names. They can’t have spaces, and need to start with a letter or underscore.

`2`. You can’t use reserved words or built-in identifiers

`3`. The pythonic way to name variables is to use all lowercase letters and underscores to separate words.(Class: Capital letter at first/ Constant: all upper letter)

## Integers and Floats

`x = int(4.7) > x == 4`

Because the float, or approximation, for 0.1 is actually slightly more than 0.1, when we add several of them together we can see the difference between the mathematically correct answer and the one that Python creates.

`>>> print(.1 + .1 + .1 == .3)False`

## String

You can include a `\` in your string to be able to include one of these quotes:

`this_string = 'Simon\'s skateboard is in the garage.'>>> Simon's skateboard is in the garage.`

## Split()

• A basic split method:
`new_str = "The cow jumped over the moon."new_str.split()>>> ['The', 'cow', 'jumped', 'over', 'the', 'moon.']`
• Here the separator is space, and the max split argument is set to 3.
`new_str.split(' ', 3)>>> ['The', 'cow', 'jumped', 'over the moon.']`
• Using ‘.’ or period as a separator.
`new_str.split('.')>>> ['The cow jumped over the moon', '']`

## Slice and Dice with Lists

You saw that we can pull more than one value from a list at a time by using slicing. When using slicing, it is important to remember that the `lower` index is `inclusive` and the `upper` index is `exclusive`.

Therefore, this:

`>>> list_of_random_things = [1, 3.4, 'a string', True]>>> list_of_random_things[1:2][3.4]>>> list_of_random_things[:2][1, 3.4]>>> list_of_random_things[1:][3.4, 'a string', True]`

## Mutability and Order

Mutability is about whether or not we can change an object once it has been created. If an object (like a list or string) can be changed (like a list can), then it is called mutable. However, if an object cannot be changed without creating a completely new object (like strings), then the object is considered immutable.

`>>> my_list = [1, 2, 3, 4, 5]>>> my_list[0] = 'one'>>> print(my_list)['one', 2, 3, 4, 5]`

As shown above, you are able to replace 1 with ‘one’ in the above list. This is because lists are mutable.

However, the following does not work:

`>>> greeting = "Hello there">>> greeting[0] = 'M'`

This is because strings are immutable. This means to change this string, you will need to create a completely new string.

There are two things to keep in mind for each of the data types you are using:

1. Are they mutable?
2. Are they ordered?

Order is about whether the position of an element in the object can be used to access the element. Both strings and lists are ordered. We can use the order to access parts of a list and string.

However, you will see some data types in the next sections that will be unordered. For each of the upcoming data structures you see, it is useful to understand how you index, are they mutable, and are they ordered. Knowing this about the data structure is really useful!

## Useful Functions for Lists

1. `sorted()`: returns a copy of a list in order from smallest to largest, leaving the list unchanged.
2. `join()`: Join is a string method that takes a list of strings as an argument, and returns a string consisting of the list elements joined by a separator string.
`name = "-".join(["García", "O'Kelly"])print(name)>>> García-O'Kelly`

## Tuple

A tuple is another useful container. It’s a data type for immutable(can’t add, remove items from tuples or sort them) ordered sequences of elements. They are often used to store related pieces of information.

Tuples can also be used to assign multiple variables in a compact way.

`dimensions = 52, 40, 100length, width, height = dimensionsprint("The dimensions are {} x {} x {}".format(length, width, height))`

The parentheses are optional when defining tuples, and programmers frequently omit them if parentheses don’t clarify the code.

In the second line, three variables are assigned from the content of the tuple dimensions. This is called tuple unpacking. You can use tuple unpacking to assign the information from a tuple into multiple variables without having to access them one by one and make multiple assignment statements.

## Sets

A set is a data type for mutable unordered collections of unique elements. One application of a set is to quickly remove duplicates from a list.

`numbers = [1, 2, 6, 3, 1, 1, 6]unique_nums = set(numbers)print(unique_nums)>>> {1, 2, 3, 6}`

## Dictionaries

A dictionary is a mutable data type that stores mappings of unique keys to values. Here’s a dictionary that stores elements and their atomic numbers.

`elements = {"hydrogen": 1, "helium": 2, "carbon": 6}`

We can check whether a value is in a dictionary the same way we check whether a value is in a list or set with the `in` keyword. Dicts have a related method that's also useful, `get`. get looks up values in a dictionary, but unlike square brackets, get returns None (or a default value of your choice) if the key isn't found.

`print("carbon" in elements)print(elements.get("dilithium"))`

This would output:

`TrueNone`

Carbon is in the dictionary, so True is printed. Dilithium isn’t in our dictionary so None is returned by `get` and then printed. If you expect lookups to sometimes fail, `get` might be a better tool than normal square bracket lookups because errors can crash your program.

You can check if a key returned None with the `is` operator. You can check for the opposite using `is not`.

`n = elements.get("dilithium")print(n is None)print(n is not None)`

This would output:

`TrueFalse`

Dictionary keys must be immutable, that is, they must be of a type that is not modifiable.

is — It depends on the object.

# Lesson 3: Control flow

## Indentation

Spaces or Tabs? — The Python Style Guide recommends using 4 spaces to indent, rather than using a tab. Whichever you use, be aware that “Python 3 disallows mixing the use of tabs and spaces for indentation.”

## If — Good and Bad Examples

1. Don’t use `True` or `False` as conditions(if True: )
2. Be careful writing expressions that use logical operators
3. Don’t compare a boolean variable with `== True` or `== False`

## For Loops

A `for` loop is used to "iterate", or do something repeatedly, over an iterable.

An iterable is an object that can return one of its elements at a time. This can include sequence types, such as strings, lists, and tuples, as well as non-sequence types, such as dictionaries and files.

## Iterating Through Dictionaries with `For` Loops

`cast = {           "Jerry Seinfeld": "Jerry Seinfeld",           "Julia Louis-Dreyfus": "Elaine Benes",           "Jason Alexander": "George Costanza",           "Michael Richards": "Cosmo Kramer"       }`

If you wish to iterate through both keys and values, you can use the built-in method `items` like this:

`for key, value in cast.items():    print("Actor: {}    Role: {}".format(key, value))`

This outputs:

`Actor: Jerry Seinfeld    Role: Jerry SeinfeldActor: Julia Louis-Dreyfus    Role: Elaine BenesActor: Jason Alexander    Role: George CostanzaActor: Michael Richards    Role: Cosmo Kramer`

# `While` Loops

`For` loops are an example of "definite iteration" meaning that the loop's body is run a predefined number of times. This differs from "indefinite iteration" which is when a loop repeats an unknown number of times and ends when some condition is met, which is what happens in a `while` loop. `pop` is a list method that removes the last element from a list and returns it.

# Break, Continue

Sometimes we need more control over when a loop should end, or skip an iteration. In these cases, we use the `break` and `continue` keywords, which can be used in both `for` and `while` loops.

• `break` terminates a loop
• `continue` skips one iteration of a loop

# Zip and Enumerate

`zip` and `enumerate` are useful built-in functions that can come in handy when dealing with loops.

# Zip

`zip` returns an iterator that combines multiple iterables into one sequence of tuples. Each tuple contains the elements in that position from all the iterables. For example, printing

`list(zip(['a', 'b', 'c'], [1, 2, 3]))` would output `[('a', 1), ('b', 2), ('c', 3)]`.

Like we did for `range()` we need to convert it to a list or iterate through it with a loop to see the elements.

You could unpack each tuple in a `for` loop like this.

`letters = ['a', 'b', 'c']nums = [1, 2, 3]for letter, num in zip(letters, nums):    print("{}: {}".format(letter, num))`

In addition to zipping two lists together, you can also unzip a list into tuples using an asterisk.

`some_list = [('a', 1), ('b', 2), ('c', 3)]letters, nums = zip(*some_list)`

This would create the same `letters` and `nums` tuples we saw earlier.

# Enumerate

`enumerate` is a built in function that returns an iterator of tuples containing indices and values of a list. You'll often use this when you want the index along with each element of an iterable in a loop.

`letters = ['a', 'b', 'c', 'd', 'e']for i, letter in enumerate(letters):    print(i, letter)`

This code would output:

`0 a1 b2 c3 d4 e`

Some examples

`x_coord = [23, 53, 2, -12, 95, 103, 14, -5]y_coord = [677, 233, 405, 433, 905, 376, 432, 445]z_coord = [4, 16, -6, -42, 3, -6, 23, -1]labels = ["F", "J", "A", "Q", "Y", "B", "W", "X"]points = []for point in zip(labels, x_coord, y_coord, z_coord):    points.append("{}: {}, {}, {}".format(*point))for point in points:    print(point)`

Output:

`F: 23, 677, 4J: 53, 233, 16A: 2, 405, -6Q: -12, 433, -42Y: 95, 905, 3B: 103, 376, -6W: 14, 432, 23X: -5, 445, -1`

Notice here, the tuple was unpacked using `*` in the `format` method. This can help make your code cleaner!

Transpose with Zip

`data = ((0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11))data_transpose = tuple(zip(*data))print(data_transpose)`

Output:

`((0, 3, 6, 9), (1, 4, 7, 10), (2, 5, 8, 11))`

## List Comprehensions

In Python, you can create lists really quickly and concisely with list comprehensions. This example from earlier:

`capitalized_cities = []for city in cities:    capitalized_cities.append(city.title())`

can be reduced to:

`capitalized_cities = [city.title() for city in cities]`

## Conditionals in List Comprehensions

You can also add conditionals to list comprehensions (listcomps). After the iterable, you can use the `if` keyword to check a condition in each iteration.

`squares = [x**2 for x in range(9) if x % 2 == 0]`

Examples:

Extract First Names

`names = ["Rick Sanchez", "Morty Smith", "Summer Smith", "Jerry Smith", "Beth Smith"]first_names = [name.split()[0].lower() for name in names]print(first_names)>>> ['rick', 'morty', 'summer', 'jerry', 'beth']`

# Lesson 4: Functions

• Arguments, or parameters, are values that are passed in as inputs when the function is called, and are used in the function body. If a function doesn’t take arguments, these parentheses are left empty.
• Within this function body, we can refer to the argument variables and define new variables, which can only be used within these indented lines.

## Function — Default Arguments

We can add default arguments in a function to have default values for parameters that are unspecified in a function call.

`def cylinder_volume(height, radius=5):    pi = 3.14159    return height * pi * radius ** 2`

It is possible to pass values in two ways — by position and by name. Each of these function calls are evaluated the same way.

`cylinder_volume(10, 7)  # pass in arguments by positioncylinder_volume(height=10, radius=7)  # pass in arguments by name`

## Variable Scope

Variable scope refers to which parts of a program a variable can be referenced, or used, from.

If a variable is created inside a function, it can only be used within that function. Accessing it outside that function is not possible.

`# This will result in an errordef some_function():    word = "hello"print(word)`

In the example above and the example below, `word` is said to have scope that is only local to each function. This means you can use the same name for different variables that are used in different functions.

`# This works finedef some_function():    word = "hello"def another_function():    word = "goodbye"`

Variables defined outside functions, as in the example below, can still be accessed within a function. Here, `word` is said to have a global scope.

`# This works fineword = "hello"def some_function():    print(word)some_function()`

Notice that we can still access the value of the global variable `word` within this function. However, the value of a global variable can not be modified inside the function.(UnboundLocalError) If you want to modify that variable's value inside this function, it should be passed in as an argument.

## Documentation

Functions are especially readable because they often use documentation strings, or docstrings. Docstrings are a type of comment used to explain the purpose of a function, and how it should be used. Here’s a function for population density with a docstring.

`def population_density(population, land_area):    """Calculate the population density of an area. """    return population / land_area`

Docstrings are surrounded by triple quotes. The first line of the docstring is a brief explanation of the function’s purpose. If you feel that this is sufficient documentation you can end the docstring at this point; single line docstrings are perfectly acceptable, as in the example above.

## Lambda Expressions

You can use lambda expressions to create anonymous functions. That is, functions that don’t have a name. They are helpful for creating quick functions that aren’t needed later in your code. This can be especially useful for higher order functions, or functions that take in other functions as arguments.

With a lambda expression, this function:

`def multiply(x, y):    return x * y`

can be reduced to:

`multiply = lambda x, y: x * y`

Both of these functions are used in the same way. In either case, we can call `multiply` like this:

`multiply(4, 7)`

This returns 28.

# Iterators And Generators

Iterables are objects that can return one of their elements at a time, such as a list. Many of the built-in functions we’ve used so far, like ‘enumerate,’ return an iterator.

An iterator is an object that represents a stream of data. This is different from a list, which is also an iterable, but is not an iterator because it is not a stream of data.

Generators are a simple way to create iterators using functions. You can also define iterators using classes, which you can read more about here.

Here is an example of a generator function called `my_range`, which produces an iterator that is a stream of numbers from 0 to (x - 1).

`def my_range(x):    i = 0    while i < x:        yield i        i += 1`

Notice that instead of using the return keyword, it uses `yield`. This allows the function to return values one at a time, and start where it left off each time it’s called. This `yield` keyword is what differentiates a generator from a typical function.

Remember, since this returns an iterator, we can convert it to a list or iterate through it in a loop to view its contents. For example, this code:

`for x in my_range(5):    print(x)`

outputs:

`01234`

# Why Generators?

You may be wondering why we’d use generators over lists. Here’s an excerpt from a stack overflow page that addresses this:

Generators are a lazy way to build iterables. They are useful when the fully realized list would not fit in memory, or when the cost to calculate each list element is high and you want to do it as late as possible. But they can only be iterated over once.

--

--