[C2, Week2]The Bits and Bytes of Computer Networking(Google IT Support) 網路概論

Joe Chao
14 min readFeb 21, 2021

--

這一週的概念就開始棘手了起來,除了課內介紹還得另外找補充資料才能敘述得清楚,這週主要講的是第三層:Network Layer 的部分,這層的作用是讓資料能夠橫跨各個網路之間,還會提到 Subnetting, Routing 等概念。

The Network Layer

The Network Layer

On a local area network or LAN, nodes can communicate with each other through their physical MAC addresses.There is no way of knowing where on the planet a certain MAC address might be at any one point in time, so it’s not ideal for communicating across distances.

在區域網路下,透過 MAC address,各節點可以很好的跟彼此溝通。但因為 MAC address 並無任何系統性地排列,再加上很難知道地球上某個 MAC address 的地址在任一時間點的位置,因此他並不適合遠距離通信。

IP Addresses

IP addresses are a 32 bit long numbers made up of four octets, and each octet is normally described in decimal numbers.

IP address 實際上是由 32 個數字所組成的四個位元組,每個位元組通常會用十進位表示。e.g. 00001100.00100010.00111000.01001110 = 12.34.56.78,此為有效 IP,但因為二的八次方是 256,因此他只能展示數字 0–255(共 256 個數字),如 123.456.789.100 就是無效的,因為大於 255 了,這個十進位的表示方法叫做 Dotted decimal notation。

The important thing to know for now is that IP addresses are distributed in large sections to various organisations and companies instead of being determined by hardware vendors.

IP addresses belong to the networks, not the devices attached to those networks.

IP address will be assigned to it automatically through a technology known as dynamic host configuration protocol(DHCP). These IP address also as known as Dynamic IP address.

In most cases static IP addresses are reserved for servers and network devices, while dynamic IP addresses are reserved for clients

一件重要的事情是,不同於 MAC address 是由硬體提供商決定,IP address 則是由不同的組織與公司決定的。這使 IP address 更具有層次性,更容易存儲相關數據。

因為 IP 是屬於網路的,不像 MAC address 在硬體本身上,因此會隨著你用的網路而改變。

至於 IP 是怎麼分發的呢?這就是透過 Dynamic host configuration protocol(DHCP) 這個技術去自動分發的。在這個技術下所分發的 IP address 叫做 Dynamic IP address,相反的,則是叫做 Static IP address,需要在節點上手動設置。

一般來說,靜態 IP address 是預留給 Server 端和網路設備,而動態 IP 則是預留給 Client 端。

IP Datagrams and Encapsulation

Under the IP protocol, a packet is usually referred to as an IP datagram.

Just like any Ethernet frame, an IP datagram is a highly structured series of fields that are strictly defined.

The two primary sections of an IP datagram are the header and the payload.

就像 Ethernet 的 Data Packet 有他的專屬名稱 — Ethernet Frame 一樣,IP 的 Data Packet 則叫做 IP Datagram。就像 Ethernet 的結構一樣, IP Datagram 的結構也十分嚴謹,且定義清楚。而它分為兩個部分,第一部分是 Payload,第二部分則是 Header。

IP Header 的分區

這張圖十分的實用,他把 Header 分得很清楚,注意是要由上往下、由左至右讀才是他的完整順序。

Version: Indicates what version of Internet protocol is being used. The most common version of IP is version four or IPv4.

Head Length: declare how long the entire header is. This is almost always 20 bytes in length when dealing with IPv4. In fact, 20 bytes is the minimum length of an IP header.

Service Type field: These eight bits can be used to specify details about quality of service or QoS technologies. The important takeaway about QoS is that there are services that allow routers to make decisions about which IP datagram may be more important than others.

Total Length field: indicate the total length of the IP datagram it’s attached to.

先就第一排的四個說起,Version 毫無疑問就是在告訴你是哪個版本的 IP,通常是 IPv4(IP version 4)。Head Length 則是告知整個 Header 有多長,在 IPv4 下,最短都會是 20 bytes。Service Type 則是詳述 Quality of service(QoS) 的細節,QoS 的重要之處在於會讓 Router 決定哪個數據比其他數據重要。 Total Length,則如其名,告訴你整個 IP datagram 有多長,包含表頭與內容(Data)部分。

Identification field: 16-bit number that’s used to group messages together.

IP datagrams have a maximum size. Since the Total Length field is 16 bits, and this field indicates the size of an individual datagram, the maximum size of a single datagram is the largest number you can represent with 16 bits: 65,535.

If the total amount of data that needs to be sent is larger than what can fit in a single datagram, the IP layer needs to split this data up into many individual packets. When this happens, the identification field is used so that the receiving end understands that every packet with the same value in that field is part of the same transmission.

Identification field 的用途在於把訊息分組。因為 IP datagram(packets) 有他的最大尺寸,也就是 2¹⁶(65535 bits),若大於這個尺寸,則會將它分組成許多獨立的 packet,這時 identification field 就有他的用途在,當它接受到這些訊息之後,發現值都是一樣的,就會明白他們都在同一個傳輸之下。

Flag field: used to indicate if a datagram is allowed to be fragmented, or to indicate that the datagram has already been fragmented.

Fragmentation: the process of taking a single IP datagram and splitting it up into several smaller datagrams.

接下來來到 Flag field,這區的用意是要指出數據是否允許將其碎片化,或者已經碎片化。而 Fragmentation 則是將單一的 IP datagram 分成許多小小的 datagram。
這麼做的用意在於,當 datagrame 從一個大網路橫跨到小網路時,他需要先被碎片化才有辦法傳輸成功,而 Fragmentation 除了負責碎片化之外,也負責當數據到達時把數據組回原本的順序。

Time to Live(TTL) field: an 8-bit field that indicates how many router hops a datagram can traverse before it’s thrown away

這段的用途是數據在被丟棄之前可以越過幾個 Router 的點,每當經過一個 Router,TTL 數字就會減一,當它來到 0,就會被丟棄。這區的用途在於避免 datagram 無限循環。

TTL 示意圖

Protocol field: 8-bit field that contains data about what transport layer protocol is being used. e.g. TCP or UDP

Header checksum field: a checksum of the contents of the entire IP datagram header

他很像是在 Ethernet frame 提到的 Checksum value,因為 TTL 會隨著每次到不同的 Router 產生變化,Checksum 勢必會產生變化。

Source IP address(32 bits)

Destination IP address

IP Options field: an optional field and is used to set special characteristics for datagrams primarily used for testing purposes.

可選的,主要用於測試數據並設置特殊特性。

Padding field: just a series of zeros used to ensure the header is the correct total size.

填充剩下全部的零。

Encapsulation: The entire contents of an IP datagram are encapsulated as the payload of an Ethernet frame.

這裡解釋得有點不清楚,我翻了 wiki 的解釋是這樣寫的:是一種通訊協定的設計方法,將網路功能抽象出來,對高層功能隱藏底層功能的資訊。英文版的翻譯也是差不多的。於是我又查了鳥哥的 Linux 私房菜,他有十分詳細的解釋,有興趣者可以看看,我就不引述過來了。

IP Address Classes

IP addresses can be split into two sections, the network ID and the host ID.

Class A

Address class system: a way of defining how the global IP address space is split up. Class A, Class B and Class C

Class A: 第一個 Octet 是 Network ID,剩下的是 Host ID

Class B: 前兩個 Octet 是 Network ID,剩下的是 Host ID

Class B: 前三個 Octet 是 Network ID,剩下的是 Host ID

D、E 不太重要

Address Resolution Protocol — The relation of MAC and IP

Address Resolution Protocol(ARP): a protocol used to discover the hardware address of a node with a certain IP address.

Once it IP datagram has been fully formed, it needs to be encapsulated inside an Ethernet frame. This means that the transmitting device needs a destination MAC address to complete the Ethernet frame header.

ARP table is: a list of IP addresses and the Mac addresses associated with them.

ARP table entries generally expire after a short amount of time to ensure changes in the network are accounted for.

假設我們想要發送數據到 IP 10.20.30.40,我們需要他的 MAC address,有可能在 ARP table 中沒有這個目錄,當發生這種情況時,要發送數據的節點就會 Broadcast ARP message 發送到 MAC 廣播地址(全為F),這些 ARP message 傳到電腦後,當被分配到 IP 10.20.30.40 的網路接口接收到這個 ARP Broadcast 時,他就會回傳 ARP Response,這個回覆會包含這個接口的 MAC address,這時 Ethernet 就會準備好要傳輸,而且這個 MAC address 可能會被加入 ARP table,這樣下次 IP 通訊時就不須要發送 ARP Broadcast 了。

Subnetting

Subnetting

Subnetting: The process of taking a large network and splitting it up into many individual smaller subnetworks or subnets.

Subnetting 是一個把大型網路進行分割成許多「子網路」的過程。

If you want to communicate with the IP address 9.100.100.100, core routers on the Internet know that this IP belongs to the 9.0.0.0 Class A Network. They then route the message to the gateway router responsible for the network by looking at the network ID.

A gateway router specifically serves as the entry and exit path to a certain network. You can contrast this with core internet routers, which might only speak to other core routers.

Once your packet gets to the gateway router for the 9.0.0.0 Class A network, that router is now responsible for getting that data to the proper system by looking at the host ID.

Address classes 給了我們一個可以分割全局 IP 至分散網路的方法。

如果你想要與 IP 9.100.100.100 聯繫,網路上的核心路由器知道這個 IP 屬於 9.0.0.0 Class A 網路。然後他們會透過查找網路 ID,把訊息 Route 到負責這個網路的網關路由器(gateway router)

Gateway router 專門作用於特定網路的入口或出口路徑。你可以把他跟 Internet Core Router 去進行對比,後者可能只與其他 Core router 對話。

一旦你的數據到了 9.0.0.0 Class A 網路,這個路由器將透過查看主機 ID 將這些數據傳送到正確的系統。

This all makes sense until you remember that a single Class A network contains 16,777,216(=256³) individual IPs. That’s just way too many devices to connect to the same router. This is where subnetting comes in.

為什麼需要子網呢?因為 Class A 網路太大了,包含著一千六百萬的獨立 IP。

With subnets you can split your large network up into many smaller ones. These individual subnets will all have their own gateway routers serving as the ingress and egress point for each subnet.

透過子網,你可以把這些大型網路切割成許多小的,這些單獨的子網都有自己的 Gateway router,用做為每個子網的入口點與出口點。

前面我們提到 IP 這個 32 位元的數值中分為網域號碼與主機號碼,其中 Class C 的網域號碼佔了 24 位元,而其實我們還可以將這樣的網域切的更細,就是讓第一個 Host_ID 被拿來作為 Net_ID ,所以,整個 Net_ID 就有 25 bits ,至於 Host_ID 則減少為 7 bits 。在這樣的情況下,原來的一個 Class C 的網域就可以被切分為兩個子網域,而每個子網域就有『 256/2–2 = 126 』個可用的 IP 了!這樣一來,就能夠將原本的一個網域切為兩個較細小的網域,方便分門別類的設計喔。(轉自鳥哥的 Linux 私房菜

Subnet Masks

In a world with subnetting, some bits that would normally comprise the host ID are actually used for the subnet ID.

一些通常構成 Host ID 的位置會用於子網 ID。

At the internet level, core routers only care about the network ID and use this to send the datagram along to the appropriate gateway router to that network. That gateway router then has some additional information that it can use to send that datagram along to the destination machine or the next router in the path to get there. Finally, the host ID is used by that last router to deliver the datagram to the intended recipient machine.

在 Internet 層面,核心路由器只在乎網路 ID,並使用它將數據傳輸至相應的網關路由器,好得以抵達這個網路。然後,這個網關路由器具有一些附加訊息,可以和數據一起發送至目標電腦或路徑中的下一台路由器。最後,一台路由器使用 Host ID 將數據傳送到目標收件人電腦。

Subnet IDs are calculated via what’s known as a subnet mask.

Subset Mask: 32-bit numbers that are normally written now as four octets in decimal.

子網 ID 是透過子網掩碼計算的。子網掩碼是 32 位數字,現在通常以十進位四個 Octet(八位數) 的形式寫入。

拿子網 ID 跟網路 ID 作比較

The beginning part, which is the mask itself is a string of ones, just zeros come after this, the subnet mask, which is the part of the number with all the ones, tells us what we can ignore when computing a host ID. The part with all the zeros tells us what to keep.

子網 ID 區分為兩個部分,第一部分,即掩碼本身是由一串 1 所組成的,在此之後都是 0。子網掩碼作為一串 1 的一部份,告訴我們在計算 Host ID 時可以忽略什麼,而為零的部分則是告訴我們什麼要保存。

The purpose of the mask or the part that’s all ones is to tell a router what part of an IP address is the subnet ID.

子網掩碼的目的在於告訴路由器 IP 地址哪一部分是子網 ID。

For 9.100.100.100, a Class A network, we know that this is just the first octet. This leaves us with the last three octets.

對於 9.100.100.100 這個 Class A 網路,我們知道的只有第一個 Octet(Network ID),餘下三個 Octets(Host ID)。

Let’s take those remaining octets and imagine them next to the subnet mask in binary form. The numbers in the remaining octets that have a corresponding one in the subnet mask are the subnet ID. The numbers in the remaining octets that have a corresponding zero are the host ID. The size of a subnet is entirely defined by its subnet mask.

讓我們把剩餘三個 Octet 放在二進位子網掩碼旁邊。在子網掩碼中具有對應 Octet 的其餘二進位 Octet 的數字是子網 ID。其餘對應零的 Octet 則是 Host ID。子網大小完全是由子網掩碼決定的。

So for example, with the subnet mask of 255.255.255.0, we know that only the last octet is available for host IDs, regardless of what size the network and subnet IDs are.

舉例來說,使用子網掩碼 255.255.255.0,我們知道,無論網路和子網 ID 的大小如何,只有最後一個 Octet 可用於 Host ID。

In general, a subnet can usually only contain two less than the total number of host IDs available.

一般來說,一個子網通常只能比可用的 Host ID 總數少兩個。

Again, using a subnet mask of 255.255.255.0, we know that the octet available for host IDs can contain the numbers 0–255, but zero is generally not used and 255 is normally reserved as a broadcast address for the subnet. This means that, really, only the numbers 1–254 are available for assignment to a host.

再一次用 255.255.255.0 這個子網掩碼作為範例,我們知道可用於 Host ID 的 Octet 可包含 0–255 的數字,但通常不使用零,255 則被保留為子網的廣播地址。1–254 才能使用。

The subnet mask 255.255.255.224 would translate to 27 ones followed by five zeros. This means that we have five bits of host ID space or a total of 32 addresses. This brings up a shorthand way of writing subnet masks.

A subnet mask is a way for a computer to use and operators to determine if an IP address exists on the same network.

CIDR Notation(with slash)

CIDR — Classless Inter-Domain Routing

算是一個比起傳統子網劃分更好的方法。

Address classes were the first attempt at splitting up the global Internet IP space. Subnetting was introduced when it became clear that address classes themselves weren’t as efficient way of keeping everything organised.

Class C 太少,但 Class B 又太多。最後,用了好幾種相鄰的 Class C 網路來滿足需求。這也代表著 Router 有著一堆 Class C 網路,最後都路由到同一個地方。

CIDR is an even more flexible approach to describing blocks of IP addresses. It expands on the concept of subnetting by using subnet masks to demarcate networks

Demarcation point: describe where one network or system ends and another one begins.

In our previous model, we relied on a network ID, subnet ID, and host ID to deliver an IP datagram to the correct location. With CIDR, the network ID and subnet ID are combined into one.(slash notation)

CIDR basically just abandons the concept of address classes entirely, allowing an address to be defined by only two Individual IDs

Before, network sizes were static. Think only class A, class B or, class C, and only subnets could be of different sizes.CIDR allows for networks themselves to be differing sizes.

Before this, if a company needed more addresses than a single class C could provide, they need an entire second class C. With CIDR, they could combine that address space into one contiguous chunk with a net mask of /23 or 255.255.254.0.

This means, that routers now only need to know one entry in their routing table to deliver traffic to these addresses instead of two.

Routing

Basic Routing Concepts

Today most intensive routing issues are almost exclusively handled by ISPs and only the largest of companies.

Routing 可以很簡單也可以很複雜,多數複雜的問題都是由 ISP 解決了。

Router: a network device that forwards traffic depending on the destination address of that traffic. A router is a device that has at least two network interfaces, since it has to be connected to two networks to do its job.

Step

One, a router receives a packet of data on one of its interfaces.

Two, the router examines the destination IP of this packet.

Three, the router then looks up the destination network of this IP in its routing table.

Four, the router forwards that out though the interface that’s closest to the remote network. As determined by additional info within the routing table.

A computer on Network A with an IP address of 192.168.1.100 sends a packet to the address 10.0.0.10. This computer knows that 10.0.0.10 isn’t on its local subnet. So it sends this packet to the MAC address of its gateway, the router.

The router’s interface on Network A receives the packet because it sees that destination MAC address belongs to it. The router then trips away the data-link layer encapsulation, leaving the network layer content, the IP datagram.

Now, the router can directly inspect the IP datagram header for the destination IP field. It finds the destination IP of 10.0.0.10. The router looks at it’s routing table and sees that Network B, or the 10.0.0.0/24 network, is the correct network for the destination IP.

The router’s interface on Network A receives the packet because it sees that destination MAC address belongs to it. The router then trips away the data-link layer encapsulation, leaving the network layer content, the IP datagram.

In fact, since it’s directly connected, the router even has the MAC address for this IP in its arc table.

事實上,由於路由器是直接連接的,因此路由器甚至在他的 arc table 中具有此 IP 的 MAC address。

Next, the router needs to form a new packet to forward along to Network B. It takes all of the data from the first IP datagram and duplicates it. But decrements the TTL field by one and calculates a new checksum.

接下來,路由器需要形成一個新的數據包來轉發到網路 B。它將從第一個 IP datagram 中獲取所有數據並將其複製,但是將 TTL field -1 並計算一個新的 checksum。然後,他將這個新的 IP datagram 封裝在一個新的 Ethernet Frame 裡面。

Routing Tables

Routing tables can vary a ton depending on the make and class of the router, but they all share a few things in common.

The most basic routing table will have four columns.

Destination network, this column would contain a row for each network that the router knows about, this is just the definition of the remote network, a network ID, and the net mask. A routing table will generally have a catchall entry, that matches any IP address that it doesn’t have an explicit network listing for.

Next hop, this is the IP address of the next router that should receive data intended for the destination networking question or this could just state the network is directly connected and that there aren’t any additional hops needed.

Total hops, this is the crucial part to understand routing and how routing tables work, on any complex network like the Internet, there will be lots of different paths to get from point A to point B

For now, it’s just important to know that for each next hop and each destination network, the router will have to keep track of how far away that destination currently is. That way, when it receives updated information from neighboring routers, it will know if it currently knows about the best path or if a new better path is available.

Interface, the router also has to know which of its interfaces it should for traffic matching the destination network out of.

Interior Gateway Protocols

Routing protocols: special protocols the routers use to speak to each other in order to share what information they might have. Routing protocols fall into two main categories, interior gateway protocols, and exterior gateway protocols.

Interior gateway protocols are further split into two categories, link state routing protocols and distance-vector protocols.

Interior gateway protocols: used by routers to share information within a single autonomous system.

Autonomous system: a collection of networks that all fall under the control of a single network operator.

Distance-vector protocol: an older standard. A router using a distance-vector protocol basically just takes its routing table, which is a list of every network known to it and how far away these networks are in terms of hops. Then the router sends this list to every neighboring router, which is basically every router directly connected to it. A list aka a vector.

With a distance-vector protocol, routers don’t really know that much about the total state of an autonomous system, they just have some information about their immediate neighbors.

Distance vector protocols are pretty simple, but they don’t allow for a router to have much information about the state of the world outside of their own direct neighbors. Because of this, a router might be slow to react to a change in the network far away from it.

Link state protocols get their name because each router advertises the state of the link of each of its interfaces. These interfaces could be connected to other routers, or they could be direct connections to networks.

在 link state protocol 底下,他會獲取其名稱,因為每台路由器都會通告其接口的 link state。

The information about each router is propagated to every other router on the autonomous system. This means that every router on the system knows every detail about every other router in the system. Each router then uses this much larger set of information and runs complicated algorithms against it to determine what the best path to any destination network might be.

有關每台路由器的訊息會傳播到自治系統中的所有其他路由器。這意味著系統中的每台路由器都知道系統中所有其他路由器的詳細訊息。然後透過演算法與這個訊息集,以確定他的最佳路徑。

Link state protocols require both more memory in order to hold all of this data and also much more processing power. This is because it has to run algorithms against this data in order to determine the quickest path to update the routing tables. As computer hardware has become more powerful and cheaper over the years, link state protocols have mostly made distance vector protocols outdated.

Exterior Gateway Protocols

Exterior gateway protocols are used to communicate data between routers representing the edges of an autonomous system. Since routers sharing data using interior gateway protocols are all under control of the same organization. Routers use exterior gateway protocols when they need to share information across different organizations. Exterior gateway protocols are really key to the Internet operating how it does today.

外部網關協議用於在代表自治系統邊緣的路由器之間的通信數據。由於使用內部網關協議共享數據的路由器都在同一組織的控制之下。 路由器在 需要跨不同組織共享信息時使用外部網關協議。外部網關協議是互聯網如 今運行的關鍵所在。

這週真的很難,希望以上的筆記沒有錯誤。如有錯誤歡迎指正,不過有了這篇筆記,也因此讓我的 test 滿分了,感謝你的閱讀。

--

--

Joe Chao

會計背景,但目前在管顧實習。喜歡很多事情,於是都選擇把它一一記錄下來。