Tcp_ip_model
Chapter 2: TCP/IP Model and Protocols - Complete Deep Dive
Section titled “Chapter 2: TCP/IP Model and Protocols - Complete Deep Dive”The TCP/IP model is the practical implementation of network communication that the modern internet runs on. While the OSI model provides a theoretical framework, TCP/IP is what actually powers our networks. This chapter covers everything you need to know about TCP/IP for DevOps, SRE, and SysAdmin roles.
2.1 TCP/IP Model Overview
Section titled “2.1 TCP/IP Model Overview”What is TCP/IP?
Section titled “What is TCP/IP?”TCP/IP (Transmission Control Protocol/Internet Protocol) is a suite of protocols that form the foundation of the internet. Unlike the OSI model’s 7 layers, TCP/IP has only 4 layers, but it represents the actual implementation used worldwide.
+------------------------------------------------------------------+| TCP/IP 4-Layer Model vs OSI 7-Layer Model |+------------------------------------------------------------------+
TCP/IP Model OSI Model+----------------------+ +----------------------+| 4. Application | = | 7. Application || | | 6. Presentation || | | 5. Session |+----------------------+ +----------------------+| 3. Transport | = | 4. Transport |+----------------------+ +----------------------+| 2. Internet | = | 3. Network |+----------------------+ +----------------------+| 1. Link (Network | = | 2. Data Link || Access) | | 1. Physical |+----------------------+ +----------------------+
Why 4 Layers?+------------------------------------------------------------------+| The TCP/IP model was designed for practicality, not theory. || It combines OSI layers that perform similar functions: || - Application layer combines OSI 5, 6, 7 || - Link layer combines OSI 1, 2 |+------------------------------------------------------------------+
+------------------------------------------------------------------+2.2 Layer 1: Link/Network Access Layer
Section titled “2.2 Layer 1: Link/Network Access Layer”This layer handles the physical transmission of data and includes both the Physical and Data Link layers of OSI.
Ethernet (IEEE 802.3)
Section titled “Ethernet (IEEE 802.3)”Ethernet is the dominant technology for local area networks. Understanding Ethernet frames, MAC addresses, and switching is crucial.
+------------------------------------------------------------------+| Ethernet Frame Structure Deep Dive |+------------------------------------------------------------------+
IEEE 802.3 Ethernet Frame:+------------------------------------------------------------------+| Preamble | SFD | Dst MAC | Src MAC | Length | 802.2 | Data | FCS || 7 bytes |1B | 6B | 6B | 2B |LLC|SNAP|46-1500B| 4B |+------------------------------------------------------------------+
Field-by-Field Explanation:+------------------------------------------------------------------+
1. PREAMBLE (7 bytes = 56 bits) +------------------------------------------------------------------+ | Pattern: 10101010 10101010 10101010 10101010 | | 10101010 10101010 10101010 | | Purpose: Synchronization of receiver clock | | Speed: 10 Mbps - all stations can detect incoming frame | +------------------------------------------------------------------+
2. START FRAME DELIMITER (SFD) (1 byte) +------------------------------------------------------------------+ | Pattern: 10101011 | | Purpose: Marks end of preamble, start of frame | | Note: The last two 1 bits signal start of frame | +------------------------------------------------------------------+
3. DESTINATION MAC ADDRESS (6 bytes) +------------------------------------------------------------------+ | Format: XX:XX:XX:XX:XX:XX (48 bits) | | First 24 bits: OUI (assigned to manufacturer) | | Last 24 bits: NIC specific | | Types: Unicast, Multicast, Broadcast | +------------------------------------------------------------------+
4. SOURCE MAC ADDRESS (6 bytes) +------------------------------------------------------------------+ | Format: XX:XX:XX:XX:XX:XX | | Always unicast - source never broadcasts | +------------------------------------------------------------------+
5. LENGTH/TYPE (2 bytes) +------------------------------------------------------------------+ | If <= 1500: Length field (802.3) | | If >= 1536: EtherType field (Ethernet II) | | | | Common EtherTypes: | | 0x0800 = IPv4 | | 0x0806 = ARP | | 0x86DD = IPv6 | | 0x8100 = VLAN Tag | +------------------------------------------------------------------+
6. PAYLOAD (46-1500 bytes) +------------------------------------------------------------------+ | Minimum: 46 bytes (pad if data < 46) | | Maximum: 1500 bytes (MTU - Maximum Transmission Unit) | | Note: MTU is typically 1500 bytes for Ethernet | +------------------------------------------------------------------+
7. FRAME CHECK SEQUENCE (4 bytes) +------------------------------------------------------------------+ | Algorithm: CRC-32 (Cyclic Redundancy Check) | | Polynomial: 0x04C11DB7 | | Coverage: Everything except Preamble, SFD, and FCS itself | | Purpose: Error detection | | Action: Receiver discards frames with invalid FCS | +------------------------------------------------------------------+
Frame Size Calculation:+------------------------------------------------------------------+| Minimum frame: 64 bytes || = 7 (Preamble) + 1 (SFD) + 6 (Dst) + 6 (Src) + 2 (Len) + || 4 (FCS) + 46 (min data) = 72 bytes on wire || Note: Some sources count preamble as part of frame || || Maximum frame: 1518 bytes || = 14 (header) + 1500 (data) + 4 (FCS) = 1518 bytes |+------------------------------------------------------------------+
+------------------------------------------------------------------+MAC Address Deep Dive
Section titled “MAC Address Deep Dive”+------------------------------------------------------------------+| MAC Address Deep Dive |+------------------------------------------------------------------+
MAC Address Structure:+------------------------------------------------------------------+| 48 bits (6 bytes) = 12 hex digits || || Bit 0 (I/G): Individual (0) or Group (1) || Bit 1 (U/L): Universal (0) or Locally Administered (1) || || Example: 00:1A:2B:3C:4D:5E || OUI: 00:1A:2B (first 3 bytes) - manufacturer || NIC: 3C:4D:5E (last 3 bytes) - specific device |+------------------------------------------------------------------+
OUI (Organizationally Unique Identifier):+------------------------------------------------------------------+| Some common OUIs: || 00:00:0C - Cisco Systems || 00:1A:2B - General Electric || 00:50:56 - VMware || 00:0C:29 - VMware (older) || 08:00:27 - VirtualBox || 52:54:00 - QEMU/KVM || B8:27:EB - Raspberry Pi || DC:A6:32 - Raspberry Pi 4 || F0:18:98 - Apple || 3C:22:0B - HP || 00:1C:42 - Parallels |+------------------------------------------------------------------+
Special MAC Addresses:+------------------------------------------------------------------+| Address | Purpose ||----------------------|-------------------------------------------|| 00:00:00:00:00:00 | Blank/unspecified || FF:FF:FF:FF:FF:FF | Ethernet broadcast || 01:00:5E:00:00:00 | IPv4 multicast (lower 23 bits used) || 33:33:00:00:00:00 | IPv6 multicast (lower 32 bits used) || 01:80:C2:00:00:00 | Spanning Tree (STP) || 01:80:C2:00:00:0E | Link Layer Discovery Protocol (LLDP) |+------------------------------------------------------------------+
MAC Address Types:+------------------------------------------------------------------+
UNICAST:+------------------------------------------------------------------+| - Sent to single specific interface || - First byte is even (bit 0 = 0) || Example: 00:1A:2B:3C:4D:5E |+------------------------------------------------------------------+
MULTICAST:+------------------------------------------------------------------+| - Sent to group of interfaces || - First byte is odd (bit 0 = 1) || Example: 01:00:5E:00:00:01 (IPv4 multicast 224.0.0.1) |+------------------------------------------------------------------+
BROADCAST:+------------------------------------------------------------------+| - Sent to all devices on the segment || - Always FF:FF:FF:FF:FF:FF || - Example: ARP requests |+------------------------------------------------------------------+
Locally Administered Addresses:+------------------------------------------------------------------+| - Set the U/L bit (bit 1 of first byte) || - Override burned-in addresses || - Used for virtual interfaces, VMs || Example: CA:FE:BA:BE:00:01 (note second hex digit is odd) |+------------------------------------------------------------------+
+------------------------------------------------------------------+2.3 Layer 2: Internet Layer
Section titled “2.3 Layer 2: Internet Layer”The Internet Layer is equivalent to OSI’s Network Layer and handles logical addressing and routing.
IPv4 Header Deep Dive
Section titled “IPv4 Header Deep Dive”+------------------------------------------------------------------+| IPv4 Header Complete Analysis |+------------------------------------------------------------------+
IPv4 Header (20-60 bytes):+------------------------------------------------------------------+
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+|Version| IHL | DSCP |ECN| Total Length |+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+| Identification |Flags| Fragment Offset |+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+| Time To Live | Protocol | Header Checksum |+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+| Source IP Address |+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+| Destination IP Address |+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+| Options (if IHL > 5) |+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
Field Details:+------------------------------------------------------------------+
VERSION (4 bits):+------------------------------------------------------------------+| Value: 4 for IPv4 || Purpose: Indicates IP version being used |+------------------------------------------------------------------+
IHL - Internet Header Length (4 bits):+------------------------------------------------------------------+| Value: 5-15 (number of 32-bit words) || Default: 5 (20 bytes without options) || Calculation: Header length = IHL × 4 bytes || Purpose: Locate where data begins |+------------------------------------------------------------------+
DSCP - Differentiated Services Code Point (6 bits):+------------------------------------------------------------------+| Purpose: Quality of Service (QoS) || Used for: Traffic prioritization || Values: 0-63 (0 = default, 46 = EF = Expedited Forwarding) || Note: Replaces old Type of Service (ToS) |+------------------------------------------------------------------+
ECN - Explicit Congestion Notification (2 bits):+------------------------------------------------------------------+| Purpose: Signal congestion without dropping packets || Values: || 00 - Not ECN capable || 01 - ECN capable, transport not congested || 10 - ECN capable, transport not congested || 11 - Congestion experienced |+------------------------------------------------------------------+
TOTAL LENGTH (16 bits):+------------------------------------------------------------------+| Value: 20-65,535 bytes || Purpose: Total size of packet (header + data) || Note: fragmentation may split packets |+------------------------------------------------------------------+
IDENTIFICATION (16 bits):+------------------------------------------------------------------+| Purpose: Unique ID for fragments of same original packet || Value: 0-65,535 || Used for: Reassembly of fragmented packets |+------------------------------------------------------------------+
FLAGS (3 bits):+------------------------------------------------------------------+| Bit 0: Reserved (must be 0) || Bit 1: Don't Fragment (DF) || 0 = May fragment || 1 = Don't fragment || Bit 2: More Fragments (MF) || 0 = Last fragment || 1 = More fragments follow |+------------------------------------------------------------------+
FRAGMENT OFFSET (13 bits):+------------------------------------------------------------------+| Value: 0-8191 (in 8-byte units) || Purpose: Position of fragment in original packet || Calculation: Actual byte offset = Fragment Offset × 8 |+------------------------------------------------------------------+
TIME TO LIVE - TTL (8 bits):+------------------------------------------------------------------+| Value: 1-255 hops || Default: Usually 64 (Linux/Unix) or 128 (Windows) || Purpose: Prevents packets from circulating forever || Mechanism: Each router decrements by 1 || If TTL = 0: Router discards and sends ICMP Time Exceeded |+------------------------------------------------------------------+
PROTOCOL (8 bits):+------------------------------------------------------------------+| Common Values: || 1 = ICMP || 6 = TCP || 17 = UDP || 47 = GRE || 50 = ESP || 51 = AH || 89 = OSPF |+------------------------------------------------------------------+
HEADER CHECKSUM (16 bits):+------------------------------------------------------------------+| Algorithm: One's complement sum of 16-bit words || Calculation: Recalculated at each router (TTL changes!) || Purpose: Detect corrupted headers |+------------------------------------------------------------------+
SOURCE IP ADDRESS (32 bits):+------------------------------------------------------------------+| IPv4 address of sender || May be changed by NAT |+------------------------------------------------------------------+
DESTINATION IP ADDRESS (32 bits):+------------------------------------------------------------------+| IPv4 address of intended recipient || Used for routing decisions |+------------------------------------------------------------------+
+------------------------------------------------------------------+ICMP (Internet Control Message Protocol)
Section titled “ICMP (Internet Control Message Protocol)”ICMP is used for error reporting and diagnostics. It’s often called “the troubleshooting protocol.”
+------------------------------------------------------------------+| ICMP Deep Dive |+------------------------------------------------------------------+
ICMP Header (8 bytes minimum):+------------------------------------------------------------------+
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+| Type | Code | Checksum |+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+| Rest of Header |+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+| Data (Variable) |+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
Common ICMP Types:+------------------------------------------------------------------+
| Type | Code | Name | Purpose ||------|------|--------------------------|----------------------------------|| 0 | 0 | Echo Reply | Response to echo request || 3 | 0 | Destination Unreachable | Network unreachable || 3 | 1 | Destination Unreachable | Host unreachable || 3 | 3 | Destination Unreachable | Port unreachable || 3 | 13 | Destination Unreachable | Communication administratively || | | | filtered || 8 | 0 | Echo Request | Ping request || 11 | 0 | Time Exceeded | TTL expired in transit || 11 | 1 | Time Exceeded | Fragment reassembly time || 12 | 0 | Parameter Problem | Header problem |
ICMP Message Categories:+------------------------------------------------------------------+
ERROR MESSAGES:+------------------------------------------------------------------+| - Destination Unreachable (Type 3) || - Time Exceeded (Type 11) || - Parameter Problem (Type 12) || - Source Quench (Type 4) - deprecated || - Redirect (Type 5) - deprecated |+------------------------------------------------------------------+
DIAGNOSTIC MESSAGES:+------------------------------------------------------------------+| - Echo Request/Reply (Type 8/0) - PING || - Timestamp Request/Reply (Type 13/14) - deprecated || - Information Request/Reply (Type 15/16) - deprecated |+------------------------------------------------------------------+
ICMP in Practice - Ping:+------------------------------------------------------------------+
Ping uses Echo Request (Type 8) and Echo Reply (Type 0):
$ ping -c 4 8.8.8.8PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.64 bytes from 8.8.8.8: icmp_seq=1 ttl=117 time=14.2 ms64 bytes from 8.8.8.8: icmp_seq=2 ttl=117 time=14.3 ms64 bytes from 8.8.8.8: icmp_seq=3 ttl=117 time=14.1 ms64 bytes from 8.8.8.8: icmp_seq=4 ttl=117 time=14.2 ms
--- 8.8.8.8 ping statistics ---4 packets transmitted, 4 received, 0% packet loss, time 3003msrtt min/avg/max/mdev = 14.1/14.2/14.3/0.0 ms
Packet breakdown:+------------------------------------------------------------------+| Request: Type=8 (Echo Request), Code=0 || Reply: Type=0 (Echo Reply), Code=0 || Data: Padding pattern for timing |+------------------------------------------------------------------+
ICMP in Practice - Traceroute:+------------------------------------------------------------------+
Traceroute uses TTL=1 to get Time Exceeded from first hop:
$ traceroute 8.8.8.8traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 60 byte packets 1 gateway (192.168.1.1) 1.234 ms 1.123 ms 1.089 ms 2 10.0.0.1 (10.0.0.1) 5.432 ms 5.123 ms 5.098 ms 3 172.16.0.1 (172.16.0.1) 10.234 ms 10.123 ms 10.098 ms ...
Process:+------------------------------------------------------------------+| Hop 1: Send packet with TTL=1 || Router decrements to 0, sends Time Exceeded || We learn first hop IP || Hop 2: Send packet with TTL=2 || Second router decrements to 0 || We learn second hop IP || ...and so on until destination reached |+------------------------------------------------------------------+
ICMP Security Concerns:+------------------------------------------------------------------+| Attack Type | Description ||----------------------|------------------------------------------|| Ping of Death | Oversized ICMP packets || Smurf Attack | Broadcast ping amplification || ICMP Tunnel | Encapsulating data in ICMP || Firewalking | Using ICMP to map firewall rules |
Mitigation:+------------------------------------------------------------------+| - Block ICMP at firewall (except essential types) || - Rate limit ICMP || - Disable directed broadcast |+------------------------------------------------------------------+
+------------------------------------------------------------------+2.4 Layer 3: Transport Layer
Section titled “2.4 Layer 3: Transport Layer”The Transport Layer provides end-to-end communication services.
TCP Deep Dive
Section titled “TCP Deep Dive”+------------------------------------------------------------------+| TCP Header Deep Dive |+------------------------------------------------------------------+
TCP Header (20-60 bytes):+------------------------------------------------------------------+
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+| Source Port | Destination Port |+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+| Sequence Number |+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+| Acknowledgment Number |+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+|Data Offset| Res |N|C|E|U|A|P|R|S|F| Window Size |+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+| Checksum | Urgent Pointer |+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+| Options (if Data Offset > 5) |+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+| Data |+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
Field-by-Field:+------------------------------------------------------------------+
SOURCE PORT (16 bits):+------------------------------------------------------------------+| Range: 0-65,535 || Typically: Ephemeral port (49152-65535) || For clients: Assigned by OS |+------------------------------------------------------------------+
DESTINATION PORT (16 bits):+------------------------------------------------------------------+| Range: 0-65,535 || Well-known: 0-1023 (HTTP=80, SSH=22, etc.) || Registered: 1024-49151 (MySQL=3306, etc.) |+------------------------------------------------------------------+
SEQUENCE NUMBER (32 bits):+------------------------------------------------------------------+| Purpose: Position of first data byte in segment || Initial: ISN (Initial Sequence Number) - randomly chosen || After: Sequence number + bytes sent || Wrap-around: 32-bit, can wrap around |+------------------------------------------------------------------+
ACKNOWLEDGMENT NUMBER (32 bits):+------------------------------------------------------------------+| Purpose: Next expected byte from sender || Only valid when ACK flag is set || Formula: Acknowledgment = Last received byte + 1 |+------------------------------------------------------------------+
DATA OFFSET (4 bits):+------------------------------------------------------------------+| Value: 5-15 (number of 32-bit words) || Default: 5 (20 bytes, no options) || Calculation: Header length = Data Offset × 4 |+------------------------------------------------------------------+
RESERVED (3 bits):+------------------------------------------------------------------+| Must be 0 || Reserved for future use |+------------------------------------------------------------------+
CONTROL FLAGS (9 bits total):+------------------------------------------------------------------+
| Flag | Bit | Name | Purpose ||------|-----|----------------|------------------------------------------|| NS | 8 | ECN-Nonce | ECN concealment protection || CWR | 7 | Congestion | Window reduced (from ECN) || ECE | 6 | ECN-Echo | ECN capable during handshake || URG | 5 | Urgent | Urgent pointer is valid || ACK | 4 | Acknowledgment | Acknowledgment number valid || PSH | 3 | Push | Push data to application immediately || RST | 2 | Reset | Abort connection immediately || SYN | 1 | Synchronize | Initialize connection || FIN | 0 | Finish | Close connection gracefully |
WINDOW SIZE (16 bits):+------------------------------------------------------------------+| Purpose: Receiver's buffer capacity || Mechanism: Flow control || Max: 65,535 bytes (can be extended with options) || Scale: Can be scaled (Window Scale option) up to 1GB |+------------------------------------------------------------------+
CHECKSUM (16 bits):+------------------------------------------------------------------+| Coverage: Header + Data + Pseudo-header || Algorithm: One's complement sum || Purpose: Error detection || Pseudo-header includes: Source IP, Dest IP, Protocol, Length |+------------------------------------------------------------------+
URGENT POINTER (16 bits):+------------------------------------------------------------------+| Only valid when URG flag is set || Points to end of urgent data (after sequence number) || Use case: Out-of-band data (rarely used) |+------------------------------------------------------------------+
OPTIONS (variable):+------------------------------------------------------------------+
Common TCP Options:+------------------------------------------------------------------+
| Kind | Name | Length | Purpose ||------|-------------------|--------|----------------------------------------|| 0 | End of Option List| 1 byte | Marks end of options || 1 | No Operation | 1 byte | Padding for alignment || 2 | Maximum Segment | 4 bytes| MSS - max segment size || | Size | | || 3 | Window Scale | 3 bytes| Shift count for window scaling || 4 | SACK Permitted | 2 bytes| Selective ACK supported || 5 | SACK | Variable| Selective ACK blocks || 8 | Timestamp | 10 bytes| RTTM (Round Trip Time Measurement) |
Maximum Segment Size (MSS):+------------------------------------------------------------------+| Typical: 1460 bytes (1500 MTU - 20 IP header - 20 TCP header) || Can be negotiated during handshake || Path MTU Discovery can find optimal MSS |+------------------------------------------------------------------+
Window Scaling:+------------------------------------------------------------------+| Without: Max window = 65,535 bytes || With scaling: Max window = 65,535 × 2^shift (up to 1GB) || Shift: 0-14 (typically 0-3 in practice) || Must be agreed upon during handshake |+------------------------------------------------------------------+
Selective ACK (SACK):+------------------------------------------------------------------+| Without SACK: Must retransmit from lost packet onward || With SACK: Can selectively acknowledge non-contiguous blocks || Significantly improves performance over lossy links |+------------------------------------------------------------------+
+------------------------------------------------------------------+TCP Connection Lifecycle
Section titled “TCP Connection Lifecycle”+------------------------------------------------------------------+| TCP Connection Lifecycle |+------------------------------------------------------------------+
TCP THREE-WAY HANDSHAKE:+------------------------------------------------------------------+
Step 1: SYN (Client -> Server)+------------------------------------------------------------------+| Client -> Server: SYN, Seq=100 || || Meaning: "I want to establish connection" || Client state: SYN-SENT || Client picks random Initial Sequence Number (ISN=100) |+------------------------------------------------------------------+
Step 2: SYN-ACK (Server -> Client)+------------------------------------------------------------------+| Server -> Client: SYN, ACK, Seq=200, Ack=101 || || Meaning: "I'm ready, here's my ISN, acknowledge yours" || Server state: SYN-RECEIVED || Server picks its ISN=200 || ACK=101 confirms client's ISN was received |+------------------------------------------------------------------+
Step 3: ACK (Client -> Server)+------------------------------------------------------------------+| Client -> Server: ACK, Seq=101, Ack=201 || || Meaning: "I acknowledge your ISN, connection established" || Client state: ESTABLISHED || Server state: ESTABLISHED (after receiving ACK) |+------------------------------------------------------------------+
Why Three-Way Handshake?+------------------------------------------------------------------+| Problem: A delayed SYN from old connection could cause issues || Solution: Both sides confirm they received ISN || Result: Synchronized sequence numbers prevent confusion |+------------------------------------------------------------------+
DATA TRANSFER:+------------------------------------------------------------------+
Simple Data Transfer:+------------------------------------------------------------------+| Client -> Server: PSH, ACK, Seq=101, Ack=201, Data="GET /" || Server -> Client: ACK, Seq=201, Ack=109 || || Client -> Server: ACK, Seq=109, Ack=201, Data="Hello" || Server -> Client: ACK, Seq=201, Ack=113 |+------------------------------------------------------------------+
Flow Control:+------------------------------------------------------------------+| Receiver advertises window size in ACK || Sender must not send more than window allows || Example: Window=1000, sent 1000, wait for larger ACK || || Zero Window: Receiver buffer full || Sender sends probe packets to check when window opens |+------------------------------------------------------------------+
CONNECTION TERMINATION:+------------------------------------------------------------------+
Four-Way Handshake (Graceful Close):+------------------------------------------------------------------+
Step 1: FIN (Client -> Server)+------------------------------------------------------------------+| Client -> Server: FIN, ACK, Seq=100, Ack=200 || Client state: FIN-WAIT-1 || Meaning: "I'm done sending data" |+------------------------------------------------------------------+
Step 2: ACK (Server -> Client)+------------------------------------------------------------------+| Server -> Client: ACK, Seq=200, Ack=101 || Server state: CLOSE-WAIT || Meaning: "Acknowledged, I'll wait for more data" || Client state: FIN-WAIT-2 |+------------------------------------------------------------------+
Step 3: FIN (Server -> Client)+------------------------------------------------------------------+| Server -> Client: FIN, ACK, Seq=200, Ack=101 || Server state: LAST-ACK || Meaning: "I'm also done sending data" |+------------------------------------------------------------------+
Step 4: ACK (Client -> Server)+------------------------------------------------------------------+| Client -> Server: ACK, Seq=101, Ack=201 || Client state: TIME-WAIT (for 2MSL) || Server state: CLOSED (immediately) |+------------------------------------------------------------------+
Why TIME-WAIT (2MSL)?+------------------------------------------------------------------+| - Allow delayed packets to be delivered || - Ensure remote gets final ACK || - MSL = Maximum Segment Lifetime (typically 60 seconds) || - Total wait: 2-4 minutes typically || - Prevents port reuse issues |+------------------------------------------------------------------+
Connection Reset (Abrupt Close):+------------------------------------------------------------------+
RST Flag Usage:+------------------------------------------------------------------+| Scenario 1: Connection refused || Client -> Server: SYN || Server: No listener on port || Server -> Client: RST |+------------------------------------------------------------------+
Scenario 2: Half-open connection+------------------------------------------------------------------+| One side crashes || Other side keeps sending || After timeout, sends RST |+------------------------------------------------------------------+
Scenario 3: Application error+------------------------------------------------------------------+| Application crashes || Stack sends RST |+------------------------------------------------------------------+
TCP STATE TRANSITION DIAGRAM:+------------------------------------------------------------------+
+--------+ | LISTEN | +--------+ | +--------+ | +--------+ | SYN- |<---+--->| SYN- | | SENT | | RECEIVED| +--------+ +--------+ | | | | v v +--------+ +--------+ | ESTAB- | | ESTAB- | | LISHED| | LISHED| +--------+ +--------+ | | FIN | FIN | v v +----------+ +----------+ | FIN- | | CLOSE- | | WAIT-1 | | WAIT | +----------+ +----------+ | | FIN | FIN | v v +----------+ +----------+ | FIN- | | LAST- | | WAIT-2 | | ACK | +----------+ +----------+ | | ACK | ACK | v v +----------+ +----------+ | TIME- | | CLOSED | | WAIT | +----------+ +----------+ | 2MSL | v +----------+ | CLOSED | +----------+
+------------------------------------------------------------------+TCP Congestion Control
Section titled “TCP Congestion Control”TCP includes sophisticated algorithms to prevent network congestion.
+------------------------------------------------------------------+| TCP Congestion Control Deep Dive |+------------------------------------------------------------------+
Why Congestion Control?+------------------------------------------------------------------+| Without it: Packet loss, timeouts, network collapse || Goal: Utilize available bandwidth without overwhelming network |+------------------------------------------------------------------+
CONGESTION WINDOW (cwnd):+------------------------------------------------------------------+| Sender's limit on unacknowledged data || Different from receive window (rwnd) || Controls how much sender can transmit |+------------------------------------------------------------------+
SLOW START:+------------------------------------------------------------------+| Purpose: Start conservatively, increase quickly || || Algorithm: || - Start with cwnd = 1 MSS (typically 1460 bytes) || - For each ACK: cwnd = cwnd + MSS || - Exponential growth: 1, 2, 4, 8, 16... || - Until ssthresh reached or loss occurs || || Example: || RTT 1: Send 1 segment, get 1 ACK -> cwnd = 2 || RTT 2: Send 2, get 2 ACKs -> cwnd = 4 || RTT 3: Send 4, get 4 ACKs -> cwnd = 8 || ... |+------------------------------------------------------------------+
CONGESTION AVOIDANCE:+------------------------------------------------------------------+| Purpose: Grow window linearly once past slow start || || Algorithm: || - cwnd = cwnd + MSS × MSS / cwnd || - Roughly: 1 additional segment per RTT || || Example: || cwnd = 10 MSS || After one RTT: cwnd = 10 + 1 = 11 || After another: cwnd = 11 + 1 = 12 |+------------------------------------------------------------------+
FAST RECOVERY:+------------------------------------------------------------------+| Trigger: Duplicate ACKs detected (3) || || Algorithm: || 1. ssthresh = cwnd / 2 || 2. cwnd = ssthresh + 3 × MSS || 3. Continue sending (fast retransmit) || 4. When ACK arrives: cwnd = ssthresh, enter congestion || avoidance |+------------------------------------------------------------------+
TIMEOUT RECOVERY:+------------------------------------------------------------------+| Trigger: No response (timeout) || || Algorithm: || 1. ssthresh = cwnd / 2 || 2. cwnd = 1 MSS (start over) || 3. Slow start again |+------------------------------------------------------------------+
TCP VARIANTS:+------------------------------------------------------------------+
TCP Reno (Fast Recovery):+------------------------------------------------------------------+| Standard congestion control || Good for moderate packet loss |+------------------------------------------------------------------+
TCP Tahoe (Early):+------------------------------------------------------------------+| Always does slow start after loss || More conservative |+------------------------------------------------------------------+
TCP BIC (Binary Increase Congestion):+------------------------------------------------------------------+| Linux default until 2.6.39 || Good for high-speed networks || Uses binary search for optimal window |+------------------------------------------------------------------+
TCP CUBIC (Linux Default):+------------------------------------------------------------------+| Since Linux 2.6.39 || Uses cubic function for window growth || W = C(t-K)³ + Wmax || Good for high-speed, high-latency networks |+------------------------------------------------------------------+
BBR (Bottleneck Bandwidth and Round-trip):+------------------------------------------------------------------+| Newer algorithm || Focuses on actual bandwidth and RTT || Less packet loss || Available in Linux 4.9+ |+------------------------------------------------------------------+
Visualizing Congestion Control:+------------------------------------------------------------------+
Window Size ^ | Slow Start | Congestion | Fast Recovery | | / | Avoidance | / | | / | / | / | | / | / | / | | / | / |/ | | / | | | | / | | | |/ | | | +-------------------------------------------------> Time
Labels:- ssthresh: Slow start threshold- Wmax: Maximum window before loss
+------------------------------------------------------------------+2.5 Layer 4: Application Layer
Section titled “2.5 Layer 4: Application Layer”The Application Layer encompasses all protocols that applications use to communicate.
HTTP/HTTPS Deep Dive
Section titled “HTTP/HTTPS Deep Dive”+------------------------------------------------------------------+| HTTP Protocol Deep Dive |+------------------------------------------------------------------+
HTTP Request Structure:+------------------------------------------------------------------+
Request Line:+------------------------------------------------------------------+| METHOD /path HTTP/Version || Example: GET /index.html HTTP/1.1 |+------------------------------------------------------------------+
Headers:+------------------------------------------------------------------+| Header: value || Multiple headers, one per line || Blank line separates headers from body |+------------------------------------------------------------------+
Message Body:+------------------------------------------------------------------+| For POST, PUT requests || Contains data being sent |+------------------------------------------------------------------+
Example HTTP Request:+------------------------------------------------------------------+
GET /index.html HTTP/1.1Host: www.example.comUser-Agent: Mozilla/5.0 (X11; Linux x86_64)Accept: text/html,application/xhtml+xmlAccept-Language: en-US,en;q=0.5Accept-Encoding: gzip, deflate, brConnection: keep-aliveUpgrade-Insecure-Requests: 1
(blank line)
+------------------------------------------------------------------+
HTTP Response Structure:+------------------------------------------------------------------+
Status Line:+------------------------------------------------------------------+| HTTP/VERSION STATUS_CODE STATUS_MESSAGE || Example: HTTP/1.1 200 OK |+------------------------------------------------------------------+
Headers:+------------------------------------------------------------------+| Same format as request |+------------------------------------------------------------------+
Message Body:+------------------------------------------------------------------+| Actual content |+------------------------------------------------------------------+
Example HTTP Response:+------------------------------------------------------------------+
HTTP/1.1 200 OKDate: Mon, 27 Jul 2025 12:28:53 GMTServer: Apache/2.4.41 (Ubuntu)Content-Type: text/html; charset=UTF-8Content-Length: 1256Connection: keep-alive
(blank line)
<!DOCTYPE html><html><head> <title>Example</title>...</html>
+------------------------------------------------------------------+
HTTP Methods:+------------------------------------------------------------------+
| Method | Safe | Idempotent | Description ||--------|------|------------|-------------------------------------|| GET | Yes | Yes | Retrieve resource || HEAD | Yes | Yes | Like GET but headers only || POST | No | No | Submit data to create resource || PUT | No | Yes | Replace resource completely || PATCH | No | No | Partial update || DELETE | No | Yes | Remove resource || OPTIONS | Yes | Yes | Query allowed methods || CONNECT | No | No | Establish tunnel || TRACE | Yes | Yes | Debug (echo back request) |
Safe: Doesn't modify resourceIdempotent: Multiple same requests = same result
HTTP Status Codes:+------------------------------------------------------------------+
1xx - Informational:+------------------------------------------------------------------+| 100 Continue - Client can continue || 101 Switching Protocols - Protocol upgrade (WebSocket) |+------------------------------------------------------------------+
2xx - Success:+------------------------------------------------------------------+| 200 OK - Request successful || 201 Created - Resource created successfully || 204 No Content - Success, no body to return |+------------------------------------------------------------------+
3xx - Redirection:+------------------------------------------------------------------+| 301 Moved Permanently - Resource moved permanently || 302 Found - Temporary redirect || 304 Not Modified - Cached version still valid || 307 Temporary Redirect - Temporary, keep method || 308 Permanent Redirect - Permanent, keep method |+------------------------------------------------------------------+
4xx - Client Error:+------------------------------------------------------------------+| 400 Bad Request - Malformed request || 401 Unauthorized - Authentication required || 403 Forbidden - Access denied || 404 Not Found - Resource doesn't exist || 405 Method Not Allowed || 408 Request Timeout || 429 Too Many Requests - Rate limited |+------------------------------------------------------------------+
5xx - Server Error:+------------------------------------------------------------------+| 500 Internal Server Error || 502 Bad Gateway - Upstream server error || 503 Service Unavailable - Temporary overload || 504 Gateway Timeout - Upstream timeout |+------------------------------------------------------------------+
HTTP Headers:+------------------------------------------------------------------+
Request Headers:+------------------------------------------------------------------+| Header | Purpose ||--------------------|---------------------------------------------|| Host | Domain name (required in HTTP/1.1) || User-Agent | Client application || Accept | Acceptable content types || Accept-Language | Acceptable languages || Accept-Encoding | Acceptable encodings (gzip, br) || Authorization | Credentials || Cookie | Cookies || Referer | Referring page URL || Origin | Origin for CORS |+------------------------------------------------------------------+
Response Headers:+------------------------------------------------------------------+| Header | Purpose ||--------------------|---------------------------------------------|| Content-Type | MIME type || Content-Length | Body length || Content-Encoding | Encoding (gzip, br) || Set-Cookie | Cookies to set || Cache-Control | Caching directives || ETag | Version identifier for caching || Expires | Expiration time || Server | Server software || Location | Redirect URL (for 3xx) |+------------------------------------------------------------------+
HTTP/2 vs HTTP/1.1:+------------------------------------------------------------------+
| Feature | HTTP/1.1 | HTTP/2 ||------------------|-----------------|----------------------------------|| Multiplexing | No (one at a time)| Yes (parallel streams) || Header Compression| None | HPACK || Server Push | No | Yes || Binary | Text | Binary frames || Prioritization | No | Yes || Connection | One per request | Single connection |
HTTP/3 (QUIC):+------------------------------------------------------------------+| - Based on UDP instead of TCP || - Built-in encryption || - Zero RTT connection establishment || - Handles packet loss better || - Currently supported by major browsers and servers |+------------------------------------------------------------------+
HTTPS (HTTP over TLS/SSL):+------------------------------------------------------------------+
TLS Handshake:+------------------------------------------------------------------+| 1. ClientHello: TLS version, cipher suites, random bytes || 2. ServerHello: Selected cipher, server certificate || 3. (Optional) ServerKeyExchange || 4. ServerHelloDone || 5. ClientKeyExchange: Pre-master secret (encrypted) || 6. ChangeCipherSpec: Switch to encrypted || 7. Finished: Encrypted handshake complete || 8. ChangeCipherSpec (Server) || 9. Finished (Server) || 10. Application data |+------------------------------------------------------------------+
Certificate Verification:+------------------------------------------------------------------+| - Browser checks certificate validity dates || - Verifies certificate chain up to root CA || - Checks certificate matches domain || - Checks certificate not revoked (CRL/OCSP) |+------------------------------------------------------------------+
+------------------------------------------------------------------+2.6 Network Address Translation (NAT)
Section titled “2.6 Network Address Translation (NAT)”NAT allows multiple devices to share a single public IP address.
+------------------------------------------------------------------+| NAT Deep Dive |+------------------------------------------------------------------+
Why NAT?+------------------------------------------------------------------+| IPv4 address exhaustion || Private addresses can't traverse internet || Single public IP for entire network |+------------------------------------------------------------------+
NAT TYPES:+------------------------------------------------------------------+
Static NAT (One-to-One):+------------------------------------------------------------------+| Public IP Private IP || 203.0.113.10 <-> 192.168.1.10 || 203.0.113.11 <-> 192.168.1.11 || || Use case: Servers that must be publicly accessible |+------------------------------------------------------------------+
Dynamic NAT (One-to-Many Pool):+------------------------------------------------------------------+| Pool: 203.0.113.10 - 203.0.113.20 || || Client A (192.168.1.10) -> Gets 203.0.113.10 || Client B (192.168.1.11) -> Gets 203.0.113.11 || When released, IP returns to pool |+------------------------------------------------------------------+
PAT/NAT Overload (Many-to-One):+------------------------------------------------------------------+| All clients share single public IP || Distinguishes by port numbers || || 192.168.1.10:5000 -> 203.0.113.5:50001 || 192.168.1.11:5000 -> 203.0.113.5:50002 || 192.168.1.12:5000 -> 203.0.113.5:50003 || || Most common for home/office networks |+------------------------------------------------------------------+
NAT Traversal Issues:+------------------------------------------------------------------+| Problem: Internal hosts can't be reached from outside || Solution: Port forwarding (static NAT) || || Example: Forward port 80 to internal web server || External: 203.0.113.5:80 -> 192.168.1.10:80 |+------------------------------------------------------------------+
NAT Table Example:+------------------------------------------------------------------+
$ iptables -t nat -L -n -vChain PREROUTING (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 0 0 MASQUERADE all -- * eth0 0.0.0.0/0 0.0.0.0/0
Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination
NAT Traversal for P2P:+------------------------------------------------------------------+| Problem: Both peers behind NAT can't connect directly || Solutions: || - STUN (Session Traversal Utilities for NAT) || - TURN (Traversal Using Relays around NAT) || - ICE (Interactive Connectivity Establishment) |+------------------------------------------------------------------+
+------------------------------------------------------------------+2.7 DNS (Domain Name System)
Section titled “2.7 DNS (Domain Name System)”DNS translates domain names to IP addresses.
+------------------------------------------------------------------+| DNS Deep Dive |+------------------------------------------------------------------+
DNS Hierarchy:+------------------------------------------------------------------+
. (Root) / | \ .com .org .net | | | google.com wikipedia.org | mail.google.com
Root Servers:+------------------------------------------------------------------+| 13 root server instances worldwide (A-M) || Anycast for performance and redundancy || Operated by various organizations |+------------------------------------------------------------------+
DNS Query Process:+------------------------------------------------------------------+
Recursive Query:+------------------------------------------------------------------+| || Client -> Recursive Resolver -> Root -> TLD -> Authoritative || || Client: "What's google.com's IP?" || Resolver: "I'll find out for you" || Resolver -> Root: "Ask .com servers" || Resolver -> .com: "Ask google's nameservers" || Resolver -> google.com NS: "IP is 142.250.190.46" || Resolver -> Client: "142.250.190.46" |+------------------------------------------------------------------+
DNS Record Types:+------------------------------------------------------------------+
| Type | Number | Purpose | Example ||------|--------|-----------------------------------|---------------------|| A | 1 | IPv4 address | example.com -> 1.2.3|| AAAA | 28 | IPv6 address | example.com -> ::1 || CNAME| 5 | Alias | www -> example.com || MX | 15 | Mail exchange | @ -> mail.example || NS | 2 | Nameserver | @ -> ns1.example || TXT | 16 | Text record | SPF, DKIM || SOA | 6 | Start of Authority | Zone info || PTR | 12 | Pointer (reverse DNS) | IP -> hostname || SRV | 33 | Service location | _http._tcp -> ... || CAA | 257 | Certification Authority Access | Issue to... |
DNS Tools:+------------------------------------------------------------------+
dig:+------------------------------------------------------------------+| $ dig example.com || || ;; QUESTION || example.com. IN A || || ;; ANSWER || example.com. 86400 IN A 93.184.216.34 || || ;; AUTHORITY || example.com. 172800 IN NS ns1.example.com || || ;; ADDITIONAL || ns1.example.com. 172800 IN A 93.184.216.34 |+------------------------------------------------------------------+
dig with specific record:+------------------------------------------------------------------+| $ dig example.com MX || $ dig -x 93.184.216.34 (reverse) || $ dig +trace example.com (full resolution path) |+------------------------------------------------------------------+
nslookup:+------------------------------------------------------------------+| $ nslookup example.com || $ nslookup -type=MX example.com || $ nslookup 8.8.8.8 (reverse) |+------------------------------------------------------------------+
host:+------------------------------------------------------------------+| $ host example.com || $ host -t AAAA example.com |+------------------------------------------------------------------+
+------------------------------------------------------------------+2.8 Chapter Summary
Section titled “2.8 Chapter Summary”In this chapter, you learned:
- ✅ TCP/IP 4-layer model - how it compares to OSI
- ✅ Link Layer - Ethernet frames, MAC addresses
- ✅ Internet Layer - IPv4 headers, ICMP
- ✅ Transport Layer - TCP headers, three-way handshake, congestion control
- ✅ Application Layer - HTTP/HTTPS, DNS
- ✅ NAT - types and how it works
- ✅ DNS - hierarchy, query process, record types
This comprehensive knowledge of TCP/IP protocols is essential for network administration and troubleshooting.
Next Chapter
Section titled “Next Chapter”Last Updated: February 2026