How does TCP control fragmentation

TCP / IP

198.18.0.0 - 198.19.255.255 are addresses for testing network components.

The address space reserved for IP is no longer sufficient to control all end devices. Possible remedies:

  • Dynamic assignment of IP addresses: This procedure is used when dialing in at the provider. It is also suitable in the local network if it can be assumed that only some of the computers are in operation at any one time. The user is assigned an IP address for the duration of a connection. The best known method is called DHCP (dynamic host configuration protocol).
  • Further development of the IP protocol: With IP version 6, an address space expanded to 128 bits is created. This means that there are enough addresses available.
  • Network Address Translation (NAT): A gateway uses a different IP address in the Internet than in the local network (private address spaces). The implementation even allows a complete private network (see above) to be operated with a single external IP address.

Network Address Translation (NAT) and IP masquerading

The limited availability of IP addresses has meant that one had to think about various options for covering a larger area with the existing addresses. A possibility to connect private networks (and this ultimately also includes a private connection with more than one PC) to the Internet using as few addresses as possible NAT, PAT and IP masquerading. All procedures map private addresses according to RFC 1918 or a proprietary (unregistered) address space of a network to public registered IP addresses.
  • NAT (Network Address Translation)
    With NAT (Network Address Translation), the addresses of a private network are assigned to publicly registered IP addresses via tables. The advantage is that computers that communicate with one another in a private network do not need any public IP addresses. IP addresses of internal computers that establish communication with destinations on the Internet are given a table entry in the router, which is located between the Internet Service Provider (ISP) and the private network. Due to this one-to-one assignment, these computers are not only able to establish a connection to destinations on the Internet, but they can also be reached from the Internet. However, the internal structure of the company network remains hidden from the outside world.
  • IP masquerading
    IP masquerading, which is sometimes also referred to as PAT (Port and Address Translation), forms all Addresses of a private network a single public IP address. This is done by exchanging the port numbers in addition to the addresses when there is an existing connection. In this way, an entire private network only needs a single registered public IP address. The disadvantage of this solution is that the computers are in the private network Not can be selected from the Internet. This method is therefore ideal for connecting two or more computers with a private connection to the Internet via dial-up networking or ISDN routers.

    With this functionality, IP masquerading comes very close to proxy and firewall solutions, whereby a proxy must exist and be called up explicitly for a protocol (e.g. HTTP).

Subnets

Now that it is clear what a class A or B network is, we should point out the formation of subnets. These serve to subdivide an existing network into further, smaller networks.
  • Subnets are structuring options for networks without the need for additional class A, class B or class C IP addresses.
  • The standard procedure for dividing a network into subnetworks is called "subnetting".
  • The host address of the A, B or C network is divided into the areas of subnet address (subnet ID, subnet ID) and host address (remaining, shortened host ID). Part of the host address range is therefore used to differentiate between the subnets.
  • The network address and the subnet part of the host address space are called "extended network prefix".
  • The internal subnet structure of A, B or C networks is invisible from the outside.
  • In order for routers to be able to deliver datagrams to the correct network, they must be able to differentiate between the network and host parts of the IP address.
  • Traditionally, this is done using the network mask or subnet mask.
The subnet mask is used by the computer to assign the network part and the host part. It has the same structure as an IP address (32 bits or 4 bytes). By definition, all bits of the "network part" are to be set to 1, all bits of the "host part" to 0. For the above address classes, the subnet mask has the following appearance:
Address classSubnet mask (binary)Subnet mask (decimal)
Class A
Class B.
Class C

This subnet mask (also known as the "default subnet mask") can be overwritten manually.

A subnet mask for a class C network is therefore 255.255.255.0. This means that the first three bytes specify the network address and the fourth byte addresses the computer. A subnet mask with the value 255.255.0.0 would therefore indicate a class B network and the mask 255.255.255.0 stands for a C network.

Division into subnets

Network-
share in bit
Host share
in bit
Subnet
number *)
Number of hosts **)Subnet mask
824116777216255.0.0.0 Class A
9232128*65536255.128.0.0
1022464*65536255.192.0.0
1121 832*65536255.224.0.0
12201616*65536255.240.0.0
1319328*65536255.248.0.0
1418644*65536255.252.0.0
15171282*65536255.254.0.0
1616165536255.255.0.0 Class B
17152128*256255.255.128.0
1814464*256255.255.192.0
1913832*256255.255.224.0
20121616*256255.255.240.0
2111328*256255.255.248.0
2210644*256255.255.252.0
2391282*256255.255.254.0
2481256255.255.255.0 Class C
2572128255.255.255.128
266464255.255.255.192
275832255.255.255.224
2841616255.255.255.240
293328255.255.255.248
302644255.255.255.252

Remarks:

    *)  The first and last addresses resulting from the subdivision must not be used (confusion with the network and broadcast address of the higher-level network). The number of subnets is therefore reduced by two each:
    Is the network part of the IP address n Bits, then you get (2n) - 2 Subnets.

    **) The number of computers is also reduced by two due to the subnet address (all computer bits to 0) and broadcast address (all computer bits to 1):
    Is the host part of the IP address m Bits, then you get (2m) - 2 Hosts per subnet.

For example, if a company has a class C network, it might want to divide it into two separate segments. The broadcast traffic of the first segment cannot affect the other segment. In this case, the subnet mask is used, which divides the computer addresses into two areas. If the computers are to be divided into four equally sized subnets with 64 nodes each, the subnet mask is 255.255.255.192. The following formula applies to that Mask byte:

Byte value = 256 - (number of nodes in a segment)

When subnetting was standardized for the first time, it was forbidden to use the subnets in which all subnet bits had the value 0 or 1 (see comments above). This results in only two subnets with 62 hosts each in the example. Almost all systems now have correct subnetting ("classless" routing).

Example: Division into 4 subnets

A class C network should be divided into four equally sized subnets. The network address is 192.168.98.0. The administrator therefore selects the subnet mask 255.255.255.192 for subdivision. The four computers with the IP addresses 192.168.98.3, 192.168.98.73. 192.168.98.156 and 192.168.98.197 are therefore in four subnets between which routing must be carried out. Broadcasts in subnet 1 are therefore not transmitted to the other subnets. For example, it is now possible for the company to organize the sales computers in subnet 1, the purchasing computers in subnet 2, those for development in subnet 3 and a network of demo computers in subnet 4. This ensures that disruptions in individual subnets are also locally restricted to these. They do not affect the data structure of the entire company.

In general, the following list results for a C network:

Subnets of a C network

In brackets the reduced number of subnets (number - 2). The options highlighted in red cannot then be used in practice.
Subnet bitsHost bitspossible subnetsHost addressesSubnet mask
172 (0)126 (0)255.255.255.128
264 (2)62255.255.255.192
358 (6)30255.255.255.224
4416 (14)14255.255.255.240
5332 (30)6255.255.255.248
6264 (62)2255.255.255.252
711280255.255.255.254

Example: Division into 8 (6) subnets

Of the eight variably usable bits, it uses the three most significant bits for the subnet and the last five bits for the host address. The first address of each subnet is the address in which all host bits have the value 0.
 Subnet bitsHost bitsdecimal
Decimal value of the bit1286432168421 
first subnet000000000
second subnet0010000032
third subnet0100000064
fourth subnet0110000096
fifth subnet10000000128
sixth subnet10100000160
seventh subnet11000000192
eighth subnet11100000224

The eight available subnets are now known:

192.168.0.0/27 192.168.0.32/27 192.168.0.64/27 192.168.0.96/27 192.168.0.128/27 192.168.0.160/27 192.168.0.192/27 192.168.0.224/27

Annotation:

    The number behind the slash (above is 27) indicates how many bits of the 32-bit IP address are used as network share.

These subnets can now be assigned to individual networks. The following table shows the network and broadcast addresses of each individual subnet and the computer addresses.

SubnetIP addresses (last octet)
 networkHostsBroadcast
first subnet01-3031
second subnet3233-6263
third subnet6465-9495
fourth subnet9697-126127
fifth subnet128129-158159
sixth subnet160161-190191
seventh subnet192193-222223
eighth subnet224225-254255

As a little help, there is a subnet calculator as a Javascript program. An IP network address is entered in CIDR format (C.lassless I.nter-D.omain R.outing, e.g. 10.1.2.0/24). After clicking on "Calculate", the values ​​of the network address, the subnet mask and the range of the associated IP addresses appear in the lower field, whereby the first address of the specified range represents the network address and the last address the broadcast address.

ICMP - Internet Control Message Protocol

ICMP allows error messages and control messages to be exchanged at IP level. ICMP uses the IP like a ULP, but is an integral part of the IP implementation. It does not make IP a 'reliable service', but it is the only way to inform hosts and gateways about the status of the network (e.g. if a host is temporarily unavailable -> timeout).

The ICMP message is accommodated in the data part of the IP datagram; it may contain the IP header and the first 64 bytes of the datagram triggering the message (e.g. in the event of a timeout).

The five fields of the ICMP message have the following meanings:

Type
Identifies the ICMP message
  • 0 echo reply
  • 3 destination unreachable
  • 4 source quench
  • 5 Redirect (Change a Route)
  • 8 Echo request
  • 11 Time exceeded for a datagram
  • 12 Parameter problem on a datagram
  • 13 Timestamp request
  • 14 Timestamp reply
  • 15 Information request
  • 16 Information reply
  • 17 Address mask request
  • 18 Address mask reply
code
Detailed information on the message type
Checksum
Check sum of the ICMP message (data part of the IP datagram)
Identifier and sequence number
are used to assign incoming responses to the respective inquiries, since a station can send out several inquiries or several responses can come in to one inquiry.

Let us now turn to the individual message types:

Echo request / reply
Checking the reachability of a target node. Test data can also be sent, which are then returned unchanged (-> ping command under UNIX).
Destination unreachable
The cause is described in more detail in the code field: 0 Network unreachable 1 Host unreachable 2 Protocol unreachable 3 Port unreachable 4 Fragmentation needed 5 Source route failed
Source quench
If more datagrams come in than a station can process, it sends this message to the sending station.
Redirect
is sent from the first gateway to hosts in the same subnet if there is a better route connection through another gateway. The IP address of the other gateway is specified in the message.
Time exceeded
There are two reasons for this message to the source node:
  • Time-to-live exceeded (Code 0): When a gateway eliminates a datagram whose TTL counter has expired.
  • Fragment reassembly time exceeded (Code 1): If a timer expires before all fragments of the datagram have arrived.
Parameter problem on a datagram
Problems interpreting the IP header. A reference to the error location and the IP header in question are returned.
Timestamp request / reply
Allows time measurements and synchronization in the network. Three times are sent (in ms since midnight, Universal Time):
  • Originate T .: Send time of the request (from the sender)
  • Receive T .: arrival time (at the recipient)
  • Transmit T .: Time of sending the reply (from the recipient)
Information request / reply
With this message, a host can query the netid of its network by setting its netid to zero.
Address mask request / reply
With subnetting (see below) a host can request the subnet mask.

ICMP can be used by the user primarily for the commands ping and traceroute (with Windows "tracert"). These commands send out ICMP echo requests and wait for the ICMP echo reply. This is how you can determine whether a node can be reached. If you want to recognize all nodes in the local network, one is sufficient ping on the broadcast address, e.g. B .:

ping 192.168.33.255 To display the Arp table, there is the arp-Commando, with arp -a you get a list of the currently saved MAC addresses and their assignment to IP addresses.

Doing the above ping-Commando and that arp-Command one after the other, you get a list of the IP and MAC addresses of the active local nodes, e.g .:

ping -b -c1 192.168.33.255 arp -a

UDP - User Datagram Protocol

UDP is a simple Layer 4 protocol that provides an unreliable, connectionless transport service without flow control. UDP enables several independent communication relationships (multiplex connection) between two stations: The two processes of a communication relationship are identified (as with TCP, see below) using port numbers ("ports" for short) that are permanently assigned to generally known applications. However, ports can also be assigned dynamically or, in the case of an application, their behavior can be controlled using different ports. The transport units are called 'UDP datagrams' or 'user datagrams'. They have the following structure:

Source port
Identifies the sending process (if not required, the value is set to zero).
Destination port
Identifies the target node's process.
Length
Length of the UDP datagram in bytes (at least 8 = header length)
UDP checksum
Optional specification (if not used, set to zero) of a checksum. To determine this, the UDP datagram is preceded by a pseudo header of 12 bytes (but not also transmitted). Contains IP source address, IP destination address and protocol number (UDP = 17).

TCP - Transmission Control Protocol

Which higher-level protocol of the transport layer receives the data packet is shown in the 'Protocol' field of each IP packet. Each protocol of the transport layer is assigned a unique identification number, on the basis of which the IP layer can decide how to proceed with the packet. One of the most important protocols of the transport layer is TCP.

The task of TCP is to hide the above-mentioned deficits of IP. For the TCP user it should no longer be visible that the underlying protocol layers are sending data packets, but the user should be able to work with a byte stream like a normal file (or a terminal). Above all, TCP guarantees the correct transport of the data - each packet arrives only once, without errors and in the correct order. In addition, with TCP, several programs can use the connection between two computers quasi-simultaneously. TCP divides the connection into many virtual channels ("ports"), which are supplied with time-division multiplexed data. Only in this way is it possible that, for example, several users of a computer can use the network at the same time or that one can receive e-mail and transfer files via FTP at the same time with a single dial-up connection to the provider.

This protocol implements a connection-oriented, secure transport service as a layer 4 protocol. Safety is achieved through positive acknowledgments and repetition of faulty blocks. Almost all standard applications of many operating systems use TCP and the underlying IP as the transport protocol, which is why the entire protocol family is generally summarized under 'TCP / IP'. TCP can be used in local and worldwide networks, as IP and the layers below can work with a wide variety of network and transmission systems (Ethernet, radio, serial lines, ...). A sliding window mechanism (variable window size) is used to implement the flow control. TCP connections are full duplex. As with all connection-oriented services, a virtual connection must first be established and disconnected again when communication is terminated. "Connection establishment" here means an agreement between both stations about the modalities of the transmission (e.g. window size, acceptance of a certain service, etc.). As with UDP, the starting and end points of a virtual connection are identified by ports. Generally available services can be reached via 'well known' ports (-> fixed assigned port number). Other port numbers are agreed upon when the connection is established.

Two tricks are used so that the constant confirmation of each data segment does not unduly inhibit the transport. On the one hand, the acknowledgment of receipt can be given to a segment in the opposite direction - this saves a separate acknowledgment segment. Second, each byte does not have to be confirmed immediately, there is a so-called 'window'. The window size indicates how many bytes can be sent before the transmission has to be acknowledged. If there is no acknowledgment, the data is sent again. The received acknowledgment contains the number of the bytess that is next expected by the recipient - with which all previous bytes are also acknowledged. The window size can be changed dynamically with the receipt of the recipient. If the resources become scarce, the window size is reduced. In the extreme case of zero, the transmission is interrupted until the recipient acknowledges again. In addition to reliable data transport, this also ensures flow control.

The principle of the window mechanism is actually quite simple. If you look at the picture, the following facts emerge:

  • The window size in the example is three bytes.
  • Byte 1 was sent by the data source and acknowledged by the recipient.
  • The source has sent bytes 2, 3 and 4, but they have not yet been acknowledged by the recipient (acknowledgment may still be on the way).
  • Byte 5 has not yet been sent by the source. It only starts its journey when the receipt for byte 2 (or higher) has arrived.

The TCP packet is often referred to as a 'segment'. Each TCP block is preceded by a header, which is much more extensive than the previous ones:

Source port
Identifies the sending process.
Destination port
Identifies the target node's process.
Sequence Number
TCP regards the data to be transmitted as a numbered stream of bytes, whereby the number of the first byte is specified when the connection is established. This byte stream is divided into blocks (TCP segments) during transmission. The 'Sequence Number' is the number of the first data byte in the respective segment (-> correct sequence of incoming segments can be restored via different connections).
Acknowledgment Number
This confirms data from the receiving station, with data being sent in the opposite direction at the same time. The confirmation is therefore "saddled" to the data (piggyback). The number refers to a sequence number of the received data; all data up to this number (exclusively) are thus confirmed -> number of the next expected byte. The validity of the number is confirmed by the ACK field (-> code).
Data offset
Since the segment header can contain options similar to the IP header, the length of the header is specified here in 32-bit words.
Res.
Reserved for later use
code
Specification of the function of the segment:
  • URG Urgent-Pointer (see below)
  • ACK acknowledgment segment (acknowledgment number valid)
  • PSH Immediate sending of the data on the sender side (before the send buffer is filled) and on the receiver side immediate transfer to the application (before the receive buffer is filled) e.g. B. for interactive programs.
  • RST reset, disconnect
  • SYN The 'Sequence Number' field contains the initial byte number (ISN) -> numbering begins with ISN + 1. In the confirmation, the destination station transfers its ISN (connection establishment).
  • Terminate the FIN connection (sender has sent all data) as soon as the receiver has received everything correctly and does not want to get rid of any more data.
Window
Specifies the window size the recipient is willing to accept - can be changed dynamically.
Checksum
16-bit longitudinal parity over header and data.
Urgent pointer
Marking part of the data part as urgent. This is passed on to the user program immediately, regardless of the sequence in the data stream (URG code must be set). The value of the urgent pointer marks the last byte to be delivered; it has the number + .
Options
This field is used to exchange information between the two stations on the TCP level, e.g. B. the segment size (which in turn should depend on the size of the IP datagram in order to optimize the throughput in the network).

Process of a TCP session

Unlike IP, TCP is connection-oriented. That has to be the case, because TCP connections should basically be handled like files for the user. This means that a TCP connection is opened and closed like a file, and you can determine its position within the data stream, just as you can specify the position of the read or write position for a file. TCP also sends the data in larger units in order to keep the administrative overhead through header and control information to a minimum. In contrast to the IP packets, the units of the transport layer are called "segments". Each transmitted TCP segment has a unique sequence number which indicates the position of its first byte in the byte stream of the connection. Using this number, the sequence of the segments can be corrected and segments that have arrived twice can be sorted out. Since the length of the segment is known from the IP header, gaps in the data stream can also be discovered and the recipient can request new segments that have been lost.

When a TCP connection is opened, both communication partners exchange control information that ensures that the respective partner exists and can accept data. To do this, station A sends a segment with the request to synchronize the sequence numbers.
The introductory packet with the SYN bit set ("Synchronize" or "Open" request) announces the initial "Sequence Number" of the client. This initial "Sequence Number" is determined randomly. The ACK bit ("Acknowledge") is set for all subsequent packets. The server replies with ACK, SYN and the client confirms with ACK. It looks like this:

Station B now knows that the transmitter wants to open a connection and at which position in the data stream the transmitter will start counting. It confirms receipt of the message and in turn defines a sequence number for transmissions in the opposite direction.

Station A now confirms receipt of the sequence number from B and then begins to transmit data.

This type of exchange of control information, in which each side must confirm the actions of the other side before they can take effect, is called "three-way handshake". Even when a connection is cleared down, this ensures that both sides have received all data correctly and completely. In relation to time, a TCP / IP connection is represented as follows:

The following example shows how the TCP / IP protocol works. A message is sent from a computer in the green network to a computer in the orange network.

TCP state transition diagram

The following graphic describes the entire life cycle of a TCP connection in a relatively rough representation.

Explanation of the states:

  • LISTEN: Waiting for a connection request.
  • SYN-SENT: Waiting for a suitable connection request after a SYN has been sent.
  • SYN-RECEIVED: Waiting for confirmation of the connection request acknowledgment after both participants have received and sent a connection request.
  • ESTABLISHED: Open connection.
  • FIN-WAIT-1: Waiting for a connection termination request from the communication partner or for a confirmation of the connection termination that was previously sent.
  • FIN-WAIT-2: Waiting for a connection termination request from the communication partner.
  • CLOSE-WAIT: Waiting for a connection termination request (CLOSE) from the layer above.
  • CLOSING: Waiting for a connection termination request from the communication partner. LAST-ACK: Waiting for the confirmation of the connection termination request that was previously sent to the communication partner.

Time monitoring

Time plays an important role in all protocol implementations. In this way, all processes are monitored over time. For this purpose, so-called "timers" are started in the protocol implementation, the timeout of which leads to error handling.

Packet retry alarm clock or retransmission timeout

In wide area networks with a wide variety of connection types, which are also subject to time fluctuations, the choice of the waiting time for confirmations is difficult. The retransmission timeout (RTO) expires when the specified period between the sending of a TCP packet and the arrival of the associated acknowledgment is exceeded. In this case the packet has to be sent again. However, the period must not be firmly defined, as otherwise TCP could not be operated over networks with different runtimes - if you compare, for example, Ethernet and a serial connection via several gateways, there is a thousandfold difference in the transmission rate. For this reason, the time that elapses between sending and receiving an acknowledgment is measured in TCP for each packet, the so-called Round Trip Time (RTT). The time measured in this way is converted using a formula that filters out the peaks upwards and downwards, but also gradually adapts to an extended or shortened running time. The result is that Smoothed Round Trip Time (SRTT), i.e. the mean time it takes for a packet to be exchanged. This time is scaled again to create more leeway for unforeseen delays. SRTT: S = aS + (1 - a) R RTO: T = min [U, max [L, ßS]] (L R round trip time
T retransmission timeout (e.g. 30 seconds)
U Upper time limit (e.g. 1 second)
L lower time limit (e.g. 1 minute)
a smoothing factor (e.g. 0.9)
ß Scaling Factor (e.g. 2.0)

The two formulas are specified by RFC 793: first the SRTT filter, then the determination of the RTO. If, after repeating the packet, the repetition timer expires one more time, the RTO is usually increased exponentially up to twelve times. Only when this increase does not show any effect is the connection considered interrupted.

Persistance timer

When data is exchanged via TCP, it is in principle possible that the receive window is currently set to 0 - and at that very moment a packet is lost that should open the window again. As a result, both TCPs then wait for each other forever. An antidote to this is the persistence timer, which sends small TCP packets (1 byte) in certain time intervals and thus checks whether the recipient is ready again. If the receive window is still 0, a negative acknowledgment is returned; if it is larger, further data can be sent after the positive acknowledgment.

Standstill time or quiet time

Any possibility of confusing connections due to outdated TCP packets wandering around the network should be prevented. This is why port numbers are only released again after the TCP connections have been terminated after a certain period of time that is twice the "Maximum Segment Lifetime" (MSL) has passed. The MSL corresponds to the time in UNIX that is entered in the TTL field of IP. The UNIX user notices this waiting time if he wants to reopen a connection between the same partners (i.e. the same port numbers) immediately after the termination. The system then informs him that the port number used is still in use. A new connection can only be established after approx. 30 seconds have elapsed.

Keep Alive Timer and Idle Timer

These are two alarm clocks that are not provided for in the TCP specification, but are implemented in UNIX systems. Both are related to each other. The Keep Alive Timer causes an empty packet to be sent at regular intervals in order to check the existence of the connection to the partner. If the partner computer does not answer, the connection is terminated after the idle timer has expired. An application activates these timers with the KEEP_ALIVE option via the socket interface. The table below shows the values ​​for the timers mentioned above. It should be noted that the duration of the timer depends on the implementation and does not always have to be set to the values ​​given below.
Settings of the TCP timer (depending on the implementation)
timerDuration [s]
Retransmission timeoutdynamic
Persistance timer5
Quiet timer30
Keep Alive Timer45
Idle timer360

Algorithms to increase efficiency

There is a long way to go between a TCP implementation according to specification and an optimized TCP subsystem, as found in UNIX systems. Countless improvements have flowed into the UNIX TCP implementations over the years and new algorithms have been integrated in subsequent versions:

Acknowledgment delay

Normally, after receiving a packet, the recipient sends a response packet in which the size of the receiving window is reduced and the data is acknowledged. After the data has been transferred to the receiving process, the data buffers in the system become free, which results in a packet being sent with an enlargement of the receiving window. Once the program has processed the data, a response usually follows shortly afterwards, so three packets are usually necessary for a transaction. However, it has been found that in some cases, e.g. with Telnet or SSH operation, delaying the acknowledgment packet by 0.2 seconds has advantages: after this short waiting time, all three pieces of information - receiving window, acknowledgment and response - can be found in a single packet be shipped. So that data transfers that require high throughput are not slowed down, there is no delay if the receive window has been changed by at least 35% or two maximum packets.

Silly Window Syndrome Avoidance

In certain situations, reception window details are sent that are so small that the network and computer are excessively burdened by the many acknowledgment packets. To prevent this, the receive window is only enlarged again if there is sufficient space (more than 1/4 of the data buffer or a maximum packet). Likewise, the sender behaves conservatively and only sends if the window offered is sufficiently large.

Nagle Algorithm or Small Packet Avoidance

Named after its inventor, John Nagle, this algorithm tries to prevent small TCP packets from being sent. If certain applications only send very small packets, the header is usually larger than the payload. Therefore the algorithm tries to combine several packets. The first packet is always sent out immediately, but further data is buffered on the sender side until a full packet can be sent or an acknowledgment has been received for the previous packet. If a package is not full, it is sent when there are no more unconfirmed packages on the way. However, problems arise with applications that send many small messages without receiving a response (e.g. SSH). In this case the Nagle algorithm can be switched off.

Slow Start with Congestion Avoidance

These interconnected algorithms, sometimes also referred to as Jacobson algorithms, have only recently become known and are primarily important for slow networks and the operation of networks with gateways. It has been observed in recent years that the Internet provided less and less data throughput as the load increased, and that in some cases it almost collapsed. On closer inspection, it was found that more than half of the data was repetitions of lost TCP packets. What happened? A network path - data buffer from the sender via possible gateways to the receiver - can only accommodate a finite amount of data. If a gateway or host is very busy with traffic, there may not be enough buffer space to accommodate packets. In this case, the packets are discarded by the gateway, whereupon the sender of the packet carries out a repetition after the retransmission timeout has expired, thereby further unnecessarily increasing the load on the network. The slow start algorithm tries to determine how much data can be on its way towards the recipient at a time without causing any losses. This is achieved by gradually increasing the amount of data sent out to a point where there is a steady flow of data without repetitions. Where the amount of data to be sent was previously determined by the size, the capacity of the network path, the so-called "congestion window", is now the determining factor, with the congestion window always being smaller than or equal to the receiving window. Once the congestion window has leveled out, it will only be changed again when repetitions signal an increase in the network load: in this case "Congestion Avoidance" comes into effect. At the same time, by constantly and carefully enlarging the congestion window, an attempt is made to use resources that may become available. Due to the conservative behavior, the throughput can be increased by up to 30% and the number of repeated packets can be reduced by over 50%. In connection with these two algorithms, the determination of the retransmission timeout has also been improved. This value now adapts more quickly to changes in the RTT and prevents additional packet repetitions.

Ports for each service

With UDP and TCP, server processes listen on certain port numbers. By agreement, ports with lower numbers are used for this.These port numbers are specified in the RFCs for the standard services. A port in "listen" mode is, so to speak, a half-open connection. Only source IP and source port are known. The server process can be duplicated by the operating system so that further requests can be handled on this port.
  • The port numbers are configured on the host system and have two functions:
    • Generally available services can be reached via 'well known' ports (-> fixed port number assigned by RFC). So they stand for a protocol that is addressed directly via the number
    • or they are agreed upon when the connection is established and assigned to a server program
  • The port specification is necessary if several server programs are running on the addressed computer.
  • The port number is in the TCP header and is 16 bits in size. In theory, up to 65535 TCP connections can be established on a computer with a single IP address.
  • Port numbers are often required as parameters when configuring Internet clients.
  • The client processes normally use free port numbers that are assigned by the local operating system (port number> 1024).

The "well known" port numbers (0 to 1023), which must be uniquely addressed worldwide, are assigned by the IANA (Internet Assigned Numbers Authority). Some examples of TCP ports (UDP uses a different mapping):

Port numberprotocol
20FTP (data)
21FTP (commands)
22Secure Shell
23Telnet
25SMTP
53DNS server
80HTTP (proxy server)
110POP3
143IMAP

A complete port list is available from http://www.isi.edu/in-notes/iana/assignments/port-numbers.

Well known ports1 – 1023These ports are permanently assigned to an application or a protocol. The fixed assignment enables a simpler configuration. The Internet Assigned Numbers Authority (IANA) is responsible for managing these ports.
Registered ports1024 – 49151These ports are intended for various services.
Dynamically Allocated Ports49152 – 65535These ports are assigned dynamically. Every client can use these ports. When a process needs a port, it requests it from its host.

The IP address and port number define a communication end point, which in the TCP / IP world is called a "socket". In most implementations, the boundary between the application layer and the transport layer is also the boundary between the operating system and the application programs. In the OSI model, this limit is roughly the limit between layers 4 and 5. Therefore, IP is usually assigned roughly to level 3 and TCP roughly to level 4 of the OSI model. However, since TCP / IP is older and simpler than the OSI model, this classification cannot exactly fit.

Port scans

When scanning, an attempt is made to determine the open ports of a computer. This is usually the first step taken by an attacker trying to break into a computer. Therefore, a port scan is also used to check the security of your own system. With the scanning methods, procedures have been developed in which an attempt is made to leave the scanning process undetected on the scanned computer.
  • TCP connect scan
    This method tries to establish a connection to a port on the target computer. The scanner allows a full three-way handshake before breaking the connection again. However, this type of scan is very easy to detect and can also be easily blocked with the help of firewalls.
  • TCP SYN scan
    This method is often referred to as a "half-open scan". The scanner sends a SYN packet to the target computer, just like a normal connection setup. If the target computer responds with an RST, the scanner knows that this port is closed. However, if the target computer responds with a SYN / ACK, the port is open. In this case, the connection from the scanner is immediately terminated with an RST. This type of scanning is not as easy to discover on the target computer as the Connect Scan.
  • Stealth FIN scan
    Stealth scans should not be detected by the target computer. However, there are programs that detect precisely such scans. With the "Stealth FIN Scan" only a packet with a FIN flag is sent without an accompanying ACK flag. This type of package is not allowed. If the port is open, the package from the scanner will be ignored by the target computer. If the port is closed, the target computer responds with an RST packet.
  • Stealth Xmastree scan
    In this scan, the FIN, URG, and PUSH flags are all set together. This package is also not permitted. If the port is open, the packet from the scanner will be ignored by the target computer. If the port is closed, the target computer responds with an RST packet.
  • Stealth zero scan
    In this scan, all flags are set to zero. Everything else as above.
  • ACK scan
    This scan is used to test firewalls whether they work with "stateful inspection" (e.g. Firewall 1) or whether it is just a simple packet filter that discards incoming SYN packets. The ACK scan sends a packet with a set ACK flag and a random sequence number to the ports. If the packet is allowed through by the firewall, the server sends an RST because the packet cannot be assigned. In this case the port is classified as "unfiltered". If the firewall monitors the status of a connection, the packet is rejected by the target computer without a response or the scanner is replied to with an ICMP destination unreachable message.

PPP

The Point to Point Protocol (PPP) is currently used widely. It works with three sub-protocols:
  • The Data Link Layer Protocol enables the transmission (encapsulation) of datagrams over serial connections with the help of HDLC.
  • The Link Control Protocol (LCP) controls the establishment, configuration and testing of the connection.
  • The Network Control Protocol (NCP) enables the transmission of configuration data for various protocols of the network layer.
PPP is suitable for the simultaneous use of different protocols of the network layer, so it is a so-called "multi-protocol protocol". It is a stateful protocol:

PPP is a connection-oriented protocol and distinguishes between three phases: connection establishment, data transmission and connection disconnection. The implementation of these phases, taking into account the partial protocols of PPP, is shown in the picture below.

  1. The calling PPP node sends LCP frames to set up and configure the connection (data link). The LCP packages have a field with configuration options. These options include, for example, the Maximum Transmission Unit (MTU). This involves specifying whether certain PPP fields are compressed or the Link Authentication Protocol (LAP).
  2. In an optional phase, it is checked whether the quality of the connection is sufficient to set up a transmission of the packets of the network layer.
  3. An authentication phase follows.
  4. The calling PPP node sends NCP frames to select and configure the network layer protocol to be transmitted.
  5. The data can now be transferred.
  6. The connection remains in place until it is terminated by an LCP or NCP frame or until an external event occurs. These can include an interruption by the user, the termination of the transmission or the expiry of an "inactivity timer".

PPP supports various protocols for authentication. All protocols only implement one-sided authentication. This means that the calling node or its user must authenticate itself and the called node must check this authentication. The called node authenticates itself through its availability under this physical connection. The most important authentication protocols are:

  • the Password Authentication Protocol (PAP),
  • the Shiva Password Authentication Protocol (SPAP),
  • the Challenge Handshake Authentication Protocol (CHAP) and
  • a variant of CHAP, Microsoft-CHAP (MS-CHAP), which is available in two versions.

The most widely used authentication protocols are PAP and CHAP. MS-CHAPv2 is also quite common. Most ISPs first ask the dialing host for CHAP.

PAP supports a so-called two-way handshake. The combination "Username / Password" is transmitted by the calling node until the authentication is confirmed or rejected. In the event of rejection, the connection will be terminated. However, this approach offers only a low level of security: the password is transmitted unencrypted. Any number of repetitions is possible. Finally, the frequency and speed of the attempts are determined by the calling node so that a brute force attack is possible.

CHAP offers an increased level of security as part of a so-called three-way handshake. The calling node is only allowed to start authentication if it has been requested to do so by the called node. In this way the frequency and speed of attempts are determined by the called node. In addition, the combination "Username / Password" is only transmitted as part of a one-way hash function (Message Digest 5, MD5). The check can therefore not only take place when the connection is established, but also periodically during the connection.

IP next generation

by Heiko Holtkamp (http://www.rvs.uni-bielefeld.de/~heiko/tcpip/tcpip.pdf)

The rapid (exponential growth) of the Internet is forcing the Internet Protocol version 4 (IPv4) to be replaced by a successor protocol (IPv6 Internet Protocol Version 6).

Vinton Cerf (the 'father' of the Internet) describes the Internet in an interview with c't magazine "(...) as the most important infrastructure for all types of communication.". When asked how one could imagine the new communication services of the Internet, Cerf replied:

"What I find most exciting is to connect all the household appliances to the mains. I am not only thinking about the fact that the refrigerator will in future exchange information with the heating system to determine whether the kitchen is too warm. Electricity companies could, for example, control appliances such as dishwashers and allow them Providing electricity when there is no peak demand. Such applications, however, depend on being offered at an affordable price. That is not necessarily a long way off; the programmers should really just start by finally adding software for intelligent network applications And of course the security of such systems has to be guaranteed. After all, I don't want the children in the neighborhood to program my house! "

In the near future, completely new demands will be made on the Internet protocols.

Classless InterDomain Routing - CIDR

The shortage of Internet addresses due to the constantly increasing number of users is first tried to deal with the Classless Inter-Domain Routing (CIDR) to counteract. Assigning Internet addresses to classes (A, B, C, ...) wastes a large number of addresses. Class B in particular presents a problem here. Many companies claim a class B network, since a class A network with up to 16 million hosts seems oversized even for a very large company, a class network C with 254 hosts but too small.

A larger host range for class C networks (e.g. 10-bit, 1022 hosts per network) would probably have alleviated the problem of increasingly scarce IP addresses. Another problem would have arisen as a result: the entries in the routing tables would have increased many times over.

Another concept is Classless Inter-Domain Routing (RFC 1519): the remaining Class C networks are allocated in blocks of variable sizes. If, for example, 2000 addresses are required, then eight consecutive Class C networks can be assigned. In addition, the remaining class C addresses are assigned in a more restrictive and structured manner (RFC 1519). The world is divided into four zones, each of which receives part of the remaining class C address space:

As a result, each of the zones is assigned around 32 million addresses. The advantage of this procedure is that the addresses of a region have in principle been compressed into an entry in the routing tables and every router that receives an address outside of its region can safely ignore it.

Internet Protocol Version 6 - IPv6 (IP Next Generation, IPnG)

The main reason for changing the IP protocol is due to the limited address space and the growing number of routing tables. CIDR creates some air here, but it is clear that this measure will not be sufficient to get the shortage of addresses under control for a longer period of time. Other reasons for changing the IP protocol are the new demands placed on the Internet, which IPv4 cannot cope with. Streaming methods such as real audio or video on demand require the definition of a minimum throughput, which must not be undercut. With IPv4, however, such a "Quality of Service" cannot be defined - and therefore cannot be guaranteed. The IETF (Internet Engineering Task Force) therefore began work on a new version of IP in 1990. The main objectives of the project are:
  • Support billions of hosts, even with inefficient use of the address space
  • Reduction of the size of the routing tables
  • Simplification of the protocol so that the routers can process packets faster
  • Higher security (authentication and data protection) than today's IP
  • More emphasis on service types, especially for real-time applications
  • Support of multicasting through the possibility of defining the scope
  • Possibility for hosts to travel without changing their address (laptop)
  • Possibility for the protocol to develop further in the future
  • Supporting old and new protocols in coexistence for years
In December 1993, with RFC 1550, the IETF asked the Internet community to make suggestions for a new Internet protocol. A large number of proposals were submitted in response to the request. These ranged from only minor changes to the existing IPv4 to complete replacement by a new protocol. From these proposals, the IETF became the Simple Internet Protocol Plus (SIPP) selected as the basis for the new IP version.

When the developers started working on the new version of the Internet Protocol, a name for the project or the new protocol was required. Inspired by the television series "Star Trek - Next Generation", was used as a working name IP - Next Generation (IPnG) elected. Finally, the new IP was assigned an official version number: IP Version 6 or IPv6 for short. Protocol number 5 (IPv5) has already been used for an experimental protocol.

The characteristics of IPv6

Many of the features of IPv4 are retained in IPv6. Nevertheless, IPv6 is generally not compatible with IPv4, but it is compatible with the overlying Internet protocols, especially the protocols of the transport layer (TCP, UDP). The main features of IPv6 are:
  • Address size: Instead of the previous 32 bits, 128 bits are now available for the addresses. Theoretically, 2128 = 3.4*1038 Assign addresses.
  • Header format: The IPv6 header has been completely changed. The header only contains seven instead of the previous 13 fields. This change enables faster processing of the packets in the router. In contrast to IPv4, there is no longer just one header with IPv6, but several headers. A data gram consists of a basic header and one or more additional headers, followed by the user data.

  • Extended support of options and extensions: The expansion of the options has become necessary because some of the fields required for IPv4 are now optional. In addition, the way the options are presented also differs. This makes it easier for routers to skip options that are not intended for them.
  • Service types: IPv6 places more emphasis on the support of service types. IPv6 thus meets the demands for improved support for the transmission of video and audio data, e.g. B. through an option for real-time transmission.
  • Security: