Network programming - TCP. Client-server application on a TCP stream socket Reducing congestion by reducing data sent over the network

Journey through network protocols.

TCP and UDP are both transport layer protocols. UDP is a connectionless protocol with non-guaranteed packet delivery. TCP (Transmission Control Protocol) is a connection-oriented protocol with guaranteed delivery of packets. First, a handshake occurs (Hello | Hello | Let's chat? | Come on.), After which the connection is considered established. Further, packets are sent back and forth over this connection (there is a conversation), and with a check whether the packet has reached the recipient. If the package is lost, or reached, but with a bat checksum, then it is sent again (“repeat, did not hear”). Thus, TCP is more reliable, but it is more difficult in terms of implementation and, accordingly, requires more cycles / memory, which is not the latest value for microcontrollers. Examples of application protocols that use TCP include FTP, HTTP, SMTP, and many others.

TL;DR

HTTP (Hypertext Transfer Protocol) is an application protocol by which the server sends pages to our browser. HTTP is now ubiquitous in world wide web to get information from websites. In the picture, the lamp is on a microcontroller with an OS on board, in which colors are set through a browser.

The HTTP protocol is text-based and quite simple. Actually, this is how the GET method sent by the netcat utility to the local IPv6 address of the server with lights looks like:

~$ nc fe80::200:e2ff:fe58:b66b%mazko 80<

The HTTP Method is usually a short English word written in capital letters, case sensitive. Every server must support at least the GET and HEAD methods. In addition to the GET and HEAD methods, the POST, PUT, and DELETE methods are often used. The GET method is used to request the contents of the specified resource, in our case here GET /b HTTP/1.0 where the /b path is responsible for the color (blue). Server response:

HTTP/1.0 200 OK Server: Contiki/2.4 http://www.sics.se/contiki/ Connection: close Cache-Control: no-cache, no-store, must-revalidate Pragma: no-cache Expires: 0 Content- type: text/html Contiki RGB

Red is OFF

Green is OFF

Blue is ON

The status code (we have 200) is part of the first line of the server's response. It is a three digit integer. The first digit indicates the status class. The response code is usually followed by a space-separated explanatory phrase in English, which explains to the person the reason for such an answer. In our case, the server worked without errors, everything was in a bunch (OK).

Both the request and the response contain headers (each line is a separate header field, the name-value pair is separated by a colon). Headers end with an empty line, after which data can follow.

My browser refuses to open a local IPv6 address, so an additional address is registered in the firmware of the microcontroller and the same prefix must also be assigned to the virtual network interface of the simulator:

~$ sudo ip addr add abcd::1/64 dev mazko # linux ~$ netsh interface ipv6 set address mazko abcd::1 # windows ~$ curl http://

TCP integrates naturally into the client/server environment (see Figure 10.1). Server Application bugs(listen) incoming connection requests. For example, the WWW, file transfer, or terminal access services listen for requests from clients. Communications in TCP are initiated by the appropriate subroutines that initiate the connection to the server (see Chapter 21 on the socket API).

Rice. 10.1. The client calls the server.

In reality, the client may be another server. For example, mail servers can connect with other mail servers to send email messages between computers.

10.2 TCP concepts

In what form should applications send data in TCP? How does TCP transfer data to IP? How do transmitting and receiving TCP protocols identify an application-to-application connection and the data elements required to implement it? All of these questions are answered in the following sections, which describe the basic concepts of TCP.

10.2.1 Input and output data streams

Conceptual the connection model assumes that an application sends a data stream to a peer application. At the same time, it is capable of receiving a data stream from its connection partner. TCP provides full duplex(full duplex) mode of operation in which both two streams data (see Figure 10.2).

Rice. 10.2. Applications exchange data streams.

10.2.2 Segments

TCP can convert the outgoing data stream from an application into a form suitable for placement in datagrams. How?

The application sends data to TCP, and this protocol puts it in output buffer(send buffer). Next, TCP cuts pieces of data from the buffer and sends them, adding a header (in this case, segments segment). On fig. 10.3 shows how the data from output buffer TCPs are packetized into segments. TCP passes the segment to IP for delivery as a single datagram. Packing the data into chunks of the correct length ensures efficient forwarding, so TCP will wait until the appropriate amount of data is in the output buffer before creating a segment.

Rice. 10.3 Creating a TCP Segment

10.2.3 Pushing

However, large amounts of data are often not applicable to real world applications. For example, when an end-user client program initiates an interactive session with a remote server, the user then only enters commands (followed by pressing the return).

The user's client program needs TCP to know that data is being sent to the remote host and to perform this operation immediately. In this case, it is used extrusion(push).

If you look at the operations in an interactive session, you can find many shards with little data, and what's more, popping can be found in almost every data shard. However, popping should not be used during file transfers (except for the very last segment), and TCP will be able to pack data into segments most efficiently.

10.2.4 Urgent data

The application's data forwarding model assumes an ordered stream of bytes on its way to the destination. Referring again to the interactive session example, suppose the user pressed a key attention(attention) or break(interrupt). The remote application must be able to skip the interfering bytes and respond to the keystroke as soon as possible.

Mechanism urgent data(urgent data) marks special information in the segment as urgent. With this TCP tells its peer that the segment contains urgent data and can indicate where it is. The partner should forward this information to the destination application as soon as possible.

10.2.5 Application ports

The client must identify the service it wants to access. This is done through the specification of the host service's IP address and its TCP port number. As with UDP, TCP port numbers range from 0 to 65535. Ports from 0 to 1023 are known as well-known ports and are used to access standard services.

A few examples of well-known ports and their corresponding applications are shown in Table 10.1. Services Discard(port 9) and charged(port 19) are TCP versions of the services we already know from UDP. Keep in mind that traffic on TCP port 9 is completely isolated from traffic on UDP port 9.

Table 10.1 Well-Known TCP Ports and Their Corresponding Applications

Port	Application	Description
9	Discard	Cancel all incoming data
19	charged	Character generator. Character stream exchange
20	FTP Data	FTP forwarding port
21	FTP	Port for FTP dialog
23	TELNET	Port for remote login via Telnet
25	SMTP	SMTP protocol port
110	POP3	Personal Computer Mail Sampling Service
119	NNTP	Access to online news

What about the ports used by clients? In rare cases, the client is not running on a well-known port. But in such situations, wanting to open a connection, it often asks the operating system to assign an unused and unreserved port to it. At the end of the connection, the client must return this port back, after which the port can be reused by another client. Because there are over 63,000 TCP ports in the non-reserved number pool, the client port limits can be ignored.

10.2.6 socket addresses

As we already know, the combination of IP address and port for communication is called socket address. A TCP connection is fully identified by a socket address at each end of that connection. On fig. Figure 10.4 shows a connection between a client with socket address (128.36.1.24, port = 3358) and a server with socket address (130.42.88.22, port = 21).

Rice. 10.4. socket addresses

The header of each datagram contains the source and destination IP addresses. In the following, you will see that the source and destination port numbers are specified in the TCP segment header.

Typically, a server is capable of managing multiple clients at the same time. A server's unique socket addresses are assigned simultaneously to all of its clients (see Figure 10.5).

Rice. 10.5. Multiple clients connected to server socket addresses

Since the datagram contains a TCP connection segment identified by IP addresses and ports, it is very easy for a server to keep track of multiple connections to clients.

10.3 TCP Reliability Mechanism

In this section, we'll look at the TCP mechanism used to deliver data reliably while preserving forwarding order and avoiding loss or duplication.

10.3.1 Numbering and confirmation

TCP uses numbering and acknowledgment (ACK) to ensure reliable data transfer. The TCP numbering scheme is somewhat unusual: every forwarded over the connection octet considered as having a serial number. The TCP segment header contains a sequence number the first data octet of this segment.

The receiver is required to acknowledge receipt of the data. If no ACK arrives within the timeout interval, the data is retransmitted. This method is called positive acknowledgment with relay(positive acknowledgment with retransmission).

The receiver of the TCP data performs a strict check on the incoming sequence numbers to check the sequence in which the data was received and that there are no lost parts. Since the ACK may randomly be lost or delayed, duplicate segments may arrive at the receiver. Sequence numbers allow you to determine the duplication of data, which is then discarded.

On fig. Figure 10.6 shows a simplified view of timeout and retransmission in TCP.

Rice. 10.6. Timeout and retransmission in TCP

10.3.2 Port, sequence, and ACK fields in the TCP header

As shown in fig. 10.7, the first few fields of the TCP header provide space for the source and destination port values, the sequence number of the first byte of the embedded data, and an ACK equal to the sequence number next byte expected at the other end. In other words, if the TCP receives all bytes up to 30 from its peer, this field will have the value 31, indicating the segment to be forwarded next.

Rice. 10.7. Initial values in TCP header fields

One small detail should be noted. Assume that TCP has sent bytes 1 to 50 and there is no more data to send. If data is received from a peer, TCP must acknowledge receipt by sending a header with no data attached to it. Naturally, the ACK value is present in this header. In the sequence field - the value 51, i.e. next byte number intends send TCP. When the TCP sends the next data, the new TCP header will also have the value 51 in the sequence field.

10.4 Establishing a connection

How are the two applications connected? Before communication, each of them calls a routine to form a block of memory that will be used to store the TCP and IP parameters of this connection, such as socket addresses, current sequence number, initial lifetime value, and so on.

The server application waits for a client to appear, which, wanting to access the server, issues a request to compound(connect) identifying the IP address and port of the server.

There is one technical feature. Each side starts numbering each byte not from one, but from random serial number(We'll see why this is done later.) The original specification advises to generate the initial sequence number based on a 32-bit external timer that increments approximately every 4 µs.

10.4.1 Connection scenario

The connection procedure is often referred to as a three-way handshake, since three messages are exchanged to establish a connection - SYN, SYN and ACK.

During the establishment of a connection, partners exchange three important pieces of information:

1. The amount of buffer space for receiving data

2. The maximum amount of data carried in the incoming segment

3. Initial sequence number used for outgoing data

Note that each side uses operations 1 and 2 to indicate limits to which the other party will act. A personal computer may have a small receive buffer, while a supercomputer may have a huge buffer. The memory structure of a personal computer can limit the incoming portions of data to 1 KB, and the supercomputer is controlled with large segments.

The ability to control how the other side sends data is an important feature that makes TCP/IP scalable.

On fig. Figure 10.8 shows an example connection script. Very simple starting sequence numbers are provided so as not to overload the figure. Note that in this figure, the client is able to receive larger segments than the server.

Rice. 10.8. Establishing a connection

The following operations are performed:

1. The server initializes and becomes ready to connect with clients (this state is called passive open - passive open).

2. The client asks TCP to open a connection to the server at the specified IP address and port (this state is called active open).

3. The client TCP receives the initial sequence number (1000 in this example) and sends timing segment(synchronize segment - SYN). In this segment, the sequence number, the size of the receive window (4K), and the size of the largest segment that the client can receive (1460 bytes) are sent.

4. When a SYN arrives, the server TCP receives mine starting sequence number (3000). It sends a SYN segment containing the initial sequence number (3000), ACK 1001 (which means numbering the first byte sent by the client as 1001), the receive window size (4K), and the size of the largest segment the server can receive (1024 bytes).

5. The client TCP, having received a SYN/ACK message from the server, sends back ACK 3001 (the first byte of the data sent by the server should be numbered as 3001).

6. Client TCP tells its application to open a connection.

7. The server TCP, having received an ACK message from the client TCP, informs its application that the connection has been opened.

The client and server announce their rules for the received data, synchronize their sequence numbers, and become ready to exchange data. The TCP specification also allows for another scenario (not a very good one) where peer applications actively open each other at the same time.

10.4.2 Setting IP parameter values

An application's request to establish a connection can also specify parameters for the IP datagrams that will carry the connection's data. If no specific parameter value is specified, the default value is used.

For example, an application can choose the desired value for the IP priority or type of service. Since each of the connected parties independently sets its own priority and type of service, theoretically these values can differ for different directions of data flows. As a rule, in practice, the same values \u200b\u200bare applied for each direction of exchange.

When an application uses government or military security options, each connection endpoint must use the same security levels or the connection will fail.

10.5 Data forwarding

The data transfer starts after the completion of the three-step connection creation confirmation (see Figure 10.9). The TCP standard allows normal data to be included in acknowledgment segments, but it will not be delivered to the application until the connection is completed. To simplify the numbering, 1000-byte messages are used. Each TCP header segment has an ACK field identifying the byte sequence number that is expected to be received from the connection partner..

Rice. 10.9. Simple data flow and ACK

The first segment sent by the client contains bytes from 1001 to 2000. Its ACK field must contain the value 3001, which indicates the byte sequence number that is expected to be received from the server.

The server replies to the client with a segment containing 1000 bytes of data (beginning with number 3001). Its ACK field in the TCP header will indicate that bytes 1001 to 2000 have already been successfully received, so the next expected segment sequence number from the client should be 2001.

The client then sends segments starting with bytes 2001, 3001, and 4001 in that order. Note that the client does not expect an ACK after each segment sent. Data is sent to the peer until its buffer space is full (we will see below that the receiver can very precisely specify the amount of data to be sent to it).

The server saves connection bandwidth by using a single ACK to indicate that all segments were successfully forwarded.

On fig. Figure 10.10 shows data forwarding when the first segment is lost. When the timeout expires, the segment is retransmitted. Note that upon receiving a lost segment, the receiver sends one ACK acknowledging the forwarding of both segments.

Rice. 10.10. Data Loss and Retransmission

10.6 Closing a connection

Normal termination of a connection is performed using the same triple handshake procedure as when opening a connection. Each party can start closing the connection in the following scenario:

B:"Fine".

IN:"I also finished the job."

A:"Fine".

The following scenario is also acceptable (although it is used extremely rarely):

A:"I've finished the job. There is no more data to send."

IN:“Good. However, there is some data…”

IN:"I also finished the job."

A:"Fine".

In the example below, the connection closes the server, as is often the case for client/server communications. In this case, after the user enters in the session telnet logout command (log out of the system) the server initiates a request to close the connection. In the situation shown in Fig. 10.11, the following actions are performed:

1. The application on the server tells TCP to close the connection.

2. The server TCP sends a Final Segment (FIN), informing its peer that there is no more data to send.

3. The client TCP sends an ACK on the FIN segment.

4. The client's TCP tells its application that the server wants to close the connection.

5. The client application informs its TCP that the connection is closed.

6. The client TCP sends a FIN message.

7. The server TCP receives the FIN from the client and responds with an ACK message.

8. The server's TCP tells its application to close the connection.

Rice. 10.11. Closing a connection

Both parties can start closing at the same time. In this case, the normal closing of the connection is completed after each of the peers sends an ACK message.

10.6.1 Abrupt termination

Either party may request an abrupt termination of the connection. This is acceptable when an application wishes to terminate a connection, or when TCP detects a serious communication problem that it cannot resolve on its own. An abrupt termination is requested by sending one or more reset messages to the peer, as indicated by a specific flag in the TCP header.

10.7 Flow control

The TCP receiver is loaded with the incoming data stream and determines how much information it can accept. This restriction affects the TCP sender. The following explanation of this mechanism is conceptual, and developers may implement it differently in their products.

During connection setup, each peer allocates space for the connection's input buffer and notifies the other party of this. Typically, the buffer size is expressed as an integer number of maximum segment sizes.

The data stream enters the input buffer and is stored there until forwarded to the application (determined by the TCP port). On fig. Figure 10-12 shows an input buffer that can take 4 KB.

Rice. 10.12. Input buffer receiving window

The buffer space fills up as data arrives. When the receiving application pulls data from the buffer, the freed space becomes available for new incoming data.

10.7.1 Receiving window

receiving window(receive window) - any space in the input buffer not already occupied by data. The data remains in the input buffer until it is used by the target application. Why is the application not collecting data immediately?

A simple scenario will help answer this question. Let's assume that a client has uploaded a file to an FTP server running on a very busy multi-user computer. The FTP program must then read the data from the buffer and write it to disk. When the server performs disk I/O operations, the program waits for those operations to complete. At this time, another program may start (for example, according to a schedule) and by the time the FTP program starts again, the next data will already be in the buffer.

The receive window is extended from the last acknowledged byte to the end of the buffer. On fig. 10.12, the entire buffer is first available, and hence a 4K receive window is available. When the first KB arrives, the receive window will be reduced to 3 KB (for simplicity, we will assume that each segment is 1 KB in size, although in practice this value varies depending on the needs of the application). The arrival of the next two 1K segments will reduce the receive window to 1K.

Each ACK sent by the receiver contains information about the current state of the receiving window, depending on which the data flow from the source is regulated.

For the most part, the size of the input buffer is set at connection startup time, although the TCP standard does not specify how to manage this buffer. The input buffer can grow or shrink to provide feedback to the sender.

What happens if an incoming segment can be placed in the receiving window, but it arrives out of order? It is generally considered that all implementations store incoming data in the receive window and send an acknowledgment (ACK) only for a whole contiguous block of several segments. This is the correct way, because otherwise, discarding out-of-order data will significantly degrade performance.

10.7.2 Send window

A system transmitting data must keep track of two characteristics: how much data has already been sent and acknowledged, and the current size of the receiver's receive window. Active sending space(send space) Extends from the first unacknowledged octet to the left of the current receive window. Part window used to send, indicates how much additional data can be sent to the partner.

The initial sequence number and the initial receive window size are set during connection setup. Rice. 10.13 illustrates some of the features of the data transfer mechanism.

1. The sender starts with a send window of 4 KB.

2. The sender sends 1 KB. A copy of this data is retained until an acknowledgment (ACK) is received as it may need to be retransmitted.

3. An ACK for the first KB arrives, and the next 2 KB of data is sent. The result is shown in the third part from the top of Fig. 10.13. Storage of 2 KB continues.

4. Finally, an ACK arrives for all transmitted data (i.e., all received by the receiver). ACK restores the send window size to 4 KB.

Rice. 10.13. Send window

Several interesting features should be pointed out:

■ The sender does not wait for an ACK for each of the data segments it sends. The only limitation on the transfer is the size of the receive window (for example, the sender must only transfer 4K one-byte segments).

■ Suppose the sender sends data in several very short segments (for example, 80 bytes). In this case, the data can be reformatted for more efficient transmission (eg, into a single segment).

10.8 TCP header

On fig. Figure 10.14 shows the segment format (TCP header and data). The header starts with source and destination port IDs. Next field serial number(sequence number) indicates the position in the outgoing data stream that this segment occupies. Field ACK(confirmation) contains information about the expected next segment that should appear in the input data stream.

Rice. 10.14. TCP segment

There are six flags:

Field data offsets(Data Offset) contains the size of the TCP header in 32-bit words. The TCP header must end on a 32-bit boundary.

10.8.1 Maximum segment size option

Parameter "maximum segment size"(maximum segment size - MSS) is used to declare the largest piece of data that can be received and processed by the system. However, the title is somewhat inaccurate. Usually in TCP segment treated as header plus data. However maximum segment size defined as:

The size of the largest datagram that can be received is 40

In other words, the MSS reflects the greatest payload at the receiver when the TCP and IP headers are 20 bytes long. If there are additional parameters, their length should be subtracted from the total size. Therefore, the amount of data that can be sent in a segment is defined as:

Declared MSS value + 40 - (sum of TCP and IP header lengths)

Typically, peers exchange MSS values in initial SYN messages when a connection is opened. If the system does not advertise the maximum segment size, the default value of 536 bytes is used.

The maximum segment size is encoded with a 2-byte preamble followed by a 2-byte value, i.e. the largest value would be 2 16 -1 (65,535 bytes).

MSS imposes a hard limit on the data sent to TCP: the receiver will not be able to process large values. However, the sender uses segments smaller size since the MTU size along the route is also determined for the connection.

10.8.2 Using header fields in a connection request

The first segment sent to open a connection has a SYN flag of 1 and an ACK flag of 0. The initial SYN is the only a segment that has an ACK field of 0. Note that security uses this feature to detect incoming requests for a TCP session.

Field serial number contains starting sequence number(initial sequence number), field window - initial size receiving window. The only TCP setting currently defined is the maximum segment size (when not specified, the default value of 536 bytes is used) that TCP is supposed to receive. This value is 32 bits long and is usually present in the connection request in the field options(Option). The length of the TCP header containing the MSS value is 24 bytes.

10.8.3 Using header fields in a connection response

In an allow response to a connection request, both flags (SYN and ACK) are set to 1. The responding system indicates the starting sequence number in the corresponding field, and the receive window size in the field Window. The maximum segment size that the recipient wishes to use is usually found in the connection response (in the options). This value may differ from the value of the party requesting the connection, i.e. two different values can be used.

A connection request can be rejected by specifying a reset flag (RST) with a value of 1 in the response.

10.8.4 Selecting a starting sequence number

The TCP specification assumes that during the establishment of a connection, each party chooses starting sequence number(based on the current value of the 32-bit internal timer). How is it done?

Imagine what happens when the system crashes. Suppose the user opened a connection just before the crash and sent a small amount of data. After recovery, the system no longer remembers anything that was done before the crash, including connections already running and assigned port numbers. The user re-establishes the connection. The port numbers do not match the original assignments, and some of them may already be in use by other connections established a few seconds before the crash.

Therefore, the other side at the very end of the connection may not be aware that its partner went through a crash and was then restored. All this will lead to serious disruptions, especially when it takes a long time until the old data passes through the network and mixes with data from the newly created connection. Selecting a start timer with an update (fresh start) eliminates such problems. The old data will have a different numbering than the sequence number range of the new connection. Hackers, when spoofing a source IP address for a trusted host, try to gain access to computers by specifying a predictable starting sequence number in the message. A cryptographic hash function based on internal keys is the best way to select secure seed numbers.

10.8.5 Common use of fields

When preparing the TCP header for transmission, the sequence number of the first octet of the transmitted data is indicated in the field serial number(Sequence Number).

The number of the next octet expected from the connection partner is entered in the field confirmation(Acknowledgment Number) when the ACK bit is set to 1. Field window(Window) is for the current receiving window size. This field contains the number of bytes from the acknowledgment number that can be accepted. Note that this value allows precise control of the data flow. With this value, the peer indicates the actual state of the receiving window during the exchange session.

If an application indicates a TCP push operation, then the PUSH flag is set to 1. The receiving TCP MUST respond to this flag by rapidly delivering data to the application as soon as the sender wishes to forward it.

The URGENT flag, if set to 1, implies an urgent data transfer, and the corresponding pointer must point to the last octet of the urgent data. A typical use for urgent data is to send signals from the terminal to cancel or abort.

Urgent data is often called out-of-band information(out-of-band). However, this term is inaccurate. Urgent data is sent on a normal TCP stream, although individual implementations may have special mechanisms to indicate to an application that urgent data has arrived, and the application must examine the contents of the urgent data before all bytes of the message arrive.

The RESET flag is set to 1 when a connection should be aborted. The same flag is set in the response when a segment is received that is not associated with any of the current TCP connections.

The FIN flag is set to 1 for connection close messages.

10.8.6 Checksum

The IP checksum is only for the IP header, while the TCP checksum is calculated for the entire segment as well as the pseudo header generated from the IP header. During the calculation of the TCP checksum, the corresponding field is set to 0. In fig. Figure 10-15 shows a pseudo header very similar to the one used in the UDP checksum.

Rice. 10.15. Pseudo header field included in TCP checksum

The TCP length is calculated by adding the length of the TCP header to the length of the data. The TCP checksum is mandatory, not like in UDP. The checksum of the incoming segment is first calculated by the receiver and then compared with the contents of the checksum field of the TCP header. If the values do not match, the segment is discarded.

10.9 TCP Segment Example

Rice. 10.16, analyzer protocol Sniffer by Network General, is a sequence of TCP segments. The first three segments establish the connection between the client and the server telnet. The last segment carries 12 bytes of data.

Rice. 10.16. TCP Header Display by Sniffer Parser

Analyzer Sniffer translates most values to decimal. However, flag values are output as hexadecimal. The flag with value 12 is 010010. The checksum is also output in hexadecimal.

10.10 Support for session operation

10.10.1 Window probing

A fast sender and a slow receiver can form a 0-byte receive window. This result is called window closing(close window). When there is free space to update the receive window size, ACK is used. However, if such a message is lost, both parties will have to wait indefinitely.

To avoid this situation, the sender sets save timer(persist timer) when closing a premium window. The value of the timer is the retransmission timeout. At the end of the timer, a segment is sent to the partner sensing window(window probe; some implementations include data). The probe causes the peer to send back an ACK that reports the current status of the window.

If the window is still size zero, the value of the persisted timer is doubled. This process is repeated until the timer value reaches a maximum of 60 s. TCP will continue to send probe messages every 60 seconds, until a window opens, until the user terminates the process, or until the application times out.

10.11 Ending a session

10.11.1 Timeout

The connection partner may crash or be completely interrupted due to a gateway or link failure. To prevent retransmission of data in TCP, there are several mechanisms.

Upon reaching the first retransmission (relay) threshold, TCP tells IP to check for the failed router and at the same time informs the application of the problem. TCP continues to send data until the second limit value is reached, and only then closes the connection.

Of course, before this happens, there may be an ICMP message indicating that the destination is unreachable for some reason. In some implementations, even after this, TCP will continue trying to access the destination until the timeout interval expires (at which point the problem may be fixed). Next, the application is informed that the destination is unreachable.

An application can set its own data delivery timeout and perform its own operations when this interval expires. Usually the connection is terminated.

10.11.2 Maintaining a connection

When an unfinished connection has data to send for a long time, it gets the idle status. During a period of inactivity, a network crash or physical link failure may occur. As soon as the network becomes operational again, the partners will continue to exchange data without interrupting the communication session. This strategy met the requirements of the Ministry of Defense.

However, any connection - active or inactive - takes up a lot of computer memory. Some administrators need to return unused resources to systems. Therefore, many TCP implementations are capable of sending a message about maintaining a connection(keep-alive) that tests inactive connections. Such messages are periodically sent to the partner to check its existence in the network. The response must be ACK messages. The use of keep-alive messages is optional. If the system has this capability, the application can override it by its own means. Estimated period default for the connection maintenance timeout is a full two hours!

Recall that the application can set its own timer, according to which, at its level, it will decide on the termination of the connection.

10.12 Performance

How efficient is TCP? Resource performance is affected by many factors, the main ones being memory and bandwidth (see Figure 10.17).

Rice. 10.17. TCP performance factors

Bandwidth and delays in the physical network in use severely limit throughput. Poor data transfer quality results in a large volume of discarded datagrams, which causes retransmissions and consequently reduces bandwidth efficiency.

The receiving side must provide sufficient buffer space to allow the sender to transfer data without pauses in operation. This is especially important for networks with high latency, where there is a long time between sending data and receiving ACKs (and also when negotiating the window size). To maintain a steady stream of data from the source, the receiving side must have a window no smaller than the product of bandwidth and delay.

For example, if the source can send data at a rate of 10,000 bytes/s, and it takes 2 seconds to return an ACK, then the receiving window on the other side must be at least 20,000 bytes in size, otherwise the data flow will not be continuous. A receive buffer of 10,000 bytes will cut the throughput in half.

Another important factor for performance is the ability of the host to respond to high priority events and quickly execute context switching, i.e. complete one operation and switch to another. The host can interactively support multiple local users, batch background processes, and dozens of simultaneous communication connections. Context switching allows you to serve all these operations, hiding the load on the system. Implementations that integrate TCP/IP with the operating system kernel can significantly reduce the load from using context switching.

Computer CPU resources are required for TCP header processing operations. If the processor cannot quickly calculate the checksums, this leads to a decrease in the speed of data transfer over the network.

In addition, developers should look to simplify the configuration of TCP settings so that a network administrator can customize them to suit their local requirements. For example, the ability to adjust the buffer size for bandwidth and network latency will greatly improve performance. Unfortunately, many implementations do not pay enough attention to this issue and hard-code the communication parameters.

Let's assume that the network environment is perfect: there are sufficient resources and context switching is faster than cowboys draw their guns. Will excellent performance be obtained?

Not always. The quality of TCP software development also matters. Over the years, many performance problems have been diagnosed and resolved in various TCP implementations. It can be considered that the best software will be the one that complies with RFC 1122, which defines the requirements for the communication layer of Internet hosts.

An equally important exception and the application of Jacobson, Kern and Partridge algorithms (these interesting algorithms will be discussed below).

Software developers can gain significant benefits by creating programs that eliminate unnecessary small data transfers and have built-in timers to free network resources that are not currently in use.

10.13 Algorithms for improving performance

Moving on to an introduction to the rather complex part of TCP, we'll look at mechanisms for improving performance and dealing with throughput degradations. This section discusses the following issues:

■ slow start(slow start) prevents a large amount of network traffic from being used for a new session, which can lead to overhead.

■ Healing from Clueless Window Syndrome(silly window syndrome) prevents poorly designed applications from flooding the network with messages.

■ Delayed ACK(delayed ACK) reduces congestion by reducing the number of independent data transfer acknowledgment messages.

■ Computed retransmission timeout(computing retransmission timeout) relies on real-time session negotiation, reducing unnecessary retransmissions while not causing large delays for actually needed data exchanges.

■ TCP forwarding stall when overloads on the network allows routers to return to their original mode and share network resources for all sessions.

■ Shipping duplicate ACKs(duplicate ACK) on receipt of a segment out of sequence, allows peers to retransmit before a timeout occurs.

10.13.1 Slow start

If all household electrical appliances are turned on at the same time at home, the electrical network will be overloaded. In computer networks slow start prevents the mains fuses from blowing.

A new connection that instantly starts sending a large amount of data on an already busy network can lead to problems. The idea of a slow start is to ensure that the new connection starts up successfully with a slow increase in data transfer rate in accordance with the actual load on the network. The sender is limited by the size of the loading window, not by the larger receiving window.

loading window(congestion window) starts with a size of 1 segment. For each segment with a successfully received ACK, the load window size is increased by 1 segment, as long as it remains smaller than the receive window. If the network is not congested, the load window will gradually reach the size of the receive window. In a normal forwarding state, these windows will be the same size.

Note that a slow start is not so slow. After the first ACK, the load window size is 2 segments, and after a successful ACK for two segments, the size can increase to 8 segments. In other words, the window size increases exponentially.

Let's assume that instead of receiving an ACK, a timeout situation has occurred. The behavior of the loading window in this case is discussed below.

10.13.2 Clueless window syndrome

In the early implementations of TCP/IP, developers encountered the phenomenon Clueless Window Syndrome(Silly Window Syndrome - SWS), which manifested itself quite often. To understand what is happening, consider the following scenario, which leads to undesirable consequences, but it is quite possible:

1. The sending application sends data quickly.

2. The receiving application reads 1 byte of data from the input buffer (i.e. slowly).

3. The input buffer fills up quickly after reading.

4. The receiving application reads 1 byte and the TCP sends an ACK meaning "I have free space for 1 byte of data".

5. The transmitting application sends a TCP packet of 1 byte over the network.

6. The receiving TCP sends an ACK meaning "Thank you. I received the packet and have no more free space."

7. The receiving application again reads 1 byte and sends an ACK, and the whole process is repeated.

A slow receiving application waits a long time for data to arrive and constantly pushes the received information to the left edge of the window, performing a completely useless operation that generates additional traffic on the network.

Real situations, of course, are not so extreme. A fast sender and a slow receiver will exchange small (relative to the maximum segment size) chunks of data and switch over a nearly full receive window. On fig. 10.18 shows the conditions for the appearance of the "stupid window" syndrome.

Rice. 10.18. Receive window buffer with very small free space

Solving this problem is easy. As soon as the receive window is reduced by a length less than the given target size, TCP begins to deceive the sender. In this situation, TCP must not point the sender to additional space in the window when the receiving application reads data from the buffer in small chunks. Instead, free up resources should be kept secret from the sender until there are enough of them. A single segment size is recommended, unless the entire input buffer stores a single segment (in the latter case, a size equal to half the buffer is used). The target size that TCP should report can be expressed as:

minimum(1/2 input buffer, Max segment size)

TCP starts to cheat when the window size is less than this size, and will tell the truth when the window size is not less than the value given by the formula. Note that there is no harm to the sender, since the receiving application would still not be able to process much of the data it expects.

The proposed solution is easy to check in the case discussed above with the output of an ACK for each of the bytes received. The same method is also suitable for the case when the input buffer can store several segments (as often happens in practice). The fast sender will fill the input buffer, but the receiver will indicate that it has no free space to store information and will not open this resource until its size reaches the whole segment.

10.13.3 Nagle's algorithm

The sender must, regardless of the recipient, avoid sending very short segments by accumulating data before sending. Nagle's algorithm implements a very simple idea to reduce the number of short datagrams sent over the network.

The algorithm recommends delaying data transfer (and popping) while waiting for an ACK from previously transmitted data. The accumulated data is sent after receiving an ACK to a previously sent piece of information, or after receiving to send data in the size of a full segment, or upon completion of a timeout. This algorithm should not be used for real-time applications that need to send data as quickly as possible.

10.13.4 Delayed ACK

Another performance improvement mechanism is the way ACK is delayed. Reducing the number of ACKs reduces the amount of bandwidth that can be used to send other traffic. If the TCP partner slightly delays sending the ACK, then:

■ Multiple segments can be acknowledged with a single ACK.

■ The receiving application is able to receive some amount of data within the timeout interval, ie. the output header may be included in the ACK and no separate message needs to be generated.

In order to avoid delays when forwarding a stream of full-length segments (for example, when exchanging files), an ACK should be sent for at least every second full-length segment.

Many implementations use a 200ms timeout. But a delayed ACK does not reduce the exchange rate. When a short segment arrives, there is still enough free space in the input buffer to receive new data, and the sender can continue the transfer (in addition, retransmission is usually much slower). If a whole segment arrives, you need to respond to it with an ACK message at the same second.

10.13.5 Retransmission timeout

After sending the segment, TCP sets a timer and monitors for the arrival of an ACK. If an ACK is not received within the timeout period, TCP retransmits the segment (relay). However, what should be the timeout period?

If it is too short, the sender will flood the network with unnecessary segments that duplicate the information already sent. Too long a timeout will prevent the segments that are actually corrupted during the transfer from being quickly repaired, which will reduce throughput.

How to choose the correct interval for the timeout? Value suitable for high speed local network, is not suitable for a remote connection with many hits. Hence, the principle of "one value for any conditions" is clearly unsuitable. Moreover, even for an existing specific connection, network conditions may change, and delays may increase or decrease.

Algorithms of Jacobson, Kern and Partridge (described in articles , Van Jacobson, and Improving Round-Trip Time Estimates in Reliable Transport Protocols, Karn, and Partridge) allow TCP to adapt to changing network conditions. These algorithms are recommended for use in new implementations. We will briefly review them below.

Common sense dictates that the best basis for estimating the correct timeout time for a particular connection might be to track cycle time(round-trip time) as the interval between sending data and receiving confirmation of their receipt.

Good solutions for the following quantities can be obtained based on elementary statistics (see Figure 10.19) that will help calculate the timeout. However, do not rely on averages, as more than half of the scores will be greater than the statistical average. By considering a pair of variances, better estimates can be obtained that take into account the normal distribution and reduce too long retransmission latency.

Rice. 10.19. Distribution of cycle times

There is no need for a large amount of calculations to obtain formal mathematical estimates of deviations. You can use fairly rough estimates based on the absolute value of the difference between the last value and the average estimate:

Last deviation = | Last Cycle - Average |

To calculate the correct timeout value, another factor to consider is the change in cycle time due to current network conditions. What happened online at the last minute is more important than what happened an hour ago.

Assume that you are calculating the cycle average for a very long session. Suppose that at the beginning the network was lightly loaded, and we determined 1000 small values, but then there was an increase in traffic with a significant increase in delay time.

For example, if 1000 values gave an average value of 170 units, but then 50 values were measured with an average of 282, then the current average would be:

170x1000/1050 + 282x50/1050 = 175

More reasonable would be smoothed cycle time(Smoothed Round-Trip Time - SRTT), which takes into account the priority of later values:

New SRTT = (1 – α)×(old SRTT) + α×Last Cycle Value

The value of α is between 0 and 1. Increase a results in a greater influence of the current cycle time on the smoothed average. Because computers can quickly divide by powers of 2 by shifting binary numbers to the right, the value for α is always (1/2) n (usually 1/8), so:

New SRTT = 7/8×old SRTT + 1/8×Last cycle time

Table 10.2 shows how the formula for SRTT adjusts to the current SRTT value of 230 units when a change in network conditions results in a sequential increase in cycle time (assuming no timeout occurs). The values in column 3 are used as the values in column 1 for the next row in the table (i.e. the old SRTT).

Table 10.2 Computing Smoothed Cycle Time

Old SRTT	Latest RTT	(7/8)×(old SRTT) + (1/8)×(RTT)
230.00	294	238.00
238.00	264	241.25
241.25	340	253.59
253.59	246	252.64
252.64	201	246.19
246.19	340	257.92
257.92	272	259.68
259.68	311	266.10
266.10	282	268.09
268.09	246	265.33
265.33	304	270.16
270.16	308	274.89
274.89	230	269.28
269.28	328	276.62
276.62	266	275.29
275.29	257	273.00
273.00	305	277.00

Now the question arises of choosing a value for the retransmission timeout. An analysis of the cycle times shows a significant deviation of these values from the current average. It makes sense to set a limit for the magnitude of deviations (deviations). Good values for the retransmission timeout (called Retransmission TimeOut - RTO in the RFC standards) are given by the following formula with a constrained smoothed variance (SDEV):

T = Retransmission Timeout = SRTT + 2×SDEV

T = SRTT + 4×SDEV

To calculate SDEV, first determine the absolute value of the current deviation:

DEV = | Last Cycle Time - Old SRTT |

Then a smoothing formula is used to account for the last value:

New SDEV = 3/4×old SDEV + 1/4×DEV

One question remains - what initial values to take? Recommended:

Initial timeout = 3 s

Initial SRTT = 0

Initial SDEV = 1.5 s

Van Jacobson has defined a fast algorithm that calculates the retransmission timeout very efficiently.

10.13.6 Statistics example

How well will the timeout calculated above work? When implementing the obtained value, significant performance improvements were observed. An example would be command statistics netstat received on the system tiger- an Internet server that is accessed by many hosts from all over the world.

1510769 packets (314955304 bytes) received in-sequence

system tiger less than 2.5% of TCP data segments were retransmitted. For one and a half million incoming data segments (the rest being pure ACKs), only 0.6% were duplicated. In this case, it should be taken into account that the level of losses in the input data approximately corresponds to the level for the output segments. Thus, useless retransmission traffic is about 0.6% of the total traffic.

10.13.7 Calculations after resubmission

The above formulas use the cycle time value as the interval between sending a segment and receiving an acknowledgment of its receipt. However, suppose that no acknowledgment is received during the timeout period and the data must be resent.

Kern's algorithm assumes that in this case the cycle time should not be changed. The current smoothed value of the cycle time and smoothed deviation retain their values until an acknowledgment is received to send some segment without resending it. From this point on, calculations are resumed based on the stored values and new measurements.

10.13.8 Actions after retransmission

But what happens before confirmation is received? After a retransmission, TCP's behavior changes drastically, mainly due to data loss from network congestion. Therefore, the response to resending data will be:

■ Reduced retransmission rate

■ Fight network congestion by reducing overall traffic

10.13.9 Exponential braking

After a retransmission, the timeout interval is doubled. However, what happens when the timer overflows again? The data will be sent again and the retransmission period will double again. This process is called exponential braking(exponential backoff).

If the network fault continues to occur, the timeout period will double until it reaches a preset maximum value (typically 1 minute). After the timeout, only one segment can be sent. The timeout also occurs when the pre-set value for the number of data transfers without receiving an ACK is exceeded.

10.13.10 Reducing congestion by reducing the data sent over the network

Reducing the amount of data sent is somewhat more complicated than the mechanisms discussed above. It starts to work, like the already mentioned slow start. But, since a limit is set for the level of traffic, which can initially lead to problems, the exchange rate will actually slow down due to an increase in the size of the load window for one segment. You need to set the border values for a real reduction in the speed of sending. First, the danger threshold is calculated:

Boundary - 1/2 minimum (current loading window, partner receiving window)

If the resulting value is more than two segments, it is used as a boundary. Otherwise, the border size is set to two segments. The complete recovery algorithm requires:

■ Set the loading window size to one segment.

■ For each ACK received, increase the size of the loading window by one segment until the boundary is reached (much like the slow start mechanism).

■ Thereafter, with each ACK received, add a smaller value to the load window, which is chosen based on the rate of increase per segment for the cycle time (the increase is calculated as MSS/N, where N is the size of the load window in segments).

The scenario for the ideal case may be a simplistic representation of how the recovery mechanism works. Assume that the peer's receive window (and current load window) was 8 segments before the timeout was detected, and the boundary is defined to be 4 segments. If the receiving application is instantly reading data from the buffer, the receive window size will remain at 8 segments.

■ 1 segment is sent (load window = 1 segment).

■ ACK received - 2 segments are sent.

■ ACK received for 2 segments - 4 segments are sent, (boundary reached).

■ Received ACK for 4 segments. 5 segments are sent.

■ Received ACK for 5 segments. 6 segments are sent.

■ ACK received for 6 segments. 7 segments are sent.

■ ACK received for 7 segments. 8 segments are sent (the load window is again equal in size to the receive window).

Because all sent data must be acknowledged during the retransmission timeout, the process continues until the load window reaches the receive window size. The occurring events are shown in fig. 10.20. The window size increases exponentially, doubling during the slow start period, and once the boundary is reached, the increase is linear.

Rice. 10.20. Forward rate limit during congestion

10.13.11 Duplicate ACKs

In some implementations, an optional feature is used - the so-called fast reshipping(fast retransmit) - in order to speed up the retransmission of data under certain conditions. Its main idea is related to the recipient sending additional ACKs indicating a gap in the received data.

On receiving an out-of-order segment, the receiver sends back an ACK pointing to the first byte. lost data (see Figure 10.21).

Rice. 10.21. Duplicate ACKs

The sender does not perform instant retransmission of data because IP can normally deliver data to the recipient without a send sequence. But when several additional ACKs are received for duplication of data (for example, three), then the missing segment will be sent without waiting for the timeout to complete.

Note that each duplicate ACK indicates the receipt of a data segment. Several duplicate ACKs make it clear that the network is capable of delivering enough data, and therefore not too heavily loaded. As part of the overall algorithm, a small reduction in the size of the load window is performed with a real increase in network traffic. In this case, the process of drastic resizing when restoring the work does not apply.

According to the standard Host Requirements(host requirements) TCP must perform the same slow start as described above when source quenching. However, reporting this is not targeted or efficient, because the connection that received the message may not generate too much traffic. Current specification Router Requirements(router requirements) specifies that routers should not send source suppression messages.

10.13.13 TCP statistics

Finally, let's take a look at the statistics messages of the command netstat, to see many of the mechanisms described above in action.

Segments are called packets.

879137 data packets (226966295 bytes)

21815 data packets (8100927 bytes) retransmitted

Re-shipment.

132957 ack-only packets (104216 delayed)

Note the large number

delayed ACKs.

Window opening sounding

size zero.

These are SYN and FIN messages.

762469 acks (for 226904227 bytes)

Packet arrival signal

out of sequence.

1510769 packets (314955304 bytes)

9006 completely duplicate packets (867042 bytes)

The result of a timeout with a real

delivery of data.

74 packets with some dup. data (12193 bytes duped)

In order to be more efficient

some data was repackaged to include extra bytes when resent.

13452 out-of-order packets (2515087 bytes)

530 packets (8551 bytes) of data after window

Perhaps this data was

included in the sounding messages.

402 packets received after close

These are subsequent repetitions

sending.

108 discarded for bad checksums

Invalid TCP checksum.

0 discarded for bad header offset fields

7 discarded because packet too short

14677 connections established (including accepts)

18929 connections closed (including 643 drops)

4100 embryonic connections dropped

572187 segments updated rtt (of 587397 attempts)

Unsuccessful Change Attempts

cycle time, because the ACK did not arrive before the timeout expired,

26 connections dropped by rexmit timeout

Subsequent unsuccessful attempts

resend, indicating a lost connection.

Probing timeouts

zero window.

Check timeouts

idle connection.

472 connections dropped by keepalive

10.14 Compliance with developer requirements

The current TCP standard requires implementations to adhere strictly to the slow start procedure when initializing a connection and use Kern and Jacobson algorithms to estimate resend timeout and control load. Tests have shown that these mechanisms lead to significant performance improvements.

What happens when you install a system that does not adhere strictly to these standards? It will not provide adequate performance for its own users, and will be a bad neighbor for other systems on the network, preventing normal operation from being restored after a temporary overload and generating excessive traffic resulting in dropped datagrams.

10.15 Barriers to performance

TCP has proven its flexibility by operating on networks with baud rates of hundreds or millions of bits per second. This protocol has achieved good results in modern local area networks with Ethernet, Token-Ring and Fiber Distributed Data Interface (FDDI) topologies, as well as for low-speed communication lines or long distance connections (like satellite communications).

TCP is designed to respond to extreme conditions, such as network congestion. However, the current version of the protocol has features that limit performance in emerging technologies that offer hundreds and thousands of megabytes of bandwidth. To understand the problems that arise, consider a simple (albeit unrealistic) example.

Let's assume that when you move a file between two systems, you want to exchange a continuous stream as efficiently as possible. Let's assume that:

■ The maximum destination segment size is 1 KB.

■ Receiving window - 4 KB.

■ The bandwidth allows you to send two segments per 1 s.

■ The receiving application consumes data as it arrives.

■ ACK messages arrive after 2 seconds.

The sender is capable of sending data continuously. After all, when the volume allocated for the window is full, an ACK arrives, allowing another segment to be sent:

After 2 s:

RECEIVE ACK OF SEGMENT 1, CAN SEND SEGMENT 5.

RECEIVE ACK OF SEGMENT 2, CAN SEND SEGMENT 6.

RECEIVE ACK OF SEGMENT 3, CAN SEND SEGMENT 7.

RECEIVE ACK OF SEGMENT 4, CAN SEND SEGMENT 8.

After 2 more s:

RECEIVE ACK OF SEGMENT 5, CAN SEND SEGMENT 9.

If the receive window was only 2K, the sender would have to wait one second out of every two before sending the next data. In fact, to keep a continuous stream of data, the receiving window must be at least:

Window = Bandwidth×Cycle Time

Although the example is somewhat exaggerated (to provide simpler numbers), a small window can lead to problems with high latency satellite connections.

Now let's look at what happens with high-speed connections. For example, if the bandwidth and transfer rate are measured at 10 Mbps, but the cycle time is 100 ms (1/10 of a second), then for a continuous stream, the receive window must store at least 1,000,000 bits, i.e. . 125,000 bytes. But the largest number that can be written in the header field for a TCP receive window is 65,536.

Another problem arises at high baud rates, because sequence numbers run out very quickly. If the connection can send data at a rate of 4 GB / s, then the sequence numbers should be updated every second. There will be no way to distinguish between old duplicate datagrams that were delayed by more than a second as they traveled across the Internet, and fresh new data.

New research is being actively conducted to improve TCP/IP and remove the obstacles mentioned above.

10.16 TCP functions

This chapter covers the many features of TCP. The main ones are listed below:

■ Associating ports with connections

■ Initialization of connections through three-step confirmation

■ Performing a slow start to avoid network congestion

■ Data segmentation in transit

■ Data numbering

■ Handling incoming duplicate segments

■ Checksum calculation

■ Regulation of data flow through the receiving window and sending window

■ Terminating the connection in the prescribed manner

■ Terminating the connection

■ Forwarding urgent data

■ Positive Resend Confirmation

■ Retransmission Timeout Calculation

■ Reducing reverse traffic during network congestion

■ Signaling out-of-order segments

■ Probing the closing of the receiving window

10.17 TCP states

A TCP connection goes through several stages: a connection is established through an exchange of messages, then data is sent, and then the connection is closed using an exchange of special messages. Each step in the operation of the connection corresponds to a certain condition this connection. The TCP software at each end of the connection constantly monitors the current state of the other side of the connection.

Below we will briefly consider a typical change in the state of a server and a client located at different ends of the connection. We do not aim to give an exhaustive description of all possible states when transferring data. It is given in RFC 793 and document Host Requirements.

During the establishment of connections, the server and client go through similar sequences of states. The server states are shown in Table 10.3 and the client states are shown in Table 10.4.

Table 10.3 Server State Sequence

Server Status	Event	Description
CLOSED		The dummy state before starting the connection setup.
CLOSED	Passive opening by server application.
LISTEN (tracking)		The server is waiting for a connection from the client.
LISTEN (tracking)	The TCP server receives the SYN and sends the SYN/ACK.	The server received a SYN and sent a SYN/ACK. Goes to waiting for ACK.
SYN RECEIVED	The TCP server receives an ACK.
ESTABLISHED (installed)		ACK received, connection open.

Table 10.4 Client State Sequence

If the peers were trying to establish a connection with each other at the same time (which is extremely rare), each would go through the CLOSED, SYN-SENT, SYN-RECEIVED, and ESTABLISHED states.

The end parties of the connection remain in the ESTABLISHED state until one of the parties proceeds to closing connection by sending a FIN segment. During a normal close, the party initiating that close goes through the states shown in Table 10.5. Her partner goes through the states shown in Table 10.6.

Table 10.5 The state sequence of the side that closes the connection

Closing side states	Event	Description
ESTABLISHED	The local application requests that the connection be closed.
ESTABLISHED	TCP sends FIN/ACK.
FIN-WAIT-1		The closing party waits for the partner's response. Recall that new data may still arrive from the partner.
FIN-WAIT-1	TCP receives an ACK.
FIN-WAIT-2		The closing party has received an ACK from the peer, but has not yet received a FIN. The closing side waits for FIN while accepting incoming data.
	TCP receives FIN/ACK.
	Sends an ACK.
TIME-WAIT		The connection is maintained in an indeterminate state to allow the arrival or the discarding of the duplicated data or the duplicated FIN still existing in the network. The wait period is twice the maximum segment lifetime estimate.
CLOSED

Table 10.6 Connection close partner state sequence

Partner Status	Event	Description
ESTABLISHED	TCP receives FIN/ACK.
CLOSE-WAIT		FIN has arrived.
	TCP sends ACK.
		TCP waits for its application to close the connection. At this point, the application can send a fairly large amount of data.
	The local application initiates the closing of the connection.
	TCP sends FIN/ACK.
LAST-ACK		TCP is waiting for a final ACK.
LAST-ACK	TCP receives an ACK.
CLOSED		Removed all connection information.

10.17.1 Analyzing TCP connection states

Team netstat -an allows you to check the current state of the connection. The following shows connections in states listen, startup, established, closing And time-wait.

Note that the connection port number is listed at the end of each local and external address. You can see that there is TCP traffic for both the input and output queues.

Pro Recv-Q Send-Q Local Address Foreign Address (state)

Tcp 0 0 128.121.50.145.25 128.252.223.5.1526 SYN_RCVD

Tcp 0 0 128.121.50.145.25 148.79.160.65.3368 ESTABLISHED

Tcp 0 0 127.0.0.1.1339 127.0.0.1.111 TIME_WAIT

Tcp 0 438 128.121.50.145.23 130.132.57.246.2219 ESTABLISHED

Tcp 0 0 128.121.50.145.25 192.5.5.1.4022 TIME_WAIT

Tcp 0 0 128.121.50.145.25 141.218.1.100.3968 TIME_WAIT

Tcp 0 848 128.121.50.145.23 192.67.236.10.1050 ESTABLISHED

Tcp 0 0 128.121.50.145.1082 128.121.50.141.6000 ESTABLISHED

TCP 0 0 128.121.50.145.1022 128.121.50.141.1017 ESTABLISHED

Tcp 0 0 128.121.50.145.514 128.121.50.141.1020 CLOSE_WAIT

Tcp 0 1152 128.121.50.145.119 192.67.239.23.3572 ESTABLISHED

Tcp 0 0 128.121.50.145.1070 192.41.171.5.119 TIME_WAIT

Tcp 579 4096 128.121.50.145.119 204.143.19.30.1884 ESTABLISHED

Tcp 0 0 128.121.50.145.119 192.67.243.13.3704 ESTABLISHED

Tcp 0 53 128.121.50.145.119 192.67.236.218.2018 FIN_WAIT_1

Tcp 0 0 128.121.50.145.119 192.67.239.14.1545 ESTABLISHED

10.18 Implementation notes

From the very beginning, the TCP protocol has been designed for interoperability of network equipment from different manufacturers. The TCP specification does not specify exactly how the implementation's internal structures should work. These questions are left to developers, who are called upon to find the best mechanisms for each particular implementation.

Even RFC 1122 (document Host Requirements- host requirements) leaves plenty of room for variation. Each of the implemented functions is marked with a certain level of compatibility:

■ MAY (Allowed)

■ MUST NOT

Unfortunately, sometimes there are products that do not implement the MUST requirements. As a result, users experience the inconvenience of reduced performance.

Some good implementation practices are not covered by the standards. For example, security can be improved by limiting the use of well-known ports to privileged processes on the system, if this method is supported on the local operating system. To improve performance, implementations should make as little copying and moving of sent or retrieved data as possible.

Standard Application Programming Interface indefined(as well as the security policy), so that there is a free field of activity for experimenting with different sets of software tools. However, this may result in different programming interfaces on each platform and prevent application software from being moved between platforms.

In fact, developers base their toolkits on the Socket API, borrowed from Berkeley. The importance of the programming interface increased with the advent of WINSock (Windows Socket), leading to a proliferation of new desktop applications that could run on top of any TCP/IP stack-compatible WINSock interface.

10.19 Further reading

The original TCP standard is defined in RFC 793. Upgrades, fixes, and compatibility requirements are covered in RFC 1122. Kern (Kash) and Partridge (Partridge) published an article Improving Round-Trip Estimates in Reliable Transport Protocols In the magazine Proceedings of the ACM SIGCOMM 1987. Jacobson's article Congestion Avoidance and Control appeared in Proceedings of the ACM SIGCOMM 1988 Workshop. Jacobson also published several RFCs revising algorithms for improving performance.

Client-server application stream socket TCP

The following example uses TCP to provide ordered, reliable two-way byte streams. Let's build a complete application that includes a client and a server. First, we demonstrate how to construct a server on TCP stream sockets, and then a client application to test our server.

The following program creates a server that receives connection requests from clients. The server is built synchronously, hence the execution of the thread is blocked until the server agrees to connect to the client. This application demonstrates a simple server that responds to a client. The client terminates the connection by sending a message to the server .

TCP server

The creation of the server structure is shown in the following functional diagram:

Here is the complete code for the SocketServer.cs program:

// SocketServer.cs using System; using System.Text; using System.Net; using System.Net.Sockets; namespace SocketServer ( class Program ( static void Main(string args) ( // Set the local endpoint for the socket IPHostEntry ipHost = Dns.GetHostEntry("localhost"); IPAddress ipAddr = ipHost.AddressList; IPEndPoint ipEndPoint = new IPEndPoint(ipAddr, 11000 ); // Create a Tcp/Ip socket Socket sListener = new Socket(ipAddr.AddressFamily, SocketType.Stream, ProtocolType.Tcp); // Assign socket to local endpoint and listen for incoming sockets try ( sListener.Bind(ipEndPoint); sListener. Listen(10); // Start listening for connections while (true) ( Console.WriteLine("Waiting for a connection on port (0)", ipEndPoint); // The program pauses, waiting for an incoming connection Socket handler = sListener.Accept(); string data = null; // We waited for a client trying to connect with us byte bytes = new byte; int bytesRec = handler.Receive(bytes); data += Encoding.UTF8.GetString(bytes, 0, bytesRec); // Show data on the console Console.Write("Received text: " + data + "\n\n"); // Sending a response to the client\ string reply = "Thanks for the request in " + data.Length.ToString() + " characters"; byte msg = Encoding.UTF8.GetBytes(reply); handler.Send(msg); if (data.IndexOf(" ") > -1) ( Console.WriteLine("Server ended connection with client."); break; ) handler.Shutdown(SocketShutdown.Both); handler.Close(); ) ) catch (Exception ex) ( Console.WriteLine (ex.ToString()); ) finally ( Console.ReadLine(); ) ) ) )

Let's look at the structure of this program.

The first step is to set the local endpoint for the socket. Before opening a socket to listen for connections, you need to prepare a local endpoint address for it. The unique TCP/IP service address is determined by the combination of the host's IP address with the service port number that creates the service endpoint.

The Dns class provides methods that return information about the network addresses supported by the device on the local network. If a LAN device has more than one network address, the Dns class returns information about all network addresses, and the application must select the appropriate address to serve from the array.

Create an IPEndPoint for the server by combining the first host IP address obtained from the Dns.Resolve() method with the port number:

IPHostEntry ipHost = Dns.GetHostEntry("localhost"); IPAddress ipAddr = ipHost.AddressList; IPEndPoint ipEndPoint = new IPEndPoint(ipAddr, 11000);

Here, the IPEndPoint class represents localhost on port 11000. Next, we create a stream socket with a new instance of the Socket class. By setting up a local endpoint to listen for connections, you can create a socket:

Socket sListener = new Socket(ipAddr.AddressFamily, SocketType.Stream, ProtocolType.Tcp);

Enumeration AddressFamily specifies the addressing schemes that an instance of the Socket class can use to resolve an address.

In parameter SocketType TCP and UDP sockets differ. It can include the following values:

dgram

Supports datagrams. The Dgram value requires you to specify Udp for the protocol type and InterNetwork in the address family parameter.

Raw

Supports access to the underlying transport protocol.

Stream

Supports stream sockets. The Stream value requires Tcp to be specified for the protocol type.

The third and final parameter specifies the type of protocol required for the socket. In parameter ProtocolType you can specify the following most important values - Tcp, Udp, Ip, Raw.

The next step should be to assign the socket with the method Bind(). When a socket is opened by a constructor, it is not assigned a name, only a handle is reserved. The Bind() method is called to assign a name to the server socket. In order for a client socket to be able to identify a TCP stream socket, the server program must name its socket:

SListener.Bind(ipEndPoint);

The Bind() method binds a socket to a local endpoint. You must call the Bind() method before any attempts to call the Listen() and Accept() methods.

Now, having created a socket and associated a name with it, you can listen to incoming messages using the method listen(). In the listening state, the socket will wait for incoming connection attempts:

SListener.Listen(10);

The parameter defines backlog (backlog) A that specifies the maximum number of connections waiting to be processed in the queue. In the above code, the value of the parameter allows up to ten connections to be accumulated in the queue.

In the listening state, you must be ready to agree to connect with the client, for which the method is used accept(). This method obtains a client connection and completes the association between client and server names. The Accept() method blocks the calling program's thread until a connection is received.

The Accept() method retrieves the first connection request from the queue of pending requests and creates a new socket to handle it. While the new socket is created, the original socket continues to listen and can be used with multithreading to accept multiple connection requests from clients. No server application should close the listening socket. It must continue to function along with the sockets created by the Accept method to process incoming client requests.

While (true) ( Console.WriteLine("Waiting for a connection on port (0)", ipEndPoint); // The program pauses, waiting for an incoming connection Socket handler = sListener.Accept();

Once the client and server have established a connection between themselves, you can send and receive messages using the methods Send() And receive() Socket class.

The Send() method writes outgoing data to the socket that is connected to. The Receive() method reads incoming data on the stream socket. When using a TCP-based system, a connection must be established between the sockets before the Send() and Receive() methods execute. The exact protocol between two interacting entities must be determined in advance so that the client and server applications do not block each other, not knowing who should send their data first.

When the data exchange between the server and the client is completed, you need to close the connection using the methods shutdown() And Close():

Handler.Shutdown(SocketShutdown.Both); handler.Close();

SocketShutdown is an enum containing three values to stop: Both- stops sending and receiving data on the socket, receive- stops receiving data on the socket and send- stops the socket from sending data.

The socket is closed when the Close() method is called, which also sets the socket's Connected property to false.

Client on TCP

The functions that are used to create a client application are more or less like a server application. As for the server, the same methods are used to determine the endpoint, instantiate the socket, send and receive data, and close the socket.

Servers that implement these protocols in corporate network, provide the client with an IP address, gateway, netmask, nameservers, and even a printer. Users do not have to manually configure their hosts in order to use the network.

The QNX Neutrino operating system implements another auto-configuration protocol called AutoIP, which is a project of the IETF committee on automatic tuning. This protocol is used in small networks to assign link-local (link-local) IP addresses to hosts. The AutoIP protocol determines the link-local IP address on its own using a negotiation scheme with other hosts and without accessing a central server.

Using the PPPoE protocol

The abbreviation PPPoE stands for "Point-to-Point Protocol over Ethernet". This protocol encapsulates data for transmission over an Ethernet network with a bridged topology.

PPPoE is a user connectivity specification Ethernet networks to the Internet through a broadband connection, such as a leased digital subscriber line, wireless device, or cable modem. The use of the PPPoE protocol and a broadband modem provides users with a local computer network individual authenticated access to high-speed data networks.

The PPPoE protocol combines Ethernet technology with the PPP protocol, which allows you to effectively create a separate connection to a remote server for each user. Access control, connection accounting, and service provider selection are defined for users, not for hosts. The advantage of this approach is that neither the telephone company nor the ISP has to provide any special support for this.

Unlike dial-up connections, DSL and cable modem connections are always active. Because the physical connection to the remote service provider is shared by multiple users, an accounting method is needed that records traffic senders and destinations, and charges users. The PPPoE protocol allows a user and a remote host that are participating in a communication session to learn each other's network addresses during an initial exchange called discovery(discovery). Once a session between an individual user and a remote host (eg, Internet Service Provider) is established, that session can be monitored in order to make accruals. Many homes, hotels, and corporations share the Internet through digital subscriber lines using Ethernet technology and the PPPoE protocol.

A PPPoE connection consists of a client and a server. The client and server work using any interface that is close to the Ethernet specifications. This interface is used to issue IP addresses to clients, and bind those IP addresses to users and, optionally, to workstations, instead of authentication based only on the workstation. The PPPoE server creates a point-to-point connection for each client.

Setting up a PPPoE session

In order to create a PPPoE session, you should use the servicepppoed. Moduleio-pkt-*pProvides PPPoE protocol services. First you need to runio-pkt-*Withsuitable driver. Example: