Friday, February 11, 2011

How does TCP/IP report errors?

How does TCP/IP report errors when packet delivery fails permanently? All Socket.write() APIs I've seen simply pass bytes to the underlying TCP/IP output buffer and transfer the data asynchronously. How then is TCP/IP supposed to notify the developer if packet delivery fails permanently (i.e. the destination host is no longer reachable)?

Any protocol that requires the sender to wait for confirmation from the remote end will get an error message. But what happens for protocols where a sender doesn't have to read any bytes from the destination? Does TCP/IP just fail silently? Perhaps Socket.close() will return an error? Does the TCP/IP specification say anything about this?

  • The sockets API has no way of informing the writer exactly how many bytes have been received as acknowledged by the peer. There are no guarantees made by the presence of a successful shutdown or close either.

    The TCP/IP specification says nothing about the application interface (which is nearly always the sockets API).

    SCTP is an alternative to TCP which attempts to address these shortcomings, among others.

    Gili : SCTP looks very cool! I wonder how one would go about migrating the web from TCP/IP to SCTP though. Sounds like a task with overwhelming complexity.
    From ephemient
  • In C, if you write to a socket that has failed with send(), you will get back the number of bytes that were sent. If this does not match the number of bytes you meant to send, then you have a problem. But also, when you write to a failed socket, you get SIGPIPE back. Before you start socket handling, you need to have a signal handler in place that will alert you when you get SIGPIPE.

    If you are reading from a socket, you really should wrap it with an alarm so you can timeout. Like "alarm(timeout_val); recv(); alarm(0)". Check the return code of recv, and if it's 0, that indicates that the connection has been closed. A negative return result indicates a read failure and you need to check errno.

    Gili : Is send() a blocking function? Does it really wait for acknowledgment before returning the number of bytes sent? Or is it simply dumping the bytes into the output buffer and returning immediately?
    From codebunny
  • TCP/IP is a reliable byte stream protocol. All your bytes will get to the receiver or you'll get an error indication.

    The error indication will come in the form of a closed socket. Regardless of what the communication pattern (who does the sending), if the bytes can't be delivered, the socket will close.

    So the question is, how do you see the socket close? If you're never reading, you'd eventually get an error trying to write to the closed socket (with ECONNRESET errno, I think).

    If you have a need to sleep or wait for input on another file handle, you might want to do your waiting in a select() call where you include the socket in the list of sources you're waiting on (even if you never expect to receive anything). If the select() indicates that the socket is ready for a read call, you may get a -1 return (with ECONNRESET, I think). An EOF would indicate an orderly close (other side did a shutdown() or close().

    How to distinguish this error close from a clean close (other program exiting, for example)? The errno values may be enough to distinguish error from orderly close.

    If you want an unambiguous indication of a problem, you'll probably need to build some sort of application level protocol above the socket layer. For example, a short "ack" message sent by the receiver back to the sender. Then the violation of that higher level application protocol (sender didn't see an ack) would be a confirmation that it was an error close vs a clean close.

    From John M
  • TCP is built upon the IP protocol, which is the centerpiece for the Internet, providing much of the interoperability that drives Routing, which is what determines how to get packets from their source to their destination. The IP protocol specifies that error messages should be sent back to the sender via Internet Control Message Protocol(ICMP) in the case of a packet failing to get to the sender. Some of these reasons include the Time To Live(TTL) field being decremented to zero, often meaning that the packet got stuck in a routing loop, or the packet getting dropped due to switch contention causing buffer overruns. As others have said, it is the responsibility of the Socket API that is being used to relay these errors at the IP layer up to the application interacting with the network at the TCP layer.

    From cdeszaq

0 comments:

Post a Comment