From 54a4cb25f54ea65440c1bb5d07d30acac684f73c Mon Sep 17 00:00:00 2001 From: Andreas Baumann Date: Sat, 4 Apr 2009 17:28:09 +0200 Subject: added more documentation about asynchronous connects --- docs/network/README | 2 + docs/network/connect-intr.html | 371 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 373 insertions(+) create mode 100644 docs/network/connect-intr.html (limited to 'docs') diff --git a/docs/network/README b/docs/network/README index f7fc47b..d2829e5 100644 --- a/docs/network/README +++ b/docs/network/README @@ -30,3 +30,5 @@ Links: and Proactor and a hybrid the emulated Proactor - http://www.developerweb.net/forum/forumdisplay.php?s=cb7c1122ba4551d2fa866b1d6cf2b97f&f=70: UNIX Socket FAQ: very good resource for all kind of detail problems +- http://www.madore.org/~david/computers/connect-intr.html: on the + behaviour of asynchronous connects on different Unixes diff --git a/docs/network/connect-intr.html b/docs/network/connect-intr.html new file mode 100644 index 0000000..c0ad3f7 --- /dev/null +++ b/docs/network/connect-intr.html @@ -0,0 +1,371 @@ + + + + + + + + + +Unix connect() and interrupted system calls + + + + + + + + + + + + + + + + + + + + + +

Unix `connect()` and interrupted system calls

+ + + + +

Summary: This page makes a fine (and admittedly +minor) point about the behavior of the Unix (socket API) +connect() system call when it is interrupted by a signal. +It points out how obscure the Single Unix Specification is, and notes +that many existing Unix implementations seem to have f*cked this up +somehow.

+ +

The question is this: if a blocking connect() (on a +blocking stream socket, that is) is interrupted by a signal, returning +EINTR, in what state is the socket left, and is it +permissible to restart the system call? What happens if a second +connect() with the same arguments is attempted +immediately after one failed with EINTR?

+ +

The reference +for connect() (hereafter, “the Spec”) is +part of the Open Group's Single +Unix Specification, version 3 (note: you may +need to register to read this; see also here). +Here is the relevant part of it:

+ +

+ +
If the initiating socket is connection-mode, then +connect() shall attempt to establish a connection to the +address specified by the address argument. If the +connection cannot be established immediately and +O_NONBLOCK is not set for the file descriptor for the +socket, connect() shall block for up to an unspecified +timeout interval until the connection is established. If the timeout +interval expires before the connection is established, +connect() shall fail and the connection attempt shall be +aborted. If connect() is interrupted by a signal that is +caught while blocked waiting to establish a connection, +connect() shall fail and set connect() to +[EINTR], but the connection request shall not be aborted, +and the connection shall be established asynchronously.
+ +
If the connection cannot be established immediately and +O_NONBLOCK is set for the file descriptor for the socket, +connect() shall fail and set errno to +[EINPROGRESS], but the connection request shall not be +aborted, and the connection shall be established +asynchronously. Subsequent calls to connect() for the +same socket, before the connection is established, shall fail and set +errno to [EALREADY].
+ +
When the connection has been established asynchronously, +select() and poll() shall indicate that the +file descriptor for the socket is ready for writing.
+ +

+ +

Later on, when listing possible error codes for +connect(), the Spec mentions:

+ +

+ +
The connect() function shall fail if:
+ +
+ +
[…]
+ +
[EALREADY]
+ +
A connection request is already in progress for the specified +socket.
+ +
+ +

+ +

How does this answer the question above? That is, what is supposed +to happen, according to the Spec, when a second connect() +is attempted (with the same arguments) just after one which returned +EINTR (on a blocking stream socket)?

+ +

To me it seems that the Spec is contradictory. On the one hand, it +is stated that If the connection cannot be established immediately +and O_NONBLOCK is not set for the file descriptor for the +socket, connect() shall block for up to an unspecified +timeout interval until the connection is established.—now +that sort of implies, since no provision is made for exceptions, that +the second call to connect() should continue the +connection attempted by the first (and which has not been aborted but +run asynchronously), and block until the connection is established (or +an error is returned). Let us call this the “Liberal +Behavior” in what follows (I will later explain that this is how +Linux behaves). On the other hand, the EALREADY error is +documented as being returned whenever a connection request is +underway. So the Spec seems to say that the second call to +connect() should fail with the EALREADY +error code. Let us call this the “Unforgiving Behavior” +in what follows (I will later explain that this is how Solaris +behaves). One thing is certain: the Spec is highly unclear about this +point.

+ +

I have asked several people's opinion as to how they read the Spec, +and most seem to favor the “Unforgiving Behavior”: they +say the Spec requires the second connect() call to fail +with EALREADY. So admittedly it is how the Spec should +be read; whether this is desirable behavior, on the other +hand, is dubious (see below).

+ +

In Unix Network Programming, volume 1, +section 5.9, W. Richard Stevens states:

+ +

+ +
What we are doing […] is restarting the interrupted system +call ourself. This is fine for accept, along with the +functions such as read, write, +select and open. But there is one function +that we cannot restart ourself: connect. If this +function returns EINTR, we cannot call it again, as doing +so will return an immediate error. When connect is +interrupted by a caught signal and is not automatically restarted, we +must call select to wait for the connection to complete, +as we describe in section 15.3.
+ +

+ +

This, indeed, clearly describes the “Unforgiving +Behavior” (connect() failing immediately when +restarted); note that Stevens does not say which error code is +produced, and indeed Solaris returns EALREADY but BSD +returns EADDRINUSE (a highly illogical error code in my +opinion).

+ +

The “Liberal Behavior” consists of making +connect() like every other system call: it can be +restarted with the same arguments, without thinking, so long as it +returns EINTR (only one minor difference remains: +EISCONN needs to be checked, to avoid a race condition +between two calls to connect()). This seems to be what +Linux does (whether this is against the Spec, is, as I say, a matter +of interpretation, though most people seem to think that indeed it +is). There is much to be said in favor of this “Liberal +Behavior”: basically, the whole point of using blocking +sockets is for system calls to block rather than stupidly +returning a temporary error code (be it +EWOULDBLOCK/EAGAIN, EINPROGRESS +or EALREADY) and forcing us to use select() +or poll() to know when that temporary error will be gone; +of what use are blocking sockets at all, if we are forced to use +select() or poll() anyway? This is my main +reason for preferring the “Liberal Behavior”.

+ +

Besides, the “Liberal Behavior” makes things +much easier to program. In clear, I can write the +following:

+ +

+/* Start with fd just returned by socket(), blocking, SOCK_STREAM... */
+while ( connect (fd, &name, namelen) == -1 && errno != EISCONN )
+  if ( errno != EINTR )
+    {
+      perror ("connect");
+      exit (EXIT_FAILURE);
+    }
+/* At this point, fd is connected. */
+

+ +

—instead of having to write all this:

+ +

+/* Start with fd just returned by socket(), blocking, SOCK_STREAM... */
+if ( connect (fd, &name, namelen) == -1 )
+  {
+    struct pollfd unix_really_sucks;
+    int some_more_junk;
+    socklen_t yet_more_useless_junk;
+
+    if ( errno != EINTR /* && errno != EINPROGRESS */ )
+      {
+        perror ("connect");
+        exit (EXIT_FAILURE);
+      }
+    unix_really_sucks.fd = fd;
+    unix_really_sucks.events = POLLOUT;
+    while ( poll (&unix_really_sucks, 1, -1) == -1 )
+      if ( errno != EINTR )
+        {
+          perror ("poll");
+          exit (EXIT_FAILURE);
+        }
+    yet_more_useless_junk = sizeof(some_more_junk);
+    if ( getsockopt (fd, SOL_SOCKET, SO_ERROR,
+                     &some_more_junk,
+                     &yet_more_useless_junk) == -1 )
+      {
+        perror ("getsockopt");
+        exit (EXIT_FAILURE);
+      }
+    if ( some_more_junk != 0 )
+      {
+        fprintf (stderr, "connect: %s\n",
+                 strerror (some_more_junk));
+        exit (EXIT_FAILURE);
+      }
+  }
+/* At this point, fd is connected. */
+

+ +

—which anyone will admit is longer (over five times longer, +as a matter of fact) and more tedious to write. Hence my calling this +behavior the “Liberal Behavior” because it does not force +the programmer to go through all this pain just to connect a +socket.

+ +

Unfortunately, my opinion has not been consulted in defining Unix +implementations, nor in writing the Single Unix specification, so it +seems that the “Liberal Behavior” is not highly thought +of, except under Linux, and anyone who wants to write a Unix program +(except if it is to run solely on the Linux kernel) that performs the +trivial act of opening a socket has to go through all the mess I have +just written (and which, in case it is of any use, I put in the Public +Domain). Why not use SA_RESTART on all signal +handlers, some will ask? Well, the problem is that one is never quite +sure that all signals have been treated, and just one signal +is enough to interrupt system calls. Most people agree that +SA_RESTART does not dispense you from testing +EINTR everywhere, if you want to be safe. Anyway, it is +not the purpose of this page to descuss this point.

+ +

Annoyingly, not only Unix implementations vary in this, but also +the documentation is either imprecise or positively wrong.

+ +

Linux adopts the pleasant “Liberal Behavior” I have +described. The man page for connect(2) under Linux does +not document EINTR, however, nor does the info page of +the GNU libc, whereas the code can be shown to occur: +connect() can be interrupted by a signal under +Linux (fortunately!).

+ +

Solaris adopts the “Unforgiving Behavior” that seems to +be the literal interpretation of the Spec. However, the Solaris man +page for connect(3SOCKET) states that the +EALREADY error code occurs when The socket is +non-blocking and a previous connection attempt has not yet been +completed. But I have cases when the EALREADY error +code was returned for a blocking socket (this is the very point I'm +arguing about).

+ +

Both FreeBSD and OpenBSD adopt the “Unforgiving +Behavior” with the following departure from the Spec that the +error code returned on the second connect() call is +EADDRINUSE rather than EALREADY. This is +probably tradition, but it seems rather absurd. There is, however, +one difference between FreeBSD and OpenBSD, but it concerns +non-blocking sockets: OpenBSD returns EALREADY +for non-blocking sockets as the Spec prescribes, whereas FreeBSD never +seems to return EALREADY at all. However, this page is +essentially about blocking sockets. Also, FreeBSD documents +EALREADY (strange, since it does not return it), but not +EINTR (which it does return); the OpenBSD man page is +essentially correct. Note that in both cases the return code of +EADDRINUSE is not clearly documented.

+ +

Information about the behavior of other Unixen, or other +implementations of the Unix socket API, in this regard, is +welcome.

+ +

See also this +thread on the comp.unix.programmer newsgroup (thanks +to Google Groups for making +this available), where the issue was discussed (note that I mentioned +EISCONN in error whereas I meant +EADDRINUSE).

+ +

This +trivial program (Public Domain) will run tests on a given Unix +implementation, to determine what the actual behavior happens to be, +and display the result in a perfectly obscure and incomprehensible +form. See the comments at the beginning of the file for information +on use.

+ + +

+ + + +

+David Madore +

+ +

Last modified: $Date: 2003/04/25 04:22:23 $

+ + -- cgit v1.2.3-54-g00ecf