summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorAndreas Baumann <abaumann@yahoo.com>2009-04-04 17:28:09 +0200
committerAndreas Baumann <abaumann@yahoo.com>2009-04-04 17:28:09 +0200
commit54a4cb25f54ea65440c1bb5d07d30acac684f73c (patch)
tree95589d2ab1e1a143679c6ab21f6768260fd33e18 /docs
parentb1523384ad20f62ee94418e1ed6b83a4090c74e7 (diff)
downloadwolfbones-54a4cb25f54ea65440c1bb5d07d30acac684f73c.tar.gz
wolfbones-54a4cb25f54ea65440c1bb5d07d30acac684f73c.tar.bz2
added more documentation about asynchronous connects
Diffstat (limited to 'docs')
-rw-r--r--docs/network/README2
-rw-r--r--docs/network/connect-intr.html371
2 files changed, 373 insertions, 0 deletions
diff --git a/docs/network/README b/docs/network/README
index f7fc47b..d2829e5 100644
--- a/docs/network/README
+++ b/docs/network/README
@@ -30,3 +30,5 @@ Links:
and Proactor and a hybrid the emulated Proactor
- http://www.developerweb.net/forum/forumdisplay.php?s=cb7c1122ba4551d2fa866b1d6cf2b97f&f=70:
UNIX Socket FAQ: very good resource for all kind of detail problems
+- http://www.madore.org/~david/computers/connect-intr.html: on the
+ behaviour of asynchronous connects on different Unixes
diff --git a/docs/network/connect-intr.html b/docs/network/connect-intr.html
new file mode 100644
index 0000000..c0ad3f7
--- /dev/null
+++ b/docs/network/connect-intr.html
@@ -0,0 +1,371 @@
+<?xml version="1.0" encoding="us-ascii"?>
+
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+
+
+<head>
+
+<title>Unix connect() and interrupted system calls</title>
+
+<meta name="description" content="Nitpicking about the semantics of the Unix connect() system call when interrupted" lang="en" />
+
+<meta name="keywords" content="computers, Unix, connect, EINTR, EALREADY, EINPROGRESS" lang="en" />
+
+<meta http-equiv="Content-Type" content="text/html; charset=us-ascii" />
+
+<meta http-equiv="Content-Language" content="en" />
+
+<link rel="Stylesheet" title="Preferred" lang="en" type="text/css" href="../preferred.css" />
+<link rel="Alternate Stylesheet" title="Black" lang="en" type="text/css" href="../black.css" />
+<link rel="Alternate Stylesheet" title="Classic" lang="en" type="text/css" href="../classic.css" />
+<link rel="Alternate Stylesheet" title="Blue" lang="en" type="text/css" href="../blue.css" />
+<script src="../jscript/functions.js" type="text/javascript"></script>
+
+</head>
+
+
+<body onload="defaultOnLoad()" onunload="defaultOnUnload()">
+
+
+<h1 align="center">Unix <code>connect()</code> and interrupted system calls</h1>
+
+<p align="center" class="navbar">
+<small>
+[<a href="http://www.ens.fr/">ENS</a>]
+[<a href="http://www.eleves.ens.fr:8080/">ENS students</a>]
+[<a href="http://www.eleves.ens.fr:8080/home/madore/">David Madore</a>]
+<br />
+[<a href="../math/">Mathematics</a>]
+[<a href="../computers/">Computer science</a>]
+[<a href="../programs/">Programs</a>]
+[<a href="../linux/">Linux</a>]
+[<a href="../lit/">Literature</a>]
+<br />
+[<a href="../pages_new.html">What's new?</a>]
+[<a href="../pages_cool.html">What's cool?</a>]
+[<a href="../#sitemap">Site map</a>]
+</small>
+</p>
+
+
+<p><strong>Summary:</strong> This page makes a fine (and admittedly
+minor) point about the behavior of the Unix (socket API)
+<code>connect()</code> system call when it is interrupted by a signal.
+It points out how obscure the Single Unix Specification is, and notes
+that many existing Unix implementations seem to have f*cked this up
+somehow.</p>
+
+<p>The question is this: if a blocking <code>connect()</code> (on a
+blocking stream socket, that is) is interrupted by a signal, returning
+<code>EINTR</code>, in what state is the socket left, and is it
+permissible to restart the system call? What happens if a second
+<code>connect()</code> with the same arguments is attempted
+immediately after one failed with <code>EINTR</code>?</p>
+
+<p>The <a
+href="http://www.opengroup.org/onlinepubs/007904975/functions/connect.html">reference
+for <code>connect()</code></a> (hereafter, &ldquo;the Spec&rdquo;) is
+part of the <a href="http://www.opengroup.org/">Open Group</a>'s <a
+href="http://www.unix-systems.org/single_unix_specification/">Single
+Unix Specification</a>, version&nbsp;3 (<strong>note:</strong> you may
+need to register to read this; see also <a
+href="http://www.unix-systems.org/version3/online.html">here</a>).
+Here is the relevant part of it:</p>
+
+<blockquote>
+
+<p>If the initiating socket is connection-mode, then
+<code>connect()</code> shall attempt to establish a connection to the
+address specified by the <var>address</var> argument. If the
+connection cannot be established immediately and
+<code>O_NONBLOCK</code> is not set for the file descriptor for the
+socket, <code>connect()</code> shall block for up to an unspecified
+timeout interval until the connection is established. If the timeout
+interval expires before the connection is established,
+<code>connect()</code> shall fail and the connection attempt shall be
+aborted. If <code>connect()</code> is interrupted by a signal that is
+caught while blocked waiting to establish a connection,
+<code>connect()</code> shall fail and set <code>connect()</code> to
+[<code>EINTR</code>], but the connection request shall not be aborted,
+and the connection shall be established asynchronously.</p>
+
+<p>If the connection cannot be established immediately and
+<code>O_NONBLOCK</code> is set for the file descriptor for the socket,
+<code>connect()</code> shall fail and set <code>errno</code> to
+[<code>EINPROGRESS</code>], but the connection request shall not be
+aborted, and the connection shall be established
+asynchronously. Subsequent calls to <code>connect()</code> for the
+same socket, before the connection is established, shall fail and set
+<code>errno</code> to [<code>EALREADY</code>].</p>
+
+<p>When the connection has been established asynchronously,
+<code>select()</code> and <code>poll()</code> shall indicate that the
+file descriptor for the socket is ready for writing.</p>
+
+</blockquote>
+
+<p>Later on, when listing possible error codes for
+<code>connect()</code>, the Spec mentions:</p>
+
+<blockquote>
+
+<p>The <code>connect()</code> function shall fail if:</p>
+
+<dl>
+
+<dt>[&hellip;]</dt><dd></dd>
+
+<dt><code>[EALREADY]</code></dt>
+
+<dd>A connection request is already in progress for the specified
+socket.</dd>
+
+</dl>
+
+</blockquote>
+
+<p>How does this answer the question above? That is, what is supposed
+to happen, according to the Spec, when a second <code>connect()</code>
+is attempted (with the same arguments) just after one which returned
+<code>EINTR</code> (on a blocking stream socket)?</p>
+
+<p>To me it seems that the Spec is contradictory. On the one hand, it
+is stated that <q>If the connection cannot be established immediately
+and <code>O_NONBLOCK</code> is not set for the file descriptor for the
+socket, <code>connect()</code> shall block for up to an unspecified
+timeout interval until the connection is established.</q>&mdash;now
+that sort of implies, since no provision is made for exceptions, that
+the second call to <code>connect()</code> should <em>continue</em> the
+connection attempted by the first (and which has not been aborted but
+run asynchronously), and block until the connection is established (or
+an error is returned). Let us call this the &ldquo;Liberal
+Behavior&rdquo; in what follows (I will later explain that this is how
+Linux behaves). On the other hand, the <code>EALREADY</code> error is
+documented as being returned whenever a connection request is
+underway. So the Spec seems to say that the second call to
+<code>connect()</code> should fail with the <code>EALREADY</code>
+error code. Let us call this the &ldquo;Unforgiving Behavior&rdquo;
+in what follows (I will later explain that this is how Solaris
+behaves). One thing is certain: the Spec is highly unclear about this
+point.</p>
+
+<p>I have asked several people's opinion as to how they read the Spec,
+and most seem to favor the &ldquo;Unforgiving Behavior&rdquo;: they
+say the Spec requires the second <code>connect()</code> call to fail
+with <code>EALREADY</code>. So admittedly it is how the Spec should
+be read; whether this is <em>desirable</em> behavior, on the other
+hand, is dubious (see below).</p>
+
+<p>In <cite>Unix Network Programming</cite>, volume&nbsp;1,
+section&nbsp;5.9, W.&nbsp;Richard Stevens states:</p>
+
+<blockquote>
+
+<p>What we are doing [&hellip;] is restarting the interrupted system
+call ourself. This is fine for <code>accept</code>, along with the
+functions such as <code>read</code>, <code>write</code>,
+<code>select</code> and <code>open</code>. But there is one function
+that we cannot restart ourself: <code>connect</code>. If this
+function returns <code>EINTR</code>, we cannot call it again, as doing
+so will return an immediate error. When <code>connect</code> is
+interrupted by a caught signal and is not automatically restarted, we
+must call <code>select</code> to wait for the connection to complete,
+as we describe in section&nbsp;15.3.</p>
+
+</blockquote>
+
+<p>This, indeed, clearly describes the &ldquo;Unforgiving
+Behavior&rdquo; (<code>connect()</code> failing immediately when
+restarted); note that Stevens does not say which error code is
+produced, and indeed Solaris returns <code>EALREADY</code> but BSD
+returns <code>EADDRINUSE</code> (a highly illogical error code in my
+opinion).</p>
+
+<p>The &ldquo;Liberal Behavior&rdquo; consists of making
+<code>connect()</code> like every other system call: it can be
+restarted with the same arguments, without thinking, so long as it
+returns <code>EINTR</code> (only one minor difference remains:
+<code>EISCONN</code> needs to be checked, to avoid a race condition
+between two calls to <code>connect()</code>). This seems to be what
+Linux does (whether this is against the Spec, is, as I say, a matter
+of interpretation, though most people seem to think that indeed it
+is). There is much to be said in favor of this &ldquo;Liberal
+Behavior&rdquo;: basically, <em>the whole point of using blocking
+sockets is for system calls to block</em> rather than stupidly
+returning a temporary error code (be it
+<code>EWOULDBLOCK</code>/<code>EAGAIN</code>, <code>EINPROGRESS</code>
+or <code>EALREADY</code>) and forcing us to use <code>select()</code>
+or <code>poll()</code> to know when that temporary error will be gone;
+of what use are blocking sockets at all, if we are forced to use
+<code>select()</code> or <code>poll()</code> anyway? This is my main
+reason for preferring the &ldquo;Liberal Behavior&rdquo;.</p>
+
+<p>Besides, the &ldquo;Liberal Behavior&rdquo; makes things
+<em>much</em> easier to program. In clear, I can write the
+following:</p>
+
+<blockquote>
+
+<pre>
+/* Start with fd just returned by socket(), blocking, SOCK_STREAM... */
+while ( connect (fd, &amp;name, namelen) == -1 &amp;&amp; errno != EISCONN )
+ if ( errno != EINTR )
+ {
+ perror (&quot;connect&quot;);
+ exit (EXIT_FAILURE);
+ }
+/* At this point, fd is connected. */
+</pre>
+
+</blockquote>
+
+<p>&mdash;instead of having to write all this:</p>
+
+<blockquote>
+
+<pre>
+/* Start with fd just returned by socket(), blocking, SOCK_STREAM... */
+if ( connect (fd, &amp;name, namelen) == -1 )
+ {
+ struct pollfd unix_really_sucks;
+ int some_more_junk;
+ socklen_t yet_more_useless_junk;
+
+ if ( errno != EINTR /* &amp;&amp; errno != EINPROGRESS */ )
+ {
+ perror (&quot;connect&quot;);
+ exit (EXIT_FAILURE);
+ }
+ unix_really_sucks.fd = fd;
+ unix_really_sucks.events = POLLOUT;
+ while ( poll (&amp;unix_really_sucks, 1, -1) == -1 )
+ if ( errno != EINTR )
+ {
+ perror (&quot;poll&quot;);
+ exit (EXIT_FAILURE);
+ }
+ yet_more_useless_junk = sizeof(some_more_junk);
+ if ( getsockopt (fd, SOL_SOCKET, SO_ERROR,
+ &amp;some_more_junk,
+ &amp;yet_more_useless_junk) == -1 )
+ {
+ perror (&quot;getsockopt&quot;);
+ exit (EXIT_FAILURE);
+ }
+ if ( some_more_junk != 0 )
+ {
+ fprintf (stderr, &quot;connect: %s\n&quot;,
+ strerror (some_more_junk));
+ exit (EXIT_FAILURE);
+ }
+ }
+/* At this point, fd is connected. */
+</pre>
+
+</blockquote>
+
+<p>&mdash;which anyone will admit is longer (over five times longer,
+as a matter of fact) and more tedious to write. Hence my calling this
+behavior the &ldquo;Liberal Behavior&rdquo; because it does not force
+the programmer to go through all this pain just to connect a
+socket.</p>
+
+<p>Unfortunately, my opinion has not been consulted in defining Unix
+implementations, nor in writing the Single Unix specification, so it
+seems that the &ldquo;Liberal Behavior&rdquo; is not highly thought
+of, except under Linux, and anyone who wants to write a Unix program
+(except if it is to run solely on the Linux kernel) that performs the
+trivial act of opening a socket has to go through all the mess I have
+just written (and which, in case it is of any use, I put in the Public
+Domain). <small>Why not use <code>SA_RESTART</code> on all signal
+handlers, some will ask? Well, the problem is that one is never quite
+sure that <em>all</em> signals have been treated, and just one signal
+is enough to interrupt system calls. Most people agree that
+<code>SA_RESTART</code> does not dispense you from testing
+<code>EINTR</code> everywhere, if you want to be safe. Anyway, it is
+not the purpose of this page to descuss this point.</small></p>
+
+<p>Annoyingly, not only Unix implementations vary in this, but also
+the documentation is either imprecise or positively wrong.</p>
+
+<p>Linux adopts the pleasant &ldquo;Liberal Behavior&rdquo; I have
+described. The man page for <code>connect(2)</code> under Linux does
+not document <code>EINTR</code>, however, nor does the info page of
+the GNU libc, whereas the code can be shown to occur:
+<code>connect()</code> <em>can</em> be interrupted by a signal under
+Linux (fortunately!).</p>
+
+<p>Solaris adopts the &ldquo;Unforgiving Behavior&rdquo; that seems to
+be the literal interpretation of the Spec. However, the Solaris man
+page for <code>connect(3SOCKET)</code> states that the
+<code>EALREADY</code> error code occurs when <q>The socket is
+non-blocking and a previous connection attempt has not yet been
+completed.</q> But I have cases when the <code>EALREADY</code> error
+code was returned for a blocking socket (this is the very point I'm
+arguing about).</p>
+
+<p>Both FreeBSD and OpenBSD adopt the &ldquo;Unforgiving
+Behavior&rdquo; with the following departure from the Spec that the
+error code returned on the second <code>connect()</code> call is
+<code>EADDRINUSE</code> rather than <code>EALREADY</code>. This is
+probably tradition, but it seems rather absurd. There is, however,
+one difference between FreeBSD and OpenBSD, but it concerns
+<em>non-blocking</em> sockets: OpenBSD returns <code>EALREADY</code>
+for non-blocking sockets as the Spec prescribes, whereas FreeBSD never
+seems to return <code>EALREADY</code> at all. However, this page is
+essentially about blocking sockets. Also, FreeBSD documents
+<code>EALREADY</code> (strange, since it does not return it), but not
+<code>EINTR</code> (which it does return); the OpenBSD man page is
+essentially correct. Note that in both cases the return code of
+<code>EADDRINUSE</code> is not clearly documented.</p>
+
+<p>Information about the behavior of other Unixen, or other
+implementations of the Unix socket API, in this regard, is
+welcome.</p>
+
+<p>See also <a
+href="http://www.google.com/groups?threadm=tON4XI9hD5aY%24comp.unix.programmer%40clipper.ens.fr">this
+thread</a> on the <code>comp.unix.programmer</code> newsgroup (thanks
+to <a href="http://www.google.com/grphp">Google Groups</a> for making
+this available), where the issue was discussed (note that I mentioned
+<code>EISCONN</code> in error whereas I meant
+<code>EADDRINUSE</code>).</p>
+
+<p><a
+href="ftp://quatramaran.ens.fr/pub/madore/misc/connect_test.c">This
+trivial program</a> (Public Domain) will run tests on a given Unix
+implementation, to determine what the actual behavior happens to be,
+and display the result in a perfectly obscure and incomprehensible
+form. See the comments at the beginning of the file for information
+on use.</p>
+
+
+<hr />
+
+<p align="center" class="navbar">
+<small>
+[<a href="http://www.ens.fr/">ENS</a>]
+[<a href="http://www.eleves.ens.fr:8080/">ENS students</a>]
+[<a href="http://www.eleves.ens.fr:8080/home/madore/">David Madore</a>]
+<br />
+[<a href="../math/">Mathematics</a>]
+[<a href="../computers/">Computer science</a>]
+[<a href="../programs/">Programs</a>]
+[<a href="../linux/">Linux</a>]
+[<a href="../lit/">Literature</a>]
+<br />
+[<a href="../pages_new.html">What's new?</a>]
+[<a href="../pages_cool.html">What's cool?</a>]
+[<a href="../#sitemap">Site map</a>]
+</small>
+</p>
+
+<address>
+<a href="mailto:david.madore@ens.fr">David Madore</a>
+</address>
+
+<p>Last modified: $Date: 2003/04/25 04:22:23 $</p>
+</body>
+</html>