Wietse Venema
2013-03-25 15:15:29 UTC
Postfix snapshot 20130324 uses kqueue() for MacOS X 8.x in Postfix
event handling routines (instead of using select()).
Unfortunately, we missed one MacOS bug.
When Postfix uses kqueue() for event handling, it relies on poll()
to enforce time limits on individual read/write operations. Prior
to snapshot 20130324, Postfix on MacOS X would use select() for
both event handling and for time-limiting individual read/write
operations.
Viktor Dukhovni reports that MacOS poll() support is still broken
for /dev/urandom. This breaks tlsmgr(8), as discussed in:
http://archives.neohapsis.com/archives/postfix/2009-12/thread.html#805
Workaround:
$ make makefiles "CCARGS=-DNO_KQUEUE"
I wrote a quick program to test if instead of poll() Postfix could
use kqueue() but that program already fails on FreeBSD (fatal:
kevent EV_ADD: Operation not supported by device). As FreeBSD is
a major provider of MacOS kernel code, I've decided not to pursue
this path further.
Postfix insists on using {read,write}_wait() and {read,writ}able()
with each individual read/write operation, because the program must
never block forever. I'm sure many system adminstrators appreciate
that Postfix does not lock up easily.
Postfix wants to use poll() instead of select() in {read,write}_wait()
and {read,writ}able(), because these functions may be called with
file descriptors >= FD_SETSIZE, the limit of the size of a file
descriptor set used by select().
Also relevant is that Postfix servers that manage many file handles
leave some descriptors < 128 unused for the benefit of (non-Postfix)
library code that wants to use select() internally.
I see the following options:
- Until MacOS X is fixed, keep using select() for event handling
(and to enforce time limits on each read/write operation). It's
not primarily a server platform anyway. This is what I have chosen
as an initial solution (i.e. Postfix works exactly as before).
- Use kqueue() for event handling, and use select() instead of
poll() to enforce time limits on each read/write operation, but
increase FD_SETSIZE at compile time. The FreeBSD and Darwin
select() manpages document this as a legitimate way to handle
larger file descriptor numbers. This means dragging kbyte-size
bitsets into and out of the select() interface. Such code is slow,
and to avoid this, kqueue() was created (followed soon by Solaris
and Linux equivalents).
- Use kqueue() for event handling, and use select() instead of
poll() to enforce time limits on each read/write operation, but
add code that dup()s a descriptor >= FD_SETSIZE to a temporary
descriptor with a lower number, and select() on that temporary
descriptor instead. This might work (we're unlikely to encounter
POSIX fcntl() locking brain damage on non-file objects), but burns
more CPU cycles. On the other hand the odds that Postfix will
handle file descriptors >= FD_SETSIZE is small on MacOS X.
Wietse
event handling routines (instead of using select()).
Unfortunately, we missed one MacOS bug.
When Postfix uses kqueue() for event handling, it relies on poll()
to enforce time limits on individual read/write operations. Prior
to snapshot 20130324, Postfix on MacOS X would use select() for
both event handling and for time-limiting individual read/write
operations.
Viktor Dukhovni reports that MacOS poll() support is still broken
for /dev/urandom. This breaks tlsmgr(8), as discussed in:
http://archives.neohapsis.com/archives/postfix/2009-12/thread.html#805
Workaround:
$ make makefiles "CCARGS=-DNO_KQUEUE"
I wrote a quick program to test if instead of poll() Postfix could
use kqueue() but that program already fails on FreeBSD (fatal:
kevent EV_ADD: Operation not supported by device). As FreeBSD is
a major provider of MacOS kernel code, I've decided not to pursue
this path further.
Postfix insists on using {read,write}_wait() and {read,writ}able()
with each individual read/write operation, because the program must
never block forever. I'm sure many system adminstrators appreciate
that Postfix does not lock up easily.
Postfix wants to use poll() instead of select() in {read,write}_wait()
and {read,writ}able(), because these functions may be called with
file descriptors >= FD_SETSIZE, the limit of the size of a file
descriptor set used by select().
Also relevant is that Postfix servers that manage many file handles
leave some descriptors < 128 unused for the benefit of (non-Postfix)
library code that wants to use select() internally.
I see the following options:
- Until MacOS X is fixed, keep using select() for event handling
(and to enforce time limits on each read/write operation). It's
not primarily a server platform anyway. This is what I have chosen
as an initial solution (i.e. Postfix works exactly as before).
- Use kqueue() for event handling, and use select() instead of
poll() to enforce time limits on each read/write operation, but
increase FD_SETSIZE at compile time. The FreeBSD and Darwin
select() manpages document this as a legitimate way to handle
larger file descriptor numbers. This means dragging kbyte-size
bitsets into and out of the select() interface. Such code is slow,
and to avoid this, kqueue() was created (followed soon by Solaris
and Linux equivalents).
- Use kqueue() for event handling, and use select() instead of
poll() to enforce time limits on each read/write operation, but
add code that dup()s a descriptor >= FD_SETSIZE to a temporary
descriptor with a lower number, and select() on that temporary
descriptor instead. This might work (we're unlikely to encounter
POSIX fcntl() locking brain damage on non-file objects), but burns
more CPU cycles. On the other hand the odds that Postfix will
handle file descriptors >= FD_SETSIZE is small on MacOS X.
Wietse