Stdio Filedescriptor Limitation Workaround
==========================================
Johann Klasek 

The official documentation of this issue can be found also
in the milter-greylist wiki:
http://milter-greylist.wikidot.com/solaris-filedescriptor-limitation


Idea
----

Some stdio stream implementation suffers from limitation or
backward compatibility. Especially Solaris 9 and previous
versions in the LWP32 programming model have a limit in the
FILE datastructure where the file descriptor field has only
a width of 8 bits. Therefore stdio streams are only
capable in handling file descriptors in the range from 0 to 255.
Some application accessing the FILE structure components directly
(e.g. the _file field) through the past years. The structure has
never be change to provide backward compatibility for and prevent
breaking existing applications (binaries).
Even Solaris 9 offers a maximum number of 65536 file descriptors
only descriptors 0 to 255 can be stored into the FILE structure.
If descriptor values greater 255 are passed to e.g. fdopen()
this library function simply fails with errno unset!

Solaris 10 offers several solution to cope with this situation
(based on source or binary applications).

The solution is mainly for versions previous to Solaris 10.
This module provides a replacement for the functions
fdopen() and fclose(). The idea is to reserve a pool of
open descriptors (associated with /dev/null)
with values <256 which are aquired as needed by fdopen_ext().
On the other hand fclose_ext() tries to return a descriptor
lower than 256 to the reserved pool. The latter is could not
assured in all cases because the real fclose() call releases
the associated description which could not be reclaimed without
a delay. A parallel running thread could grab the closed
descriptor during the delay ...

The whole system works only in situations where not more
then 256 streams are used in parallel despite the maximum
number of open descriptors could grow much greater.

References:
  http://www.oracle.com/technetwork/server-storage/solaris10/stdio-256-136698.html
  http://www.science.uva.nl/pub/solaris/solaris2.html
        3.48) How can I increase the number of file descriptors per process?
  http://www.research.att.com/~gsf/download/ref/sfio/sfio.html


Source
------

file_ext/file_ext.c
file_ext/file_ext.h


Usage
-----

Typical usage is to replace calls to fopen(), fdopen() and fclose() by
Fopen(), Fdopen() and Fclose(). Depending on the existance of
USE_FD_POOL the new functions are mapped to the default functions from
the stdio libarary. The module need some initialisation which is
provided by means of the function file_ext_init() which should be called
as soon as possible, normally during some other initialistion in function main().
It should be done right before any file descriptor usag.

#include "file_ext.h"

#ifdef USE_FD_POOL
         /* initialize file descriptor pool as soon as possible ... */
         file_ext_init();
#endif

... stream = Fopen(file, "r") ...

... stream = Fdopen(descriptor, "r") ...

... Fclose(stream) ...


Compile with flag/define -DUSE_FD_POOL

Link with file_ext.o (add file_ext.o to the Makefile object list).



Logging
-------

Typical logmessages of aquiring a new low descriptor through a call to fdopen()
and getting descriptor back by means of fclose().
These logging messages can only be seen if the fopen/fdopen/fclose functions
have to handle a descriptor value above 255 (this may take some time until a 
situation arise where more than 250 threads - concurrent sendmail jobs - exist! 

Jun 15 10:01:50 server milter-greylist: [ID 411381 mail.info] fdopen_ext: get_pool_desc: descriptor 13424 reused as 11
Jun 15 10:02:05 server milter-greylist: [ID 956266 mail.info] fclose_ext: taking descriptor 9 back into low fd pool


To find any occurences of FD pool usage:

grep _ext: /var/log/maillog



All log messages and their sources:

file_ext.c: (LOG_ERR, "file_ext: can't create dummy file descriptor file %s: %s", FNAME, strerror(errno))
file_ext.c: (LOG_ERR, "file_ext: %s", (descriptor < 0 ? "can't allocate a new descriptor" : "can't allocate small dummy file descriptor (<256) - out of luck!") )
file_ext.c: (LOG_INFO, "fclose_ext: adding new descriptor %d into low fd pool", new_descriptor)
file_ext.c: (LOG_ERR, "fclose_ext: fd_new_desc failed: low descriptor %d lost! (%s)", old_descriptor, strerror(errno))
file_ext.c: (LOG_ERR, "fclose_ext: dup failed: low descriptor %d lost! (%s)", old_descriptor, strerror(errno))
file_ext.c: (LOG_ERR, "fclose_ext: low descriptor %d lost!", old_descriptor)
file_ext.c: (LOG_INFO, "fclose_ext: taking descriptor %d back into low fd pool", descriptor)
file_ext.c: (LOG_INFO, "fdopen_ext: get_pool_desc: descriptor %d reused as %d", desc, i)
file_ext.c: (LOG_ERR, "fdopen_ext: no free low file descriptor")
file_ext.c: (LOG_INFO, "fopen_ext: descriptor %d lost!", descriptor)
file_ext.c: (LOG_ERR, "fopen_ext: failed and descriptor %d lost!", descriptor)



miler-greylist patches
----------------------

Milter-greylist Home: http://hcpnet.free.fr/milter-greylist/
Milter-greylist Wiki: http://milter-greylist.wikidot.com/

The STDIO-Workaround feature is incorporated into milter-greylist since
version 4.0, to be enabled with configure option "--enable-stdio-hack".


Autoconfiguration patch
- - - - - - - - - - - -

For milter-greylist 4.0rc1, 4.0rc2

These are changes to configure.ac, which is part of the 4.0rc{1,2}
distribution. They fix the correct binding into a Solaris multi-
threading environment. Two suggested variations exist. They consider GCC
and Sun Studio C Compiler. The former needs the CFLAG "-pthreads", the
latter "-mt" to compile multi-threaded sources with the correct
prototypes.

   Variation A: test by means of strtok_r() and if the usage leads to a compiler warning.
   Variation B: add always option "-mt" or "-ptreads" if the compiler understand it.

configure.ac.thread-option-test			Variation A
configure.ac.thread-option-test-2		Variation B
configure.ac.thread-option-test-2.patch		Variation B patch (context diff)


Without this changed autoconfiguration setup configure is not doing its
job correctly in Solaris environments producing a lot of warning during
compile (missing prototypes, ...). But, the application will be still
compiled and linked ...


### Making a new configure script

autoconf configure.ac >configure



Complete patch cluster
- - - - - - - - - - - 

**** NOTE: These patches below are already included into milter-greylist begining with 4.0rc1!!!!!

milter-greylist-3.0.patch	file_ext workaround integration into milter-greylist 3.0
milter-greylist-4.0a6.patch	file_ext workaround integration into milter-greylist 4.0a6

These patches cover following issues:

 * [Solaris] Integration of the file_ext module as workaround for the stdio filedescriptor 
	limitiation, targeting the sync.c und dump.c module.

 * [Solaris] Proper handling of fdopen() error situations: Solaris does explicitly *not* set errno
	in some cases where -1 is returned (indicating tha fdopen() has failed).
	In this situation errno and the corresponding error message has no meaning.
	This occurs if the descriptor is out of range (>255).

 * [General] The dumper thread does not terminate if opening the dumpfile failes, instead
	it tries again after 60 seconds.
	(only in milter-greylist-3.0.patch, as an option)

 * [General] Extended (more portable) handling of errorcodes after call to accept() in 
	module sync.c .
 
 * [Solaris] dnsbl.c (with configure --enable-dnsrbl) must be forced to use the thread safe
	resolver routines. At least for Solaris the existance of these routines is indicated by
	the __RES define from  (which represents the resolver version
	based on a date information) - this seems to be true for all bind derived
	resolver implementations ... (especially on Linux, too).
	A test for the macro "res_ninit" is not enough or somehow incorrect because
	a least Solaris <=9 defines this routine as a native library call and not as
	a macro!
	bzero needs .


Separated patches against milter-greylist-4.0a6
- - - - - - - - - - - - - - - - - - - - - - - -

**** NOTE: These patches below are already included into milter-greylist begining with 4.0rc1!!!!!

These patches are separated patches similar to the above milter-
greylist-4.0a6.patch. The only main difference is that the file_ext-
Module is more hidden (the Solaris STDIO workaround). Because these
patches are overlapping they cannot easily applied altogether (without
manual resolving). But in summary these patches are the same as milter-
greylist-4.0a6.patch.

mg.accept.patch			be more flexible in different Unix environments 
				regarding the EINTR error code ...

mg.resolver.patch		fixes the usage of the thread-proof resolver library
				(for some environments, e.g. Solaris 8 and 9 at least).

mg.stdio-handling.patch		fixes the handling STDIO functions returning
				filedescriptors (not setting errno, this
				is the case at least for Solaris).

mg.stdio-solaris.patch		workaround for Solaris 8 and 9 STDIO library
				limitiation of 256 STDIO file descriptors.

mg.temp-file-fix.patch          fixes the cleanup of temporary files during
				the DB dumping process in case of failures.  



Building milter-greylist
- - - - - - - - - - - - 

Using libspf2, feature dnsrbl



### Solaris 9 x86 (32bit) with GCC

# asuming libspf2 in /usr/local, Berkeley DB in /pd/db/
# --enable-stdio-hack is needed for Solaris <=9 plattforms! Solars 10 64bit does not need
#	this workaround anymore.

./configure --with-libspf2=/usr/local --with-db=/pd/db --enable-dnsrbl --enable-stdio-hack
make

### Linux FC6 64_x86 (64bit) with GCC

./configure --with-libspf2 --with-thread-safe-resolver --enable-dnsrbl
make


### RPM for Linux FC6 64_x86 (64bit) with GCC 

milter-greylist-zid.spec		Customized .spec File with smmsp User, enabled dnsrbl feature
milter-greylist-zid.spec.patch		Patch against milter-greylist.spec from 4.0rc1
milter-greylist.spec			Generated with 4.0rc1



libspf2
-------

Official documentation regarding this issue can also be found at
http://milter-greylist.wikidot.com/libspf2


libspf2 is a new generation SPF library which may be included into milter-greylist.

References:
  http://www.libspf2.org/ official libspf2 site
  http://www.city-fan.org/ftp/contrib/libraries/ (RPM Spec and patches)


Patches
- - - -

Appliable to libspf2 1.2.5


libspf2-1.2.5-64bit.patch		fixes int/size_t problem, taken from http://www.city-fan.org/
libspf2-1.2.5-bogus-header.patch	fixes unused header file reference, taken from http://www.city-fan.org/
libspf2-1.2.5-malloc.patch		fixes malloc error handling
libspf2-1.2.5-res_ninit.patch		fixes correct initialisation and usage of res_ninit()


ad libspf2-1.2.5-res_ninit.patch:

	The static datastructure for res_ninit() is correctly
	initialized to 0. Otherwise some newer resolver libraries try to
	clean up the state if not already set to zero and will find
	spurious pointer values which in turn will be tried to free up
	allocated memory ... leading sometimes to segmentation faults :(


Building RPMs
- - - - - - -

libspf2.spec		RPM Spec file, with includes all the patches given above.


cp -p libspf2-1.2.5-*.patch /usr/src/redhat/SOURCES/
cp -p libspf2.spec /usr/src/redhat/SPECS/

rpmbuild -ba libspf2.spec



Thu Oct 4 09:38:04 CEST 2012