This is an old revision of the document!
For quite some time I managed to keep the number of “construction sites” in PDCLib to a minimum. Sure, there were plenty of unfinished parts (floating point, multibyte / wide characters, locales, …), but my actual work was focussed on one part of the library only.
Unfortunately, I have strayed a bit from that path, and ended up with more “action items” than I am comfortable with. That is why I opened this “drawing board”, to write down my thoughts about all those “construction sites”, getting them organized, and make the path onward a bit clearer.
With later versions of glibc now finally supporting <threads.h>
, we can expect to see software emerging making actual use of this header (instead of <pthread.h>
). However, when conducting some more involved tests with my implementation, I also found a couple of severe defects. (The thrd_exit()
/ thrd_join()
return value handling is broken, for one.)
With <threads.h>
we get the ability to handle thread-local storage. While I implemented a thread-local errno
(which was simple enough), thread-local locale handling might turn out to be a bit more complicated. It might require initializing things, and we don't get to call functions from _PDCLIB_stdinit.c
…
The idea was to write a function _PDCLIB_load_lc_<category>
for each locale category (collate, ctype, monetary, numeric, time, messages). This worked rather well at first. For ctype I delved into Unicode, getting the “right” character classes directly from the Unicode database (auxiliary/uctype
).
Then I wanted to do the same for collate (sorting equivalence), and this was where I got stuck. Unicode collation is a pretty big subject in the Unicode standard, and information about it is scattered over multiple chapters, even multiple documents. In a kind of repeat performance of the block I had with <stdio.h>
, I did not find the necessary uninterrupted time to really grasp what was before me.
The thing to do here would be to identify which data from which Unicode input files I would need, in which format, in order to implement (initially) strcoll
and strxfrm
. Ideally, whatever architecture I come up with would also serve for (upcoming) multibyte and wide character collation.
Several time functions are not implemented yet. The gmtime
/ localtime
/ mktime
group requires timezone information (which in turn requires me looking into the timezone database for proper support code). For asctime
/ ctime
I need alternative access to the “C” strings in the time locale category, because they are both defined locale-independent.
A request from downstream was to add FP support to my printf()
implementation (which currently breaks for %f/%g et al. because it doesn't draw the accompanying value from the stack – not nice!).
I got a good introduction to the Dragon4 binary-to-string conversion algorithm as well as the paper for the Grisu3 small integer optimization, but this would be another major construction site (touching <math.h>
and <fenv.h>
matters as well), and I feel it would be just one thing too many to tackle at this point.