====== Development Blog ====== Here I jot down thoughts, roadmaps, to-do's and other things related to [[pdclib:start | PDCLib]]. Newest entry first. ==== 2021-10-07 ==== This past month seemed a lot longer than a month. There had been productivity elsewhere, new professional challenges, and some private heartbreak. No progress on the library though. I hope this is excusable. ==== 2021-09-09 ==== The reimplementation is online. Please //do// pull the new version. I am a bit embarrassed at the poor quality of the previous attempt, and how long it took me to actually realize. ==== 2021-09-06 ==== Turns out my ''printf( "%a" )'' support was (is) not only inefficiently implemented, but also bugged in multiple ways. I am in the process of reimplementing the whole thing in a bit more robust way. ==== 2021-08-14 ==== I decided to shut down the Subversion repository (which had been the master repo until now). I still consider Subversion to be the better option for a small project like PDCLib, but I guess it is time to move on. I basically need the practice with Git, so... yea. ==== 2021-08-08 ==== Technically it is not that much, but it feels like a huge step forward -- PDCLib now supports printing floating point values using the ''%a'' conversion specifier. Why this weird specifier that is not used by anybody out there? Because it is the one that works without changing the base of the mantissa, i.e. this is the one format that avoids all the issues of the other FP conversions. And //having// the ability to print //some// kind of FP output will help immensely when debugging the other conversions. ==== 2021-07-12 ==== Back in the groove with a new employer. In pursuit of floating point support, I pulled apart the rather monolithic ''_PDCLIB_print.c'' and did some cleanups. Big integer support is mostly done,so I can try my hand at implementing the Dragon algorithm for float support in ''printf()''. ==== 2021-04-12 ==== Sorry for the long silence. An opportunity has opened up for me employment-wise which, however, requires my full attention. I hope to return to PDCLib by mid '21. ==== 2020-10-25 ==== I was asked to add floating point support to my ''printf()'' function family. I have been looking into the Dragon4 algorithm, which seems to be "the thing to do" here. This requires some bigint support for the high-precision conversions; I have started to add functions to that end. ==== 2020-10-23 ==== Two functions in stdio() (fread(), fgetpos()) did not handle ungetc()'ed characters correctly. Fixed. ==== 2020-08-03 ==== I erased my previous work on tzcode, and started anew. This time, I kept the original mostly untouched for the initial setup (instead of trying to refactor major parts of it as I go, the way I tried in the first go). This means that, at this point, I got a lot of code in the PDCLib repo that is... well... unkempt. Also, no documentation. But the '''' functions work now, even if the testing is rudimentary. For one, the local time function tests will fail in different time zones, because I cannot set a specific time zone for the test yet... This is a dirty hack, but it gives me a base from which to refactor ''functions/_tzcode'' till it feels like a true part of PDCLib. ==== 2020-07-24 ==== The last two months had been... unsavory. I had a lot of things on my hands, and unfortunately had to drop the ball on PDCLib for some time. I've returned to the keyboard though. The current work will take some more polish before being checked in, but I am confident that I tackled the tzcode issue from the correct angle this time. ==== 2020-05-29 ==== Well, that was to be expected. In my effort to untangle the internal data flows of tzcode, I have painted myself into a corner. Nothing serious really, but something that requires a couple hours of uninterrupted focus. Which is hard to come by currently... I hope to get this done over the long weekend. ==== 2020-04-23 ==== I've made inroads on [[https://data.iana.org/time-zones/tz-link.html|tzcode]], using it as a basis and reference for reading the Olson database for time zone and leap second information, in order to (you guessed it) provide actually functional implementations of the '''' functions. As opposed to ''dlmalloc'', which I assimilated more or less unchanged into PDCLib so it serves as its ''malloc'' / ''free'' implementation, tzcode requires a bit more work. I am not adopting the code, but more or less am rewriting the functionality along its general lines. I might get into some more detail on the why and how when I am satisfied that I am on the right track; right now I am still testing the waters so to speak. Progress has slowed a bit toward the end of the holidays; both work and real life have caught up with me again. But my partner has shown a remarkable interest in what I'm doing with PDCLib, and I guess that will keep me doing it whenever time allows. 8-) ==== 2020-04-07 ==== See that strike-through text in the previous entry? I went ahead and implemented that change anyway, because I was a bit fed up that types kept being "a bit off" whenever I switched platform. For the record, I am working on-and-off on either a desktop, a netbook (both x86_64 Linux), a Raspberry Pi (ARM Linux), my mobile (ARM Android, and yes, I //am// actually working on that at times), and more recently I got a Windows laptop (for home office work) which allowed me to double-check x86_64 Cygwin, MinGW 32 and 64bit compilations. Which had apparently stopped working some time ago because the types weren't set up correctly by ''_PDCLIB_config.h''. It was time to do an overhaul of the whole type handling. One thing that had bugged me (pun intended) for a long time was that I originally implemented the ''leastN_t'' types in terms of the exact-width ''intN_t'' types. That was bass-ackward because the latter are optional and the former aren't. I also did rely on ''_PDCLIB_config.h'' being set up "just right" instead of using compiler predefines, and of course manual setup gets it wrong from time to time. So I sat down, wrote little test programs, and ran those on //all// the platforms at my disposal to figure out what was actually required. I also made an overview of what GCC / clang provided (which is //almost// identical across platforms and compilers, but not completely). This was not only for the types mentioned above, but (because that was what I was //originally// working on) for ''clock_t'' and ''time_t'' as well. For obvious reasons, these have to fit the types used by the platform API. In the end this necessitated a complete rework of all the files affected -- ''_PDCLIB_config.h'', ''_PDCLIB_int.h'', ''stdint.h'', and ''inttypes.h''. It took a while to figure out what actually belonged where, and how the logic could work out, but I think I got it right eventually. Should any problems occur at your end due to this change, please tell me so I can adjuct the screws. But right now it's 4 AM. I am happy this is checked in, but I am even more happy to go to bed now. I guess I will take a break tomorrow (today?) and enjoy a day of //real// vacation for a change. ==== 2020-04-01 ==== We had a one-hour power outage this morning... and it took me another hour to figure out I had misconfigured the server so it didn't spin up on its own after the power came back. Sorry. On the other hand, I'm chunking away at the '''' implementation, using IANA's reference implementation (which is public domain) as a guide on time zone / leap second handling. While I am at it, I'm making some changes to internal plumbing as well, reducing the number of ''#if'' guesswork in ''_PDCLIB_config.h'' in favor of using GCC / clang predefines -- hoping to finally get PDCLib to compile properly on all my test platforms, including Cygwin, MinGW, and the occasional [[https://termux.com/|Termux]] compile on the road. (Postponed, this turned out to me more of a change than I was willing to do on the side.) ==== 2020-03-15 ==== Between Stefan Schmidt's contribution of a ''gmtime()'' contribution I promised to review and the sorry not-even-half-implemented state of '''' I decided to make that my next "action item". Stay healthy, stay at home, meet you on the flip side. ==== 2020-03-10 ==== New priority is ''''. During some tests related to thread-local ''errno'' support (which is now implemented), I found some serious flaws in my implementation, most importantly handling of result codes (''thrd_exit()'' / ''thrd_join()'') and failure codes. This should not be too hard though. ==== 2020-03-08 ==== I pushed the reworked ''freopen()'' (and flanking work) to git just now. There is a lot yet to be done (like proper setting of ''errno''), but at least this rework should stomp on various errors lurking in the old ''freopen()''. ==== 2020-03-05 ==== Sorry for keeping quiet for so long. There has been activity in the repo, I just didn't find the leisure to make a blog post. There had been fixes to the ''*scanf()'' functions, and a lot of peripheral work regarding the ''freopen()'' rewrite, which hopefully improved overall code quality. I am in the last throes of ''freopen()''; the new code is done, but I got a bug affecting ''stdin'' reopening, as used by the ''*scanf()'' test drivers, which is why I haven't commited that work yet. I got an implementation for ''gmtime()'' contributed by downstream, which I will review ASAP (but no sooner), and then I guess I'll get cracking at one of the numerous other construction sites, with an eye on getting the number of such construction sites down so this doesn't "feel" so bad anymore. ==== 2019-09-20 ==== Not much to report as I have been focussing on private matters (including a renovation project). Another downstream request was to implement floating point output to my ''printf()'' implementation. I've had a look at [[http://www.ryanjuckett.com/programming/printing-floating-point-numbers/ | this excellent presentation of the Dragon4 algorithm]], as well as [[https://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf | Florian Loitsch's paper on the Grisu improvement]], and found my spontaneous reaction of "definitely not at this point in time" to be justified... But I took note of the state of the art, and will study it when the time comes... ==== 2019-07-11 ==== Coming back to my "to-do list" from early June, things are a bit ugly. * I realized '''' is only partially implemented. All the date functions are awaiting locale support (for figuring out time zones, and leap second handling). * Of Annex K, '''' and '''' are yet missing. Annex K '''' faces the same issue as regular '''' -- locale support. * What's there of Annex K isn't really well tested. AFAIK there is only one good reference implementation (slib), and it's not easy to reference that in PDCLib tests. * '''' still isn't thoroughly tested. Integration of thread-local storage is missing, which makes handling of ''errno'', locale etc. non-compliant at this time. * ''freopen()'' still isn't fixed. * The elephant in the room, Unicode support to finally get going on locales, wide chars, and multibyte strings. * Downstream expressed some dismay that PDCLib isn't really optimized yet. I could make huge improvements to this end by providing an overlay that uses GCC / CLang buildins instead of the naïve ''memcpy'' / ''strcpy'' functions, but I am afraid to open //yet// another can of worms. * '''' looming on the horizon. I feel a bit overwhelmed at the moment, as there are so many construction sites, and no easy way to reduce their number anytime soon. I guess fixing ''freopen()'' is the easiest among them, but then things get... interesting. ==== 2019-07-02 ==== My exit code -- the one handling process termination on ''exit()'' -- apparently never worked. Streams did not get flushed and closed, buffered output got lost. This is **not** good, and I profusely apologize to all. In fixing that particular bug, I came to realize that, while ''exit()'' now does what it is intended to, a return from ''main()'' still doesn't. Apparently I never actually solved the issue. Until I do, be warned: A return from ''main()'', at this point, //does not close open streams//, or indeed call //any// of the functions registered with ''atexit()''. I have to figure out how to make this happen. Note that it is the duty of the C runtime support code -- the part that actually **calls** ''main()'' -- to call ''exit()'' with what ''main()'' returned[[https://github.com/evanphx/ulysses-libc/blob/master/src/env/__libc_start_main.c#L46|[1]]][[https://sourceware.org/git/?p=glibc.git;a=blob;f=csu/libc-start.c;h=5d9c3675fa38fcd7ee4e034c103bea4b74606366;hb=HEAD#l339|[2]]]. PDCLib does not come with C runtime support code, as that is platform specific. I should have a line or two about that in my Readme... ==== 2019-06-12 ==== I finished ''strtok_s()'', and am having a look at the remaining Annex K functions... those in '''' are easy enough, but I shudder a bit at the thought of diving into the ''*printf()'' implementation to get the bounds checking implemented... ==== 2019-06-04 ==== As a summary update, my to-do list: * implement ''strtok_s()'' as per C11 Annex K (due to popular demand) * fix ''freopen()'' * double-check the '''' implementation * ...and then either the rest of Annex K, or return to the Unicode support functions for '''' et al. ==== 2019-05-20 (2) ==== Turns out the ''fseek'' issue was easily fixed, thanks to the high-quality bug report. Take that for a Monday. :-) ==== 2019-05-20 ==== Yes, I am aware of **breaking bugs** in the current PDCLib. There's something wrong with the thread implementation still, there *might* be some problems with the dlmalloc integration, and I also know of issues with ''freopen'' and ''fseek''. It's a bit overwhelming right now -- I thought I'd be looking at a mostly functional build until a simple test program convinced me otherwise... But I am definitely working on it. Especially since that ''fseek'' issue has been brought to my attention by a group of PDCLib early adopters, which I am rather keen to support, as they have provided me with valuable feedback in the past. So... for now, you're probably better off to look to SVN revision 769 (pre 2019-04-16) if what you want is a halfway-stable, functional PDCLib (that's using a makeshift memory management and is strictly single-threading). You'd still have to accept the problems with ''freopen'' (possible resource leaks) and ''fseek'' (probably completely broken, I'm still looking into it). So much to code, so little time... ==== 2019-04-30 ==== Adapting my '''' solution from x86_64-Linux to Raspbian Linux went surprisingly smooth (despite the jump scare I got the first time around when I forgot to adjust the settings in _PDCLIB_config.h, as you can see from the repository log...). Then I tried to adapt it to Windows / MinGW, just for the sake of giving it a try, and... oh, my. OK, there has to be some more work poured into this. (Among other things, Windows / MinGW does some things //very// differently in pthreads.h, most importantly the data structures not being data structures at all but ''typedef''ed ''void *'', so most of what I did in ''pthread_readout'' does not help -- instead it gets very much gets in the way.) Ah well. We're further down the road than we were a week ago, so all is good I guess. ;-) ==== 2019-04-28 ==== Back from vacation, and got around to commit the '''' implementation. I know of the following shortcomings at this point: * The implementation does not have test drivers yet (a.k.a. "untested"). * '''' needs to be thread-specific storage; I am thinking about how to initialize things that way. * ''freopen()'' is flaky, probably broken in more than one way. I am working on that, but wanted the rest of the code committed right away, for backup purposes if nothing else. On the upside, most of '''' (with the exception of aforementioned ''freopen()'') is thread-safe, as are the memory management functions. ==== 2019-04-18 ==== Enjoying two weeks of vacation at the North Sea, I spend quite some time relaxing at the keyboard. (Yes, this can be actually relaxing, if you go at it the right way.) I integrated dlmalloc (using default settings only for the time being), and am making some progress toward implementing ''''. That was not at all on the to-do list, since it's C11 and I claimed that as being out of scope until I got C99 covered. But as I received feedback from several adopters of PDCLib, and the subject of multithreading support popped up in almost every single one, I bowed to popular demand. The example platform will implement '''' as a wrapper for pthread, but it should be comparatively easy to come up with other adaptions. **Note that contributions supporting other mainstream APIs and / or platforms will always be welcome!** It's also simpler to implement those pthread wrapper functions than digging through the Unicode specs. ;-) Once I got the functionality nailed down, I will wade through the existing code to implement thread safety as required. (Looking at you, ''''...) I might add some C11 extensions while I am at it (''strtok_s'' was among the requested functions, and I do not see a reason not to oblige, really). So... yes. Progress is being made. ;-) ==== 2019-03-26 ==== It's been a long time since I last did anything with / for PDCLib, but I won't make excuses for it. I just could not get myself to dig into that Unicode standard again. And as I said to a fellow developer some time ago: > A hobby should always be a CAN do, not a TO do. Have a good hard look at what each of your hobbies is giving you, and be ready to drop hobbies that drain your energy instead of recharging it. After the ePub debacle, and due to several other (private) issues, my energy was drained. So I focussed on more enjoyable things... but I'm back. Since I //still// could not bear the thought of going full Unicode mode again, I had a look at integrating [[https://g.oswego.edu/dl/html/malloc.html|Doug Lea's ''malloc()'']], properly this time, to replace the makeshift ''malloc()'' / ''free()'' implementations PDCLib currently "offers". To do this with a minimum of changes to the ''dlmalloc()'' code (desirable because easier to maintain facing future changes), that meant I had to tackle the issue of symbol visibility (''dllexport''), which ''dlmalloc()'' supports and PDCLib doesn't (yet). That in turn meant I had to //test// the stuff, which in turn meant it was time to enable building PDCLib as a //shared// library instead of the static one it currently is. But that meant touching ''Makefile''... and that thing, while I liked its results, was not exactly a beauty to behold in an editor. So I started working toward supporting [[https://www.cmake.org|CMake]], which would bring several other benefits as well. And today I committed the first version of just that, so... Let's see if I get back on track on this. ;-) ==== 2018-10-29 ==== Quickly saving a link for later reference: [[https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html|What Every Computer Scientist Should Know About Floating-Point Arithmetic]]. ==== 2018-09-24 ==== The ePub conversion was a dead end; I should have spent the time reading instead of working on "conversion to better readable format". So now I am looking at wasted time, a reading backlog, and lots of things I neglected while working on the now-abandoned conversion. *sigh* ==== 2018-08-17 ==== Since I was asked, I thought I could just as well give the answer here: > Why are you doing '''' first? I would think floating point support would be more important. Three reasons, really. The first is just a minor snag -- FP I/O is locale-dependent (decimal point vs. decimal comma). The second is that, to do the FP logic //right// (instead of naïve 80-20 solutions), you need to take //lots// of platform specifics into account. This will blow up ''<_PDCLIB_config.h>'' //significantly//, and result in lots of rather ugly conditional code. Third, it is quite simply the area I have the least expertise in. I want to save the hardest part for last. ==== 2018-08-07 ==== Sometimes we find ourselves approaching new technologies from rather unexpected angles. Right now I am working on an ePub conversion of The Unicode Standard for easier reading, as PDF handles poorly on my tolino ebook reader. I would probably never have bothered with looking into the ePub format if it had not been for PDCLib... we live and learn. ==== 2018-08-05 ==== There is no way around it. Too much of the whole ctype, wctype, uchar, locale issue is pointing to Unicode all over again. And I have been cursing at getting tangled by lots of cross-references and internal dependencies, so now I made myself sit down and tackle the monster that is The Unicode Standard. From cover to cover, as there seems to be no real shortcut to "just what I need right now". So... yeah. Stay put. ==== 2018-07-27 ==== In these past two days, I learned a lot about the Unicode Collation Algorithm. Yes, I can do this, I can make this part of the PDCLib. But no, not in the immediate future. That will have to "make do" with the "C" locale. ==== 2018-07-25 ==== I have added ''_PDCLIB_load_lc_*()'' functions for all the locale categories mandated by C99, plus ''LC_MESSAGES'' which is a C99-compliant POSIX extension which is required anyway for ''strerror()'' and ''perror()'' to be locale-aware. The one thing left is ''LC_COLLATE''. Collation in the C locale is comparatively simple, but //Unicode// aware collation? Let's just say that the [[http://www.unicode.org/reports/tr10/|corresponding Unicode document]], converted to PDF for easier offline reading, amounts to 61 pages. I will have to dig through that at some point, so why not now. ==== 2018-07-02 (2) ==== Bah. //Think first.// There already **is** a function to load contents for the various locale-data structures from file, and it's name is ''setlocale()''. Also, while loading from the filesystem is rather "raw", any other mechanic will be even more "raw", and less standard (as in, ''<**std**io.h>''). So stop dithering and make ''setlocale()'' do more than ''return NULL;''. ;-) ==== 2018-07-02 ==== Looking at what I already had in '''', I decided some reworking was required. Stuffing everything into ''struct lconv'' was not the smartest idea I had, so I did split things up into separate ''struct _PDCLIB_lc_*''. I also moved the ''extern'' declaration of the actual data instances from '''' to ''<_PDCLIB_int.h>'' where they are less confusing to the casual observer. I am currently thinking in terms of ''_PDCLIB_load_lc()'' to load contents for the various locale-data structures from file. I do not like the idea of having raw filesystem access inside PDCLib, though... this needs some pondering. ==== 2018-06-29 ==== With ''get-uctypes'' (the source of which is in the repo at ''auxiliary/uctype/''), I now have a program to get character classification information (as required by '''' and, more importantly, '''') directly from data files available from [[http://unicode.org|unicode.org]]. The ''shepherd'' branch already had this functionality, but it was a) written in Python (which IMHO has no place in the source tree of a C library); b) including the raw data files which made them prone to getting outdated and required additional legalese added due to Unicode licensing; c) not giving correct results, and more importantly, not offering an easy way to test against the system library's results. Now I have to provide a way to actually //use// the derived information in PDCLib proper.