In which a simple “Do you know an easy way to convert an integer into a string of (character) digits?” turned into an expedition into ancient UNIX codebases. The process went something like this:
Labmate: Do you know an easy way to convert an integer into a string of digits?
< both of us type it in to google>
Looks like sprintf() is best, there must be a simple efficent algorithm.
How does printf() do it?
Let’s look in libc!
< grep through the various toolchain sources on my system >
Well, here’s the uClibc implementation… which is a terrifying mess of ambigously named function calls and preprocessor directives.
I wish I had some old UNIX sources to look at, that would be simple.
< a bit of googling later>
Holy crap, this is amazing! Full root images for a bunch of early UNIXes, many of them from dmr himself!
< download v5, grep /usr/source judiciously to find /usr/source/s4/printf.s>
Well crap, it’s in PDP-11 assembly, maybe a later version.
< download v7, grep /usr/src judiciously to find /usr/src/libc/stdio/doprnt.s>
Damn, still in PDP-11 assembly, but this is a fancier algorithm.
Hmm… the most understandable UNIX I’ve ever looked at was old MINIX…
< spin up BasiliskII, in ‘030 mode, use this workaround to make MacMinix run>
< more grepping for justice>
Eventually, we found the printk() from MacMinix 1.5, in all its awful K&R/ANSI transitional C glory
...
#define NO_FLOAT
#ifdef NO_FLOAT
#define MAXDIG 11 /*32 bits in radix 8*/
#else
#define MAXDIG 128 /* this must be enough */
#endif
_PROTOTYPE(void putc, (int ch)); /*user-supplied, should be putk */
PRIVATE _PROTOTYPE( char *_itoa, (char *p, unsigned num, int radix));
#ifndef NO_LONGD
PRIVATE _PROTOTYPE( char *_itoa, (char *p, unsigned long num, int radix));
#endif
PRIVATE char *_iota(p, num, radix)
register char *p;
register unsigned num;
register radix;
{
register i;
register char *q;
q = p + MAXDIG;
do {
i = (int) (num % radix);
i += '0';
if (i > '9') i += 'A' - '0' - 10;
*--q = i;
} while (num = num / radix);
i = p + MAXDIG - q;
do
*p++ = *q++;
while(--i);
return(p);
}
...
Which is, of course, a digit at a time in one of the most straightforward ways imaginable. Minix’s kernel is designed for systems so resource constrained it has a separate prints() that can only handle char and char* to save overhead, so I can’t imagine it uses a sub-optimal technique.
This kind of thing really makes me wish I had learned OSes in the old death-march through the UNIX sources (or at least the old Tenenbaum book with MINIX) way; things are too complicated and opaque now. There always seems to me to be a golden age from around 1970 into the early 1990s where the modern computing abstractions were in place, but the complexity of production hardware and software hadn’t yet grown out of control.
In a related note, as a tool writer, looking at the earliest versions of cc is AMAZING. Most decent programmers should be able to work through the cc from v5 UNIX (574 lines of C for cc proper, 6454 more lines of C and 1606 of PDP-11 assembly in called parts) in a couple hours, and fairly fully understand how it all works. Sadly, (pre-3) MINIX came with a (binary only) CC made with ACK, which is fancy and portable and way, way, way harder to understand. dmr’s simple genius was just that.