Category Archives: Computers

Bookish Dreaming

I only remember a dream every year or so, but I realized on the way back from getting breakfast with friends this morning that a book I thought I’d been reading was entirely in a dream. It was a long dream with various dream-space fucked-up-ness to the setting building (No University space is that large, that nice… or has three large highly styled cafe/lounge spaces in the same complex) and interaction with various old acquaintances, but there was one section I didn’t realize was a dream because it was so normal:
< dream content >
I picked up a book (roughly A4 sized, and inch or so thick, nicely black cloth-bound with gold embossing) about code generation for a particular class of exotic hybrid-SIMD machines (I remember details, which are realistic, but not specific enough to pick out which machine) by David Padua (respected figure in parallel computing, who I’ve met at conferences) and a coauthor I couldn’t remember when I woke up. I got the book from a well stocked engineering library, and discussed it with various engineering types I know, including my current adviser.
< /dream content >
Until we were headed back from breakfast and I realized the setting was “improbable,” I was sure it had happened. When I got home I had to see if it was something I may have seen referenced – the content and authors were probably based on “Optimizing data permutations for SIMD devices.” which I read a year or two ago, but it isn’t an exact match. The description I remember also matches a section in Encyclopedia of Parallel Computing (four volume, $1500) book that I’ve never seen before (and now want access to). I also want the dream book, because it would be all kinds of useful for my MS project.
Aren’t brains interesting…

Posted in Computers, General, Navel Gazing, Objects, School | Tagged | Leave a comment

SC’11 Lessons

I learned some really interesting things at SC this year, and now that I’ve had a day to process, I want to share. Many of these observations come from first or second hand conversations, or justifiable interpretations of press releases, so I don’t promise they are correct, but they are plausible, explanatory, and interesting. I apologize for the 1,000 word wall of text, but there is a lot of good stuff.

  • This is the big one: I’m pretty sure I understand the current long term architecture plan being pursued by Intel, AMD, and Nvidia. This plan signals the end of the current style of monolithic symmetric processor cores.
    They are all apparently pursuing designs with a small N of large integer units, coupled to M >> N SIMD engines.

    • Nvidia’s “Project Denver” is a successor/big sibling to Tegra design, and appears to be the beginning of a line with 2-8 64-bit (probably) ARM cores tightly integrated with a big honking GPU-like SIMD structure for FP. The stale press release about this stuff is kind of nauseating to read, but it looks like they’re betting the farm on that design.
    • Intel’s HPC efforts are going to be based on a lot of MIC (Many Integrated Cores, successor to the Larabee stuff) parts coupled with a few big cores like the current Xeons. The MIC chips are basically large numbers of super-Atoms: tiny, simple, dumb integer units attached to big SWAR (SIMD Within a Register) units focused on SSE/AVX performance. This is less speculative than most observations, they made a pretty good press push (This for example) on this idea.
      The ring interconnects and higher per-“thread” hardware complexity are probably not a good idea in the long run (IMHO), but having an integer unit for every big SWAR engine will be a major advantage in terms of programming environment and code generation. I suspect the more cautious approach is because Intel doesn’t want/can’t afford another Itanic, where the tools couldn’t generate good code for the programming model on their intended high-end part.
    • AMD’s two current products are stepping stones to a design similar to Nvida’s – Bulldozer is a design with some ridiculously powerful x86-64 integer units decoupled from a smaller number of shared FPUs. The APU (I haven’t heard the “Fusion” name in a while) designs are CPUs tightly coupled to GPU structures. The successor parts will be a hybrid of the two – a few big, bulldozer style integer units, with a large number wide next-gen GPU SIMD structures coupled to them.

    I think this is generally a good design direction, particularly with current directions in computing in mind, but it is going to make the compiler/concurrent programming world exciting for a while.

  • AMD appears to be gearing up to abandon a fifth generation of GPGPU products. CTM, CAL, Brook+, OpenCL on 4000 series cards have all been deprecated while still shipping, and indications are that OpenCL (and general driver) support for the current architecture (4-wide VLIW SIMDs, like in the 5- and 6- series) has been relegated to second-class citizen status, while they work on a next generation architecture. The rumor is the next gen parts will be 4 independent banks of SIMD engines instead of 4-wide VLIW SIMD engines, which should be both both nicer to program and generate code for and more similar to Nvidia.
  • Nvidia is going to open source their CUDA environment. One of the primary objections to CUDA in a lot of circles is reluctance to use a proprietary single-vendor programming environment (people who have been in super/scientific computing for long have all been burnt on that in the past), and the Integer+SIMD model is going to require that not be an issue. This is assembled from information from several places, including PGI, Nvidia, and various scientific compute facilities, much of it second hand or further, but it would make sense.
  • I still don’t exactly know what went down at Infiscale, but the impression that the Perceus community was abandoned by the company, the developers fled, and it was a bad scene seems to be correct. No one I know that was there seems to be talking, but they’re all on their way to other interesting things, especially Greg Kurtzer’s Warewulf3 project at LBL.
  • The dedicated high performance compute nodes in Amazon’s EC2 cloud are actually connected as a few large partitionable clusters, users just can’t (nominally, don’t need to) see and instrument the topology like they could with a normal cluster. This is from interpreting press releases, because the people manning Amazon’s booth really didn’t want to chat (and, in fact, were kind of dicks when we tried). This explains how they’ve been getting performance out of a loosely coupled cloud — which is to say they aren’t, they just have a huge cluster attached to their cloud that shares the interface.
  • The current hard drive production problems have given SSDs the opportunity they need to become first class citizens. Talking to OEMs, the wholesale cost per capacity on HDDs almost tripled, and the supply lines aren’t all that stable, so everyone is scrambling to make things work with mostly SSDs. I saw a lot of interesting new form factors for SSDs, and several flavors flash or battery backed “nonvolatile” DRAM floating about as well, so the nature of storing data-sets is changing.
  • I saw motherboards with 32 DIMM slots (mostly AMD Interlagos based) on the floor. I saw 32GB DIMMs on the floor. I saw some shared-memory systems with multiple Terabytes of RAM in them. The standard for high memory machines has roughly quadrupled in the last year or two.
  • The number of women (not booth babes, real technical people, especially younger ones) and educators on the show floor this year was way higher than in the past. This is very good for the field.

I think that covers most of the really good stuff coming off the floor this year, although I am still processing and may come up with some other insights when I’ve had more sleep and discussion.
Also, Pictures! WOO! (Still sorting and uploading the last batch at time of posting).

Posted in Announcements, Computers, Electronics, General | Tagged , , , , | 1 Comment

Stop SOPA

Internet censorship and overbearing laws to maintain the entertainment industries business model are always bad, but SOPA is the worst. Not that it is entirely technically implementable, but if passed it will genuinely break the internet, and ruin lives in the process. I have the support logo blackbar up for the day, and after that, check for details at http://americancensorship.org/

Posted in Computers, General | Leave a comment

Headed out to SC11 in Seattle, WA. for the week. Technical interest, travel complaints, booth hacks, advertising mockery, and schwag to follow.

Posted on by pappp | Leave a comment

Source Scroller

I’ve been working on a side project to make a source display widget for the aggregate.org SC’11 exhibit, and it is surprisingly problematic. The goals for the program are as follows:

  • Take a directory of source files as input
  • Automatically perform appropriate syntax highlighting on the source files.
  • Display (On a dark background – intended use is a floating projection display) all the source files consecutively.
  • Scroll automatically through the output in a pleasing, readable way.
  • Be capable of repeating this action unattended indefinitely.

Continue reading

Posted in Computers, DIY, General, School | Leave a comment

A Day in the Life of the KAOS Lab Thought Process

In which a simple “Do you know an easy way to convert an integer into a string of (character) digits?” turned into an expedition into ancient UNIX codebases. The process went something like this:

Labmate: Do you know an easy way to convert an integer into a string of digits?
< both of us type it in to google>
Looks like sprintf() is best, there must be a simple efficent algorithm.
How does printf() do it?
Let’s look in libc!
< grep through the various toolchain sources on my system >
Well, here’s the uClibc implementation… which is a terrifying mess of ambigously named function calls and preprocessor directives.
I wish I had some old UNIX sources to look at, that would be simple.
< a bit of googling later>
Holy crap, this is amazing! Full root images for a bunch of early UNIXes, many of them from dmr himself!
< download v5, grep /usr/source judiciously to find /usr/source/s4/printf.s>
Well crap, it’s in PDP-11 assembly, maybe a later version.
< download v7, grep /usr/src judiciously to find /usr/src/libc/stdio/doprnt.s>
Damn, still in PDP-11 assembly, but this is a fancier algorithm.
Hmm… the most understandable UNIX I’ve ever looked at was old MINIX
< spin up BasiliskII, in ‘030 mode, use this workaround to make MacMinix run>
< more grepping for justice>
Eventually, we found the printk() from MacMinix 1.5, in all its awful K&R/ANSI transitional C glory

...
#define NO_FLOAT

#ifdef NO_FLOAT
#define MAXDIG 11 /*32 bits in radix 8*/
#else
#define MAXDIG 128 /* this must be enough */
#endif

_PROTOTYPE(void putc, (int ch)); /*user-supplied, should be putk */

PRIVATE _PROTOTYPE( char *_itoa, (char *p, unsigned num, int radix));
#ifndef NO_LONGD
PRIVATE _PROTOTYPE( char *_itoa, (char *p, unsigned long num, int radix));
#endif

PRIVATE char *_iota(p, num, radix)
register char *p;
register unsigned num;
register radix;
{
register i;
register char *q;
q = p + MAXDIG;
do {
i = (int) (num % radix);
i += '0';
if (i > '9') i += 'A' - '0' - 10;
*--q = i;
} while (num = num / radix);
i = p + MAXDIG - q;
do
*p++ = *q++;
while(--i);
return(p);
}
...

Which is, of course, a digit at a time in one of the most straightforward ways imaginable. Minix’s kernel is designed for systems so resource constrained it has a separate prints() that can only handle char and char* to save overhead, so I can’t imagine it uses a sub-optimal technique.
This kind of thing really makes me wish I had learned OSes in the old death-march through the UNIX sources (or at least the old Tenenbaum book with MINIX) way; things are too complicated and opaque now. There always seems to me to be a golden age from around 1970 into the early 1990s where the modern computing abstractions were in place, but the complexity of production hardware and software hadn’t yet grown out of control.
In a related note, as a tool writer, looking at the earliest versions of cc is AMAZING. Most decent programmers should be able to work through the cc from v5 UNIX (574 lines of C for cc proper, 6454 more lines of C and 1606 of PDP-11 assembly in called parts) in a couple hours, and fairly fully understand how it all works. Sadly, (pre-3) MINIX came with a (binary only) CC made with ACK, which is fancy and portable and way, way, way harder to understand. dmr’s simple genius was just that.

Posted in Computers, Entertainment, General, School | Tagged , , | Leave a comment

RIP Dennis Richie

It’s been a bad week for computing pioneers. Steve Jobs on October 5th, and, more quietly, Dennis Richie died October 9th. We’re hearing all about Steve Jobs in the news, because he was a salesman and a showboat. We’re not hearing about Dennis Richie because he did his very best to avoid attention while he did interesting things.
He developed the C programming language, in which virtually all low level programming has been done for the last twenty some years, and from which most widely used modern languages inherit their syntactic structure. He was a major player in the development of UNIX, an operating system which has become so universal that both the vast majority of smartphones and the vast majority of supercomputers run one of it’s derivatives or descendants.
His contributions are so fundamental that they shape the nomenclature and notation we use to discuss computing, and in essence created the world I live in.

I have a copy of K&R in the home directory of all my computers, and always hoped to meet him. His wisdom and knowledge will be sorely missed, but he long ago discovered the secret to immortality: he didn’t just make things, he made things that make things, and as such he will live on through the tools he designed – tools so elegant that we’ll all, mostly unknowingly, be using them every day on every computer for as long as computers remain recognizable.
exit 0;

Posted in Computers, General | Leave a comment

Arduino Promotion Behavior

I was pulling up an old project (A little Simon game which has apparently fallen off the ‘net when I moved my site- will have to repost) to use as a classroom demonstration, and discovered that sometime in the last couple years, the int->unsigned long promotion in Ardino/Processing broke/changed without comment.

The code uses using primitive nested counters and delays to generate sounds and control difficulty, which means rather large numbers (100-microsecond scale delays running for seconds) are being thrown about. To isolate the change, I wrote the following test:
Continue reading

Posted in Computers, DIY, Electronics, General, Objects | Tagged , | Leave a comment

X11


Dear AMD,
Randall Munroe has correctly pointed out that you are adversely affecting my quality of life. Please fix your shit.
Hate,
PAPPP

Posted in Computers, DIY, Entertainment, General | Leave a comment

Power Factor Corrector

We’ve been tracking down the failure mode of power supplies in the clusters on campus, and picked up a plug-in “Power Saver” power factor corrector box for around $5 to look at in our experiments. Prices on these things range from about $5 (less than the cost of the components in small quantities… and in a nice wallwart case – this is what we paid) to over $70 (fleecing the morons).
This particular device is a “PowerSaver PowerStar CHT” (or some similar random string, the model is “CHT-001A”), about which a variety of bemusingly improbable claims are made.


Upon opening it up, it contains a large (5uF,450V) capacitor, two LEDs, three quarter-watt resistors, and a single-sided PCB with “Comment” all over the silkscreen where fields were not filled in. As best I can make out, the resistors are a very high impedance voltage divider to step down the 120V/60Hz from the wall to a level the LEDs can handle, and the LEDs are acting as their own rectifier. The capacitor is at least connected across the outlet. Curiously, only about half the board is populated, and I can’t figure out what some of the missing components would do – they appear to be a second independent indicator LED circuit rotated 90 degrees from the populated one.
Under a very limited set of circumstances (linear inductive load, like an AC motor) these things could actually help the power factor, but since most residential power billing is net, with no power factor penalty, it wouldn’t actually reduce bills. In the case of non-linear loads, like say, computer power supplies, it will do nothing useful. Experiment complete, although it may still be plugged in while instrumented for funsises.

Posted in Computers, DIY, Electronics, Entertainment, General, Objects, School | Tagged , , | Leave a comment