Monthly Archives: November 2011

Bookish Dreaming

I only remember a dream every year or so, but I realized on the way back from getting breakfast with friends this morning that a book I thought I’d been reading was entirely in a dream. It was a long dream with various dream-space fucked-up-ness to the setting building (No University space is that large, that nice… or has three large highly styled cafe/lounge spaces in the same complex) and interaction with various old acquaintances, but there was one section I didn’t realize was a dream because it was so normal:
< dream content >
I picked up a book (roughly A4 sized, and inch or so thick, nicely black cloth-bound with gold embossing) about code generation for a particular class of exotic hybrid-SIMD machines (I remember details, which are realistic, but not specific enough to pick out which machine) by David Padua (respected figure in parallel computing, who I’ve met at conferences) and a coauthor I couldn’t remember when I woke up. I got the book from a well stocked engineering library, and discussed it with various engineering types I know, including my current adviser.
< /dream content >
Until we were headed back from breakfast and I realized the setting was “improbable,” I was sure it had happened. When I got home I had to see if it was something I may have seen referenced – the content and authors were probably based on “Optimizing data permutations for SIMD devices.” which I read a year or two ago, but it isn’t an exact match. The description I remember also matches a section in Encyclopedia of Parallel Computing (four volume, $1500) book that I’ve never seen before (and now want access to). I also want the dream book, because it would be all kinds of useful for my MS project.
Aren’t brains interesting…

Posted in Computers, General, Navel Gazing, Objects, School | Tagged | Leave a comment

Beer Reduction Sauce

While travelling I had a meal at a Gordon Biersch (chain brewery restaurant) with what they call “Märzen sauce,” which was pretty good, and I wanted to check my guess for how it was made. Nothing else particularly appealed while I was shopping for dinner ingredients tonight, so I decided to try to clone with the wheat ale I had in the fridge.

Pork in a thickened beer reduction with mushrooms, with asparagus and rice.

I was pretty sure it was made on the cook meat in oil with salt and pepper (and garlic powder?)-> remove meat -> add flour to to whatever is left in the pan -> cook until dark -> deglaze with beer -> add sliced mushrooms (and garlic?) -> cook down -> plate meat with pan sauce process, and decided to try it. I went with real garlic later in the process, which I suspect was wrong, and I would have to test again to see if I got the ordering right in one place (flour and mushrooms steps may be swapped), but it seems to be right otherwise, and, more importantly, came out tasty.

Posted in FoodBlogging | Leave a comment

SC’11 Lessons

I learned some really interesting things at SC this year, and now that I’ve had a day to process, I want to share. Many of these observations come from first or second hand conversations, or justifiable interpretations of press releases, so I don’t promise they are correct, but they are plausible, explanatory, and interesting. I apologize for the 1,000 word wall of text, but there is a lot of good stuff.

  • This is the big one: I’m pretty sure I understand the current long term architecture plan being pursued by Intel, AMD, and Nvidia. This plan signals the end of the current style of monolithic symmetric processor cores.
    They are all apparently pursuing designs with a small N of large integer units, coupled to M >> N SIMD engines.

    • Nvidia’s “Project Denver” is a successor/big sibling to Tegra design, and appears to be the beginning of a line with 2-8 64-bit (probably) ARM cores tightly integrated with a big honking GPU-like SIMD structure for FP. The stale press release about this stuff is kind of nauseating to read, but it looks like they’re betting the farm on that design.
    • Intel’s HPC efforts are going to be based on a lot of MIC (Many Integrated Cores, successor to the Larabee stuff) parts coupled with a few big cores like the current Xeons. The MIC chips are basically large numbers of super-Atoms: tiny, simple, dumb integer units attached to big SWAR (SIMD Within a Register) units focused on SSE/AVX performance. This is less speculative than most observations, they made a pretty good press push (This for example) on this idea.
      The ring interconnects and higher per-“thread” hardware complexity are probably not a good idea in the long run (IMHO), but having an integer unit for every big SWAR engine will be a major advantage in terms of programming environment and code generation. I suspect the more cautious approach is because Intel doesn’t want/can’t afford another Itanic, where the tools couldn’t generate good code for the programming model on their intended high-end part.
    • AMD’s two current products are stepping stones to a design similar to Nvida’s – Bulldozer is a design with some ridiculously powerful x86-64 integer units decoupled from a smaller number of shared FPUs. The APU (I haven’t heard the “Fusion” name in a while) designs are CPUs tightly coupled to GPU structures. The successor parts will be a hybrid of the two – a few big, bulldozer style integer units, with a large number wide next-gen GPU SIMD structures coupled to them.

    I think this is generally a good design direction, particularly with current directions in computing in mind, but it is going to make the compiler/concurrent programming world exciting for a while.

  • AMD appears to be gearing up to abandon a fifth generation of GPGPU products. CTM, CAL, Brook+, OpenCL on 4000 series cards have all been deprecated while still shipping, and indications are that OpenCL (and general driver) support for the current architecture (4-wide VLIW SIMDs, like in the 5- and 6- series) has been relegated to second-class citizen status, while they work on a next generation architecture. The rumor is the next gen parts will be 4 independent banks of SIMD engines instead of 4-wide VLIW SIMD engines, which should be both both nicer to program and generate code for and more similar to Nvidia.
  • Nvidia is going to open source their CUDA environment. One of the primary objections to CUDA in a lot of circles is reluctance to use a proprietary single-vendor programming environment (people who have been in super/scientific computing for long have all been burnt on that in the past), and the Integer+SIMD model is going to require that not be an issue. This is assembled from information from several places, including PGI, Nvidia, and various scientific compute facilities, much of it second hand or further, but it would make sense.
  • I still don’t exactly know what went down at Infiscale, but the impression that the Perceus community was abandoned by the company, the developers fled, and it was a bad scene seems to be correct. No one I know that was there seems to be talking, but they’re all on their way to other interesting things, especially Greg Kurtzer’s Warewulf3 project at LBL.
  • The dedicated high performance compute nodes in Amazon’s EC2 cloud are actually connected as a few large partitionable clusters, users just can’t (nominally, don’t need to) see and instrument the topology like they could with a normal cluster. This is from interpreting press releases, because the people manning Amazon’s booth really didn’t want to chat (and, in fact, were kind of dicks when we tried). This explains how they’ve been getting performance out of a loosely coupled cloud — which is to say they aren’t, they just have a huge cluster attached to their cloud that shares the interface.
  • The current hard drive production problems have given SSDs the opportunity they need to become first class citizens. Talking to OEMs, the wholesale cost per capacity on HDDs almost tripled, and the supply lines aren’t all that stable, so everyone is scrambling to make things work with mostly SSDs. I saw a lot of interesting new form factors for SSDs, and several flavors flash or battery backed “nonvolatile” DRAM floating about as well, so the nature of storing data-sets is changing.
  • I saw motherboards with 32 DIMM slots (mostly AMD Interlagos based) on the floor. I saw 32GB DIMMs on the floor. I saw some shared-memory systems with multiple Terabytes of RAM in them. The standard for high memory machines has roughly quadrupled in the last year or two.
  • The number of women (not booth babes, real technical people, especially younger ones) and educators on the show floor this year was way higher than in the past. This is very good for the field.

I think that covers most of the really good stuff coming off the floor this year, although I am still processing and may come up with some other insights when I’ve had more sleep and discussion.
Also, Pictures! WOO! (Still sorting and uploading the last batch at time of posting).

Posted in Announcements, Computers, Electronics, General | Tagged , , , , | 1 Comment

Stop SOPA

Internet censorship and overbearing laws to maintain the entertainment industries business model are always bad, but SOPA is the worst. Not that it is entirely technically implementable, but if passed it will genuinely break the internet, and ruin lives in the process. I have the support logo blackbar up for the day, and after that, check for details at http://americancensorship.org/

Posted in Computers, General | Leave a comment

Headed out to SC11 in Seattle, WA. for the week. Technical interest, travel complaints, booth hacks, advertising mockery, and schwag to follow.

Posted on by pappp | Leave a comment

Source Scroller

I’ve been working on a side project to make a source display widget for the aggregate.org SC’11 exhibit, and it is surprisingly problematic. The goals for the program are as follows:

  • Take a directory of source files as input
  • Automatically perform appropriate syntax highlighting on the source files.
  • Display (On a dark background – intended use is a floating projection display) all the source files consecutively.
  • Scroll automatically through the output in a pleasing, readable way.
  • Be capable of repeating this action unattended indefinitely.

Continue reading

Posted in Computers, DIY, General, School | Leave a comment