Daily Archives: 2024-06-25

Researchers upend AI status quo by eliminating matrix multiplication in LLMs

Source: Ars Technica

Article note: It isn't really matrix-math-free, it's just matrices of ternaries. That said, I'm a fan of small-range systems for sloppy approximators, (-1,0,1) ternaries map well to LLMs, and not using huge, expensive, power-hungry monstrosities for dumb bullshit is a win for everyone.
Illustration of a brain inside of a light bulb.

Enlarge (credit: Getty Images)

Researchers claim to have developed a new way to run AI language models more efficiently by eliminating matrix multiplication from the process. This fundamentally redesigns neural network operations that are currently accelerated by GPU chips. The findings, detailed in a recent preprint paper from researchers at the University of California Santa Cruz, UC Davis, LuxiTech, and Soochow University, could have deep implications for the environmental impact and operational costs of AI systems.

Matrix multiplication (often abbreviated to "MatMul") is at the center of most neural network computational tasks today, and GPUs are particularly good at executing the math quickly because they can perform large numbers of multiplication operations in parallel. That ability momentarily made Nvidia the most valuable company in the world last week; the company currently holds an estimated 98 percent market share for data center GPUs, which are commonly used to power AI systems like ChatGPT and Google Gemini.

In the new paper, titled "Scalable MatMul-free Language Modeling," the researchers describe creating a custom 2.7 billion parameter model without using MatMul that features similar performance to conventional large language models (LLMs). They also demonstrate running a 1.3 billion parameter model at 23.8 tokens per second on a GPU that was accelerated by a custom-programmed FPGA chip that uses about 13 watts of power (not counting the GPU's power draw). The implication is that a more efficient FPGA "paves the way for the development of more efficient and hardware-friendly architectures," they write.

Read 13 remaining paragraphs | Comments

Posted in News | Leave a comment

A buddy of mine dropped me a link to TI’s new brushed DC motor driver with sensorless speed control, the DRV8214. I spent a few minutes trying to hunt down the mechanism they use to derive speed from back EMF … Continue reading

Posted on by pappp | Leave a comment

Testing AMD’s Giant MI300X

Source: Hacker News

Article note: It's an absurd $15,000 OAM module with 192GB of RAM on it... but it's also benching out around twice as fast as an Nvidia H100 (and drastically better in some circumstances), which is a $25,000 OAM module. ...if your software will run on AMD's toolchain(s), it's clearly a wildly preferable part, but the question is how many workloads is that reliably true of after all the pooch screwings they've made on the software front.
Comments
Posted in News | Leave a comment

Nvidia loses a cool $500B as market questions AI boom

Source: The Register

Article note: Yesssss. The AI Hype is dangerously out of hand, and this is signs of correction.

Cisco was briefly the world's most valuable company too, you know, just before the dot com bust

Nvidia has rapidly lost about $500 billion off its market capitalization amid concerns that the GPU maker may have become overvalued or that the AI market powered by its chips is a bubble set to burst.…

Posted in News | Leave a comment

LHM to permanently close and sell DEC-10 at auction

Source: Hacker News

Article note: Oh that's really sad that the LCM is being dismantled. Nice that SDF is adopting their emulated remote experiential things. Will be interesting to see who can provide a good home for the white elephants like the CDC6500 and KI10.
Comments
Posted in News | Leave a comment