Processing-In-Memory, 1960s-Style

Famous photo of Apollo 11's Buzz Aldrin on the moon.There’s a surge of interest these days in Processing in Memory or PIM.  Some call it “Compute in Memory.”  Either way, it often involves adding small amounts of logic to a memory chip to allow it to offload tasks from the CPU.

There are various reasons to do this.  The most important is that PIM frees the CPU to do other chores.  PIM also can reduce I/O traffic, which saves bus bandwidth and power.  Other less common reasons are to harness the phenomenal parallelism within the chip, or to allow computing power to scale with the number of memory chips used in the system. The Memory Guy blog has covered PIM a few times:

The above reasons to use PIM are all very valid, but in the 1960s one very notable computer used PIM to trim the weight and power consumption of a spacecraft by reducing the complexity of the CPU.

The CPU in this case was NASA’s Apollo Guidance Computer on the Lunar Lander that put the first human on the moon in 1969.  This system was a very intriguing custom CPU built entirely out of 3-input NOR gates.  These were some of the earliest ICs, and were specified into the system by MIT in 1963, only five years after the integrated circuit was invented in 1958.

Photo of Apollo Guidance Computer's control panelSince this computer was specified so early on, it didn’t use SRAM or DRAM, but was based on the core memories prevalent in that era’s computers.  (Before his passing, Joel Karp, the designer of Intel’s first DRAM, told me that he was busily debugging the first silicon of the first commercial DRAM while listening to radio coverage of the lunar landing.)  The Apollo Guidance Computer used standard core read/write memory for data, and an unusual core rope read-only memory for code storage, where the code was hand-woven into the array.

Naturally, the PIM processing was performed in the read/write core memory.  It consisted of four memory addresses, 0020-0023, each of which performed its own function on the data written into it:

      • Rotate Left
      • Rotate Right
      • Signed Shift Right
      • Unsigned Shift Right by 7 Places

These may not look like much, and it’s pretty obvious that they were very simple to wire into the core, but the first three performed very useful functions for multiplication algorithms and for serializing and deserializing data streams.  Serial data streams helped reduce a spaceship’s weight, since only a single wire needed to run from the sensor to the computer.

That odd-looking shift-by-seven function was used to help divide the processor’s 15-bit words into two 7-bit instructions.

Address 0007 provided one other somewhat similar function since it read back as a zero, no matter what was written into it.  This probably was used instead of having an instruction to clear registers.

Each of these addresses performed a function that could have been performed by the CPU, but that would have added complexity.  The designers found this approach allowed them to use fewer gates to build the CPU, which kept the computer within its power and weight budgets.

My main reason for writing this, though, is to point out that PIM is anything but new.  The first I heard of it was about 25 years later in the late 1980s, when an inventor hoped to add graphics primitives into a memory chip to harness the parallelism of the memory’s 1,024-bit internal data path.

More recently, Samsung produced special DRAM chips in an HBM stack for the company’s Aquabolt XL, which was designed to accelerate AI, and was announced in 2020.  In 2013 Micron introduced its Automata Processor PIM, which is now sold by Natural Intelligence Semiconductor.  Many other firms are using emerging memory technologies like ReRAM to prototype analog neural networks which massively reduce the computing burden of AI inference.  We explain this in some depth in our emerging memory report: Emerging Memories Branch Out.

So PIM is not at all a new concept, but rather a good idea that has never really caught on.  Will things be different this time around?  Objective Analysis keeps a very close watch over this and over all memory technologies and markets.  With our help you can be the first to know when the business is moving in a new direction.  Please contact us to see how we can make this work to the benefit of your own efforts.

2 thoughts on “Processing-In-Memory, 1960s-Style”

    1. Brian, Thanks for the comment.

      While I “kind of” understand how GSI’s APU works, and I admire its novel approach, I don’t know how it looks to the people who actually have to specify such chips into their systems.

      Typically they are responsible for getting the greatest performance at the lowest cost. This is a different goal for each different kind of system.

      Today AI is widely used in hyperscale data centers, but not much AI has found its way into smaller systems, and I would guess that the GSI part would do best in smaller systems. It will probably take a while for such systems to become common. Once they do, then there should be a great opportunity for GSI.

      For now, though, I can’t really give you a valid outlook for the part’s chance of success.

      Best,

      Jim

Comments are closed.