I got a phone call yesterday from Russell Fish of Venray Technology. He wanted to talk about how and why computer architecture is destined for a change.
I will disclose right up front that he and I were college classmates. Even so, I will do my best to give the unbiased viewpoint that my clients expect of me.
Russell is tormented by an affliction that troubles many of us in technology: We see the direction that technology is headed, then we consider what makes sense, and we can’t tolerate any conflicts between the two.
In Russell’s case, the problem is the memory/processor speed bottleneck.
Most semiconductor manufacturers think that a memory is a memory and a processor is a processor, and believe that the only way to get past the memory/processor bottleneck is to increase data transfer rates. This is the Rambus approach, and the approach used by several subsequent technologies including synchronous DRAMS and all the variants of DDR. The hybrid memory cube detailed in an earlier post is yet another way of confronting this issue head-on.
This viewpoint makes a lot of sense, since DRAM manufacturing processes differ significantly from the logic processes used to make CPUs. You can’t make a fast CPU using a DRAM process, and you can’t make an inexpensive DRAM using a logic process.
What Venray advocates is the use of slow CPUs that are tightly coupled with the DRAM chips. In fact, the processors are integrated right inside the DRAMs! This completely eliminates the processor/DRAM interface bottleneck.
Even though processors manufactured using DRAM technology are slower than those made with an advanced logic processes, they can access memory through extraordinarily wide data paths (thousands of bits wide) and one processor can be dedicated to every bank of DRAM. Venray is trading processing horsepower for DRAM bandwidth.
The company’s models give a telling tale: According to Venray a MapReduce benchmark runs nearly 12 times as fast on a system designed around Venray’s 2.1GHz TOMI architecture as it does on a system based on Intel’s 2.4GHz Xeon E5620, while consuming less than 1/10th the power at about 1/45th the cost.
So, then, why is Mr. Fish frustrated? In addition to the preconceptions mentioned above, there are two mindsets that work against the immediate adoption of Venray’s design.
First, this concept involves a completely new approach to computer architecture. In a world that revolves around the Intel Architecture (the instruction set used by Intel and AMD), whose beginning can be traced back to the 8008 processor from the early 1970s, the adoption of multiple distributed processors would require millions of lines of legacy code to be reconfigured. Chip makers worry that software firms might not migrate popular platforms to this new scheme.
The other mindset that works against this is that DRAM is always manufactured to leverage the economies of scale – DRAM makers believe that it is not worthwhile to make any new chip unless they can sell billions of units. This automatically works against any new ideas, since anything new will start small before it becomes big.
In fact, the only way that DRAMs change from DDR to DDR2 to DDR3 etc. is through Intel’s promotion of these interfaces. Without that assurance DRAM companies would move much more slowly from one interface to the next as they sought assurance that a market for the newer product really did exist.
Venray has chosen a difficult path. The company must convince DRAM manufactures and software companies that there is a solid reason to support its technology. This analyst has no doubts that there are real speed and cost advantages to the Venray architecture, but the sheer inertia of existing technologies stands in the way of future adoption.
We wish the company good luck, and hope that the industry will understand the need to adopt either this approach or one that is similarly divergent from the path the industry is currently following.