The Memory Guy has found that some people get confused about the terminology surrounding flash “Layers” and “Levels,” Sometimes confusing the two, and often misunderstanding what each one means. This post is meant to be a low-level primer to address that confusion.
There are actually three places where such terminology is used: The number of chips in a package, the number of conductor/insulator pairs in 3D NAND, and the number of voltage levels stored on any single bit cell within the chip. I will address them in that order.
CHIP STACKING: Since the 1990s Both NAND and NOR flash chip makers have been stacking chips within a single plastic package. Originally this approach was used to reduce the size of thin flip phones like the Motorola Razr by stacking an SRAM chip on top of NOR flash, but soon afterwards NAND chips began to use the same approach to get incredible storage capacities into a single IC package or eMMC, or into a microSD card format. What began as 2-die stacks became 4, then 8, and now 16 high. This post’s photo illustrates an 8-high stack.
Since the height of a standard plastic package for a chip is smaller than a stack of 16 full-thickness dice the wafers had to be back-ground to minimize the thickness of each die. I have heard that these wafers are made so thin that they become flexible enough to be rolled into a tube the diameter of a cigar! Back-grinding is relatively inconsequential to the circuit, since less than 10% of the top of the wafer contains the functioning circuitry – the bulk of the wafer is there for mechanical support during manufacture.
What I find amazing is how well producers have been able to drive the costs out of this approach. 4-high stacks are usually command the same price as four of their single-die counterparts, with no additional cost for getting everything into a single package.
Make no mistake – stacking chips does nothing to reduce the total cost. It merely allows more NAND bits fit into less space. This is very different from the next two approaches.
3D NAND LAYERS: A lot of people mistake 3D NAND for the approach above of stacking chips. This is completely wrong – 3D NAND is a different way of producing NAND flash chips that drives much of the cost out. Although the industry still hasn’t reached this point, the ideal 48-layer 3D NAND flash chip should have a price per gigabyte (GB) that is only a little more than half of the price per GB of of its 24-layer counterpart.
A cost analysis that Objective Analysis did a year or more ago estimates that the cost per GB to manufacture a 32-layer 3D NAND chip should (ideally) be a little more than half the price of a 16nm planar NAND chip, which some people call “2D” NAND.
3D NAND was invented to drive costs out of NAND after planar flash reached its scaling limit. Not very many people can tell the difference between a 3D NAND manufacturing plant and a planar NAND plant even if they’re right in the middle of it, nor can they distinguish a 3D NAND wafer from a planar NAND wafer. There’s not much difference between the two until you get down to a very technical level. Those who want to learn about that might want to read a series of blog posts I wrote about 5 years ago to explain 3D NAND.
It’s most important to know, though, that a 3D NAND chip is just one single chip. That’s what allows its cost to be lower. Most NAND flash chip wafers cost around the same amount: Between $1,000-2,500. If you stack chips then you use more wafers and the cost goes up. When you build 3D NAND on a singe wafer, that wafer costs about the same amount, but it stores many times as many gigabytes, driving the cost per GB down. This is all enabled by adding layers of material to the basic chip, and this is the magic of 3D NAND.
MULTILEVEL CELLS: Back in the middle 1990s both Intel and SanDisk began shipping NOR flash in which 2 data bits were stored on a single NOR flash memory cell. (Some may be amazed to learn that SanDisk shipped NOR flash, but the company’s early products were indeed NOR-based.) Multilevel cell flash, or MLC, is a sneaky way of making one bit act as two. It’s done by storing multiple voltages on the bit cell. The original 2-bit MLC stored four voltage levels.
You can imagine the cost savings you can get by making one bit cell act like two – you basically cut the cost per gigabyte in half.
This wasn’t the first use of this technology. Intel used an MLC approach in its 8087 math co-processors’ microcode ROM in 1981, well before bringing it to NOR flash, and Yoshishige Kitamura of NEC was granted a patent for MLC EPROM in 1985.
As the technique became better understood three bits were stored, then mSystems devised a scheme to reliably get four-bit MLC to work in 2006.
Producers tried valiantly to guide nomenclature to more accurate terms like “X3” and “X4”, or “3-bit MLC” and “4-bit MLC”, but people didn’t follow their lead, and now use “TLC” or “Triple-Level Cell” for 3-bit MLC, and for 4-bit MLC most people say “QLC” or “Quad-Level Cell.” I like to ask people how many voltage levels there are in a QLC NAND, and often get the mistaken answer 4. It’s really sixteen.
The relative cost benefit decreases the more bits you add: Moving from one bit to two per cell may cut costs in half, but moving from two to three only gives you 2/3 the cost of MLC, and moving from three to four cuts costs by to about 3/4. Meanwhile the challenge balloons: Sensing 2 levels is already a challenge, but going to the four levels in MLC is much harder, and a few vendors struggled for a couple of years simply to get it working. You can imagine the challenge, then of sensing the 8 levels of TLC and the huge headache QLC’s 16 levels must pose.
But the important thing to keep in mind is that nothing has changed on the chip itself: Each bit cell is the same, it’s just now being used to store an increasing number of voltage levels, and this helps to drive cost per gigabyte down.
LAYERS OF LAYERS OF LAYERS: Bringing this around to the title of the post, we can see that all three of these technologies can be used together. If you start with a chip that has 64 billion bit cells, this can be built either on a planar NAND flash process or on a 3D NAND process. If it’s built as 3D NAND it will have a smaller die size and cost a lot less. The more layers of 3D NAND you use, the smaller then chip, and the cheaper the cost.
Now you can decide to put more than one bit on each of those 64 billion bit cells. With MLC it will contain twice as many bits – 128 billion – and with TLC it contains three times as many, or 192 billion. With QLC the number reaches 256 billion bits, but the cost of the chip is the same no matter which you use: one bit, or two, or three, or four, so it’s more economical to use as many bits per cell as you can manage.
OK – So these two approaches have helped you get better economics while increasing the density of the chip, but you still may not be able to squeeze enough storage in the tiny corner of space that you have set aside for the chip. That’s where stacking comes into play.
If you take four, or eight, or even sixteen of those chips and put them on top of one another it won’t cost any less than having sixteen individual chips, and it may cost more, but you will certainly get a lot of storage into a very tiny space.
And, in the end, this might give you a 16-high stack of quad-level 64-layer 3D NAND: Layers upon layers, upon layers!