… and it looks nothing like today’s large ASICs
The current state of the art
For years, large ASICs like the ones used in network processing, supercomputing and high-end personal computing have had very interesting similarities. The figure below is a fairly typical floorplan of such an ASIC. After taping out over a dozen of these types of chips a year, it is interesting to see that the interfaces have changed, processors are faster and memory data rates have increased, but the basic floorplan remains similar.
Conventional networking ASICs are memory hungry and have embedded memory such as 1T memories as well as large interfaces to external memory. These large memory interfaces have very significant effects on the die size due to the periphery needed to fit these interfaces. Access to more external memory is generally limited to the most advanced JEDEC standard and the width of the IP available to support them. Alas, although these external memories may have high data rates and ever-growing capacity, the two drawbacks are the high power needed to drive these nets and the costs of interfacing to these memories. Yes, I did imply that existing memory solutions may be more expensive than a 2.5D implementation. Once you consider that the package size is usually driven up by these high-pin-count memory interfaces and that the PCB layer count is also driven by the memory interface on the PCB, then it is plausible that at the system-level, it is less expensive to go down the 2.5D route. If not for these massive memory interfaces, large networking systems that are in the 32-PCB-layer range may only need six PCB layers.
Embedded memories can be used to address the needs of low-latency memory in an ASIC. There are several choices, but the most common embedded memory is 1T memory. In a large ASIC, this can take up the lion’s share of the area along with the processors, but will not come close to the capacity of an external memory. The largest 1T memories in use are in the ballpark of just above 100Mb. This is several orders of magnitude less than the external memory, but it serves the purpose of having very low-latency memory.
As nodes shrink, the maximum SerDes data rates increase as well, yet they occupy a similar amount of space on a die. I would have thought that we would end up using less SerDes as data rates increased, but that guess was certainly wrong. Higher-data-rate SerDes in the same package with very large memories is a challenge, but not for the reasons many would assume. It is routing these hundreds of higher-speed SerDes (10Gbps-28Gbps) in large packages that is tricky because the transmission lines are longer on lossy dielectrics. While we can engineer the most elegant transition from bump-to-trace-via-ball, etc., we still need to work within the bounds of the materials available to make a robust large package. The properties that make dielectrics mechanically robust generally make them have higher loss tangents, hence they do not operate as well at these higher frequencies.
The future as I see it
While we have covered some of the difficulties associated with the current architecture — which is a monolithic solution as seen above, the future as seen below is completely different. This future 2.5D solution may appear to be a more complex solution, but its elegance lies in the simplicity it brings to the architecture. The custom ASIC seen in the image below has not only partitioned out some aspects out of the monolithic ASIC, but it has also brought in memory that would have otherwise been outside of the package.
The image above shows what a new ASIC architecture may look like. There are several key benefits to this implementation:
- The package is not shown, but it would be much smaller than it would have been with a monolithic die. The reason is that the memory interface has been removed from the package, as it is already inside the package.
- The die itself can be considerably smaller if the embedded memory can be mostly on the adjacent memory stack.
- The SerDes can be removed and replaced with a tile that takes a highly parallel interface and multiplexes it to several high-speed channels.
- If a processor is needed, it would be best located on top of the ASIC because these processor architectures work best when vertically stacked, rather than placed side by side.
Once these portions of the die are removed, there is little need to use the latest wafer node. The final benefit is a lower barrier to entry for a new device because this ASIC can likely be done in a legacy node that has a much lower foundry NRE. Assuming that the other tiles mounted on the interposer already exist, it is possible that the unit system cost of this 2.5D implementation is lower than that of the monolithic die solution.
When the performance and cost both benefit from an implementation, it is no longer a matter of whether a solution will come to bear, but when. The future has become clearer. The challenges of the monolithic die architecture and the emerging packaging capabilities on tying these partitioned elements together may provide the environment needed to propel our industry into another dimension in system architecture.
There are two general flavors of 3D-TSV technology. Images for these can be seen in the previous blog entry “The Future of ASICs in 3D.”
- 3D-IC has vias in silicon containing active circuitry.
- 2.5D is similar, but uses passive silicon, glass or organic interposers to enable very fine pitch interconnection between the active die mounted on top. There is some discussion about adding some basic active circuitry to these silicon carriers to enable better testability, but that is for a different blog entry. If glass or organic interposers are used, it is possible that this may not even include a TSV, but for simplicity in this entry we’ll just assume it does have TSVs.
These simple descriptions do not cover all flavors as simply as chocolate and vanilla. There are also several combinations like rocky road that combine elements of each and introduce other ingredients. Much like consumers of ice cream that gravitate to certain flavors, so are the types of companies that will utilize each of these technology flavors, and for very good reasons.
Who will use 3D-IC?
ASIC 3D-IC implementations will be driven by several key market drivers. 3D-IC offers the promise of considerable miniaturization. The stacking of die that would otherwise be adjacent to each other reduces the area required on a PCB. While both 3D-IC and 2.5D technologies reduce PCB real estate, 3D-IC has the promise of having a greater real estate reduction. There is also the promise of lower power consumption when communicating with ASICs or ASSPs. Signals do not need to be driven at higher voltages since the IR (current X resistance) drop from die vertically stacked is minimal, so the IO may just need to operate at the same voltage as the core of the die. In fact, the miniaturization must be coupled with lower power solutions or else these compact devices will overheat. Heat is the biggest problem with 3D-IC, so this will likely be restricted to low-power devices, or solutions that can have inordinately expensive heatsinking solutions, such as supercomputers.
There is also be a cost premium to this solution, as it would be less expensive to design these active die as a monolithic ASIC, or different MCM (multi-chip module) solution. The users for this will be those willing to pay a premium for this smaller, lower-power option. Hence, the main users for this will be mobile devices. A smaller phone with a longer battery life is worth more to the end consumer so the mobile component manufactures can justify this premium. The lower power usage in this space is also a great fit for this technology.
Who will choose a scoop of the 2.5D flavor?
2.5D technology doesn’t really save a lot of space. It still has some benefit of power reduction, although you still need to drive many millimeters (instead of many inches) across an interposer. Here the benefits of partitioning, reuse and mixing technology will dominate. Partitioning is the ability to separate various portions of an ASIC into separate die. This would include a CPU, embedded memory, SerDes, etc. The partitioning would enable:
- Better yields due to smaller die
- The ability to reuse these “tiles” of partitioned silicon in future designs
- Elimination of the reticule limit governing die sizes. (The area sum of several large tiles on a silicon interposer may exceed what would have otherwise been possible on a single monolithic die due to the size limit of the reticule.)
The most strategic approach to 2.5D technology is the ability to mix silicon technologies such as SiGe, DRAM, RFCMOS or silicon of various nodes such as 28nm and 130nm on the same interposer. This ties back to the reuse benefit mentioned earlier. If a CPU tile is taped out at 28nm and proven, it can be reused across multiple designs.
Ask yourself what drives you to shrinking silicon wafer nodes? It is generally an area, power efficiency or speed limitation for one or two silicon IPs on your die. The logic on your ASIC may not likely benefit much from the smallest node available. Therefore, produce tiles for the aggressive IPs such as the CPU or SerDes in an advanced silicon node such as 28nm. Then use a legacy node for the logic, which would bring down the overall risk and cost of a system. If the advanced tiles can be reused, then only the legacy-node logic tiles would need to be released for future generations which would reduce the cost associated with new devices. Let’s say your next-generation device requires two CPUs, then just put a second CPU tile on your interposer. There would be no additional risk since this tile would be proven and reused from a previous design. There is more area associated with this approach, but the benefits are very compelling. The products that would benefit the most from a 2.5D strategy are the more power-intensive enterprise-level systems such as network processors, large FPGAs, non-mobile CPU applications, etc.
3D technology is generating a lot of interest as a way to reduce NRE costs and speed time to market. It is in its nascency so people are looking for a single standard in through-silicon vias (TSVs). This is mainly for reducing infrastructure costs. Unfortunately, I do not think this will be the case. There are at least two fundamentally different applications for 3D technology that are driven by completely different incentives. The mobile space is driven mostly by the need for reduced power, height and area. The infrastructure and networking space is driven by the need for yield improvement and the ability to insert more memory than is monolithically possible — at much lower power. Mobile devices need thin architectures and very thin packages. On the other hand, larger networking devices require thicker 3D-ICs or interposers in order to handle the flatness needed for larger die and the side-by-side architectures of the devices.
Basic 2.5D Structure
These are really exciting times: 3D and 2.5D technology could change the entire landscape and architecture of ASICs. This has already started in FPGAs and ASSPs, but ASICs face a particular challenge. ASICs do not generally have the benefit of high volume required to secure sources, influence foundries, and gain early access to 3D technology — which they need if they want to be in a leadership role in this implementation.
The exponentially rising cost of tapeouts at lower nodes has resulted in fewer tapeouts at these emerging technologies. Therefore, there are fewer experts in this field. Some companies will be able to spend a lot of money developing the technology and hence developing the expertise in the field. The rest of us will need to depend on strategic partnerships to help, to hand-hold, as we cross the threshold into this technology.
Foundries and assembly houses are keeping their 3D-IC cards close to their chest and waiting for industry leadership to come from the users of 2.5D and 3D technology. Obviously, they do not want to spend all that money to determine later they need to change course to follow the prevailing current.
eSilicon has already spent a good amount of time and effort on 3D- and 2.5D-IC technology. We believe that ASICs will need what we are referring to as a menu for “tiles,” such as memories, microprocessor subsystems, integrated passive devices, FPGA die, and other devices. In the eSilicon model, tiles are proven building blocks. A 2.5D or 3D-IC implementation could include tiles in leading-edge technologies like 28nm, with a lower NRE thanks to a 65nm based interposer. The proven tiles mean the design team doesn’t have to re-invent the wheel, saving time and reducing risk.
3D-IC Structure Example
We — along with our partners — are moving forward to provide leadership in the 3D-IC space. At the same time, we look within and beyond our customer base to make sure we know where the prevailing currents are flowing. I do not think any of us will have all the answers, but ongoing conversations with partners and customers are getting us closer to understanding where the need is. Once you know where the need is, the direction will be abundantly clear.
Cross Section of TSVs, source: P. Leduc, LETI, D43D, 2010
Ready for primetime in ASICs…almost
Thru-silicon-vias (TSVs) have become a very hot topic in in recent months. Ever since Xilinx reported that they are using a 2.5D TSV approach for their Virtex-7 FPGAs (http://bit.ly/ayfOgy), the industry started to salivate with the prospects of this new technology. While this technology may be accessible for larger stacked memory, FPGAs, MEMS devices, and CMOS image sensors, this does not inherently mean it is ready for ASIC applications. Before we get into some of the details, it is important we take a moment to calibrate with the terminology used in this space.
- 2.5D: refers to having one or several die mounted to another inactive die with thru-silicon vias in order to route nets between the active die and to the substrate.
- 3D-IC: refers to one or several die mounted to the backside of an active silicon die through these TSVs
- Glass interposer: A die made of glass with vias that connect both sides of glass die together for signal/power transmission.
- Silicon interposer: A die made of silicon with vias that connect both sides of glass die together for signal/power transmission.
- Tile: a die mounted to a glass-interposer, silicon interposer or 3D-IC. These generally have microbump pitches of 30-80um.
Short flat microbumps, source: KK Tzu, ITRI, RTI 2010
- TSV: Thru-silicon-via, a via that connects two opposite sides of a silicon die/wafer. This can be seen in the image in the upper right corner.
EDA tool infrastructure
The market for the design of 3D-IC and 2.5D interposers really started with several niche players making stand-alone tools to address this need. Most of these are on open-architecture platforms so they share data with some other EDA tools. It is not clear if these niche EDA tool companies will gain significant market share before the larger EDA tool companies have a chance to surpass them. At least one of the major EDA tool companies is already presenting a solution at tradeshows. The lack of design kits from the wafer fabs has given these large EDA companies a chance to catch up and apply their greater resources to enter this 3D-IC and 2.5D design space.
One interesting observation I made after seeing some of these tools in action is that they appear to be built on package design platforms instead of physical design platforms. This may be because 2.5D solutions look like miniature package substrates that then get inserted into other more-conventional package substrates. Therefore, from an EDA tool perspective, this can appear much more like a stacked-die package design rather than a physical design on silicon. What has not been clearly demonstrated is the solution for 3D-IC in ASIC designs, by a major EDA tool company. This would appear much less like a stacked-die layout and more like a physical design, so there is still some more evolution needed in the tool space to address this 3D-IC technology. Critical steps such as LVS (layout versus schematic) checking still have limitations with this technology. Additionally, timing analysis of nets between chips in this space is further complicated by the TSV connections and routing on different die without signal buffering.
Short copper posts with solder, source: E. Beyne, IMEC, RTI 2010
Most sources agree that 3D-IC will not likely be mature enough for wide adoption for another two years or so in the ASIC space. On the other hand, 2.5D is much further along with regards to EDA tool readiness, likely due to the silicon-interposer’s similarity to an embedded package substrate.
The good news here is that there are several interposer suppliers in the market enabling the 2.5D marketplace. The bad news is that their solutions are considerably different from one another and so are their cost structures. The major wafer fabs are keeping their cards close to their chest until clear standards emerge in order to avoid the expense of re-tooling at a later date. For those of us in the ASIC space, this poses some interesting questions. Either partner with new suppliers for early access to the technology or wait until the major industry players open their doors with standard design kits. Only a select few are being given a sneak preview of the incomplete design kits as early adopters. The rest either end up waiting by the sidelines or partnering with the select few able to access these design kits.
Probing of the tiles that interface to the silicon interposer or 3D-IC die cannot be done with conventional vertical-probe or cantilever probe technology. After all, these microbump pitches of 30-80um are too tight for these conventional approaches. Instead, several companies are devising new families of probe cards, which are ge
Tall posts with solder tips, source: P. Royannez, et.al., IME, RTI2010
nerally based on MEMS technology. This means that the up-front cost for a probe card may go up considerably. MEMS probe cards have been available for some time, but the finer technology needed may make these a little more difficult to manufacture and maintain. The production cost structure here is not well understood yet, but at least a solution exists.
Assembly is one of the hurdles that has been addressed, but unfortunately there is little uniformity in how this is done. Some solutions in the wafer-to-wafer (W2W) format utilize a multitude of bonding techniques, but in the ASIC space, this should not be a major concern. It is unlikely that W2W bonding will be used in ASICs other than embedding stacked memory die in a 2.5D or 3D ASIC solution. At this point, the wafers are already bonded to each other and will likely be delivered by the memory suppliers in tape-and-reel format.
The two options for ASIC assembly of TSV devices will be die-to-wafer (D2W) and die-to-die (D2D), but I expect D2D to be the prevalent format for ASIC solutions. The reason I expect D2D to be the dominant format for ASIC assembly is that this allows the greatest flexibility of what to put on the TSV wafer, and it also eliminates the difficult thin-wafer handling. Wafers with TSV will be somewhere in the 50-150um thick range, and, if given the option, the assembly sites would surely opt for the more robust D2D solution.
There are no clear assembly standards. Standards are being initiated for wafer handling as well as reliability, but assembly still has a hole in standards coverage. For example, some TSV technologies have copper posts with solder on the end, others have round bumps, while others may have relatively flat connections that are meant for having copper posts on both die in order to form the interconnect. Examples of these can be seen on the lower three images in this post. The assembly for these different formats may require different assembly strategies that may not be easily mixed. In addition, the gap between the die may be different resulting in different underfilling (plastic gap-filling between the die) materials and methodologies. In order to have multiple die capable of assembly on the same 2.5D or 3D-IC device, the assembly processes need to be compatible. At the moment, ASSP and FPGA providers design all of the die in the package, but this may not be the case in the ASIC space. For this, standards will be required to enable this technology for those of us the in ASIC realm.
JEDEC’s JC-14.3 committee is working on the reliability standards required for this technology. This will clearly address concerns currently preventing wider adoption of the technology. Having standards that clearly define reliable packaging will help us all, so we are looking forward to the output of this committee.
Shipment of TSV wafer and die
Semi and Sematech have been working on standards primarily for handling these delicate TSV wafers and die. The current directions include bonded wafers, bare die, W2W attachment methods, etc. These standards normally take about a half-year to release, so expect to see the fruit of this effort towards the middle of 2011. This happens to coincide with the expected release of the wide-IO memory standard being developed by JEDEC. This means that the end of 2011 should see a significant flurry of activity.