Why Advanced Packaging Means More Emulation

By Paul Morrison

Solution Specialist, Emulation Division


August 27, 2018


Why Advanced Packaging Means More Emulation

When we think of emulation as part of our verification plan, most of us are likely to think in terms of a full chip being exercised through a wide range of scenarios, including running software on the

This is part five of a series. Read part four here.

When we think of emulation as part of our verification plan, most of us are likely to think in terms of a full chip being exercised through a wide range of scenarios, including running software on the design. After all, that has, realistically, been what emulation has been able to achieve over the last decade even as ASIC sizes have grown.

But there are changes happening at the physical level that affect the pre-silicon verification done in emulation that is typically focused on an individual die: advanced packaging techniques are letting engineers co-package multiple die together, presenting them to customers as a single unit. This may allow us to integrate common die like memory with our own custom die, or it can let us mix and match technology nodes so that each die uses a process appropriate to its contents, reducing cost by not overusing the most advanced technologies.

Multi-die integration can happen at two levels, with variations on how it’s accomplished. Multiple die can be mounted on an interposer (typically silicon), where signals can be connected and rerouted to package pins. This is called 2.5D integration, because it’s somewhere between packaging a single die (1D) and full 3D integration.

3D integration involves die stacked on top of each other, connected directly through microbumps and through-silicon vias. Rerouting signals, if necessary, can be implemented on the backsides of some of the die.

From the standpoint of the user of the package, it’s immaterial whether the internals comprise one or multiple die. It just has to work. This becomes a verification goal: proving to yourself and to your customer that, regardless of how the package contents are arranged, everything works as expected.

That, then, becomes a job for emulation. Since the individual die are likely to have been verified separately, the job largely becomes one of ensuring that inter-die connections and communication are working properly. The design files for the multiple die can be combined into a single unified design, with the interposer or the package pins acting as the top level of the hierarchy. Veloce emulators are large enough to contain these full multi-die designs.

Standards for Interconnecting Die

There are several consortium-based (i.e., not proprietary) standards that specify the means of interaction between die that are closely packaged together. Which one to use can depend on the application. With individual die, it’s impossible fully to verify these standards, since each die will implement only one side of the interaction. So, a big part of the emulation job will be confirming that the standard implementations are working correctly across all inter-communicating die.

  • GenZ is a new memory-semantic interconnect standard. It allows memory access to other die through direct attachment, a switch fabric or a routing fabric. The die accessing the memory will think it’s accessing local memory.
  • CCIX is a standard that extends coherency beyond the CPU. Other memories and accelerators can be included in the coherency plan so that software doesn’t need to manage it explicitly. Built over PCIe, it supports bandwidth of 25 GT/s (T being “transfers”).
  • OpenCAPI is effectively a superset of GenZ and CCIX (although defined by a different standards body). It’s based on IBM’s Coherent Accelerator Processor Interface (CAPI). It also competes with Intel’s EMIB protocol, a proprietary approach to die interconnect. It is still working to achieve traction (as is EMIB).

ASICS Plus FPGAs and Other Applications

Another emerging application for multi-die verification involves pairing ASICs or SoCs with FPGAs. An ASIC represents an optimized implementation of some set of capabilities. The benefit is that performance, power, and cost can be tailored to suit the needs of the application. The downside is that an ASIC is expensive and time-consuming to design, verify, and build – and, once it’s done, there’s no making changes.

So, if you’re not sure which of several options might work best for your customers, it becomes hard to build all of those options onto the chip while still keeping costs in check. In other situations, there may be application variations, with large portions of fixed functionality and more limited circuits that need configuration and personalization. You may even acquire an ASIC and then have accompanying silicon that adds your “secret sauce” in a manner that gets you to market as quickly as possible.

This is where FPGAs start to look attractive. FPGAs can’t implement functions with the same efficiency as an ASIC can, but you have the flexibility to experiment with different functionality options, test different versions with customers, and even perform in-field updates after the chip is deployed in a system.

This pairing of ASICs or SoCs with an FPGA looks to become a more common option as design costs continue to rise while time-to-market windows shrink. However, given that these two (or more) die are packaged together, it’s necessary to validate the combined pair.

In yet another application, processor makers like Nvidia are considering moving to multi-die implementations of their GPUs. This will require extensive emulation to ensure that the multiple die behave for the user like a unified graphics processor.

Emulation Is the Only Viable Verification Solution

Multiple die in a single package constitute a very large design; there is no alternative to emulation for thorough validation. The design would be cumbersome, at best, to simulate, and the number of tests necessary means that there’s no way to accomplish them in a timely fashion without emulation. Similarly, emulation enables the co-verification of ASIC and companion FPGA designs together, providing for a complete checkout of their interactions. The Veloce family has the size and performance required for handling these large designs.

Paul Morrison, Technical Marketing Engineer, Mentor, a Siemens Business

Paul Morrison has been a digital design engineer, group lead, manager, and TME for the past 21+ years. He has spent his career in the Storage and MilAero industries, and his expertise is in ASIC design, FPGA design, verification, and lab debug, and emulation. In addition to Mentor, his resume includes jobs at Fujitsu, Xilinx, Quantum, DRS (now Leonardo), and Micron.

Mentor, A Siemens Business


Read part six of the series here

Debug & Test