PCIe Gen 2: Not just more bandwidth
February 01, 2009
Story
Plugging in a Gen 2 switch is the easiest way for designers to upgrade their legacy Gen 1 system.
It's hard to believe that it has been two years since the PCI Special Interest Group (PCI-SIG) published the PCI Express Base Specification Revision 2.0. More commonly referred to as PCIe Gen 2, the upgrade to the popular and widely used PCIe protocol doubled PCIe Gen 1's supported bandwidth to 5.0 Gbps. But that's not all the new specification makes possible.
While market adoption of Gen 2 is well under way, some designers still rely on Gen 1 in their products and aren't yet ready to migrate to Gen 2. What these designers might not realize is that Gen 2 is bringing significantly more to the table than simply additional bandwidth.
Switch maturation
PCIe switches are the standard building blocks of a PCIe system. Because chipsets offer a limited number of native PCIe ports, switches are typically used to create additional PCIe ports. Using a switch allows the chipset to fan out to an expanded number of PCIe endpoints or I/O.
As the PCIe protocol has matured, so has its implementation. Chipsets used to come with a single PCIe port but now typically support several PCIe ports. A few years ago, designers were hard-pressed to find an embedded processor supporting PCIe. Today, an embedded processor supporting two or even three PCIe ports is common.
Next-gen features
PCIe switches likewise have matured. Whereas Gen 1 switches originally did little more than provide fan-out capabilities, Gen 2 switches provide support for application-specific performance-enhancing features such as read pacing and dual cast.
Read pacing
It is not uncommon for today's server and storage systems to use a mix of constant and bursty I/O implemented with adapters connected to PCIe. A Fibre Channel Host Bus Adapter (FC HBA) is an example of a constant, data-hungry endpoint that spends most of its time sending large read requests to the host. On the other hand, a Gigabit Ethernet Network Interface Card (GE NIC) is an endpoint that is bursty in nature and sends small, infrequent read requests to the host on an as-needed basis.
In a standard PCIe system, if the GE NIC sends a small read request after the FC HBA has already issued perhaps eight read requests, the GE NIC has to wait for the CPU to service all eight of those FC HBA read requests before servicing the one GE NIC read request. Hence, the GE NIC performance suffers as it spends several cycles waiting to receive its requested data. The fact that FC HBA read requests tend to be much larger and more frequent than those of the GE NIC only exacerbates GE NIC performance degradation. This problem is a byproduct of the PCIe protocol and the First-In, First-Out (FIFO) scheme the CPU utilizes in handling incoming read requests.
Read pacing resolves this conundrum by fairly allocating CPU bandwidth when servicing read requests. In Figure 1, read pacing allows the GE NIC to jump ahead of the pending FC HBA read requests, significantly reducing the GE NIC's worst-case wait time. Instead of waiting for all the queued FC HBA read requests to be serviced, the PCIe switch allows the GE NIC read request to jump ahead in the queue.
While the bursty I/O performance can experience a performance increase of 5x or better, the constant I/O performance does not suffer. The algorithm used by read pacing takes into account the constant I/O performance and ensures that its performance is not degraded. Based on this algorithm, read pacing supports its own collection of default settings. However, designers who want to customize the read pacing feature can program their own thresholds as needed.
Dual cast
The dual cast feature allows an ingress packet to be copied to two egress ports simultaneously - one packet in, two packets out. The ingress port and the two egress ports are user-programmable. Any time a packet is written from the selected ingress port to a specified address range in the selected egress port, the switch automatically generates a copy of the egress packet and sends that copy to the second designated egress port.
In redundant and failover applications, it is common for the CPU to send copies of packets to a redundant endpoint or secondary system to ensure that a backup copy of the data is available in case the system crashes. With dual cast (shown in Figure 2), the PCIe switch relieves the CPU of the burden of managing redundant traffic, cutting the number of writes that the CPU needs to execute in half.
Debugging and diagnosing links
In addition to these application-specific performance features, the latest PCIe Gen 2 switches offer a number of integrated debug and diagnostic features that can accelerate system bring-up.
PCIe packet generator
A built-in PCIe packet generator allows designers to exercise the PCIe switch's external links at full wire speed (5.0 Gbps). This programmable generator, which enables designers to create their own traffic patterns, is robust enough to saturate a x16 Gen 2 link. The packet generator can be extremely useful during system debug and bring-up, allowing designers to test their systems against customizable, high-density traffic.
Performance monitoring
An integrated real-time performance monitor allows designers to see ingress and egress performance on each port as traffic passes through the switch by using GUI design tools for PCIe switches. Performance monitoring is completely passive and thus has no effect on overall system performance. Internal counters provide extensive granularity down to traffic and packet types. Furthermore, sample application code can be modified by designers to allow for even further customization, such as traffic filtering. This feature is useful for exposing performance bottlenecks, identifying underutilized links, and optimizing performance in a system.
SERDES eye capture
Designers can evaluate a system's signal integrity at the physical layer using a switch's SERDES eye capture feature, as seen in Figure 3. Again using software tools for the switch, designers can view the receiver eye of any lane on the switch. This feature is useful for spotting gross signal integrity errors, which a designer can identify by modifying the SERDES settings and seeing the adjustment's impact on the receiver eye.
Error injection
Error injection allows designers to inject malformed packets and/or fatal errors into their systems, enabling them to evaluate their systems' abilities to detect and recover from such errors.
Bridging the gap
Besides failing to apprehend the benefits of PCIe Gen 2 apart from the extra bandwidth, some designers do not realize that the transition from Gen 1 to Gen 2 is substantially simpler than it seems. As required by the PCI Express Base Specification Revision 2.0, PCIe Gen 2 is backward-compatible with Gen 1. Therefore, all PCIe Gen 2 devices, including switches, link up with all Gen 1 devices. As shown in Figure 4, a Gen 2 switch can be used as a Gen 1-to-Gen 2 bridge.
Designers using a legacy PCIe Gen 1 chipset, for example, can use a Gen 2 switch to connect to Gen 1 endpoints. Using the Gen 2 switch not only offers various performance and debug features, it also allows peer-to-peer traffic between the Gen 2 endpoints to run at 5 Gbps, doubling the peer-to-peer performance a Gen 1 switch offers.
Conversely, designers using a Gen 2 chipset can utilize a Gen 2 switch to fan out to Gen 1 endpoints. By using a Gen 2 switch, designers can take advantage of the aforementioned performance and debug features built into the Gen 2 switch, while at the same time gaining the flexibility to connect to a Gen 2 endpoint as the design evolves.
PCIe Gen 2 here and now
As the PCIe market continues to expand, so does innovation in PCIe switches. Today's Gen 2 switches offer high performance, integrated application-specific performance-enhancement features, and built-in debug and diagnostic features. Plugging in a Gen 2 switch is the easiest way for designers to upgrade their legacy Gen 1 system to be Gen 2-capable, future-proofing their systems in this rapidly evolving market.
Touseef Bhatti is senior product marketing engineer at PLX Technology, based in Sunnyvale, California. He holds a BSCE from the University of Michigan.
PLX Technology
408-774-9060
www.plxtech.com