The Advent of PCI Express
ATI and NVIDIA Discuss Their Plans

By: Chris Angelini
March 23, 2004

There was once a time when running an intensive 3D game at a resolution of 1600x1200 was wishful thinking.  Frame rates were laughable, at best.  And what about Anti-aliasing and anisotropic filtering? No can do.

Believe it or not, that was a scant  three and a half years ago, when NVIDIA's GeForce2 Ultra roamed the earth devouring hapless RADEON and Voodoo5 cards, before swallowing the once dominant 3dfx whole.  Now where are we?  Blasting through the latest first-person shooters at that mythological 1600x1200, toying with those extra knobs and switches enabled by gratuitous fill rate and ever-widening memory buses.  There's no doubt about it - hardware is outpacing software and it doesn't show any indication of relenting.  Sure, Doom III is on the way and so is Half-Life 2, eventually.  Yet NVIDIA and ATI are both determined to enable even more realism; an era of interactive 3D that rivals "Finding Nemo" for graphical glory. 

Clearly, that caliber of content is a ways off.  It's no matter, though; we're perfectly happy playing multiplayer Halo on RADEON 9800 XT and GeForce FX 5950 Ultra cards until that happens, right?  But it's a different story entirely for the big swinging GPU manufacturers anxious to convince you that dropping $500 for cutting edge graphics is justified.

Also consider that the FCC voted in 2002 to require electronics manufacturers to include digital tuners in all new television sets.  What does that have to do with PC graphics?  Well, HDTV is on the way, and nobody wants to be left behind.  We've already seen ATI's HDTV Wonder announcement, which precedes mainstream acceptance of a technology already available from companies like DVICO and MIT.  Once high-definition video starts finding its way into more homes, it seems natural that HD editing will follow.  And ATI already demonstrated an example of HD editing in conjunction with Intel and Pinnacle Systems at IDF in February.  According to ATI's press release, one very important architectural feature enabled that demonstration - PCI Express.

PCI Express
A little more about the technology...

PCI-SIG (the Peripheral Component Interconnect Special Interest Group), defines PCI Express as "...an open specification designed from the start to address the wide range of current and future system interconnect requirements of multiple market segments in the computing and communications industries. The PCI Express Architecture defines a flexible, scalable, high-speed, serial, point-to-point, hot pluggable/hot swappable interconnect that is software-compatible with PCI."

That's a pretty dense specification summary, so we'll go over it at a bit more length.  Firstly, PCI Express is an open specification, meaning anyone can implement PCI Express.  Although Intel will be the first manufacturer to debut the technology, representatives at NVIDIA have commented that the firm will also support PCI Express in an upcoming version of the nForce3 chipset once Intel unveils the Grantsdale and Alderwood chipsets.  Moreover, SiS already announced its own 965 South Bridge, which incorporates two PCI Express 1x connectors, in addition to eight-channel audio, four-port Serial ATA, and integrated Gigabit Ethernet. 

Secondly, PCI Express is flexible.  That is, its architectural design facilitates the ability to scale by adding "lanes."  Initial implementations of PCI Express will employ single-lane and 16-lane designs.  Should the need arise, there's also a 32-lane specification for even higher levels of bandwidth.  As it stands, the 16x slot offers 4GB per second of throughput, both up and down the pipeline, over double the amount of AGP 8x.

Finally, PCI Express is high-speed, serial, and point-to-point.  Existing platforms that employ PCI share the interface's 133MB per second, however, PCI Express enables a >200MB per second between each 1x slot and the platform's core logic.  The PCI-SIG explains the interface's serial nature by saying, "Serial bus architectures deliver more bandwidth per pin than parallel bus architectures, and they scale more easily to higher bandwidths. Serial bus architectures enable a network of dedicated point-to-point links between devices as opposed to the multi-drop basis of parallel bus architectures. This eliminates the need for bus arbitration, provides deterministic low latency, and greatly simplifies hot plug/hot swap system implementations."

Of course, PCI Express won't replace PCI overnight.  Rather, the two interfaces will co-exist for some time. There also shouldn't be any problems when it comes to software support.  PCI Express graphics cards won't require any programming considerations from third-party software developers.  Only those toying with BIOS files and drivers will have to concern themselves with the difference.

NVIDIA's Approach
The bridge chip

Now, there's been a fair bit of debate back and forth between ATI and NVIDIA over the "best" way to move from today's AGP standard to the impending takeover of PCI Express. Jen-Hsun Huang, NVIDIA's CEO, already commented to the effect that the NV4x family is to be natively PCI Express.  However, the PCI Express cards slated to emerge before NV4x will require NVIDIA's own HSI (High Speed Interconnect) bridge chip for compliance.  Four products with the bridge chip have already been announced - the GeForce PCX 5950, the GeForce PCX 5750, the GeForce PCX 5300 (a GeForce FX 5200 with PCI Express support) and a GeForce PCX 4300 (an older GeForce4 MX 440 adapted for PCI Express). 

According to NVIDIA, its decision to use the HSI bridge is influenced by a couple of factors.  First, it allows the firm to manufacture one GPU with support for two interfaces.  In the case of its NV3x family, AGP is supported natively.  The HSI bridge provides quick access to PCI Express operability.  When NV4x rolls around with PCI Express support, the same HSI bridge can be reversed for an AGP variant of the board.  It also claims that a bridge chip is cheaper to manufacture, since it precludes the need for two versions of the same GPU - one with AGP support and the other sporting PCI Express. The cost of taping out a graphics processor is reportedly about $1 million dollars, so arguing in favor of economics seems solid given an entire product range comprised of multiple chips that would need to exist in both AGP and PCI Express form.

"But what about the augmented bandwidth numbers?" you ask.  According to NVIDIA's spokespeople, the current generation of graphics processors is manufactured with enough tolerance to withstand an accelerated AGP bus - the equivalent of AGP 16x, they claim.  Older architectures, presumably the GeForce MX 440, are limited to an equivalent of AGP 12x made possible by employing short traces from the GPU to bridge chip. The following diagram illustrates a typical PCI Express usage model.  Note that the bandwidth numbers can be altered, and the total 4.2GB per second can be allocated however needed.  And the 266MB per second upstream limitation commonly believed to ail AGP cards is not a factor at all, as that only holds true for platforms limited to PCI-writes.  The HSI bridge supports AGP writes, alleviating that potential bottleneck. 


Theoretical PCI Express Bandwidth


Effective PCI Express Bandwidth


Typical PCI Express Usage, Per NVIDIA

ATI claims that NVIDIA's implementation introduces the possibility of latency, which NVIDIA vehemently denies.  Instead, by maximizing in-flight requests, utilizing 64-byte request sizes instead of 32, and bolstering AGP speeds, the PCX family purportedly circumvents latency.  Then again, the argument is hard to debate given the lack of PCI Express hardware and a general absence of software demanding enough to observe adverse amounts of latency.

NVIDIA is confident in the readiness of its PCI Express lineup, and is reportedly already validated on Intel's Grantsdale (desktop), Tumwater (workstation), and Alviso (mobile) chipsets.  The company demonstrated a PCI Express platform playing back HDTV content during the most recent IDF, but it should be noted that even today's AGP cards are capable of that.

ATI's Approach and Conclusion