The ATi Radeon 9800 Pro Full Release Review
ATi Technologies Distances Itself From NVIDIA Once Again

By, Dave Altavilla
March 5, 2003

 

During our Tech Briefing with ATi in New York last month, ATi provided a foil set comparing the Radeon 9800 Pro to the GeForce FX.  We've snipped a couple of the salient benefits slides for you here.

This chart shows you a few of the raw specifications of the 9800 Pro versus the GeForce FX.  As you can see ATi is claiming full OpenGL 2.0 support, which means the VPU needs to support unlimited length for Shader Instructions.  NVIDIA's claim to fame with the GFFX has always been "DX9+", with support for 1024 Shader Instruction length.  The Radeon 9800 Pro has the ability to go well beyond that, with their new "F-Buffer" technology, which we'll cover shortly.  Also, Peak Memory Bandwidth (save the Marketing games played with compression ratios) is now up to 21.8GB/sec, significantly faster than their rival's flagship GPU.

 

 
Shots from the New York ATi Tech Briefing

New enhancements for the R350
More than just a speed bump

Next,  we'll dig into some detail on the major differences between the R300 (or Radeon 9700 Pro) and the R350 / Radeon 9800 Pro.  While the base VPU hasn't changed much, ATi has taken steps to further optimize their high end core.

ATi's new R350 VPU incorporates a few new tweaks and optimizations beyond just a simple Engine Clock speed boost.  The R350 picks up where the R300 left off, in terms of DX9 and OpenGL 2.0 support, as well as their Image Quality enhancement techniques, called "SmoothVision" and Memory Bandwidth saving compression algorithms in Z-cache, now known as "Hyper Z III+"

 

Smart Shader 2.1
The "F-Buffer" for Limitless Shader Instructions -

 

The "F-Buffer" is a fairly simple concept really and it's exactly what allows the Radeon 9800 Pro and other cards based on the R350 VPU, to offer unlimited length Shader Instructions for the Game Developer.  The "F-Buffer" stands for a Fragmentation Stream FIFO (First In First Out memory) Memory Buffer, that has been implemented on chip.  Specifically what this does is to provide temporary storage for pixels that need to be processed over multiple passes of the shader engines, rather than writing them out to the frame buffer.  Only pixels that require a single pass will be written out to the frame buffer.  This provides a memory bandwidth savings and allows the VPU to handle processing workloads more efficiently.

SmoothVision 2.1

ATi has also tweaked the Radeon 9800 Pro's memory controller, allowing higher performance and greater efficiency with Anti-Aliasing loads.  At higher resolutions with 4X and 6X AA settings, the 9800 Pro should by all rights be somewhat faster than the 9700 Pro, clock for clock.  Regardless, ATi's pristine looking Gamma Corrected 4X and 6X AA methods, are still arguably the best looking approach to getting rid of the jaggies, that is available on the market today.

Hyper Z III+

Hyper Z III are ATi's compression and caching techniques, aimed at providing valuable memory bandwidth savings in the Z-Bufffer and Stencil Buffer.  Rather than dissecting the technology for you here,  we'll let the folks at ATi go through its benefit.  Here's what they claim Hyper Z III+ brings to the table.

HYPER Z III+ takes this technology a step further with an enhanced Z-cache that is more flexible and better optimized to work with stencil buffer data. The stencil buffer co-exists with the Z-buffer and behaves similarly, in that an application can set a pixels stencil value and compare it against the value stored in the stencil buffer to determine if the pixel gets rendered or not. The main difference is that the Z values in the Z-buffer represent the depth of a pixel, while the values in the stencil buffer can represent anything the programmer wants them to.

One of the most common uses for the stencil buffer is for rendering real-time shadow volumes. In this case, the application calculates which parts of the image fall in the shadow of other objects, and stores these shadowed areas in the stencil buffer. The graphics processor can then compare each pixel it renders with the stencil buffer values to determine if it falls within the shadow of any objects that have already been rendered. As long as all objects are rendered in the correct order, this technique makes it possible to generate accurate shadows for any moving object and/or light sources in a scene.

This process requires a lot of extra computation, so it has been used sparingly (if at all) in most existing games. Future game engines, however, such as the Doom 3 engine, are expected to use it heavily to create very realistic environments. The enhanced Z-cache feature of HYPER Z III+ increases the performance of stencil shadow volumes and will help to deliver a superior experience when playing the next generation of 3D games.

Translation?  Doom3 = faster...  Stencil Shadow Volumes and real-time shadow effects = Faster... Piece of cake, right?  OK then, let's move out.
  

 

A  New Growing Family Of Radeons