Intel's MCM Design Patent Could Be A Sneak Peek At Future Arc Gaming GPUs

intel ponte vecchio
Here's a processor primer for novices: classical computing workloads are largely "serial", meaning they go through problems A, to, B, to C, in a series. CPUs have historically been designed through the years to be better and better at this kind of workload. More and more, though, we're also doing parallel work, which is to say workloads that can process problems A, B, and C simultaneously. 3D graphics rendering (whether for games or otherwise) is one such workload, and that's why the most common parallel processors are GPUs today.

One relatively simple way to make a faster parallel processor is to simply make it bigger. By adding more and more functional elements, you can make your parallel processor capable of handling more work simultaneously. As long as your workload is massively-parallel—as graphics graphics rendering usually is—you can get a near-linear increase in performance this way. This kind of thinking is what has led us to the gigantic, power-slurping GPUs we have on the top-end of AMD and NVIDIA's product stacks currently.
multi gpu scaling increase vs single gpu
Intel figure pointing out poor scaling of past multi-GPU solutions.

There's only so big you can make a parallel processor, though. Even setting aside issues of practicality, such as cost, and yield (what percentage of dice come out problem-free), there's a literal physical limit to how large of a processor it's possible to fabricate with current chip fab processes. With that limit in mind, every company creating these compute-oriented chips is looking toward multi-chip module (MCM) technology or "chiplets", and Intel is no exception.

Using multiple processors to perform graphics work is no novel idea, of course; even before real-time 3D graphics were a thing, folks were using powerful multi-processing systems to perform so-called "offline" 3D rendering. The difficulty with MCM GPUs is in figuring out exactly how best to sub-divide the work among multiple separate processors. It's one thing when you have parallel functional elements inside a processor working from the same caches; it's another thing altogether when you have two or more discrete processors working in tandem.
intel patent slide tiled checkerboarding multi chip module gpu
Intel image illustrating "checkerboard" of tiles. T0-T3 represent GPUs.

Intel thinks it has a clever way to go about this. In a patent revealed by Underfox (@Underfox3 on Twitter), the company lays out its plans for multi-die graphics processing in fairly explicit detail. Essentially, it seems like the company is doing a sort of tile-based checkerboard rendering using vertex position calculation to determine visibility. By limiting each batch of work to a set of pre-defined screen space tiles, Intel's design can apparently scale out to multiple dice more efficiently.

AMD has already filed its own patent for a similar technology; indeed, its distributed graphics processing patent also works by using a checkerboard render system. You can read these filings yourself for reference if you'd like: Intel's patent is here, and AMD's is over here.

intel patent slide work division multi chip module gpu
Intel chart demonstrating distribution of work across four GPUs.

We haven't heard any whispers of an MCM Intel GPU intended for the desktop, but Intel's Ponte Vecchio Xe-HPC accelerator is exactly such a processor. It will be interesting to see how soon this concept makes its way to gaming PCs. Of course, Intel needs to get Alchemist to gamers, first.