This site may earn affiliate commissions from the links on this page. Terms of utilize.

Historically, GPUs accept been designed every bit monolithic dies with all of their functionality under one 'roof.' This hasn't e'er been the case — the earliest GPUs sometimes used separate chips for specific functionality. Both AMD and Nvidia have, at various times, used different cores to provide support for boosted monitors or to bridge connections between PCI Express and AGP.

Every bit far as the core components of the GPU itself, even so, those have been single-die affairs for a long fourth dimension. That'due south why it'due south a tad surprising to see Nvidia is now evaluating the possibility of a multi-scrap GPU that would communicate with other parts of the core, in something like an MCM (Multi-Chip-Module).

Simply monolithic GPU designs endure from a number of problems. First, they can be reticle-busters, pushing the limits of what TSMC, GlobalFoundries, or Samsung can build into a single cadre. Harvesting good GPUs from bad ones tin atomic number 82 to problems when the manufacturer attempts to harvest practiced die, as happened with Nvidia and the GTX 970. (Long story curt — the method NV used to recover parts for the GTX 970 also had an impact on the GPU's memory bandwidth when accessing its last 512MB of RAM.) If GPUs were built in modules, then continued together on a common packet, the resulting flake could theoretically exist larger and more powerful than whatsoever single carte du jour.

mGPU-2

In the authors' study (available here) they believe they tin surpass the performance of the largest buildable GPU by 44.5 percent, and come within 10 percent of a monolithic GPU die that surpasses any product currently buildable at any foundry.

MGPU-1

Now, in an ideal situation, this arroyo could yield huge improvements to performance and fifty-fifty power consumption and TDP, since y'all wouldn't take the entire GPU'due south horsepower concentrated in such a small infinite. I would caution, however, against leaping to conclusions. The authors of the report acknowledge that this would require software that was NUMA (Not-Uniform Memory Access) compatible. There would exist an inevitable performance hit when accessing data held in a dissimilar GPU or sharing information beyond multiple cores.

The flip side to this is that these kinds of performance impacts already happen when customers try to deploy arrays of GPUs. The performance penalties now are significantly harsher than what they'd be in an MCM.

How credible is this approach?

I don't want to claim that Nvidia is preparing to ringlet out MCM-mode GPUs right around the corner, merely I'll say this: Information technology'south non crazy. The paper has more details on this, just the big-motion-picture show takeaway is that by giving each GPU cake plenty bandwidth, keeping latency low, and properly allocating enshroud resource, you can striking some significant performance targets. The trick, of form, is keeping all those resources properly balanced.

For those who would argue that this is just the same ii-GPUs-on-one-piece-of-silicon that we've seen from both AMD and Nvidia for years, no, it truly isn't. Nvidia isn't contemplating sharing data beyond a silicon PCIe span; they're talking about designing a GPU that'south congenital, from the footing upward, to share workloads across multiple GPU modules. Better or worse, it'd be greatly different from anything nosotros've seen earlier.