Software allows us to make fewer GPUs, but use more
Understanding how the things we buy and use affect our environment is not a simple exercise. Unless you’re foraging for food and crafting hand tools, everything you consume has a complex environmental impact that’s probably bigger than you think.
For example, that little piece of sashimi you eat in SOHO may have been caught thousands of miles away in the Pacific and brought back to Japan (the boat burning fuel back and forth), transported on an airplane (burning fuel fuel) and delivered to the restaurant in a truck (fuel). Everyone involved – fisherman, airline crew, truck driver, chef, wait staff – had to get to and from work. And if you eat an apex predator like a tuna, there’s an additional set of ecosystem impacts downstream.
Computer chips are no different. In fact, manufacturing at such a staggering level of microscopic precision comes with a surprisingly large investment of energy and other resources, as well as other ecological impacts. And beyond manufacturing, chips must be distributed, installed, and ultimately powered inside computers — all activities that impact the environment.
As the behemoths of metaverse, machine learning, artificial intelligence, and gaming advance, GPU acceleration will become even more critical and pervasive – with corresponding increases in environmental pressures.
The impact of GPU manufacturing
The environmental impacts of modern chip manufacturing, or “fab”, can be categorized as follows (Note: I don’t want to pick on TSMC specifically, but they are the biggest chipmaker):
Construction of manufacturing facilities: a full account of the environmental impact of a massive new factory goes beyond this article, but the projected cost of a modern TSMC factory is close to $20 billion – you can imagine the magnitude of the environmental impact of such a construction project.
Carbon footprint of powering these facilities: Major manufacturers like TSMC are moving towards renewable energy, but the majority of their electricity still comes from traditional fossil fuel power plants. And those chip factories use a shocking amount of energy (yes, I know…) – TSMC uses about 15 TWh per year, more than the whole country of Paraguay.
Other greenhouse gases (excluding CO2): According to the EPA, “Semiconductor manufacturing processes use [global-warming-potential] fluorinated compounds including perfluorocarbons (e.g. CF4, C2F6, C3F8 and c-C4F8), hydrofluorocarbons (CHF3, CH3F and CH2F2), nitrogen trifluoride (NF3) and sulfur hexafluoride (SF6)…[up to] 80% of fluorinated GHGs pass through the chambers of manufacturing tools without reacting and are released into the air. »
Water consumption : TSMC uses 150,000 tons of water a day in Taiwan alone, the equivalent of 500,000 households. This isn’t necessarily a problem in monsoon country until a drought hits, but it could continue to be a problem in places like Arizona.
Toxic chemicals: The manufacturing process includes hundreds of carefully orchestrated physical and chemical steps. Most of the chemicals used are protected as trade secrets and don’t even need to be disclosed. Those known to us include a wide range of carcinogenic, mutagenic and teratogenic metals, solvents and polymers. In addition to the health effects on workers inside, the resulting toxic waste ends up leaving factories one way or another, all too often ending up in soil and groundwater.
Extraction of precious metals: The average 7.8 billion people on Earth 13 books each of electronic waste per year. Only 20% of these are recycled, leaving the majority of chip manufacturing to be driven by new materials. Extracting the metals needed to produce chips scars landscapes and leaves toxic chemicals used in the refining process in soil, groundwater and surface water. The demand for metals in electronics is only increasing, and these impacts are increasing.
Most of this impact is attributable to CPUs rather than discrete GPUs. But despite their smaller numbers, GPUs are much bigger and heavier, which has a higher unit impact – and with 41 million discrete GPUs shipped in 2020, the impact is significant.
Other Environmental Impacts of GPUs
Distribution: New GPUs must, of course, be delivered worldwide by ships, planes, and trucks.
Installation: Even if all you need is acceleration capability, you can’t just plug a GPU into your network – GPUs need to run inside computers. In the data center, this means “racking and stacking” entire servers to house the GPUs, along with the corresponding hardware that has its own impact on the environment.
Operation: GPUs consume a lot of power – and they heat up, so even more power is needed to cool them to operating temperature limits. They also draw a surprising amount of electricity when idle but “on” (ready for a workload). Overall, the data centers they operate in require constant cooling and other environmental controls that consume as much energy as the servers themselves.
Water, electricity, fossil fuels, greenhouse gases, toxic chemicals – the manufacture, delivery and operation of GPUs comes with a host of environmental impacts. And it’s ironic, for example, that the massive computing power that goes into things like climate simulation is also a substantial contributor to climate change.
Imagine instead that we could do a lot more with a lot less GPU. It may sound too good to be true.
Do less, use more
GPU utilization today is woefully low, averaging below 15%, as we explained earlier. Such underutilization is, no doubt, a colossal waste of resources – but it is also needlessly destructive to the environment.
If we could find a way to increase utilization to, say, 90%, we would be able to do six times the calculation with the GPUs we have, do the same amount of computation with 1/6 the number of GPUs, or some balance on an intermediate spectrum. The leverage we could create would be incredible.
Imagine two futures five years from now where the world will need 10 times the accelerated computing capacity it has today:
Scenario 1: Business as usual (15% utilization)
Let’s expect roughly a 20% annual increase in compute capacity per GPU, based on the recent Moore’s Law slowdown. Over five years, that gives us 2.5x of our 10x (1.2⁵=2.49). We still need 4x on top of that 2.5x to get our 10x.
So we’ll need 4x more GPU cards, with a corresponding 4x in nasty environmental impacts. Let’s be generous and say that innovation towards manufacturing, mining, delivery systems, etc. cleaner and more efficient, reduces these impacts by 10% (a 1.1x effect).
10x increase in computing / 2.5x “Moore’s Law” / 1.1x cleantech innovation → 3.6x
In Scenario 1, we are still left with a ~3.6x environmental impact to get our computing capacity multiplied by 10.
Scenario 2: Breakthrough (90% utilization)
- “Moore’s Law”: 2.5xas above
- Innovations to reduce environmental impacts: 1.1xas above
- Breakthrough innovation increases utilization by 15% to 90%: 6x
- This 6x cannot be fully applied to all environmental impacts because increased usage increases the energy needed to operate – but we know that most of the environmental impact of computing comes from manufacturing. Let’s say 1/3 of the environmental impact comes from operations, so our 6x usage results in a 2x effect in the opposite direction – 0.5x
10x increase in computing / 2.5x “Moore’s Law” / 1.1x cleantech innovation / 6x usage / 0.5x operational impact → 1.2x
1.2x! In Scenario 2, incredibly, we can encounter this 10x accelerated computing future while barely increasing our environmental impact.
Sounds good – but how?
It might sound too good to be true, but we don’t need to bend the laws of physics to get this awesome 8x result (10x calculation with 1.2x environmental impact) – we just need to unleash 75% d additional usage in our GPUs. it’s already there.
This is not an easy task. If so, someone would have done it already. Corndisruptive innovation has arrived.
By abstracting the PCIe connector between a GPU and its application host with software, the two fundamental changes in GPU utilization – both necessary to increase utilization by 6x – are now possible:
Remotely access a GPU over a network. Removing the physical connection between the application host and the GPU allows for broader and more efficient pooling of resources and allows any host to dynamically attach and detach the remote GPU.
Dynamic sharing of a GPU. Far more efficient than today’s limited splitting options that simply create smaller fixed partitions, which themselves are underutilized, general-purpose software abstraction enables a new sharing paradigm where multiple clients (consumers GPUs) can continue to work harder on a single GPU.
We’ve already described how our team did this, deployed our solution with customers, and imagined a world where the remote GPU is widely adopted.
As we head into an uncertain environmental future, the last thing we need is rampant waste. We hope you will join us in our mission to bring the world to 10 times the computing capacity with only a marginal increase in environmental impact – driven by the simple idea of doing more with what we already have.
Steve Golik is co-founder of Juice Labs, a startup whose vision is to make computing power flow as easily as electricity.