Although migrating on-premises operations to the cloud can help to become more sustainable, cloud computing* has a significant carbon footprint.
According to recent estimates, cloud computing contributes between 2.5%-3.7% of all global carbon emissions, which is comparable to the 2,5% of the aviation industry. In fact, a single data center can consume up to the equivalent electricity of 50,000 homes, and by 2030, data centers are projected to consume 13% of global electricity.
Different cloud providers, however, leave different impacts on the environment. According to the 2017 Clicking Clean report by Greenpeace, choosing the right cloud provider is crucial to minimize your carbon footprint. Therefore, it is not a surprise that carbon emissions of cloud services will become one of the top three factors that users look for in a cloud service provider by 2025, according to Gartner.
Based on the Greenpeace report, out of the three biggest cloud service providers, Google Cloud Platform (GCP), Amazon Web Services (AWS), and Microsoft Azure, GCP came out as the most environmentally friendly cloud infrastructure. But has anything changed since the report was published?
As the public clouds are monopolized by these three companies (GCP, AWS, Microsoft Azure account for two-thirds of the world’s cloud computing), this blog will focus on them. This also simplifies keeping track of the environmental measures of three companies rather than of dozens. But how do you evaluate these three companies to make the ultimate choice of provider?
In the following blog, we will evaluate the carbon footprint of the three cloud service providers on two different levels:
- By calculating the carbon footprint of a language model when trained on any of these cloud platforms.
- By comparing the providers on their ambitions, transparency and tooling/best practices when it comes to reducing their carbon footprint (based on content available on the internet).
* Cloud computing services, including data storage, software, analytics and AI-based models, are transmitted over the internet. It enables users to access and use these services on-demand, without needing to have the infrastructure on-premises.
Calculating cloud carbon footprint using the Green Algorithms method
To investigate how the carbon footprints of GCP, AWS, and Microsoft Azure have evolved, we took the state-of-the-art NLP model BERT-base under the microscope. The reason we choose this particular use case is that large language models (LLMs), like BERT, belong to the biggest ML models, due to the amount of their parameters. As a result, many of these models require multiple instances of specialized hardware such as GPUs or TPUs, and emit a lot of carbon in the process. Moreover, we chose this particular model because BERT is the most popular pretrained language model.
The evaluated use case had the following characteristics:
The carbon footprint is calculated by estimating the energy draw of the algorithm and the carbon intensity of producing this energy at a given location, resulting in the following numbers:
Table 1: Measurements created using the Green Algorithms calculator.
The outcome: Based on this method, Google has the most environmentally-friendly cloud infrastructure.
Critical note: Despite being a practical tool to estimate the carbon footprint of computations, the Green Algorithm calculator has some limitations. For example, it does not perform a life cycle assessment. Consequently, it does not consider the full environmental impact of manufacturing, maintaining, and disposing of the hardware used.
Hugging Face’s paper sets a new standard for model life cycle assessments in their recently developed method to calculate the CO2 emissions. They showed that not only the training phase of a model is harmful, but also the inference phase. Using a GCP instance in the us-central1 region (which has a carbon intensity of 394 gCO2eq/kWh), to deploy the model resulted in approximately 19 kgs of CO2eq emitted per day of API deployment, or 340 kg over the total period of 18 days. This is almost as much as a return flight from Amsterdam to Rome.
Which cloud provider is the greenest when it comes to ambitions, transparency, tooling and best practice?
Even though measuring the carbon footprint gives us a useful indication of how environmentally-friendly the cloud providers are, other factors that deserve our attention include:
- The future ambitions of each cloud provider when it comes to reducing their environmental impact;
- The transparency of each cloud provider about their impact on the environment, and what tools and best practices they have developed to combat carbon emission.
In the following sections, we dive in depth into the answers of those questions in regards to the top three cloud platform providers.
What are the future ambitions of each cloud provider regarding reducing their environmental impact?
Google has set the specific target of making all of its power sources carbon-free by 2030, and has already reached its 100% renewable energy target in 2018. Google’s “moonshot” goal is to operate on 24/7 carbon-free energy in all their data centers and campuses worldwide by 2030.
Amazon has set the goal of powering its company 100% with renewables by 2025 and becoming carbon net-zero by 2040. In 2019, Amazon co-founded The Climate Pledge - a pact to meet the targets of the Paris Climate Accords ten years early by achieving net-zero carbon by 2040. Many companies, like Heineken and PepsiCo but also Amazon and Microsoft, have signed this pledge.
Microsoft is aiming to achieve 100% renewable energy in its cloud data centers by 2025 and is investing in various sustainability initiatives such as carbon capture technology and energy-efficient data centers. Moreover, Microsoft wants to become ‘carbon negative’: They went as far as promising they will remove all the carbon from the atmosphere by 2030 that the company has emitted since its founding in 1975, which obviously goes further than becoming carbon neutral.
The outcome: For now, it seems that Google is the frontrunner in terms of ambitions. But overall, all three cloud providers – Google, Amazon, and Microsoft – have promised to decarbonize their data centers, with each their own approach to make it happen. It will be interesting to see if and how these providers will fulfill their ambitions, while expanding their capabilities.
What are the tools and best practices that cloud providers developed to combat carbon emissions, and how transparent are they?
Traditionally, companies that aim to power their operations with 100% renewables do that by matching electricity usage with renewables on an annual basis. Google, on the other hand, wants to focus its efforts. Rather than leaning on the standard REC concept (buying renewable energy certificates), Google works with the non-profit organization M-RETS and relies on hourly time-based energy attribute certificates, called T-EACS.
T-EACS are instruments that track how, where and when electricity is produced, and require solutions to meet demand when renewable energy is not available. Google announced an example of such a solution at a data center in Belgium, where their diesel power backup system was replaced with a battery backup system.
Moreover, Google created the tool Carbon Footprint, in which Google applies the location-based method for scope 2 emissions (emissions from electricity production), that calculates emissions based on the emissions intensity of the local grid area where the electricity usage occurs. And it discloses scope 3 emissions (all indirect emissions that occur in the upstream and downstream activities).
On the other hand, Google’s Power Usage Effectiveness (PUE), which is the ratio of how much energy is used by the computer data center divided by the total energy used by the computing equipment, has some limitations due to inconsistencies in ways to calculate its data centers’ energy use. Google’s reporting of its PUE is highly variable, ranging from yearly averages to best-case scenarios (such as in winter, when minimal cooling is required).
Amazon created a tool called the Customer Carbon Footprint Tool. AWS users are able to see estimated carbon impacts of their AWS workloads down to the service level for its EC2 compute and S3 storage service.
This tool has received a lot of criticism because it shows only emissions data by extremely high-level geographical groupings. That makes it impossible to swap workloads to AWS regions with lower emissions. Reducing your carbon footprint by using this tool is therefore quite challenging, because the location of the data center is an important factor impacting it.
Also, the methodology behind this tool uses the market-based method to measure scope 2 emissions, which is problematic because it does not reflect the actual location-based electricity generation emissions resulting from the company's electricity consumption.
Last but not least, Amazon is not disclosing their scope 3 emissions, which is something they are frequently criticized for.
A tool called Microsoft Cloud for Sustainability helps Microsoft Azure users record, report, and reduce their environmental impact through automated data connections and actionable insights.
Just like the tool created by Amazon, the methodology behind this tool is based on the market-based method for scope 2 emissions. Even though this tool could be used for reporting, however, critics say that the main goal of this calculator is to tell people about how Microsoft Azure prevented emissions.
The outcome: Overall, Google and Microsoft seem to be most transparent about the emissions of their cloud platforms and the methodologies to measure them.
A word of caution: Even though all the tools mentioned above give users some insight into their emissions when using the cloud platforms, users have to stay critical. Using RECs to offset carbon emissions to reach ‘net zero’ goals while retraining fossil fuel energy sources does not reduce emissions in a meaningful way.
In addition, there are multiple examples of cloud parties that have managed to claim green energy infrastructure that was built on government subsidies (e.g. in the Netherlands). While their energy usage is green, they took that capacity away from the Dutch infrastructure.
Tips to reduce your carbon footprint when using a cloud platform
Even though Google, Amazon and Microsoft have their own ways to compete for the title of ‘most-sustainable cloud provider’, some of which are more ambitious or transformative than others, the good news is that all three cloud providers have made progress in sustainability. And they continue to innovate in this area.
Ultimately, the size of your carbon footprint will also depend on how you make use of your cloud platform of choice. Below are a few recommendations on how to make it more sustainable:
Choose the right location of the data center:
If you want to compare the carbon intensity of data centers, the best option available is the data explorer created by Climatiq.
Schedule batch jobs to green data centers:
By setting batch jobs to run at green data centers when the share of renewable energy is high, you can significantly reduce emissions. Scheduling software, like Cloudflare’s Green Compute, can be used to optimize the placement and timing of batch jobs, taking into account factors such as energy availability, server utilization, and workload.
Profile your code:
Optimizing code for the reduction of your carbon footprint often involves writing code efficiently and minimizing unnecessary computations. To be able to do so, you can profile your code, for example by using the Python package cProfile. This enables you to get more information about the relative time spent in different functions. This will also inform you on what the most inefficient parts of your code are, which you could probably rewrite (If you start rewriting your code, you might also consider reading our blog about writing Clean Code).
Use compiled programming languages instead of interpreted languages:
In general, compiled programming languages like C, Rust and C++, are more energy-efficient than interpreted languages like Python. Even though the difference in energy consumption between programming languages is typically small compared to other factors, it is worth mentioning that Python consumes ±70 times more energy than C, Rust and C++. If you are not familiar with programming using these compiled languages, you could start with executing data transformations with libraries that are built on top of some of these languages, e.g. Polars.