Introduction
Before COVID-19 turned the world upside down, the theme of
2020 was meant to be the Climate Emergency. If there is anything good to come
from COVID it is perhaps that CO2 emissions for 2020 will be lower than expected,
although still far too high, and COVID has been a distraction from what is
almost certainly a bigger issue to address in historical terms. As we approach the
end of the year, with some good news of vaccines finally on the horizon, I’m seeing CO-two rather
than CO-vid appearing in my tech news feed more often, with the E&T Magazine devoting most of the
November issue to "Growing Back Green", and the Azure Podcast this week focus on sustainability.
Cloud … Part of the problem
Although many sectors have seen their carbon footprint fall
during the pandemic, IT is a definite outlier. As the output from aviation and
other forms of transport cratered with everyone spending more time at home, the use
of both cloud based work systems (Teams, Zoom, etc), and the cloud required
to keep everyone entertained (Netflix, gaming platforms) went the other way. This means our use of compute continues unabated and there’s a
surprising amount of carbon produced by IT. The UN
Environment Programme estimates ICT as a whole amounts for between 2%-6% of
global Green House Emissions – a proportion bigger than aviation
(although emitting directly into the upper atmosphere is a whole issue in
itself). If ICT is this big a producer of Carbon it’s not something we can
ignore - particularly as big data, machine learning, and the increase of digital citizens in developing countries continues to drive demand. The same UN source shows of that 45% and 24% of emissions come from
Data Centers and Networks respectively – meaning that the internet and cloud compute are a huge part of the problem.
Cloud … Part of the solution
Fortunately the cloud can also help to solve this problem in
a number of ways
- More
efficient than on-prem: The first way the cloud can help is that Cloud
Providers are more energy efficient than running applications on-premise, or even co-located
data centers in terms of power consumption. This is for a variety of reasons:
- economies-of-scale in mega data centers allows more efficiency in non-computer costs such as cooling
- virtualisation
makes better use of servers, which coupled with extensive automation tools allow
customers to scale back when demand is lower or turn things off entirely
when not needed (e.g. test environments overnight).
- virtualistaion/automation
at scale efficiently distributes virtual machines, and controls things like
power and cooling better than smaller providers (e.g. GCP uses machine learning to predict and optimize cooling)
- OpEx
Pricing: Perhaps the biggest changes businesses need to tackle when moving to the cloud is the move away from
Capital Expense (CapEx) to Operational Expense (OpEx) – namely paying by
the minute/second for a server rather than paying for it up front then running it
for 4 or 5 years. Similarly data storage and data transfer are charged by
the megabyte rather than as a fixed price. This mindset shift is a good thing in terms of efficiency as it creates
a market which encourages businesses to reduce their consumption. If you run more efficient code you
pay less; if you turn off servers when not needed you pay less; if you are
more active in deleting old data you... well you get the idea. It also
encourages shifting workloads to quieter times of the day (e.g. use of
spot instances) to help iron out the peaks in demand. This
works well for the planet because generally cost = carbon, so reducing cost means reducing carbon,
although unfortunately this isn’t always the case. There is no incentive
built into the system to move workloads to areas with lots of hydro vs
areas with lots of coal fired power stations for instance, and there's no guarantee that the energy mix is greener during quieter hours (perhaps at night when the sun doesn't shine) – so reducing
power is good, but still needs green clouds (see below).
- Serverless: Serverless is a concept born in the cloud and is the next iteration of the above trends (improved efficiency, and OpEx pricing). For this discussion it’s also spectacularly badly named. Of course there are servers, big power-hungry servers made from minerals mined from the ground, which take electricity to build and had to be shipped across the world to be installed in our cloud region. That said, Serverless is still beneficial in two major ways:
- Serverless isn’t just about only paying when something is being used, it’s about only running it when it’s needed (with some real-world caveats about how long it takes to scale down). Essentially serverless products like functions as a service and database engines can scale down to zero when not in use, meaning that the power needed to run them drops too.
- Serverless functions are (almost always) smaller than Virtual Machines and relatively short lived, which means all that magical cloud provier automation can more efficiently move these around the data center, filling gaps to most efficiently use underlying hardware.
- Clouds
going green: Of course being more efficient isn’t the same as being
carbon neutral; fortunately all the major public cloud providers have targets to
be net zero (although one might argue that some are taking it more seriously
than others). As with most things in the world of cloud everyone is going
in the same direction, but that’s not to say they’re equal (as noted in this
now year old article
in wired). At present a lot of this is through Renewable Energy Credits (RECs) meaning the cloud is not 100% clean all the time. That said it’s a step in
the right direction and the fact that cloud providers are committing to
buy so much from renewable sources helps the business cases to finance those renewable
energy providers in the first place (as they can access credit because of their guaranteed revenue streams). This is an area of very positive competition:
- Azure:
Microsoft have been net renewable since 2014 (including RECs) and have sustainability
targets to use 100% renewable energy by 2025, in addition to targets
on waste, deforestation, and water usage. Microsoft also made headlines earlier
this year when it announced it would become net negative by 2030 – attempting
to erase their entire footprint since foundation in 1975.
- Google
Cloud: Google advertise themselves as being the “cleanest cloud in the
industry” and hit their target of purchasing 100% renewable energy in
2017 – although this is an annual target not a minute by minute
commitment to use only renewable energy. They have subsequently announced
plans to go greener committing to
use only renewable energy by 2030. This includes new initiatives such as
building co-located power plants and installing batteries in GCP data
centers.
- AWS:
In so many areas of cloud AWS are the leaders with the other playing catch-up, but that’s not true here. Much of AWS’s focus on sustainability talks of
the efficiencies of moving to the cloud (see above) rather than them being carbon neutral. Amazon has (after public pressure from employees and shareholders) announced a target to be
Carbon Neutral, but not
until 2040. Although this is not entirely comparable with GCP/Azure as it’s a
target for Amazon as a whole not just AWS it's definitely not as good. As far as I can see, there
isn’t a specific AWS target to switch entirely to renewable, but rather some fairly vague talk of a “long-term commitment to use 100% renewable energy”. In
addition Greenpeace
have criticized Amazon in the past for breaking their own commitments.
What more can we do?
Although all this is good news from the cloud providers, what can we Engineers and Architects do to help (other than choosing a nice green cloud provider)?
Inefficient Software Engineering
One of the downsides of Moore’s law is that over the decades software has become significantly less efficient. To run Word on Windows I now need what would have been a supercomputer only a few decades ago (when remember, my computer would run a version of Word on Windows!).
Although it's easy to criticize, in fact this inefficiency is in part understandable and even justifiable. Reusable components in software, frameworks, and abstraction all significantly reduce development effort even if they usually come with a cost. At a time when a day of development costs more than a month running a Virtual Machine, why spend hours reinventing the wheel or optimising code when we can just scale out or use a slightly bigger SKU? Similarly with compute and network so cheap and fast there are real benefits to breaking up software and distributing it over the network even if it adds an overhead and more network traffic; because that software is far more fault tolerant.
Sadly this is death by a thousand short cuts, but a few small changes made often enough can have equally big results.
Efficient Software Engineering
Microsoft have launched some thoughts on Efficient Software Engineer Principles both as a short learning path and a manifesto of sorts at https://principles.green/. The latter even include some concrete ideas (although this is an area which needs expanding, and as it's on Github I feel a PR may be in my future).
Essentially off the top of my head here are a few ways we
could improve energy efficiency – many of these are good practice for other
reasons (cost, reliability, simplicity, latency etc). Unfortunately, it does take
some thought as almost all of these are - as is so often the case - a trade-off between reducing energy
usage and something else:
- Efficient
code: Writing better code, in more efficient languages will make things
faster and smaller – particularly when moving to microservices (e.g. packaging
a whole JVM and App Server with a Java Microservice to run 10 lines of
code is obviously less efficient than writing that 10 line service in Go
or C). If that service is being used a lot it could need many times the compute to run it. That said frameworks improve speed and reduce risk of introducing
security vulnerabilities or other bugs, and prepackaged code may already be optimized.
- Efficient
distribution: Should we use a whole virtual machine; containers built with big images;
containers built with small minimal images; or serverless functions? It
depends a lot on the scenario, but if I’m running a small microservice
which processes a few messages from a queue once a day, I know which I’d
choose.
- Efficient
deployments: Co-locating code avoids network transits. Pushing assets to a
CDN reduces the use of both network and compute. Caching both at the edge
and at the client side can improve both user experience, reduce electricity usage. These are all good practice even before we think of electricity usage, but other things may not be so obvious or clear cut such as can a compute heavy process (which doesn't require low latency) be done in a low carbon area? If so does a lot of data need to be shipped there just to run the compete (if so it might not be worth it).
- Efficient
data: Tuning retention of data can save on electricity (and ensure GDPR compliance!) but an even bigger saving can be made by not capturing data in the first
place. This is true of business data, but even more true of technical data. We now drown in wonderful data which tools like AppDynamics and Splunk provide, but every metric and each log
line piped into a monitoring tool means data being sent over the network,
processed in the tool, and stored (even if only for a few days). Generally there are a lot of
reasons why collecting more data is better but carbon footprint certainly
isn’t one!
- Efficient sizing & scaling: There is an obvious cost benefit to rightsizing compute for workload which means it should be on everyone's radar. This comes into play both when deploying (not selecting too large an instance) and when running (using autoscaling to handle variable load and turn off instances when not needed). That said this can be forgotten or large instances selected to start with and then never reassessed once the load is understood. It's perhaps the easiest area to make savings.
Postscript
Just after I posted this Mike Dodds pointed out there is an Azure Sustainability Calculator for working out the carbon footprint of your Azure Services which is well worth a look.
Comments
Post a Comment