As companies strive to become more carbon efficient and companies prioritise sustainability, reducing the carbon footprint of IT services is becoming increasingly important.

In this blog post, we will explore how to leverage real-time carbon intensity data to reschedule IT jobs and reduce their carbon footprint. We will discuss carbon intensity and how, by using simple APIs which retrieve real-time carbon intensity to identify optimal scheduling times, it is possible to reduce the carbon footprint of any scheduled IT service.

The Green Software foundation has been doing some great work in this area to allow organisations to baseline the Software Carbon Intensity of IT services. Their Software Carbon Intensity (SCI) Specification defines a methodology for calculating the rate of carbon emissions for a software system.

This blog shows how by reducing the carbon intensity of an IT service (one of the key parameters in the SCI methodology) we can reduce the carbon footprint of any IT service.

Background

Scheduled IT jobs are automated processes that run on servers, often hosted in a data centres, which consume energy to carry out computation calculations, data processing and network transfers. This energy consumption has an associated carbon footprint which can vary based on the energy sources – or sources - being used to provide the power.

In BJSS’ work creating a data platform for the Retail Trust, a charity with the mission of creating hope, health and happiness for everyone in the retail industry, we are conscious that this platform has a carbon footprint. Our vision has been to ensure that we reduce the carbon footprint of our data platform as much as possible.

Retail Trust has a Databricks data pipeline hosted in Microsoft Azure that currently runs on a static daily schedule. The pipeline uses Delta Live Tables and Databricks Notebooks to ingest data from multiple external data sources, then processes and aggregates data resulting in a set of tables for consumption by downstream services.

Power BI reports are setup to refresh their data from the delta live tables daily.

What Is Carbon Intensity, And Why Is It Important?

Carbon intensity is a measure of the amount of carbon dioxide emissions produced per unit of energy consumed. It varies depending on the energy source, with renewable sources such as wind and solar having a much lower carbon intensity than fossil fuel sources like coal or natural gas.

By using services such as the Carbon Intensity API (UK) or WattTime (USA), it is possible to determine the carbon intensity for a particular region currently.

For this to be useful though, future predictions on carbon intensity are needed. and these services can predict the carbon intensity into the future, based on an understanding of the weather.

The Carbon Intensity API is freely accessible without sign-up. Other services require sign-up and some require payment to use.

How We Made A Difference To The Retail Trust Data Platform

Retail Trust and BJSS decided to run a Hackathon to see if we could reduce the carbon footprint of both the Databricks refresh and the Power BI refresh.

We built a small API which, when called with the original refresh time and a future time window (for when the job must be run), would return the best time to run the job (the time with the lowest carbon intensity) and the percentage reduction in carbon this would achieve.

The API was deployed as a serverless (lambda) function using the serverless framework and deployed into an BJSS AWS instance with swagger endpoints made available for easy consumption.

A simple return of data shows the proposed time to run the job, and the carbon intensity at that time. It also returns the carbon intensity saving (31), and the percentage reduction in carbon emissions (25.2%).

Joining The Services Together

With an API that allowed us to easily calculate the best time to run a service, it was time to connect the services together by using a simple Azure DevOps Pipeline to orchestrate the refreshes by:

  1. Calling the BJSS API to find the greenest time to run the service
  2. Updating the Power BI Data Refresh Schedule
  3. Updating the Azure Databricks Refresh Schedule
  4. Publishing the reduction in carbon statistics to a Power BI Dashboard.

How Much Carbon Can We Save?

What we’ve been able to demonstrate from this is that it’s very much possible to reduce the carbon footprint of a data platform by running at the greenest time possible - when carbon intensity is at its lowest.

By following the technical steps outlined above, any organisation that has batch or scheduled jobs can retrieve real-time carbon intensity data, analyse the data to identify optimal scheduling times, reschedule jobs accordingly and monitor the impact.

We would encourage businesses to adopt this approach to make their IT operations more environmentally friendly and contribute to a greener future. And, as these approaches continue to advance, we plan to constantly monitor and review the carbon footprint of the platform we are building and make technology decisions that are as green as possible.

Find out more about how we can help your organisation better action your sustainability ambition here.