Kyle Galbraith
Kyle Galbraith Kyle Galbraith is a senior software engineer, AWS Certified Professional Solutions Architect, blogger, and aspiring entrepreneur. He is the creator of Learn AWS By Using It, a course that focuses on learning Amazon Web Services by actually using it.

Using Serverless Reapers to Lower Your AWS Bill

Using Serverless Reapers to Lower Your AWS Bill

There are a lot of factors to consider when thinking about building your applications and services inside of Amazon Web Services. Often times there is a learning curve that individuals have to climb to get a full understanding. This learning curve is a large reason why I created my Learn AWS By Using It course. It’s a course that helps you learn the basics of AWS by actually diving in and using it to build a solution to a common problem.

But, there are things even the most seasoned of AWS gurus could stand to understand better. The primary one, the AWS bill. What is costing you money on your bill and what is not. Many folks monitor their bill and if they’re smart they use a service like CloudForecast to do so.

But monitoring is only one-third of the equation.

Understanding your AWS bill and establishing a baseline for what you expect to spend month over month is another piece to the puzzle. Once you have a baseline it’s time to start taking action when things breach your baseline.

In this post, we are going to focus on how you can take proactive actions to keep your AWS bill low. We are going to automatically destroy unused resources that are left lying around. We are going to focus on a common resource I have seen left around in AWS accounts, Elastic block store (EBS) volumes.

If you recall EBS volumes are block storage volumes that we can attach and detach from EC2 instances. They allow us to write to storage on one instance, detach the volume, attach it to a different instance and read the same data.

On the surface, these aren’t all that expensive. If we focus on the SSD gp2 volumes the pricing is $0.10 per GB-month of provisioned storage. OK, so if I use 8 GB of EBS storage for a total of 6 hours in a month my cost would be:

$0.10 * 8 * 21600 seconds / (86400 seconds / 30 days) = $0.006 per month

But, what if that 8 GB volume ran 24 hours a day for the entire month. Now things get a bit more expensive:

$0.10 * 8 = $0.80 per month

A difference of a few cents isn’t going to make or break you, but imagine if that 8 GB was 800 GB. For 6 hours out of the month, we are looking at $0.66 cents a month. But, if that EBS volume was running around the clock it would be $80 dollars a month. Quite the difference right?

There are all kinds of developer errors, bugs in the code, and brain farts that can leave an EBS volume running even though it’s not used. In this context, we define used as attached to an active production instance. You are still charged for the GB’s of provisioned EBS volumes whether they are attached to an instance. These can add up if you leave lots of them lying around.

The simplest solution is…well to not let that happen. Monitor your infrastructure and have a firm understanding of what contributes to your monthly spend in AWS. This should help you detect these types of issues sooner.

We can also put some automated processes in place to prevent these types of surprises.

Serverless Reapers

Turning mundane manual things into well oiled automated machines is a prime use case for serverless and AWS Lambda. So we are going to build out a quick Lambda function that finds available EBS volumes and destroys them.

At a high-level, our reaper is going to follow this idea:

  • Launch off some CloudWatch rule that is running off a schedule
  • List all available EBS volumes in a given region
  • Destroy the volumes that are in an available state

This reaper is going to provide us a huge benefit, automation. We write the code once, set up the schedule trigger, and then volumes left lying around in an available state are automatically destroyed. We never have to think about this problem again.

Let’s dive into some code.

Implementing a reaper in AWS Lambda

For this blog post we are going to make use of the Serverless Framework for provisioning our Lambda function. I enjoy the interface the framework provides and it makes representing your infrastructure as code a breeze. That said, it is optional but you will need to deploy your function by some other means if you choose not to use it.

If you want to see the code, I have created a GitHub repo that has this code and steps to deploy this into your own AWS account. https://github.com/kylegalbraith/aws-reaper-lambda

The first step is getting our serverless template the way we want. Let’s start by creating an initial project from the command line.

$ serverless create --template aws-nodejs --path aws-reaper-lambda

Cool, we now have our initial code configuration in handler.js and an initial serverless.yml file. Remember, we want our function to be triggered off of a CloudWatch rule that is running on some kind of cadence. We can add that into our serverless template in the events section.

1
2
3
4
5
functions:
  ebs-reaper:
    handler: handler.reaper
    events:
      - schedule: rate(1 day)

For this reaper function, we also need to grant it access to list and destroy EBS volumes. We can add that to our serverless template as well in the role statements section.

1
2
3
4
5
6
7
iamRoleStatements:
    - Effect: "Allow"
      Action:
        - "ec2:DescribeVolumes"
        - "ec2:DeleteVolume"
      Resource:
        - "*"

Now that we have our serverless template file all ready to go, let’s write the actual code in our function handler.

Our reaper function is going to launch once per day. When it launches it is going to issue a describe volumes call to the EC2 service for the volumes that are in an available state. We can then loop over that list of volumes and delete any that are in an available state.

Let’s take a look at what that code looks like in our handler.js file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
var AWS = require("aws-sdk");

module.exports.reaper = async (event, context, callback) => {
  var ec2Client = new AWS.EC2();
  var decribeVolumeParams = {
    Filters: [
      {
        Name: "status",
        Values: ["available"]
      }
    ]
  };

  try
  {
    var availableVolumes = await ec2Client.describeVolumes(decribeVolumeParams).promise();
    var deleteVolumePromises = [];
    availableVolumes.Volumes.forEach((volume) => {
      var deleteVolumeParams = {
        VolumeId: volume.VolumeId
      };
      console.log(`Scheduling delete of volume: ${volume.VolumeId}`);
      deleteVolumePromises.push(ec2Client.deleteVolume(deleteVolumeParams).promise());
    });

    await Promise.all(deleteVolumePromises);
    return callback(null, `Successfully removed ${availableVolumes.Volumes.length} EBS volumes.`);
  } catch(err) {
    return callback(err);
  }
};

We start off by getting a new ec2Client from the aws-sdk. Next, we declare the parameters object that we are doing to pass to the describeVolumes API. Notice here that we are using the Filters syntax to only grab the EBS volumes that have a status of available.

Now inside of our try block we can call describeVolumes and ask the SDK for a promise back. Once the promise returns we will have our response from the API call and we can loop over the Volumes property.

For each volume that was returned, we are going to issue a deleteVolume API call for that VolumeId and push the resulting promise onto an array of promises we are watching.

Once we are done looping through the volumes we can wait for all promises in our array to finish resolving by using Promise.all(). When that returns we know that we have deleted the volumes that were in an available state and we exit the function with a successful callback, callback(null, Successfully removed ${availableVolumes.Volumes.length} EBS volumes.);.

With that one function, we now have an automated process that runs every day and cleans up any EBS volumes that are lying around in an available state. This could save us a bit of money on our AWS storage costs if we are making heavy use of EBS volumes and they are being left around.

Where could this go from here

This is a rather simple example of a serverless reaper, but this one function could save you significant amounts of money on your AWS bill. I had a client that had over 150 EBS volumes lying around in one region at any given point. Deploying this function allowed them to save hundreds of dollars on their monthly bill.

Of course, the longer term solution is to figure out why those volumes are being left lying around. But sometimes it’s better to save a bit of money and then start digging to the bottom of the root cause.

This is only one example of a serverless reaper.

This same idea can be applied to other areas of your AWS account. Perhaps you don’t want to just remove any EBS volume that is an available state but only those that are missing a certain tag. You can do that by changing your describeVolumes filter.

Or maybe you don’t want to apply this idea to EBS volumes but rather expensive EC2 instances, you can do that too. You need to describe the instances in your region, filter for the ones you want to terminate, and then terminate them.

This idea can help you with any use case where you want to automatically destroy resources.

Conclusion

Leaving EBS volumes lying around likely isn’t something that is going to break your bank account. But as organizations scale and more developers begin provisioning resources this type of problem can become exponentially worse.

Leaving instances or RDS databases lying around that aren’t used can be 10-100x more expensive than EBS volumes depending on the instance sizes.

What we have laid out in this blog post is one approach to avoiding those surprise AWS bills. Using reaper serverless functions we can detect when resources like EBS volumes are left lying around and delete them so we are no longer charged for them. This idea could be expanded and extended to include things like EC2 instances or RDS databases. With that approach, I have seen clients save 25-50% on their AWS bill just by destroying resources that were no longer being used.

But, this isn’t the only process you should think about implementing relating to this problem. Resources left lying around unused is usually a sign that you aren’t using Infrastructure-as-Code. Surprised by your AWS bill at the end of the month? You should consider monitoring your costs more closely using a service like CloudForecast.

As is the case with most things in AWS software development, there are many approaches, but you have to decide on the ones that work for you.

If you have any questions about serverless reaper, feel free to ping me via twitter: @kylegalbraith. If you have questions about CloudForecast, feel free to ping Tony: tony@cloudforecast.io