Using Spot Instances as Jenkins Slaves for Cost Optimization

Entirely written in Java, Jenkins is an open source cross platform that is used to build and test software projects continuously making the software development process smoother. This widely used server based application has around 300k installations worldwide and growing at a terrific speed. Jenkins accelerates the software development process by supporting the entire development lifecycle starting from building, testing, and documenting to deploying the software. It supports the Master-Slave architecture, also known as Jenkins Distributed Builds that allows running jobs on different operating systems. Using the distributed approach, Jenkins helps to achieve the desired results faster thereby helping in cost optimization.

We were introduced to a problem last month by one of our customers based in Nordics. They were looking for a solution that would help them to optimize cost while not comprising value-added service delivery. Loves Cloud used spot instances as Jenkins Slaves to help them achieve that. But before we tell you how we reached the solution, here’s a brief client introduction –

About the Client

The client is a major provider of SaaS applications for the Oil and Gas industry. They provide extensive services with large data sets to solve complex problems for their customers. Their applications use Machine Learning, Data Science and AWS Services to solve some of the most difficult tasks accurately for their customers. So, it is quite clear that the company is not only an active contributor to digital transformation but also helps other enterprises to imbibe the same culture.

The challenges our client was facing

 In automated Continuous Integration (CI), developers submit changes to a central version control repository that are automatically built and run through a test suite. As builds and tests succeed or fail, the development team has a high level and detailed view of the status of the code base providing more confidence in deployments. The problem with automated CI is that it typically requires an always-on server or an expensive SaaS product in order to run builds and tests at any given time. This can be extremely expensive as a large development team would require more resources to test in a reasonable amount of time. That is exactly what had happened to our client. Their costs were spiraling affecting the bottom line.

A deeper dive into the problem unraveled another issue our customer was facing. It was with the environment. Companies like them having large and heavy projects ongoing at any given point in time need builds on a regular course. Running all of the builds on a central machine can affect the development lifecycle of the software. In such circumstances, several different environments are needed to test the builds which our client was not having. This resulted in a slower build capacity and led the company to inadequate software deployment.

Goal-setting

The problem analysis followed by a detailed discussion with the CTO of the client company helped us to identify the following goals:

  • We were to increase the effective build capacity of Jenkins server
  • Categorize multiple build slaves based on build requirements such as UI, AWS Sam, Docker, and Python, etc.
  • Use different build slaves for different use cases
  • Reduce the overall cost of running Jenkins continuous integration (CI)

Next step was solution implementation. This is how we approached the task:

Solution

Amazon Web Services (AWS) spot Instances are ultimate cost savers for software companies. They are even cheaper than the on-demand instances. The problem with Spot Instances is that they can disappear at any point. To ensure complete resolution of our client company’s problem, we used this feature as Jenkins Slave to avoid any type of unpredictability. By utilizing right sizing and spot instances feature of AWS, we implemented the following solution:

  • We first decreased the instance size of Jenkins master to a small instance
  • Updated Jenkins configurations to run all build jobs on Build Slaves
  • Used AWS Spot fleet to request and maintain required build capacity. By utilizing Spot instances, we were able to save about 70% on instance cost while still maintaining required build capacity with Spot Fleet.
  • Created two Spot Fleets with different CPU and memory requirements based on build types:
    For Node.js, UI builds
    For AWS Sam and Docker builds

The process allowed us to maintain different build slave pools for different job types based on the project requirements.

Tech stack used by Loves Cloud

To implement the solution, we used the following tools, platforms, services, and programming languages:

  • Public cloud platform – AWS. Following AWS services were prominently used:
    • AWS EC2
    • AWS Spot Fleet
    • AWS SAM Templates
    • Cloud formation
    • Lambda
    • AWS Elastic Container Registry
    • AWS Elastic Container Service
    • Simple Storage Service
  • Jenkins – as the continuous integration server
  • Docker – to containerize Node.js and Python
  • AWS SAM CLI – for building and packaging Lambda applications for AWS platform
  • Programming languages – Node.js, Python

Results

Needless to say, a structured problem analysis, choice of tools and services, and solution implementation helped us to reach our goals within the set timeline. What was better- by utilizing right sizing and spot instances features of AWS, we were able to get our client company the following benefits:

  1. Almost 2X increase in Jenkins CI build capacity
    This would help them in software development lifecycle while maintaining the consistency of the build process.
  1. 40% reduction in the cost of running Jenkins CI
    We used AWS Spot instances that are actually spare Elastic Cloud computing instances. These instances can be automatically replenished that helps maintain the target capacity using EC2 Spot fleets. Each instance and availability zone comes with an alternative capacity pool. We chose multiple such capacity pools to launch the lowest priced instances, available currently, by launching a Spot fleet. This enabled us to reduce the cost of the running Jenkins CI by 40%.
  1. No queued builds due to increased build capacity
    A build queue is a list of builds that have been triggered but are waiting to be started. Our client was having too many build queues before our service. It could be due to many reasons. After we implemented the needed changes, no more queued builds were observed during software development. It proved that the effective build capacity of Jenkins server has increased, as targeted.
  1. Faster build and deployment
    This would benefit our client in more than one way. The development process would become faster ensuring less risky releases. Faster build and deployment would also ensure continuous improvement thereby giving new momentum to their digital transformation journey.

At Loves Cloud, we support various public cloud computing platforms along with multiple open source software solutions for making our customer’s digital transformation journey smooth. To learn more about our services aimed at the digital transformation of your business, please visit https://www.loves.cloud/ or write to us at biz@loves.cloud.