Cost Management

Cloud FinOps and GKE: How Generali optimized spend with Google Cloud Consulting

May 31, 2024

Ecaterina Grisaeva

Strategic Cloud Engineer, Google Cloud Consulting

Sam Moss

FinOps and Digital Transformation Consultant, Google Cloud Consultant

Try Gemini 1.5 models

Google's most advanced multimodal models in Vertex AI

Try it

“As a small platform team, how do we optimize hundreds of workloads running in multiple GKE clusters to be cost-effective, without sacrificing performance or reliability?”

This was the question facing the team in charge of a suite of applications hosted on Google Kubernetes Engine (GKE) at Generali Switzerland in the summer and Autumn of 2023 — a tough challenge that would resonate with many of our customers.

GKE can be difficult to optimize for a couple of reasons:

There isn’t one single metric which you can use to optimize a cluster reliably. Instead, you’ll need to take into account multiple factors to determine what an effective optimization might be (keep reading to learn more about the four golden signals of GKE).
GKE is often decentralized to some extent. A central team maintains the clusters, and application teams deploy the workloads on those clusters. This model works well as it separates the application and the infrastructure into different spheres of responsibility. However, it also means that if the application teams oversize their requests, it will waste resources. Engineers need to be conscious of sizing workloads correctly.

With GKE usage contributing to a significant portion of their cloud spend, Generali felt there was room for improvement. Generali engaged Google Cloud Consulting to help the global insurance company explore how best to optimize its clusters and workloads, across both its production and non-production environments. The result was an identification of annual savings of more than €260,000 per year, representing a 41% reduction in GKE spend, achieved over a 12-week project between July and September 2023.

Analyze the ‘golden signals’

So how was this achieved? What tools were used? And, more importantly, how can you achieve the same results?

When optimizing GKE spend, we always start by analyzing what we call the “golden signals,” a set of metrics that you can use to analyze the efficiency of your GKE cluster. These signals include the following:

Workload rightsizing and utilization
Demand-based autoscaling
Bin packing
Discount coverage

http://storage.googleapis.com/gweb-cloudblog-publish/images/Cloud_blog__Transform_template__Cloud_FinO.max-2200x2200.jpg

You can read more about all of these signals in the State of Kubernetes Cost Optimization report, but think of them as levers you can pull to optimize the costs of your cluster. So how did Generali use these signals?

Workload rightsizing

Let's start with workload rightsizing. Workload rightsizing is about setting requests on CPU and memory as close as possible to actual resource utilization. It should be the very first step when optimizing the cluster because workload rightsizing changes affect the optimal settings for bin packing and discounts. As an activity, workload rightsizing is relatively simple when you have a small number of applications to consider. However, it becomes extremely difficult to enforce at scale when you have many teams and hundreds of applications running on your clusters like Generali.

Working with Google Cloud Consulting, Generali observed that while most applications had requests set, resource utilization was less than 20% of what was being requested — a huge opportunity for saving. To optimize this golden signal, Generali reviewed and lowered all application requests and conducted sessions with its application teams to help educate and upskill on the importance of setting correct requests. This amounted to approximately €100,000 of savings per year!

Tip: You can check how your workloads are rightsized by using the out-of-the-box GKE Cost Optimization graphs, along with the recommended Requests.The GKE Workloads at Risk dashboards, available in the Google Cloud console, also point out where containers are not setting any requests, as well as a wealth of other information about your clusters.

Rightsizing your requests can have a dramatic impact on cluster efficiency and cost. In addition, poorly set requests, such as not setting any requests or setting them too low, can also affect your application reliability as workloads might be evicted from the cluster. You should always investigate this signal first and make sure you continually monitor those requests.

Demand-based autoscaling

The next signal that Generali considered was demand-based autoscaling, which means configuring the application and the infrastructure to autoscale based on its end user demand. This can cut significant costs for workloads that have fluctuating traffic. For example, Generali already had infrastructure autoscaling in place but discovered after analyzing this signal that it didn’t always trigger properly and, in particular, would not scale down when traffic was low. This was caused by the way the pods had been configured to autoscale and some application configuration errors.

To change this behavior, Generali fixed the issues preventing the scale down and configured the horizontal and vertical pod autoscaler based on best practices provided by Google Cloud Consulting.

For Generali, unblocking Vertical Pod autoscaling and Horizontal Pod autoscaling for certain workloads helped realize immediate savings as the company was no longer paying for nodes they didn’t need. They also applied a scale to zero technique, which realized an estimated €35,000 annually in savings.

Bin packing

The third signal, bin packing, involves picking the best number and characteristics of the nodes to fit the workload. The better you pack apps onto nodes, the more you save. Generali was using GKE Standard and various node pools, including Spot VMs for better bin packing. While the percentage of bin packing was already over 60%, there was still room to improve. The team focused on migrating from N2 machines to cost-optimized E2 machines to provide a better balance of performance to cost. With this change and further bin packing optimizations, Generali identified around €105,000 of yearly savings .

To optimize even more, Generali also experimented with GKE Autopilot, which has a lower total cost of ownership because you aren't billed for unused capacity on your nodes. GKE picks and manages the most optimal nodes for you, leading to better bin packing, and facilitates many operational maintenance tasks, such as autoscaling settings, which you would otherwise have to do yourself. To learn more, read the Autopilot documentation guide or check out how GKE Standard and GKE Autopilot features compare.

Discount coverage

Finally, discount coverage is a crucial part in getting the best ROI on GKE. Specifically, it's important to ensure that any cloud resources that can be covered with a committed use discount — or CUD — is covered.

CUDs are a way to get significant discounts on your GKE usage in exchange for committing to use those resources for a set period (typically 1 or 3 years). They are most effective when you have a predictable and consistent need for GKE resources.

GKE offers two types of CUDs:

GKE Autopilot Mode CUDs: These offer spend-based discounts specific to GKE Autopilot clusters.
Resource-Based CUDs: These provide discounts on vCPUs and memory for GKE Standard clusters.

With CUDs, it's best to start conservatively and build coverage. Generali already had CUDs in place and opted to pause its purchasing of new commitments until after this optimization work was complete since any optimization would impact the overall amount of infrastructure in use. Generali have forecasted around €75,000 of savings using CUDs.

Great results for Generali and Google Cloud

Working with Google Cloud Consulting, Generali was able to achieve double digit percent savings on its GKE investment. According to Mohamed Talhaoui, the project lead, “If there was one takeaway from this engagement that Generali took, it was the importance of proactive financial management and optimization.“

As next steps for cost optimization, Generali plans to finalize its rightsizing and bin packing for all clusters and complete its evaluation of GKE Autopilot. The team is also preparing new commitments after all cost optimization exercises.

”The engagement with the Google Cloud Consulting FinOps team proved immensely beneficial for our organization,” Talhoui said. “Through their expertise, guidance, and focused efforts, we were able to achieve significant cost optimization within a remarkably brief timeframe, without compromising performance or quality."

Get started

Ready to learn more? Check out our services focused on FinOps and discover how Google Cloud Consulting can help you learn, build, operate and succeed.

^{Special thanks ands huge congratulations to the GKE team at Generali Switzerland, who worked on identifying and implementing these savings and sharing their insights for this blog post: Nada Kammoun-Ksibi, Mohamed Talhaoui, Christophe Mugnier, Mohamed Marzouk, Daniele Altomare, Jomon Joseph and Gustavo Benitez.}

Posted in