FinOps: Reduce Your Kubernetes Costs by 40%

Most organizations running Kubernetes don't actually know what it costs them. You pay cloud bills that seem reasonable in aggregate, but there's no visibility into which teams, applications, or projects consume which resources. The result: predictable waste.

We've worked with dozens of Swiss and European companies to implement FinOps discipline. The pattern is consistent: a 30-40% reduction in cloud spending without sacrificing performance or reliability.

Here's how.

The FinOps Mindset

FinOps isn't "squeeze resources until things break." It's an operational discipline where:

Everyone owns cost: Engineers, not just finance teams
Visibility is foundational: You can't optimize what you can't measure
Efficiency compounds: Small improvements across hundreds of workloads add up
Trade-offs are intentional: Sometimes spending more is the right choice

This requires three capabilities: measuring costs accurately, allocating them to teams, and building incentives to optimize.

Phase 1: Visibility (Weeks 1-4)

Step 1: Instrument Everything

You need cost visibility at every layer:

Cluster level:

How much does this Kubernetes cluster cost per month?
How much of that is compute, storage, networking?

Namespace level:

How much does the "payments" team's infrastructure cost?
How much does the "marketing" namespace consume?

Workload level:

How much does the PostgreSQL database cost?
How much does the web frontend consume?

Tools that provide this:

Kubecost (most comprehensive for Kubernetes)
Cloudability (multi-cloud, integrates with Kubernetes)
Native cloud tools (AWS Cost Explorer, GCP Billing, Azure Cost Management)

Kubecost deployment:

helm repo add kubecost https://kubecost.github.io/cost-analyzer/
helm install kubecost kubecost/cost-analyzer \
  --namespace kubecost \
  --create-namespace

Within 30 minutes, you have a dashboard showing costs by namespace, pod, and container.

Step 2: Tag Everything

Cloud billing is only useful if you can slice it by dimensions that matter to your business.

Minimum tags:

team: ownership (who pays for this?)
environment: dev, staging, production
application: name of the application
cost-center: internal billing code
project: what business initiative does this support?

In Kubernetes:

apiVersion: v1
kind: Pod
metadata:
  name: payment-processor
  labels:
    team: payments
    environment: production
    application: payment-processor
    cost-center: revenue-ops

Cloud billing systems read these labels and allocate costs accordingly.

Step 3: Set a Cost Baseline

Before optimization, measure your current spending.

Questions to answer:

What's our monthly Kubernetes bill?
What's the breakdown by environment? (Prod should be ~70%, staging ~15%, dev ~15%)
What percentage is compute vs. storage vs. networking?
What percentage is idle/unused capacity?

Reality check: Most organizations waste 25-35% on unused resources. This isn't negligence; it's just lack of visibility.

Phase 2: Optimization (Weeks 5-12)

Optimization 1: Right-Size Resource Requests and Limits

Most teams set resource requests conservatively ("we might need 2GB of memory") when actual usage is half that.

How to measure:

kubectl top pods -n your-namespace

Compare actual usage to requested resources.

Example optimization:

# Before: Conservative
resources:
  requests:
    cpu: 1000m
    memory: 2Gi
  limits:
    cpu: 2000m
    memory: 4Gi

# After: Right-sized (based on actual usage)
resources:
  requests:
    cpu: 250m
    memory: 512Mi
  limits:
    cpu: 500m
    memory: 1Gi

This single change can reduce compute costs by 20-30%.

Process:

Run in production for 2 weeks with monitoring
Measure p95 usage (not peak, not average)
Set requests to p95, limits to p95 + 25%
Monitor for pod evictions
Adjust if needed

Optimization 2: Eliminate Idle Resources

Development and staging clusters are often over-provisioned.

What to do:

Scale down non-production environments outside business hours
Implement pod autoscaling based on actual traffic
Shut down unused databases, caches, and load balancers

Kubernetes example (scale down at 6 PM, scale up at 8 AM):

apiVersion: autoscaling.alibabacloud.com/v1beta1
kind: CronHPA
metadata:
  name: scale-down-staging
spec:
  scaleDownRules:
  - at: "18:00"
    minReplicas: 1
    maxReplicas: 1
  scaleUpRules:
  - at: "08:00"
    minReplicas: 3
    maxReplicas: 10

Impact: 30-40% reduction in staging and development costs.

Optimization 3: Implement Horizontal Pod Autoscaling (HPA)

Most applications have variable load. Autoscaling means you only pay for capacity you use.

Example:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-frontend
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-frontend
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

When traffic increases, more pods spin up automatically. When it decreases, pods terminate.

Realistic impact: 15-25% cost reduction for variable workloads.

Optimization 4: Cleanup Unused Storage

Persistent volumes and snapshots accumulate. Old databases, old backups, old configuration often linger.

Audit questions:

Which persistent volumes haven't been accessed in 30 days?
Which snapshots are older than our retention policy?
Which databases are disconnected from production?

Process:

Tag all storage with creation date and owner
Every 90 days, audit unused storage
Delete what's not needed
Implement automated cleanup for logs older than 30 days

Impact: 10-15% reduction in storage costs.

Optimization 5: Negotiate Commitments

Once you understand your actual baseline usage (after optimization), lock in committed use discounts.

Cloud provider options:

AWS: Reserved Instances (1-year or 3-year)
GCP: Committed Use Discounts (1-year or 3-year)
Azure: Reserved Instances

Reality: If you use 10 CPUs consistently, buying a 1-year commitment saves 30-40% vs. on-demand pricing.

Caveat: Only commit to what you'll actually use. Don't predict growth you can't guarantee.

Phase 3: Culture (Weeks 13+)

Step 1: Make Costs Visible to Engineers

Most developers have no idea what their applications cost to run.

How to fix it:

Monthly cost reports per team
Cost breakdown in pull requests (Kubecost has GitHub integration)
Cost alerts when deployments exceed expected spending

Example PR integration:

This PR adds a new database, estimated additional cost: $500/month.
Estimated monthly savings from right-sizing: $200.
Net monthly impact: +$300.

Engineers become cost-aware quickly when they see numbers.

Step 2: Incentivize Optimization

If cost-awareness is shared, make optimization a shared goal.

Approaches:

Budget per team with carryover (save money, reinvest in infrastructure)
Cost optimization contests (30% reduction = team reward)
Cost-per-request metrics (visible on dashboards alongside latency and error rate)

Reality: Competitive teams usually optimize aggressively if they see the impact.

Step 3: Build Cost Governance

Not all cost reduction is good. Sometimes spending more prevents bigger problems.

Governance rules:

Standard resource profiles (small/medium/large) to reduce decision paralysis
Budget thresholds (this team gets $50k/month; anything above requires approval)
Cost review meetings (monthly, quick, focused on outliers)

Step 4: Establish a FinOps Dashboard and Review Cadence

Governance rules only work if they are backed by visible, up-to-date data. Build a centralized FinOps dashboard that consolidates cost data from Kubecost, your cloud provider billing APIs, and any internal chargeback systems into a single view. The dashboard should answer three questions at a glance: where are we spending, who is responsible, and what changed since last month.

We recommend organizing the dashboard into four panels. The first panel shows total monthly spend with a trend line and budget threshold markers. The second breaks cost down by team or namespace, highlighting any team that exceeds its allocated budget. The third panel tracks idle resource percentage over time, which serves as a proxy for optimization health. The fourth panel lists the top ten cost anomalies of the month: deployments, scaling events, or storage allocations that deviated significantly from historical patterns.

Pair the dashboard with a monthly FinOps review meeting of no more than 30 minutes. Invite one representative per engineering team plus a finance stakeholder. The agenda is simple: review the anomaly panel, discuss any budget overruns, and agree on optimization actions for the next cycle. This lightweight cadence prevents cost drift without burdening teams with excessive process.

Real Numbers: A 500-Person Company's Optimization

Before:

Monthly Kubernetes bill: $120,000
Resource utilization: 35%
Idle capacity: $42,000/month

After (three months of optimization):

Monthly bill: $72,000 (40% reduction)
Resource utilization: 60%
Applied optimizations:
- Right-sizing requests: -20%
- Scaling down non-prod: -8%
- Horizontal autoscaling: -7%
- Cleanup unused storage: -5%

Total savings: $48,000/month or $576,000/year.

Effort: Two engineers for three months, one ongoing FTE for maintenance.

Common Mistakes

Mistake 1: Optimizing without baseline Fix: Measure before, measure after. Otherwise you won't know what worked.

Mistake 2: Overly aggressive right-sizing Fix: Set limits slightly above p95, not at average. Leave headroom for unexpected spikes.

Mistake 3: Killing production features to save cost Fix: FinOps is about efficiency, not compromise. If a feature is valuable, its cost is justified.

Mistake 4: Neglecting multi-cloud or hybrid scenarios Fix: Use normalized cost reporting. Ensure you're comparing apples to apples across providers.

The Timeline

Month 1: Visibility (Kubecost, tagging, baseline measurement) Month 2: Quick wins (right-sizing, cleanup, scaling) Month 3: Culture (reporting, incentives, governance)

Most organizations see 15-20% cost reduction by month two, 30-40% by month three.

Moving Forward

FinOps is not a project you finish. It's an operational discipline that compounds over time. Every 5% reduction in waste, across hundreds of workloads, adds up to significant savings.

The first 30% is relatively easy (right-sizing, cleanup, autoscaling). The next 20% requires more effort (commitment planning, architecture redesign). Beyond that, you're looking at core product changes.

Most organizations see the greatest ROI in focusing on the easy 30%.

Related reading:

Found this helpful? See how Hidora can help: Professional Services · Managed Services · SLA Expert