FinOps: Reduce Your Kubernetes Costs by 40%
Most organizations running Kubernetes don't actually know what it costs them. You pay cloud bills that seem reasonable in aggregate, but there's no visibility into which teams, applications, or projects consume which resources. The result: predictable waste.
We've worked with dozens of Swiss and European companies to implement FinOps discipline. The pattern is consistent: a 30-40% reduction in cloud spending without sacrificing performance or reliability.
Here's how.
The FinOps Mindset
FinOps isn't "squeeze resources until things break." It's an operational discipline where:
- Everyone owns cost: Engineers, not just finance teams
- Visibility is foundational: You can't optimize what you can't measure
- Efficiency compounds: Small improvements across hundreds of workloads add up
- Trade-offs are intentional: Sometimes spending more is the right choice
This requires three capabilities: measuring costs accurately, allocating them to teams, and building incentives to optimize.
Phase 1: Visibility (Weeks 1-4)
Step 1: Instrument Everything
You need cost visibility at every layer:
Cluster level:
- How much does this Kubernetes cluster cost per month?
- How much of that is compute, storage, networking?
Namespace level:
- How much does the "payments" team's infrastructure cost?
- How much does the "marketing" namespace consume?
Workload level:
- How much does the PostgreSQL database cost?
- How much does the web frontend consume?
Tools that provide this:
- Kubecost (most comprehensive for Kubernetes)
- Cloudability (multi-cloud, integrates with Kubernetes)
- Native cloud tools (AWS Cost Explorer, GCP Billing, Azure Cost Management)
Kubecost deployment:
helm repo add kubecost https://kubecost.github.io/cost-analyzer/
helm install kubecost kubecost/cost-analyzer \
--namespace kubecost \
--create-namespace
Within 30 minutes, you have a dashboard showing costs by namespace, pod, and container.
Step 2: Tag Everything
Cloud billing is only useful if you can slice it by dimensions that matter to your business.
Minimum tags:
team: ownership (who pays for this?)environment: dev, staging, productionapplication: name of the applicationcost-center: internal billing codeproject: what business initiative does this support?
In Kubernetes:
apiVersion: v1
kind: Pod
metadata:
name: payment-processor
labels:
team: payments
environment: production
application: payment-processor
cost-center: revenue-ops
Cloud billing systems read these labels and allocate costs accordingly.
Step 3: Set a Cost Baseline
Before optimization, measure your current spending.
Questions to answer:
- What's our monthly Kubernetes bill?
- What's the breakdown by environment? (Prod should be ~70%, staging ~15%, dev ~15%)
- What percentage is compute vs. storage vs. networking?
- What percentage is idle/unused capacity?
Reality check: Most organizations waste 25-35% on unused resources. This isn't negligence; it's just lack of visibility.
Phase 2: Optimization (Weeks 5-12)
Optimization 1: Right-Size Resource Requests and Limits
Most teams set resource requests conservatively ("we might need 2GB of memory") when actual usage is half that.
How to measure:
kubectl top pods -n your-namespace
Compare actual usage to requested resources.
Example optimization:
# Before: Conservative
resources:
requests:
cpu: 1000m
memory: 2Gi
limits:
cpu: 2000m
memory: 4Gi
# After: Right-sized (based on actual usage)
resources:
requests:
cpu: 250m
memory: 512Mi
limits:
cpu: 500m
memory: 1Gi
This single change can reduce compute costs by 20-30%.
Process:
- Run in production for 2 weeks with monitoring
- Measure p95 usage (not peak, not average)
- Set requests to p95, limits to p95 + 25%
- Monitor for pod evictions
- Adjust if needed
Optimization 2: Eliminate Idle Resources
Development and staging clusters are often over-provisioned.
What to do:
- Scale down non-production environments outside business hours
- Implement pod autoscaling based on actual traffic
- Shut down unused databases, caches, and load balancers
Kubernetes example (scale down at 6 PM, scale up at 8 AM):
apiVersion: autoscaling.alibabacloud.com/v1beta1
kind: CronHPA
metadata:
name: scale-down-staging
spec:
scaleDownRules:
- at: "18:00"
minReplicas: 1
maxReplicas: 1
scaleUpRules:
- at: "08:00"
minReplicas: 3
maxReplicas: 10
Impact: 30-40% reduction in staging and development costs.
Optimization 3: Implement Horizontal Pod Autoscaling (HPA)
Most applications have variable load. Autoscaling means you only pay for capacity you use.
Example:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-frontend
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-frontend
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
When traffic increases, more pods spin up automatically. When it decreases, pods terminate.
Realistic impact: 15-25% cost reduction for variable workloads.
Optimization 4: Cleanup Unused Storage
Persistent volumes and snapshots accumulate. Old databases, old backups, old configuration often linger.
Audit questions:
- Which persistent volumes haven't been accessed in 30 days?
- Which snapshots are older than our retention policy?
- Which databases are disconnected from production?
Process:
- Tag all storage with creation date and owner
- Every 90 days, audit unused storage
- Delete what's not needed
- Implement automated cleanup for logs older than 30 days
Impact: 10-15% reduction in storage costs.
Optimization 5: Negotiate Commitments
Once you understand your actual baseline usage (after optimization), lock in committed use discounts.
Cloud provider options:
- AWS: Reserved Instances (1-year or 3-year)
- GCP: Committed Use Discounts (1-year or 3-year)
- Azure: Reserved Instances
Reality: If you use 10 CPUs consistently, buying a 1-year commitment saves 30-40% vs. on-demand pricing.
Caveat: Only commit to what you'll actually use. Don't predict growth you can't guarantee.
Phase 3: Culture (Weeks 13+)
Step 1: Make Costs Visible to Engineers
Most developers have no idea what their applications cost to run.
How to fix it:
- Monthly cost reports per team
- Cost breakdown in pull requests (Kubecost has GitHub integration)
- Cost alerts when deployments exceed expected spending
Example PR integration:
This PR adds a new database, estimated additional cost: $500/month.
Estimated monthly savings from right-sizing: $200.
Net monthly impact: +$300.
Engineers become cost-aware quickly when they see numbers.
Step 2: Incentivize Optimization
If cost-awareness is shared, make optimization a shared goal.
Approaches:
- Budget per team with carryover (save money, reinvest in infrastructure)
- Cost optimization contests (30% reduction = team reward)
- Cost-per-request metrics (visible on dashboards alongside latency and error rate)
Reality: Competitive teams usually optimize aggressively if they see the impact.
Step 3: Build Cost Governance
Not all cost reduction is good. Sometimes spending more prevents bigger problems.
Governance rules:
- Standard resource profiles (small/medium/large) to reduce decision paralysis
- Budget thresholds (this team gets $50k/month; anything above requires approval)
- Cost review meetings (monthly, quick, focused on outliers)
Real Numbers: A 500-Person Company's Optimization
Before:
- Monthly Kubernetes bill: $120,000
- Resource utilization: 35%
- Idle capacity: $42,000/month
After (three months of optimization):
- Monthly bill: $72,000 (40% reduction)
- Resource utilization: 60%
- Applied optimizations:
- Right-sizing requests: -20%
- Scaling down non-prod: -8%
- Horizontal autoscaling: -7%
- Cleanup unused storage: -5%
Total savings: $48,000/month or $576,000/year.
Effort: Two engineers for three months, one ongoing FTE for maintenance.
Common Mistakes
Mistake 1: Optimizing without baseline Fix: Measure before, measure after. Otherwise you won't know what worked.
Mistake 2: Overly aggressive right-sizing Fix: Set limits slightly above p95, not at average. Leave headroom for unexpected spikes.
Mistake 3: Killing production features to save cost Fix: FinOps is about efficiency, not compromise. If a feature is valuable, its cost is justified.
Mistake 4: Neglecting multi-cloud or hybrid scenarios Fix: Use normalized cost reporting. Ensure you're comparing apples to apples across providers.
The Timeline
Month 1: Visibility (Kubecost, tagging, baseline measurement) Month 2: Quick wins (right-sizing, cleanup, scaling) Month 3: Culture (reporting, incentives, governance)
Most organizations see 15-20% cost reduction by month two, 30-40% by month three.
Moving Forward
FinOps is not a project you finish. It's an operational discipline that compounds over time. Every 5% reduction in waste, across hundreds of workloads, adds up to significant savings.
The first 30% is relatively easy (right-sizing, cleanup, autoscaling). The next 20% requires more effort (commitment planning, architecture redesign). Beyond that, you're looking at core product changes.
Most organizations see the greatest ROI in focusing on the easy 30%.
Related reading:
Found this helpful? See how Hidora can help: Professional Services · Managed Services · SLA Expert



