AWS Cloud Cost Optimization 2026: Cut Your Bill by 60% Without Killing Performance

AWS Cost Optimization 2026: Stop Burning Money in the Cloud

The average company wastes 32% of its cloud spend. AWS bills grow silently — dev environments left running, over-provisioned instances, unused Elastic IPs, data transfer charges nobody noticed. Here is how to find and eliminate every dollar of waste.

The Cost Visibility Stack
Reserved Instances vs Savings Plans
Spot Instances: 90% Off for Fault-Tolerant Workloads
Right-Sizing EC2 Instances
S3 Cost Optimization
Lambda Cost Optimization
RDS Cost Optimization
Quick Wins Checklist
FinOps: Cost as a Culture

The Cost Visibility Stack

Before you optimize, you need to see where money is going:

# Enable Cost Explorer (free)
# AWS Console → Billing → Cost Explorer → Enable

# AWS CLI: Get monthly costs by service
aws ce get-cost-and-usage \
  --time-period Start=2026-03-01,End=2026-03-31 \
  --granularity MONTHLY \
  --metrics "UnblendedCost" \
  --group-by Type=DIMENSION,Key=SERVICE \
  --query 'ResultsByTime[0].Groups[*].{Service:Keys[0],Cost:Metrics.UnblendedCost.Amount}' \
  --output table

# Get costs by tag (requires cost allocation tags enabled)
aws ce get-cost-and-usage \
  --time-period Start=2026-03-01,End=2026-03-31 \
  --granularity MONTHLY \
  --metrics "UnblendedCost" \
  --group-by Type=TAG,Key=Environment \
  --output json

// cost-reporter.ts — Weekly cost report to Slack
import { CostExplorerClient, GetCostAndUsageCommand } from '@aws-sdk/client-cost-explorer'

const client = new CostExplorerClient({ region: 'us-east-1' })

async function getWeeklyCosts() {
  const end = new Date()
  const start = new Date()
  start.setDate(start.getDate() - 7)

  const { ResultsByTime } = await client.send(new GetCostAndUsageCommand({
    TimePeriod: {
      Start: start.toISOString().split('T')[0],
      End: end.toISOString().split('T')[0],
    },
    Granularity: 'DAILY',
    Metrics: ['UnblendedCost'],
    GroupBy: [{ Type: 'DIMENSION', Key: 'SERVICE' }],
  }))

  const totals: Record<string, number> = {}

  ResultsByTime?.forEach(day => {
    day.Groups?.forEach(group => {
      const service = group.Keys?.[0] ?? 'Unknown'
      const cost = parseFloat(group.Metrics?.UnblendedCost?.Amount ?? '0')
      totals[service] = (totals[service] ?? 0) + cost
    })
  })

  // Sort by cost descending
  return Object.entries(totals)
    .sort((a, b) => b[1] - a[1])
    .filter(([, cost]) => cost > 1)  // Only show >$1
}

async function sendSlackReport() {
  const costs = await getWeeklyCosts()
  const total = costs.reduce((sum, [, cost]) => sum + cost, 0)

  const lines = costs.slice(0, 10).map(([service, cost]) =>
    `• ${service}: $${cost.toFixed(2)}`
  )

  const message = `*AWS Weekly Cost Report*\n*Total: $${total.toFixed(2)}*\n\n${lines.join('\n')}`

  await fetch(process.env.SLACK_WEBHOOK_URL!, {
    method: 'POST',
    body: JSON.stringify({ text: message }),
  })
}

Reserved Instances vs Savings Plans

Commit to usage for 1-3 years and save 40-75%:

Reserved Instances (RIs):
- Locked to specific instance type + region + OS
- Up to 75% savings vs On-Demand
- Best for: Stable workloads, specific instance needs

Savings Plans:
- Flexible across instance families, regions, OS
- Up to 66% savings
- Two types:
  1. Compute Savings Plans — most flexible (EC2, Lambda, Fargate)
  2. EC2 Instance Savings Plans — higher savings, locked to family

Recommendation for 2026:
- Use Compute Savings Plans for flexibility
- Start with 1-year, no upfront (zero risk)
- Buy to cover 70-80% of baseline, let rest be On-Demand

# Find RI/Savings Plan recommendations via CLI
aws ce get-savings-plans-purchase-recommendation \
  --savings-plans-type COMPUTE_SP \
  --term-in-years ONE_YEAR \
  --payment-option NO_UPFRONT \
  --lookback-period-in-days THIRTY_DAYS

# Check RI utilization
aws ce get-reservation-utilization \
  --time-period Start=2026-03-01,End=2026-03-31

# See Savings Plans coverage
aws ce get-savings-plans-coverage \
  --time-period Start=2026-03-01,End=2026-03-31 \
  --granularity MONTHLY

Spot Instances: 90% Off for Fault-Tolerant Workloads

Spot Instances use spare AWS capacity at massive discounts. They can be interrupted with 2 minutes notice — so handle interruptions gracefully:

// spot-handler.ts — Handle Spot interruption notices
import http from 'http'

// Poll EC2 instance metadata every 5 seconds
async function checkSpotInterruption() {
  return new Promise<boolean>((resolve) => {
    const req = http.get(
      'http://169.254.169.254/latest/meta-data/spot/termination-time',
      (res) => resolve(res.statusCode === 200)
    )
    req.on('error', () => resolve(false))
    req.setTimeout(1000, () => {
      req.destroy()
      resolve(false)
    })
  })
}

async function gracefulShutdown() {
  console.log('Spot interruption detected — graceful shutdown starting')

  // Stop accepting new work
  isShuttingDown = true

  // Finish current tasks (you have ~90 seconds)
  await drainQueue()
  await db.disconnect()

  // Deregister from load balancer
  await deregisterFromALB()

  console.log('Shutdown complete')
  process.exit(0)
}

// Check every 5 seconds
setInterval(async () => {
  const interrupted = await checkSpotInterruption()
  if (interrupted) await gracefulShutdown()
}, 5000)

# Auto Scaling Group with Spot + On-Demand mix
# cloudformation-asg.yml
Resources:
  AppASG:
    Type: AWS::AutoScaling::AutoScalingGroup
    Properties:
      MixedInstancesPolicy:
        InstancesDistribution:
          OnDemandBaseCapacity: 1          # Always 1 On-Demand
          OnDemandPercentageAboveBaseCapacity: 20  # 20% On-Demand, 80% Spot
          SpotAllocationStrategy: capacity-optimized  # Fewest interruptions
        LaunchTemplate:
          LaunchTemplateSpecification:
            LaunchTemplateId: !Ref LaunchTemplate
            Version: !GetAtt LaunchTemplate.LatestVersionNumber
          Overrides:
            - InstanceType: t3.large
            - InstanceType: t3a.large   # AMD variant
            - InstanceType: m5.large    # Fallback
            - InstanceType: m5a.large

Right-Sizing EC2 Instances

Oversized instances are the #1 waste. Use AWS Compute Optimizer:

# Enable Compute Optimizer (free)
aws compute-optimizer update-enrollment-status --status Active

# Get EC2 recommendations
aws compute-optimizer get-ec2-instance-recommendations \
  --query 'instanceRecommendations[*].{
    Instance:instanceArn,
    CurrentType:currentInstanceType,
    RecommendedType:recommendationOptions[0].instanceType,
    Savings:recommendationOptions[0].estimatedMonthlySavings.value
  }' \
  --output table

# Right-sizing script — check underutilized instances
aws cloudwatch get-metric-statistics \
  --namespace AWS/EC2 \
  --metric-name CPUUtilization \
  --dimensions Name=InstanceId,Value=i-1234567890abcdef0 \
  --start-time 2026-03-01T00:00:00Z \
  --end-time 2026-03-26T00:00:00Z \
  --period 86400 \
  --statistics Average \
  --query 'Datapoints[*].Average'

# find-idle-instances.py
import boto3
from datetime import datetime, timedelta

ec2 = boto3.client('ec2', region_name='us-east-1')
cloudwatch = boto3.client('cloudwatch', region_name='us-east-1')

def get_average_cpu(instance_id: str, days: int = 14) -> float:
    end = datetime.utcnow()
    start = end - timedelta(days=days)

    response = cloudwatch.get_metric_statistics(
        Namespace='AWS/EC2',
        MetricName='CPUUtilization',
        Dimensions=[{'Name': 'InstanceId', 'Value': instance_id}],
        StartTime=start,
        EndTime=end,
        Period=86400,
        Statistics=['Average'],
    )

    if not response['Datapoints']:
        return 0.0
    return sum(d['Average'] for d in response['Datapoints']) / len(response['Datapoints'])

# Find instances with <5% average CPU
paginator = ec2.get_paginator('describe_instances')
idle_instances = []

for page in paginator.paginate(Filters=[{'Name': 'instance-state-name', 'Values': ['running']}]):
    for reservation in page['Reservations']:
        for instance in reservation['Instances']:
            instance_id = instance['InstanceId']
            avg_cpu = get_average_cpu(instance_id)

            if avg_cpu < 5.0:
                name = next(
                    (tag['Value'] for tag in instance.get('Tags', []) if tag['Key'] == 'Name'),
                    'Unnamed'
                )
                idle_instances.append({
                    'id': instance_id,
                    'name': name,
                    'type': instance['InstanceType'],
                    'avg_cpu': avg_cpu,
                })

for inst in sorted(idle_instances, key=lambda x: x['avg_cpu']):
    print(f"{inst['name']} ({inst['id']}) — {inst['type']} — {inst['avg_cpu']:.1f}% CPU")

S3 Cost Optimization

S3 storage and data transfer are sneaky cost sources:

# Analyze S3 bucket sizes and storage classes
aws s3api list-buckets --query 'Buckets[*].Name' --output text | \
  tr '\t' '\n' | while read bucket; do
    echo "=== $bucket ==="
    aws s3api list-objects-v2 --bucket "$bucket" \
      --query '{
        Total: length(Contents),
        Size: sum(Contents[*].Size)
      }' --output json 2>/dev/null || echo "Access denied"
  done

{
  "Rules": [
    {
      "ID": "auto-tiering",
      "Status": "Enabled",
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "STANDARD_IA"
        },
        {
          "Days": 90,
          "StorageClass": "GLACIER_IR"
        },
        {
          "Days": 365,
          "StorageClass": "DEEP_ARCHIVE"
        }
      ],
      "NoncurrentVersionTransitions": [
        {
          "NoncurrentDays": 7,
          "StorageClass": "STANDARD_IA"
        },
        {
          "NoncurrentDays": 30,
          "StorageClass": "GLACIER"
        }
      ],
      "NoncurrentVersionExpiration": {
        "NoncurrentDays": 90
      }
    }
  ]
}

# Enable S3 Intelligent-Tiering for unpredictable access patterns
aws s3api put-bucket-intelligent-tiering-configuration \
  --bucket my-bucket \
  --id EntireBucket \
  --intelligent-tiering-configuration '{
    "Id": "EntireBucket",
    "Status": "Enabled",
    "Tierings": [
      {"Days": 90, "AccessTier": "ARCHIVE_ACCESS"},
      {"Days": 180, "AccessTier": "DEEP_ARCHIVE_ACCESS"}
    ]
  }'

Lambda Cost Optimization

Lambda is cheap but can surprise you at scale:

// Lambda cost factors:
// Requests: $0.20 per 1M
// Duration: $0.0000166667 per GB-second (arm64)
// Memory: Higher memory = faster execution = same or lower cost

// Optimize: Use arm64 (20% cheaper)
// In serverless.yml:
// architecture: arm64

// Profile memory usage to find optimal size
// Lambda Power Tuning: https://github.com/alexcasalboni/aws-lambda-power-tuning

// Reduce cold starts: minimize package size
// Check bundle size after build:
// du -sh .serverless/webcoderspeed-api.zip

// Use Lambda SnapStart (Java) or keep functions warm

// Avoid Lambda for:
// - >15 minute tasks (use ECS Fargate instead)
// - Constant high traffic >50M req/month (EC2 cheaper)
// - Tasks needing >10GB memory (use EC2)

# lambda-cost-calculator.py
def calculate_lambda_cost(
    monthly_requests: int,
    avg_duration_ms: float,
    memory_mb: int,
    architecture: str = 'arm64'
) -> dict:
    # Pricing (arm64)
    request_price = 0.20 / 1_000_000  # per request

    # arm64: $0.0000133334 per GB-second
    # x86:   $0.0000166667 per GB-second
    gb_second_price = 0.0000133334 if architecture == 'arm64' else 0.0000166667

    # Free tier: 1M requests + 400,000 GB-seconds
    FREE_REQUESTS = 1_000_000
    FREE_GB_SECONDS = 400_000

    billable_requests = max(0, monthly_requests - FREE_REQUESTS)

    duration_seconds = avg_duration_ms / 1000
    gb_seconds = (memory_mb / 1024) * duration_seconds * monthly_requests
    billable_gb_seconds = max(0, gb_seconds - FREE_GB_SECONDS)

    request_cost = billable_requests * request_price
    compute_cost = billable_gb_seconds * gb_second_price
    total = request_cost + compute_cost

    return {
        'requests': f'${request_cost:.4f}',
        'compute': f'${compute_cost:.4f}',
        'total': f'${total:.4f}',
        'vs_ec2_t3_small': f'{"CHEAPER" if total < 15 else "EC2 cheaper"} than EC2 t3.small ($15/mo)',
    }

# Example: 5M requests, 200ms avg, 512MB, arm64
print(calculate_lambda_cost(5_000_000, 200, 512))
# {'requests': '$0.8000', 'compute': '$1.3333', 'total': '$2.1333', ...}

RDS Cost Optimization

# Aurora Serverless v2 — scales to zero, pay per ACU
# aurora-serverless.tf
resource "aws_rds_cluster" "serverless" {
  cluster_identifier      = "my-app-db"
  engine                  = "aurora-postgresql"
  engine_mode             = "provisioned"  # v2 uses provisioned mode
  engine_version          = "15.4"
  database_name           = "myapp"
  master_username         = "postgres"
  master_password         = var.db_password
  skip_final_snapshot     = false

  serverlessv2_scaling_configuration {
    min_capacity = 0.5   # Minimum 0.5 ACU (scales to near-zero)
    max_capacity = 16    # Maximum 16 ACU
  }
}

resource "aws_rds_cluster_instance" "serverless" {
  cluster_identifier = aws_rds_cluster.serverless.id
  instance_class     = "db.serverless"
  engine             = aws_rds_cluster.serverless.engine
}

# Identify idle RDS instances
aws rds describe-db-instances \
  --query 'DBInstances[*].{ID:DBInstanceIdentifier,Class:DBInstanceClass,Status:DBInstanceStatus,MultiAZ:MultiAZ}' \
  --output table

# Stop dev/staging RDS when not in use (saves 100%)
aws rds stop-db-instance --db-instance-identifier dev-db

# Schedule auto-stop/start with EventBridge
# Stop at 8pm, start at 8am weekdays

Quick Wins Checklist

Optimization	Typical Savings	Effort
Delete unused Elastic IPs	$3.65/IP/month	Low
Delete unattached EBS volumes	$0.08/GB/month	Low
Remove unused Load Balancers	$16/month each	Low
Right-size oversized EC2	20-60%	Medium
Spot Instances for batch jobs	70-90%	Medium
Reserved Instances / Savings Plans	40-70%	Low
S3 Lifecycle Policies	50-90% on old data	Low
NAT Gateway → VPC Endpoints	Varies	Medium
Compress data before S3 upload	60-80% storage	Medium
CloudFront for S3 (reduce S3 GET)	Varies	Medium

# Find and delete unattached EBS volumes
aws ec2 describe-volumes \
  --filters Name=status,Values=available \
  --query 'Volumes[*].{ID:VolumeId,Size:Size,Type:VolumeType,Created:CreateTime}' \
  --output table

# Find unused Elastic IPs
aws ec2 describe-addresses \
  --query 'Addresses[?AssociationId==`null`].{IP:PublicIp,AllocationId:AllocationId}' \
  --output table

# Find old EBS snapshots (>30 days, not attached to AMI)
aws ec2 describe-snapshots --owner-ids self \
  --query 'Snapshots[?StartTime<=`2026-02-26`].{ID:SnapshotId,Size:VolumeSize,Date:StartTime}' \
  --output table

FinOps: Cost as a Culture

Cost optimization is not a one-time project — it is a culture:

Tag everything — every resource must have Environment, Team, Project tags
Set budget alerts — get notified at 80% and 100% of monthly budget
Weekly cost reviews — make it a team ritual, not an audit
Cost as a team KPI — include cloud cost per feature in engineering metrics
Use AWS Budgets Actions — auto-stop resources when budget exceeded

# Create a budget with email + SNS alerts
aws budgets create-budget \
  --account-id 123456789012 \
  --budget '{
    "BudgetName": "MonthlyBudget",
    "BudgetLimit": {"Amount": "500", "Unit": "USD"},
    "TimeUnit": "MONTHLY",
    "BudgetType": "COST"
  }' \
  --notifications-with-subscribers '[{
    "Notification": {
      "NotificationType": "ACTUAL",
      "ComparisonOperator": "GREATER_THAN",
      "Threshold": 80
    },
    "Subscribers": [{"SubscriptionType": "EMAIL", "Address": "devops@company.com"}]
  }]'

The goal is not to spend the least — it is to get the most value per dollar. A well-optimized AWS bill reflects a well-architected system.