S3 Cost Optimisation: Reducing Storage Spend with Smart Strategies

Amazon S3 is one of the most used AWS services, and storage costs can accumulate surprisingly quickly. For Australian businesses storing terabytes of data, S3 optimisation can yield substantial savings.

Understanding S3 Pricing

S3 costs come from three main sources:

1. Storage ($0.023/GB/month in Sydney for Standard) 2. Requests ($0.0055 per 1,000 PUT requests, $0.00044 per 1,000 GET requests) 3. Data transfer ($0.114/GB out to internet)

A 10 TB dataset costs ~$230/month just for storage in S3 Standard - but could cost as little as $10/month in Glacier Deep Archive.

S3 Storage Classes

Choose the right storage class for access patterns:

S3 Standard - $0.023/GB/month

Frequent access
Millisecond latency
No retrieval fees

S3 Intelligent-Tiering - $0.023/GB/month + $0.0025/1000 objects monitoring

Unknown or changing access patterns
Automatic optimisation
No retrieval fees

S3 Standard-IA - $0.0125/GB/month

Infrequent access (< once per month)
Millisecond latency
$0.01/GB retrieval fee
Minimum 30-day storage

S3 One Zone-IA - $0.01/GB/month

Infrequent access, non-critical data
Single AZ (less resilience)
$0.01/GB retrieval fee

S3 Glacier Instant Retrieval - $0.004/GB/month

Long-term, rarely accessed
Millisecond retrieval
Minimum 90-day storage

S3 Glacier Flexible Retrieval - $0.0045/GB/month

Archive data
Minutes to hours retrieval
Minimum 90-day storage

S3 Glacier Deep Archive - $0.00099/GB/month

Long-term archive (7-10 years)
12-48 hour retrieval
Minimum 180-day storage
Cheapest option

Strategy 1: Lifecycle Policies

Automatically transition objects to cheaper storage classes.

Basic Lifecycle Policy

{
  "Rules": [{
    "Id": "Move to cheaper storage",
    "Status": "Enabled",
    "Transitions": [
      {
        "Days": 30,
        "StorageClass": "STANDARD_IA"
      },
      {
        "Days": 90,
        "StorageClass": "GLACIER_IR"
      },
      {
        "Days": 365,
        "StorageClass": "DEEP_ARCHIVE"
      }
    ],
    "Expiration": {
      "Days": 2555
    }
  }]
}

Typical savings: 60-80% for data following this pattern

Application-Specific Policies

Log files:

{
  "Rules": [{
    "Id": "Log retention policy",
    "Prefix": "logs/",
    "Status": "Enabled",
    "Transitions": [
      {
        "Days": 7,
        "StorageClass": "STANDARD_IA"
      },
      {
        "Days": 30,
        "StorageClass": "GLACIER_IR"
      },
      {
        "Days": 90,
        "StorageClass": "GLACIER_FR"
      }
    ],
    "Expiration": {
      "Days": 365
    }
  }]
}

Backups:

{
  "Rules": [{
    "Id": "Backup retention",
    "Prefix": "backups/",
    "Status": "Enabled",
    "Transitions": [
      {
        "Days": 1,
        "StorageClass": "GLACIER_IR"
      },
      {
        "Days": 30,
        "StorageClass": "DEEP_ARCHIVE"
      }
    ],
    "Expiration": {
      "Days": 2555
    }
  }]
}

Intelligent-Tiering Configuration

For unpredictable access patterns:

aws s3api put-bucket-intelligent-tiering-configuration \
  --bucket my-bucket \
  --id EntireBucket \
  --intelligent-tiering-configuration '{
    "Id": "EntireBucket",
    "Status": "Enabled",
    "Tierings": [
      {
        "Days": 90,
        "AccessTier": "ARCHIVE_ACCESS"
      },
      {
        "Days": 180,
        "AccessTier": "DEEP_ARCHIVE_ACCESS"
      }
    ]
  }'

Objects automatically move between tiers based on access:

Frequent access tier
Infrequent access tier (after 30 days)
Archive access tier (after 90 days)
Deep archive access tier (after 180 days)

Benefit: No retrieval fees, automatic optimisation

Strategy 2: Delete Unnecessary Data

Identify Old and Unused Objects

S3 Storage Lens: Free metrics and insights:

Storage by age
Incomplete multipart uploads
Unencrypted objects
Objects without lifecycle policies

Analyze with Athena:

-- Query S3 inventory
SELECT
  bucket,
  key,
  size,
  storage_class,
  last_modified_date,
  size / 1024 / 1024 / 1024 as size_gb
FROM s3_inventory
WHERE last_modified_date < DATE_ADD('day', -365, CURRENT_DATE)
  AND storage_class = 'STANDARD'
ORDER BY size DESC
LIMIT 100;

Clean Up Incomplete Multipart Uploads

Often forgotten but still incur storage costs:

{
  "Rules": [{
    "Id": "Delete incomplete uploads",
    "Status": "Enabled",
    "AbortIncompleteMultipartUpload": {
      "DaysAfterInitiation": 7
    }
  }]
}

Delete Old Versions

If versioning is enabled:

{
  "Rules": [{
    "Id": "Expire old versions",
    "Status": "Enabled",
    "NoncurrentVersionTransitions": [
      {
        "NoncurrentDays": 30,
        "StorageClass": "STANDARD_IA"
      }
    ],
    "NoncurrentVersionExpiration": {
      "NoncurrentDays": 90
    }
  }]
}

Strategy 3: Optimise Request Costs

Batch Operations

Instead of individual API calls:

# Bad - Individual requests
for key in keys:
    s3.delete_object(Bucket='my-bucket', Key=key)

# Good - Batch delete
s3.delete_objects(
    Bucket='my-bucket',
    Delete={
        'Objects': [{'Key': key} for key in keys]
    }
)

Use CloudFront for Frequently Accessed Data

Benefits:

Cheaper data transfer ($0.095/GB vs $0.114/GB)
Reduced S3 GET requests
Better performance (edge caching)
Free HTTPS certificates

// CloudFront Distribution
{
  "Origins": [{
    "Id": "S3-my-bucket",
    "DomainName": "my-bucket.s3.amazonaws.com",
    "S3OriginConfig": {
      "OriginAccessIdentity": "origin-access-identity/cloudfront/ABCDEFG"
    }
  }],
  "DefaultCacheBehavior": {
    "TargetOriginId": "S3-my-bucket",
    "ViewerProtocolPolicy": "redirect-to-https",
    "AllowedMethods": ["GET", "HEAD"],
    "CachedMethods": ["GET", "HEAD"],
    "DefaultTTL": 86400
  }
}

Implement Caching Headers

Reduce repeat requests:

s3.put_object(
    Bucket='my-bucket',
    Key='image.jpg',
    Body=image_data,
    CacheControl='public, max-age=31536000',  # 1 year
    ContentType='image/jpeg'
)

Strategy 4: Optimise Data Transfer

VPC Endpoint (Gateway Endpoint)

Free data transfer between EC2 and S3 within same region:

aws ec2 create-vpc-endpoint \
  --vpc-id vpc-1234567890abcdef0 \
  --service-name com.amazonaws.ap-southeast-2.s3 \
  --route-table-ids rtb-12345678

Savings: Eliminate data transfer charges within region

S3 Transfer Acceleration

For uploads from distant locations:

s3 = boto3.client('s3', config=Config(
    s3={'use_accelerate_endpoint': True}
))

Cost: $0.04/GB (but faster uploads may be worth it)

Cross-Region Replication Optimisation

If using CRR, optimise:

Replicate only necessary prefixes
Use lifecycle policies on replica
Consider Batch Replication for one-time migrations

Strategy 5: Compression and Deduplication

Compress Before Storing

import gzip
import boto3

s3 = boto3.client('s3')

# Compress data
compressed_data = gzip.compress(original_data)

# Upload compressed
s3.put_object(
    Bucket='my-bucket',
    Key='data.json.gz',
    Body=compressed_data,
    ContentEncoding='gzip',
    ContentType='application/json'
)

Savings: 70-90% for text data, 20-50% for binary data

Deduplicate Before Upload

import hashlib

def get_hash(data):
    return hashlib.sha256(data).hexdigest()

def upload_if_new(bucket, data):
    data_hash = get_hash(data)
    key = f'data/{data_hash}'

    # Check if already exists
    try:
        s3.head_object(Bucket=bucket, Key=key)
        print(f"Object {key} already exists")
        return key
    except:
        # Upload new object
        s3.put_object(Bucket=bucket, Key=key, Body=data)
        return key

Strategy 6: Right-Size Objects

Avoid Small Objects

S3 charges minimum:

Standard: 128 KB per object
Intelligent-Tiering: 128 KB per object
Standard-IA: 128 KB per object

1,000 objects of 1 KB each = charged as 128 MB, not 1 MB.

Solution: Aggregate small files:

import io
import tarfile

def create_archive(files):
    tar_buffer = io.BytesIO()
    with tarfile.open(fileobj=tar_buffer, mode='w:gz') as tar:
        for filename, content in files.items():
            info = tarfile.TarInfo(name=filename)
            info.size = len(content)
            tar.addfile(info, io.BytesIO(content))

    tar_buffer.seek(0)
    return tar_buffer.read()

# Upload single archive instead of many small files
s3.put_object(
    Bucket='my-bucket',
    Key='archive.tar.gz',
    Body=create_archive(files)
)

Object Size Recommendations

< 100 KB: Consider aggregating
100 KB - 5 GB: Optimal range
> 5 GB: Use multipart upload

Strategy 7: S3 Storage Class Analysis

Automated recommendations for storage class transitions.

Enable Storage Class Analysis

aws s3api put-bucket-analytics-configuration \
  --bucket my-bucket \
  --id entire-bucket \
  --analytics-configuration '{
    "Id": "entire-bucket",
    "StorageClassAnalysis": {
      "DataExport": {
        "OutputSchemaVersion": "V_1",
        "Destination": {
          "S3BucketDestination": {
            "Format": "CSV",
            "Bucket": "arn:aws:s3:::analysis-results",
            "Prefix": "analysis/"
          }
        }
      }
    }
  }'

Analyzes access patterns and recommends transitions (takes 30 days to generate recommendations).

Monitoring and Governance

S3 Storage Lens

Enable default dashboard:

Storage metrics by bucket
Cost optimisation opportunities
Data protection best practices
Access management recommendations

Cost Allocation Tags

Tag all buckets:

aws s3api put-bucket-tagging \
  --bucket my-bucket \
  --tagging 'TagSet=[
    {Key=Application,Value=customer-portal},
    {Key=Environment,Value=production},
    {Key=Owner,Value=platform-team},
    {Key=CostCenter,Value=engineering}
  ]'

Bucket Policies for Cost Control

Prevent storage of unencrypted objects:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Sid": "DenyUnencryptedObjectUploads",
    "Effect": "Deny",
    "Principal": "*",
    "Action": "s3:PutObject",
    "Resource": "arn:aws:s3:::my-bucket/*",
    "Condition": {
      "StringNotEquals": {
        "s3:x-amz-server-side-encryption": "AES256"
      }
    }
  }]
}

Real-World Example

Scenario: 50 TB of application logs

Before optimisation:

Storage class: S3 Standard
Cost: 50,000 GB × $0.023 = $1,150/month

After optimisation:

First 30 days (recent logs):
  - 5 TB in Standard: $115/month

Days 31-90 (compliance access):
  - 15 TB in Standard-IA: $187.50/month

Days 91-365 (archive):
  - 20 TB in Glacier Instant Retrieval: $80/month

Older than 365 days:
  - 10 TB in Glacier Deep Archive: $9.90/month

Total: $392.40/month
Savings: $757.60/month (66% reduction)
Annual savings: $9,091

S3 Cost Optimisation Checklist

Immediate:

Enable S3 Storage Lens
Review largest buckets
Delete incomplete multipart uploads
Identify and delete old, unused data

30 days:

Implement lifecycle policies
Enable Intelligent-Tiering for unpredictable access
Set up VPC endpoints
Configure CloudFront for frequently accessed content

90 days:

Review Storage Class Analysis recommendations
Implement compression where applicable
Optimise request patterns
Review and optimise cross-region replication

Ongoing:

Monthly review of storage growth
Quarterly lifecycle policy review
Regular cleanup of old data
Monitor storage costs per bucket

Conclusion

S3 cost optimisation combines storage class selection, lifecycle policies, data management, and access pattern optimisation. For Australian businesses with significant S3 usage, these strategies can reduce storage costs by 50-80% while maintaining data accessibility and compliance requirements.

Start with lifecycle policies and Intelligent-Tiering, then progress to compression, deduplication, and access optimisation for maximum savings.

CloudPoint specialises in S3 cost optimisation for Australian businesses. We analyze your storage patterns, implement lifecycle policies, and build sustainable data management practices. Contact us for an S3 cost assessment.

Want to Optimise Your S3 Costs?

CloudPoint analyses your storage usage and implements lifecycle policies, intelligent tiering, and cleanup strategies. Get in touch to start saving.

Learn more about our Cost Optimisation service →