AWS Cost Optimisation
S3 Cost Optimisation: Reducing Storage Spend with Smart Strategies
Comprehensive guide to optimising S3 storage costs through lifecycle policies, intelligent tiering, and data management best practices for Australian businesses.
CloudPoint Team
Amazon S3 is one of the most used AWS services, and storage costs can accumulate surprisingly quickly. For Australian businesses storing terabytes of data, S3 optimisation can yield substantial savings.
Understanding S3 Pricing
S3 costs come from three main sources:
1. Storage ($0.023/GB/month in Sydney for Standard) 2. Requests ($0.0055 per 1,000 PUT requests, $0.00044 per 1,000 GET requests) 3. Data transfer ($0.114/GB out to internet)
A 10 TB dataset costs ~$230/month just for storage in S3 Standard - but could cost as little as $10/month in Glacier Deep Archive.
S3 Storage Classes
Choose the right storage class for access patterns:
S3 Standard - $0.023/GB/month
- Frequent access
- Millisecond latency
- No retrieval fees
S3 Intelligent-Tiering - $0.023/GB/month + $0.0025/1000 objects monitoring
- Unknown or changing access patterns
- Automatic optimisation
- No retrieval fees
S3 Standard-IA - $0.0125/GB/month
- Infrequent access (< once per month)
- Millisecond latency
- $0.01/GB retrieval fee
- Minimum 30-day storage
S3 One Zone-IA - $0.01/GB/month
- Infrequent access, non-critical data
- Single AZ (less resilience)
- $0.01/GB retrieval fee
S3 Glacier Instant Retrieval - $0.004/GB/month
- Long-term, rarely accessed
- Millisecond retrieval
- Minimum 90-day storage
S3 Glacier Flexible Retrieval - $0.0045/GB/month
- Archive data
- Minutes to hours retrieval
- Minimum 90-day storage
S3 Glacier Deep Archive - $0.00099/GB/month
- Long-term archive (7-10 years)
- 12-48 hour retrieval
- Minimum 180-day storage
- Cheapest option
Strategy 1: Lifecycle Policies
Automatically transition objects to cheaper storage classes.
Basic Lifecycle Policy
{
"Rules": [{
"Id": "Move to cheaper storage",
"Status": "Enabled",
"Transitions": [
{
"Days": 30,
"StorageClass": "STANDARD_IA"
},
{
"Days": 90,
"StorageClass": "GLACIER_IR"
},
{
"Days": 365,
"StorageClass": "DEEP_ARCHIVE"
}
],
"Expiration": {
"Days": 2555
}
}]
}
Typical savings: 60-80% for data following this pattern
Application-Specific Policies
Log files:
{
"Rules": [{
"Id": "Log retention policy",
"Prefix": "logs/",
"Status": "Enabled",
"Transitions": [
{
"Days": 7,
"StorageClass": "STANDARD_IA"
},
{
"Days": 30,
"StorageClass": "GLACIER_IR"
},
{
"Days": 90,
"StorageClass": "GLACIER_FR"
}
],
"Expiration": {
"Days": 365
}
}]
}
Backups:
{
"Rules": [{
"Id": "Backup retention",
"Prefix": "backups/",
"Status": "Enabled",
"Transitions": [
{
"Days": 1,
"StorageClass": "GLACIER_IR"
},
{
"Days": 30,
"StorageClass": "DEEP_ARCHIVE"
}
],
"Expiration": {
"Days": 2555
}
}]
}
Intelligent-Tiering Configuration
For unpredictable access patterns:
aws s3api put-bucket-intelligent-tiering-configuration \
--bucket my-bucket \
--id EntireBucket \
--intelligent-tiering-configuration '{
"Id": "EntireBucket",
"Status": "Enabled",
"Tierings": [
{
"Days": 90,
"AccessTier": "ARCHIVE_ACCESS"
},
{
"Days": 180,
"AccessTier": "DEEP_ARCHIVE_ACCESS"
}
]
}'
Objects automatically move between tiers based on access:
- Frequent access tier
- Infrequent access tier (after 30 days)
- Archive access tier (after 90 days)
- Deep archive access tier (after 180 days)
Benefit: No retrieval fees, automatic optimisation
Strategy 2: Delete Unnecessary Data
Identify Old and Unused Objects
S3 Storage Lens: Free metrics and insights:
- Storage by age
- Incomplete multipart uploads
- Unencrypted objects
- Objects without lifecycle policies
Analyze with Athena:
-- Query S3 inventory
SELECT
bucket,
key,
size,
storage_class,
last_modified_date,
size / 1024 / 1024 / 1024 as size_gb
FROM s3_inventory
WHERE last_modified_date < DATE_ADD('day', -365, CURRENT_DATE)
AND storage_class = 'STANDARD'
ORDER BY size DESC
LIMIT 100;
Clean Up Incomplete Multipart Uploads
Often forgotten but still incur storage costs:
{
"Rules": [{
"Id": "Delete incomplete uploads",
"Status": "Enabled",
"AbortIncompleteMultipartUpload": {
"DaysAfterInitiation": 7
}
}]
}
Delete Old Versions
If versioning is enabled:
{
"Rules": [{
"Id": "Expire old versions",
"Status": "Enabled",
"NoncurrentVersionTransitions": [
{
"NoncurrentDays": 30,
"StorageClass": "STANDARD_IA"
}
],
"NoncurrentVersionExpiration": {
"NoncurrentDays": 90
}
}]
}
Strategy 3: Optimise Request Costs
Batch Operations
Instead of individual API calls:
# Bad - Individual requests
for key in keys:
s3.delete_object(Bucket='my-bucket', Key=key)
# Good - Batch delete
s3.delete_objects(
Bucket='my-bucket',
Delete={
'Objects': [{'Key': key} for key in keys]
}
)
Use CloudFront for Frequently Accessed Data
Benefits:
- Cheaper data transfer ($0.095/GB vs $0.114/GB)
- Reduced S3 GET requests
- Better performance (edge caching)
- Free HTTPS certificates
// CloudFront Distribution
{
"Origins": [{
"Id": "S3-my-bucket",
"DomainName": "my-bucket.s3.amazonaws.com",
"S3OriginConfig": {
"OriginAccessIdentity": "origin-access-identity/cloudfront/ABCDEFG"
}
}],
"DefaultCacheBehavior": {
"TargetOriginId": "S3-my-bucket",
"ViewerProtocolPolicy": "redirect-to-https",
"AllowedMethods": ["GET", "HEAD"],
"CachedMethods": ["GET", "HEAD"],
"DefaultTTL": 86400
}
}
Implement Caching Headers
Reduce repeat requests:
s3.put_object(
Bucket='my-bucket',
Key='image.jpg',
Body=image_data,
CacheControl='public, max-age=31536000', # 1 year
ContentType='image/jpeg'
)
Strategy 4: Optimise Data Transfer
VPC Endpoint (Gateway Endpoint)
Free data transfer between EC2 and S3 within same region:
aws ec2 create-vpc-endpoint \
--vpc-id vpc-1234567890abcdef0 \
--service-name com.amazonaws.ap-southeast-2.s3 \
--route-table-ids rtb-12345678
Savings: Eliminate data transfer charges within region
S3 Transfer Acceleration
For uploads from distant locations:
s3 = boto3.client('s3', config=Config(
s3={'use_accelerate_endpoint': True}
))
Cost: $0.04/GB (but faster uploads may be worth it)
Cross-Region Replication Optimisation
If using CRR, optimise:
- Replicate only necessary prefixes
- Use lifecycle policies on replica
- Consider Batch Replication for one-time migrations
Strategy 5: Compression and Deduplication
Compress Before Storing
import gzip
import boto3
s3 = boto3.client('s3')
# Compress data
compressed_data = gzip.compress(original_data)
# Upload compressed
s3.put_object(
Bucket='my-bucket',
Key='data.json.gz',
Body=compressed_data,
ContentEncoding='gzip',
ContentType='application/json'
)
Savings: 70-90% for text data, 20-50% for binary data
Deduplicate Before Upload
import hashlib
def get_hash(data):
return hashlib.sha256(data).hexdigest()
def upload_if_new(bucket, data):
data_hash = get_hash(data)
key = f'data/{data_hash}'
# Check if already exists
try:
s3.head_object(Bucket=bucket, Key=key)
print(f"Object {key} already exists")
return key
except:
# Upload new object
s3.put_object(Bucket=bucket, Key=key, Body=data)
return key
Strategy 6: Right-Size Objects
Avoid Small Objects
S3 charges minimum:
- Standard: 128 KB per object
- Intelligent-Tiering: 128 KB per object
- Standard-IA: 128 KB per object
1,000 objects of 1 KB each = charged as 128 MB, not 1 MB.
Solution: Aggregate small files:
import io
import tarfile
def create_archive(files):
tar_buffer = io.BytesIO()
with tarfile.open(fileobj=tar_buffer, mode='w:gz') as tar:
for filename, content in files.items():
info = tarfile.TarInfo(name=filename)
info.size = len(content)
tar.addfile(info, io.BytesIO(content))
tar_buffer.seek(0)
return tar_buffer.read()
# Upload single archive instead of many small files
s3.put_object(
Bucket='my-bucket',
Key='archive.tar.gz',
Body=create_archive(files)
)
Object Size Recommendations
- < 100 KB: Consider aggregating
- 100 KB - 5 GB: Optimal range
- > 5 GB: Use multipart upload
Strategy 7: S3 Storage Class Analysis
Automated recommendations for storage class transitions.
Enable Storage Class Analysis
aws s3api put-bucket-analytics-configuration \
--bucket my-bucket \
--id entire-bucket \
--analytics-configuration '{
"Id": "entire-bucket",
"StorageClassAnalysis": {
"DataExport": {
"OutputSchemaVersion": "V_1",
"Destination": {
"S3BucketDestination": {
"Format": "CSV",
"Bucket": "arn:aws:s3:::analysis-results",
"Prefix": "analysis/"
}
}
}
}
}'
Analyzes access patterns and recommends transitions (takes 30 days to generate recommendations).
Monitoring and Governance
S3 Storage Lens
Enable default dashboard:
- Storage metrics by bucket
- Cost optimisation opportunities
- Data protection best practices
- Access management recommendations
Cost Allocation Tags
Tag all buckets:
aws s3api put-bucket-tagging \
--bucket my-bucket \
--tagging 'TagSet=[
{Key=Application,Value=customer-portal},
{Key=Environment,Value=production},
{Key=Owner,Value=platform-team},
{Key=CostCenter,Value=engineering}
]'
Bucket Policies for Cost Control
Prevent storage of unencrypted objects:
{
"Version": "2012-10-17",
"Statement": [{
"Sid": "DenyUnencryptedObjectUploads",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::my-bucket/*",
"Condition": {
"StringNotEquals": {
"s3:x-amz-server-side-encryption": "AES256"
}
}
}]
}
Real-World Example
Scenario: 50 TB of application logs
Before optimisation:
- Storage class: S3 Standard
- Cost: 50,000 GB × $0.023 = $1,150/month
After optimisation:
First 30 days (recent logs):
- 5 TB in Standard: $115/month
Days 31-90 (compliance access):
- 15 TB in Standard-IA: $187.50/month
Days 91-365 (archive):
- 20 TB in Glacier Instant Retrieval: $80/month
Older than 365 days:
- 10 TB in Glacier Deep Archive: $9.90/month
Total: $392.40/month
Savings: $757.60/month (66% reduction)
Annual savings: $9,091
S3 Cost Optimisation Checklist
Immediate:
- Enable S3 Storage Lens
- Review largest buckets
- Delete incomplete multipart uploads
- Identify and delete old, unused data
30 days:
- Implement lifecycle policies
- Enable Intelligent-Tiering for unpredictable access
- Set up VPC endpoints
- Configure CloudFront for frequently accessed content
90 days:
- Review Storage Class Analysis recommendations
- Implement compression where applicable
- Optimise request patterns
- Review and optimise cross-region replication
Ongoing:
- Monthly review of storage growth
- Quarterly lifecycle policy review
- Regular cleanup of old data
- Monitor storage costs per bucket
Conclusion
S3 cost optimisation combines storage class selection, lifecycle policies, data management, and access pattern optimisation. For Australian businesses with significant S3 usage, these strategies can reduce storage costs by 50-80% while maintaining data accessibility and compliance requirements.
Start with lifecycle policies and Intelligent-Tiering, then progress to compression, deduplication, and access optimisation for maximum savings.
CloudPoint specialises in S3 cost optimisation for Australian businesses. We analyze your storage patterns, implement lifecycle policies, and build sustainable data management practices. Contact us for an S3 cost assessment.
Want to Optimise Your S3 Costs?
CloudPoint analyses your storage usage and implements lifecycle policies, intelligent tiering, and cleanup strategies. Get in touch to start saving.