Created by Laurence J MacGuire a.k.a. 刘建明 a.k.a Liu Jian Ming
ThoughtWorks XiAn, 2017/02/17
Usually simple. Just requires some effort.
Check out the billing console.
Last Month Costs, This Month (so far), This Months Forecast
Yeah. We spent a lot.
Checkout CloudWatch IN US-EAST-1
CloudWatch > Metrics > Billing
All > Billing > By Service
Select All
Point In Time Invoice Estimate
An extra 1200$? It was me :(
Pretty much all the pricing can be accessed from here:
M class instances are general purpose
T class instances have burstable capacity
R class have high memory / lower CPU
C class have high CPU / lower memory
ap-southeast-2 pricing
It’s more complicated. But that’s the idea.
Similar to EC2.
Simple Pricing: 0.020$ to 0.025$ /gig /month
Watch out for versioning!
Enable ASG Metrics and see GroupTotalInstances
Instance Price * Instance hours = ASG Price
Use Cloudwatch and do some Math
Know your app. Is it CPU intensive? Is memory or I/O intensive?
I can’t answer these questions. But it’s critical that you know.
How much traffic does it see?
1 RPM? 10 RPM? 100 RPM? 1000 RPM? 10000 RPM?
Do the numbers make sense?
Simple CRUD app w/ 10 RPM? On 8 instances?
Look at all the numbers, and ask yourself.
Investigate.
You NEED an SLA.
And the means to measure.
Every change is weighing trade-offs. What is acceptable?
Re-size your stuff.
EC2 Instance Console: Monitoring
EC2 Instance Console: Monitoring
EC2 Instance Console: Monitoring
Clouwatch > Metrics > EC2 > By-ASG > CPUUtilization
Before
After
This code is perfect.
– said no one ever
Chances are, there are easy optimisations you can do.
“Premature optimisation blah blah”
Mostly READ ONLY Database after a large data processing pipeline.
Before
After
Not anymore. Since data-services is hammering us :(
Bid % of normal price. Uptime not garanteed.
Low SLA? Offline processing? Can survive delays?
Try a Spot instance.
Examples: event processors, CI agents, dashboards, report runners …
All events come into AWS Kinesis. Get buffered for 7 days. State stored in DynamoDB.
Spot instance is up 95%+ of the time for 30% of the price.
Before
After
Trusted Advisor
$ authenticate
$ ./auto/ec2-utilisation --summary
...
Stat Min Avg Max σ
Instances 23 31 40.0 7.1
vCPUs avail 33 45 60.0 11.2
vCPUs used 2 3 10.8 1.5
Util % 6 8 19.3 2.0
Tells you average CPU usage across an entire account.
Under 20% and you’re really under-utilizing
$ authenticate
$ ./auto/stack-costs
---
stack: tlap-demo
since: 2017-02-16 05:42:46.765942071 Z
cost:
since: 0.8240000000000001
per_hour: 0.034333333333333334
per_day: 0.8240000000000001
per_month: 24.720000000000002
per_year: 300.76000000000005
per_request: .inf
resources:
auto_scaling_groups:
items:
tlap-demo-autoScalingGroup-19URKBUAGO3WB:
config:
instance_type: t2.nano
spot_price:
size:
min: 2
max: 2
avg: 1.1666666666666667
cost: 0.2240000000000001
usage:
memory_percentage:
min: 61.578199229834965
avg: 66.04526662297977
max: 72.2794131419596
cpu_percentage:
min: 2.5
avg: 8.089285714285714
max: 67.33
cost: 0.2240000000000001
load_balancers:
items:
tlap-demo-loadBala-DH21FTCDREX0:
usage:
hits:
sum: 0
average: 0
minimum: 0
maximum: 0
cost: 0.6
cost: 0.6
recommendations:
- elb 'tlap-demo-loadBala-DH21FTCDREX0' gets very little traffic ( 0.00 hits / hour).
Is this stack needed 24 hours a day? or at all?
Feel free to ask for help when you look at the costs :)