You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Feb 11, 2019. It is now read-only.
On 10/2 at 18:43:00, one of our EC2-based saw a priceout of the last AZ available to us in the region for that instance type, which took out all 9 of our spot instances, leaving us with 0 instances running in the service. That state of 0 running instances lasted about 7 minutes before the spot market briefly returned to normal. The only scale up we saw for the on-demand group was 10 minutes after the total priceout.
Looking in the logs of the spotswap lambda function, I found the logs that were closest to the priceout:
START
2017-10-02T18:23:08.038Z Finding instance ids in spot group: SpotGroup
2017-10-02T18:23:08.342Z Checking for termination tags on 9 instances
2017-10-02T18:23:09.624Z Found 9 instances with SpotTermination tag
2017-10-02T18:23:12.362Z No-op on stack during a CloudFormation update
END
REPORT Duration: 4325.97 ms Billed Duration: 4400 ms Memory Size: 128 MB Max Memory Used: 53 MB
START
2017-10-02T18:25:08.775Z Finding instance ids in spot group: SpotGroup
2017-10-02T18:25:09.199Z No instances listed
2017-10-02T18:25:09.255Z Found 0 instances with SpotTermination tag
2017-10-02T18:25:09.276Z Checking spot group SpotGroup for scaledown
2017-10-02T18:25:09.276Z Checking spot group SpotGroup for scaledown
18:25:09 END
Strangely, the No-op on stack during CloudFormation update appeared when there was no cloudformation update in progress. Then, in the second message, it shows No instances listed, which seems to then no-op the function when it should have taken the lack of instances as an indication that the on-demand group should scale up.
Two questions:
Why did spotswap misjudge the cloudformation update?
Should the lack of spot instances have scaled up the on-demand group?