Have you ever had an Azure instance that just was not performing up to your expectations?
We have. And in the past, they were very hard to get rid of without affecting your entire cloud deployment.
How-To: Remove Azure instances that became slow or unhealthy
In this week’s How-To post, we’ll cover the new API that lets us easily remove an Azure instance that is performing poorly or has become unhealthy. We also share a tool we wrote to automate the removal, so you can remove dead Azure instances quickly whenever you need.
Read more: Azure: Remove unhealthy or slow role instances.
A bit of history
Back when we launched LeanSentry 2 years ago, we had a lot of issues with Azure instances not performing up to our expectations. In particular, the Azure host processes would die whenever the instances experienced high memory utilization from our custom cache layer, and begin to constantly recycle the role/reboot the VM. This caused service outages during times of peak usage.
Back then, the only way to take an instance out of rotation was to do VIP swap, or to scale down the service to a point where the offending instance would be removed (and so would all other instances with higher instance ids). Because we maintained a lot of in-memory and on-disk state on the instance, both of those options would be a huge no-no. So, we lobbied Microsoft to create an option to remove a specific instance, instead of trashing half of your service.
The Azure team came through and finally released an API to do this. This has been a godsend, allowing us to intelligently manage how we scale up and down so we can keep the instances with the highest efficiency / warmest cache.
For more on how to when to do this, and a tool to quickly delete instances, check out the How-To post here.