This is a good reason to start investing in multi region architecture at some point.
Not trying to be smug here or anything, but we updated a single config value, made a PR, and committed the change and we were switched over to a different region in a few minutes. Smooth sailing after that.
(This is still dependent to some degree on AWS in order to actually execute the failover, something we’re mulling over how to solve)
Now, our work demands we invest in such things, we’re even investing in multi-cloud (an actual nightmare). Not everyone can do this, and some systems are just not built to be able to, but if it’s within reach it’s probably worth it.
This is a good reason to start investing in multi region architecture at some point.
Not trying to be smug here or anything, but we updated a single config value, made a PR, and committed the change and we were switched over to a different region in a few minutes. Smooth sailing after that.
(This is still dependent to some degree on AWS in order to actually execute the failover, something we’re mulling over how to solve)
Now, our work demands we invest in such things, we’re even investing in multi-cloud (an actual nightmare). Not everyone can do this, and some systems are just not built to be able to, but if it’s within reach it’s probably worth it.
Last night from 12-4am, it was almost every region impacted so it didn’t help that much.
But we do have failovers for customers that they need to activate to just start working in another region.
But our canaries and infrastructure alarms cannot do that since they are for alerts in the region.