Lots of risks are being exposed in the SaaS landscape – whether compromised systems or extended operational outages.

What are the implications of this?

In terms of risk mitigation for the provider, perhaps deeper sharding is warranted. This would provide enhanced isolation between tenants limiting the blast radius of a compromise. As I recently described CloudSinc uses the highly audited AWS IAM service to enforce data vaults, but there are many ways to do this. I discussed this previously.

For the service consumer, the obvious risk reduction is backups – even if the primary service is down, having some form of redundant data copy is likely to be useful. It’s unlikely you’ll have a usable DR capability – since you are paying the SaaS provider to add value, but in many cases, such as the (at the time of writing) JIRA outage, being able to look up / track some items will have benefits. If the decision, temporarily or not, to migrate to another provider is taken, then at least porting is tenable vs trying to get attention from the vendor at moment of peak stress.

For similar reasons, and again restricted usage, service configuration has value to be externalised even if in a proprietary format. Ideally this is still in a readable textual format in a stable schema backed by a suitable API. The current state-of-the-art end-game for this is to have all service config stored in a client’s own git repo, such that controlled change can be done in a coordinated manner – the essence of GitOps. More on this in another post.