Alerts should be actionable a.k.a.: Do not email on success!
Following the Unix philosophy: Do one thing, do it well and be quiet about it.
In software engineering, if you're writing a system that's useful and suddenly, one day, you think it's nice to notify users via email that their useful thing is being done, you're making a mistake.
Emails from software systems should be actionable: if a system is sending an email to a user, it should be helpful, provide enough context about who it is, where it's running, who owns / runs it and what the problem is that requires human attention. Ideally, the alert email should clearly specify the next steps and the dashboards that can be used to ensure that the problem is fixed.
The worst offense of the system is to send out success emails. This fails on 2 counts:
1. Success emails are not actionable - if I read a success email, I am informed and I promptly create a Gmail filter to never see another success email from the system again. The system made me do active work to ignore it.
2. Success emails are not trackable - if I want to see what's the ratio of success to failures of the system, Gmail is a terrible way to do it. From first hand experience, measuring alert volume over time in Gmail is a time sink. Please build a dashboard and make the world a better place. Your future self will thank you.
The best alert emails are those that just tell me the commands to fix the problem. Oh happy me, I don't have to read a runbook, talk to people, poke around dashboards to see what the problem is. Run a few commands and presto, it's fixed.
Take some time and make your life better, don't send success emails and send actionable emails on failure.
Cheers!
Divye
Comments
Post a Comment