On Error Messages

Here’s a pet peeve of mine: Customers who don’t read the error messages. The usual symptom is a belief that there is just on error: “Doesn’t work”, and that all forms of “doesn’t work” are the same. So if you tried something, got an error, your changed something and you are still getting an error, nothing changed.

I hope everyone who reads this blog understand why this behavior makes any troubleshooting nearly impossible. So I won’t bother to explain why I find this so annoying and so self defeating. Instead, I’ll explain what can we, as developers, can do to improve the situation a bit. (OMG, did I just refer to myself as a developer? I do write code that is then used by customers, so I may as well take responsibility for it)

Here’s what I see as main reasons people don’t read error messages:

  1. Error message is so long that they don’t know where to start reading. Errors with multiple Java stack dumps are especially fun. Stack traces are useful only to people who look at the code, so while its important to get them (for support), in most cases your users don’t need to see all that very specific information.
  2. Many different errors lead to the same message. The error message simply doesn’t indicate what the error may be, because it can be one of many different things. I think Kerberos is the worst offender here, so many failures look identical. If this happens very often, you tune out the error message.
  3. The error is so technical and cryptic that it gives you no clue on where to start troubleshooting.  “Table not Found” is clear. “Call to localhost failed on local exception” is not.

I spend a lot of time explaining to my customers “When <app X> says <this> it means that <misconfiguration> happened and you should <solution>”.

To get users to read error messages, I think error messages should be:

  1. Short. Single line or less.
  2. Clear. As much as possible, explain what went wrong in terms your users should understand.
  3. Actionable. There should be one or two actions that the user should take to either resolve the issue or gather enough information to deduce what happened.

I think Oracle are doing a pretty good job of it. Every one of their errors has an ID number, a short description, an explanation and a proposed solution. See here for example: http://docs.oracle.com/cd/B28359_01/server.111/b28278/e2100.htm#ORA-02140

If we don’t make our errors short, clear and actionable – we shouldn’t be surprised when our users simply ignore them and then complain that our app is impossible to use (or worse – don’t complain, but also don’t use our app).