Logs and other Instrumentation toolsPosted: December 4, 2007
Attention All Application Developers:
One day, your application is going to be deployed in production.
We both know, that even when that fateful day arrives, the application may still be full of bugs and perhaps even a performance bottleneck or two.
And sooner or later the operations team that owns the production system will notice the bugs and bottlenecks, and they will call you. They will expect you to be able to diagnose the cause of the problem and depending on the severity of the issue either suggest a work-around, issue an emergency fix or implement a solution in the next version.
They will also expect you to do all that without interfering with production work. This means that they will not allow you to upload new executables onto production. They are not doing this to make your life miserable, they are doing this because they are getting payed to protect the production system from interruptions.
If you don’t want to look like an idiot at that point, you should prepare your application for live debugging.
A log with error messages that you can understand is a terrific start. It is even better when log messages are dated. If your application is multi threading, you want to know which thread wrote which message to the log.
A nice switch that causes the application to start writing incredibly detailed messages to the log, usually known as “debug” or “trace” switch, will be highly useful and well worth the 2 hours it may take you to write one. Make sure your application polls this switch – the operation people will be unhappy if you tell them you need to restart production to generate a trace.
After this, the sky is the limit – write key performance events to a table, allow attaching a client that will let you look at the memory of the application while it is running, secret APIs for turning on even more detailed debugging messages – I’ve seen them all, and they are all very useful.