Environment Variables in Grid Control User Defined Metrics

This post originally appeared at the Pythian blog.

Emerson wrote: “Foolish consistency is the hobgoblin of small minds”. I love this quote, because it allows me to announce a presentation titled “7 Sins of Concurrency” and then show up with only 5. There are places where consistency is indeed foolish, while other times I wish for more consistency.

Here is a nice story that illustrates both types of consistency, or lack of.

This customer Grid Control installed in their environment. We were asked to configure all kinds of metrics and monitors for several databases, and we decided to use the Grid Control for this. One of the things we decided to monitor is the success of the backup jobs.

Like many others, this customer runs his backup jobs from cron and the cron job generates an RMAN logfile. I thought that a monitor that will check the logfile for RMAN- and ORA- errors will be just the thing we need.

To be consistent, I could have moved the backup jobs to run from Grid Control scheduler instead of cron. But in my opinion, this would have been foolish consistency – why risk breaking perfectly good backups? Why divert my attention from the monitoring project to take on side improvements?

To have Grid Control check the log files, I decided to use OS UDM: Thats a “User Defined Metric” that is defined on “host” targets and allows to run a script on the server. I wrote a very small shell script that finds the latest log, greps for errors and counts them. The script returns the error count to Grid Control. More than 0 errors is a critical status for the monitor. I followed the instructions in the documentation to the letter – and indeed, everything easily worked. Hurray!

Wait. There’s a catch (and a reason for this blog post). I actually had two instances that are backed up, and therefore two logs to check. I wanted to use the same script and just change the ORACLE_SID in the environment.

No worries. The UI has a field called “Environment” and the documentation says: “Enter any environmental variable(s) required to run the user-defined script.”

One could imagine, based on the field name and the documentation, that if I type: “ORACLE_SID=mysid” in this field, and later run “echo $ORACLE_SID” in my script, the result would be “mysid”.

Wrong. What does happen is that $ORACLE_SID is empty. $1, on the other hand, is “{ORACLE_SID=mysid}”.

To get the environment variable I wanted, I had to do: tmp=(`echo $1 | tr ‘{}’ ‘  ’`); eval $tmp

It took me a while to figure this out as this behavior is not documented and I found no usage examples that refer to this issue.

Consistency between your product, the UI and the documentation is not foolish consistency. I expect the documentation and field descriptions to help me do my job and I’m annoyed when it doesn’t.

At least now this behavior is documented somewhere so future googlers may have easier time.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s