I'm just a simple DBA on a complex production system

Writing about all things production. Especially Oracle databases.

Kosher Visualization September 23, 2009

Filed under: Visualization — prodlife @ 12:04 am

I’m working on my visualization presentation (OOW unconference, October 12, 4PM – don’t miss it!), and one of the topics I keep rethinking is how to present results of research in a visual way. Especially when the report or presentation is for non-technical management.

It is perfectly easy to take true data and arrange it on a chart in a way that “proves” whatever it is you want to show. But is it wrong? Is there a one true way to display data and everything else is a lie and a distortion?

Lets look at a handy example: http://www.cunningham.me.uk/wordpress/2007/07/11/how-to-lie-with-statistics-as-shown-by-the-bbc/

In the example (look at the graphs with the red lines), scientists measured a temperature increase of 0.5 degrees over a period of 30 years. The blogger thinks that the first chart lies – because they make a tiny change look scary by changing the scale of the chart. He then shows the “correct” chart where the scale is changed to the point that the temperature increase is barely noticeable.

But is it really that straight forward? Another point of view can be that temperature change of 0.5 degrees over 30 years is a huge deal, and is indeed scary and the graph was scaled to make the correct scientific view more visible. By rescaling the graph you are actually obscuring an important truth and misleading the audience.

What is the truth? I’m just a simple DBA, I’ve no idea about global warming.

But when I do research about a performance issue and then I write a report about the results of my research, and I use charts to demonstrate the important points in my results – I find it legitimate to scale the graphs in a way that makes the important points as clear as possible. If my graphs don’t demonstrate my points in the clearest way possible then I’m doing a bad job.

However, to keep myself ethical, I follow few rules about these modifications:

  1. You are 100% sure, to the best of your knowledge and research, that the point you are making is indeed correct. You are not allowed to hide data just because you did not do a very good job at collecting or analyzing it.
  2. You mention the modification in the report or presentation. You make the original data available to anyone who wants to verify your results.
  3. You have very good reasons for the modifications you did and you feel comfortable presenting them to anyone who questions your charts.
  4. You will be extra careful when rescaling data that is displayed as two dimensional shapes, and make sure that the proportions between the rescaled areas indeed reflect the proportions of the data. Because in 2D small changes are doubled.

You’ll notice that my advice is somewhat subjective – that because I don’t really see an objective way to differentiate between “highlighting an important truth” and “making mountains out of molehills”. You did the research, you know if 0.002ms increase in storage network round-trip time is a big deal or not, and you should decide how to display it. Obviously, if you manage to find a clear and unquestionable way to display your results, so much the better.

 

Good Stuff September 16, 2009

Filed under: links — prodlife @ 11:14 pm

Oracle Open World! I’ll be there, and so will lots of other cool people. Don’t miss the blogger meetup where we’ll all hang out :) Don’t miss the unconference. The line-up is better than what I see at most events – Greg Rahn, Cary Millsap, Kelvin Closson, Rob van Wijk, Alex Gorbatchev, Richard Foote and I will all be there.

Dr. Neil Gunther! One of the top most performance specialists. His blog is not easy to read, and is not strictly Oracle related, but I’m always glad I take the time to read it because I learn so much. Its also quite entertaining (for load testing nerds). For example: “Without knowing any details, I can see is that the test rig was driven into saturation, starting with the first concurrent request! Therefore, the first data points provide all the comparison information. The other measurements are redundant (log axis or no). So, what’s the point of the plot?”. Oh, and he also has good twittings!

Exadata v2! A DB server so fast the only way to describe it is ridiculous! There’s still not a lot of technical information out there about it, but the FAQ is a good start.

Advanced Oracle Troubleshooting Seminar at NoCoug Unbelievable, but two month before the event 50% of the sits are already taken. If you are interested, you should probably hurry up. Early bird registration ends in a week. Don’t say I didn’t warn you.

Shell tricks! Don’t know about you, but I still do my scripting with BASH. Jared Still posted some useful tricks.

Please post more cool stuff in the comments. Also suggestions for books I should read on my daily 3 hour train commute to OpenWorld will be nice.

 

Two Things Everyone Should Know About Queues September 16, 2009

Filed under: musing,performance — prodlife @ 10:38 pm

If you are in the performance business, you should know a lot about queues. How to use them to find performance problems, predict issues, plan your capacity, model your load test results, etc. Queues are just a part of what you should know and be comfortable discussing.

But what if you are not a performance professional? What if you are a sales person or a manager or a dentist? Do you still need to understand queues?

Obviously not everyone should know queues at a precise mathematical level. But queues are everywhere, and sometimes I wish people around me understood queues better. It’ll make it easier for me to explain things. There are two things I think everyone should know about queues:

  1. If it takes me one hour on average to handle a request, and I get one request every hour – most of the time requests will be delayed due to queueing and backlog. Running your DBAs (or servers, or doctors, or toll-booths) at full utilization with every minute accounted for means queueing and delays.
  2. If there are multiple servers (or DBAs or DMV clerks), the most efficient way to get service is to arrange all the requests in a single queue and have all servers accept requests from that queue. The way supermarkets do it – a different queue per cashier is inefficient. Deciding that you want all your requests to be handled by a specific DBA because she is better looking is also less efficient than entering the request in the general DBA queue.

Spread the word :)

 

I Can Has Training Budget September 11, 2009

Filed under: tips,training — prodlife @ 10:37 pm

We know how it goes – there is a recession, and companies try to reduce expanses. The next thing you know, your training budget is all gone. Or maybe there is some training budget left, but now 6 DBAs share a sum that is not enough for one Oracle University course. How do you convince your managers that paying for your training is the best investment they can make?

Start by convincing yourself. Remember that your manager probably got to his position because he is good at reading people, so if you don’t really want the training, or don’t really believe you need this training, he may see that and you lost. You have to be 100% sure that you want this training because it will really allow you to improve the way you work.

As an example, lets assume you want to go to Linux Administration course. Its an interesting case, because it is not even evident that a DBA should go to such course.

Then think about your boss for a bit – what parts of the job are most important to him? what are his pet projects? pet peeves.

Once you have your desire for the course and your bosses desires in mind, make a list of all the benefits you can see from going to the course. The important thing is to highlight how the things you want to learn will help with the projects that are most important to your boss, or will address his specific pain points.

So, if your boss loves automation say: “I will learn more shell linux tools so I’ll be able to write better automation scripts”.
If he is a capacity planning person, say: “I will be able to better monitor the OS so we can be more proactive about provisioning”.
If he is a big fan of RAC, say: “With my improved Linux knowledge, I’ll be able to understand low-leve clusterware issues and solve them faster!”

Now you need to decide if you make your pitch face to face or by email. I prefer email. Information I put in the email:
* Course title and instructor (or school name)
* Dates/Times
* Location
* Price
* The list of 3-5 reasons I need this course (as you prepared in the previous paragraph).

Until he makes his decision, keep mentioned once or twice a day how the things you do now will be much better after you take the course: “I still don’t understand how to debug coredumps after the process crashes, but the Linux course may help”, “It takes me 2 hours to copy old files to the second disk, but I’ll probably learn how to do it faster in the Linux course”. Don’t force it, but keep an eye open for opportunities to explain and demonstrate the value you see in the course.

And a questionable tactics that sometimes works: Get an ultra-expensive course rejected before asking for a reasonably-priced course. “I can totally understand you don’t have the budget to send me to Collaborate in Denver, but what about one day training given by our local usergroup at a near-by location?”. I’m not sure if this tactic works because the manager feels guilty about rejecting my request, or if the lower-price seminar just looks better in comparison. I’m not even sure if I recommend it, really. Consider and act at your own risk ;)

 

OOW09 – Tradition Edition September 9, 2009

Filed under: openworld09 — prodlife @ 10:35 pm

This year will be my third time I’m attending Oracle Open World. When you do something every year for 3 years in a row, you develop few traditions around it.

Even though I know I always have an amazing time there, I’m always worried before. I remember the commute, and the fatigue and the boring marketing contents. Somehow the memories of great discussions in the OTN lounge with amazing people are less vivid. So being anxious before is definitely a tradition.

Some traditions do not continue – this year there seem to be no blogger meeting. I guess I’ll need to be a bit more proactive about meeting my online colleagues. Like, email everyone to check if they will attend OOW and ask if they want to date me :) You can also leave a comment here if you want to hang out together.

A tradition I hope not to continue is over-scheduling sessions. I looked for presenters I know, especially those I enjoyed in previous years. Some Streams and RAC 11gr2 sessions, to make sure I keep on top of my favorite technologies. I made it a habit to attend “Current Trends in Real World Performance” session – it is consistently the most enlightening session in OpenWorld. I’ll probably rewrite my schedule few times before the conference, and few times a day during the conference. Thats traditional too.

I’m excited to continue the tradition (started last year) of giving an Unconference session at OpenWorld. Last year was my first ever Oracle presentation – I gave a live demo of streams configuration and troubleshooting. It was wonderful. This year I feel like a veteran presenter – I gave 4 presentations at conferences in the last year. I am going to talk about graphical methods (under the sexy name – visualization). To be honest, I still don’t know what exactly I’ll talk about. I have lots of ideas – using charts to explore the data and solve problems, using charts to prove a point in reports and presentations, how not to lie or confuse when charting data. I plan for lots of examples. I’m looking forward to cooking all these ingredients into one delicious presentation.

I’m presenting on Monday, 4pm – looking forward to see you all there, because meeting amazing people is my favorite OOW tradition.

 

Real Life Block Corruption (Maybe) September 4, 2009

Filed under: musing — prodlife @ 1:28 am

What’s the worst thing that can happen to a database? I think most DBAs will agree that block corruption is a good candidate on the list. When DBAs debate the soundness of their backup policy, corrupted blocks are often used as test cases and rhetoric devices: “Keep just 3 days of backup? But what if a block is corrupted on Saturday and we don’t find out until Monday?”.

Until this week, I only knew about block corruptions from my certification studies and from recovery practices (using dd to corrupt blocks is a common gambit).

We had a block corruption this week. At least, we think we did – neither us, nor Oracle support are 100% certain. It was nothing like the text books described.

On Saturday, our DB crashed. The error in the alert log indicated a corrupted block. We restarted the DB, and…. did nothing. My manager sent me an email asking me to open a ticket to Oracle about this. I saw the email on Monday, failed to realize the importance of the problem (I suck!) and proceeded to work on other tasks.

On Tuesday the DB crashed again. This time it also sprouted endless Ora-600 [2662] error message once it started. We gave it another restart, this time it started fine. I did open the ticket to Oracle. Priority 1. We ran a bunch of verifications – RMAN validation, DBV, analyzing bunch of tables and indexes.

RMAN and DBV did not detect any issues. Full export completed successfully. No one is actually certain this is a block corruption. The only strangeness was an index that appeared in DBA_INDEXES but did not exist when we tried to run analyze. We asked our sys admins to check the machine, the OS and the connected storage.

On Wednesday the server crashed again. Again a corrupt block. Different file this time. Oracle supports found that one of the millions of ORA-600 and ORA-7445 errors we’ve seen could be related to a SQL parsing bug and suggested a patch.

We’ve had it. In an emergency 10 hour maintenance, we used export/import to move all the schemas to a new DB server.

We hope this is the end of the problem, but we can’t really tell. Which is exactly how real DBA life is so different from textbook descriptions and recovery practices.

 

 
Follow

Get every new post delivered to your Inbox.

Join 46 other followers