Lessons From OOW09 #2 – Consolidation TipsPosted: October 19, 2009
The session was called “All in One” and it was given by Husnu Sensoy. A young and very accomplished DBA from Turkey. I chatted with him during ACE dinner and it turns out we have many colleagues in common. This was probably the most useful presentation I’ve heard this OpenWorld. As I am going into a large consolidation project for next year, I am glad I can learn from the experience of someone like Husnu who already gone through this and is very willing to share the experience.
His presentation is shared on his blog, so I’ll just give the parts that I consider to be highlights. You will probably learn more by reading his slides. He had tons of good content and he talks very very fast, so I’m sure I missed a bunch of good stuff. Since his content is readily available, I’m mixing in a lot of my own thoughts here.
The problems he set out to solve:
- Too many DBs and too few DBAs.
- Some servers are doing almost nothing.
- Some servers have no HA.
Pick candidates for consolidation based on: License costs, utilization, data center location, dependencies, I/O characteristics, risk levels.
The driver for our consolidation were license costs – our new machines had two quad cores instead of one dual core, so license costs suddenly quadrapled and we were forced into cost saving consolidations. We mostly used data center location and risk levels to decide on the plan. Our most OLTP system, the one that is most sensitive to slow-downs, will remain unconsolidated for now.
Prior to consolidation collect lots of system/performance metrics. They will help pick candidates, plan capacity, test and later troubleshoot.
Don’t forget to talk to DBAs and business reps when making the consolidation plans, they will have their own ideas and this can be important input.
Additive linear models are recommended for capacity planning. He gave lots of guidelines on how to do this. Pages 26-36 in his PDF have the details. I could have sworn he recommended to stay below the 65% utilization when planning for CPU capacity, but I cannot see it in his slides. In any case – do this, because any higher than that and the linear additive model is questionable.
Also pay attention to the part about preferring larger servers and less RAC nodes, since RAC adds complexity. And to the part about every storage system delivering about 70-80% of spec. Actually, this is more true for the EMC system he used. Our Netapps seem to be up to spec.
Don’t mix sequential and random IO (i.e. OLTP and DW) is a good idea. A lot of places can’t really do this because of the way their apps are designed.
Benchmarking the new system to test the capacity plan is a great idea. I’d love to see more concrete information on how to benchmark, maybe a whole other presentation on this. One of the things that worry me most about our consolidation plans is that I’m not sure how good our tests will be. Husnu recommended HammerOra, which I’ll check out.
Crash tests. We did those ages ago when we moved to RAC architecture and then again for Netapp clusters. Was lots of fun and maybe its time to do this again. Husnu advised to ask support for a list of good test scenarios. I recommend taking your sysadmins, storage admins and net admins for few beers and asking them for scenarios – they generally come up with very creative stuff.
Good tip: In 11g, memory target does not work with huge pages. You are using huge pages, right?
Write the backup and recovery document as the first document on the new system.
Pages 76-86 have good advice on merging databases. We’ll be doing that too and I was glad to see that we came up with the same plans and same problems as Husnu.
His last advice is the best: “You never know how long this is going to take.” So true! Who could have known that we will be delayed for 3 month by an IT security group that popped up from no-where with requirement that we will pass certain security audits that we’ve never heard of. Life in a big organization can be full of surprises, so be prepared 🙂