Common mistakes in RAC installationPosted: November 24, 2007
This was supposed to be my OpenWorld Unconference session, which I didn’t give partially due to shyness and partially because I preferred to spend my time listening and learning.
I’m probably the worlds expert on failed RAC installations. I started my career as a DBA by spending four days with a consultant failing to install RAC in our test environment. In the three years that passed since that fatefull week, I’ve probably failed installing RAC over fifty times (I’ve succeeded quite a few times too), so I’m well qualified to tell everyone how to fail installing RAC.
So, how do you completely screw up your RAC installation?
- Don’t use the installation guide. Thats a common mistake done by both beginners and experts. If you don’t follow your RAC installation guide closely, your RAC installation will fail. The installation is simply too complicated to do from memory or by hunches. That is the most important thing to remember. The rest of this post will just contain common consequences of not following the installation guide. Also keep in mind to match the version of the installation guide to the version of RAC you are actually installing, because some things change with time.
- Your nodes don’t see each other. Huge mistake. Your nodes should be able to connect to each other by name, ip and fully qualified domain name, through public ip and interconnect ip. Verify with pings. Also make sure your host name is spelled the same everywhere – some parts of the installation are case sensitive.
- Don’t verify that all your RPMs are installed before beginning the installation. Unfortunately, this is a very easy mistake to make, because the RPM list in the installation guide is somewhat incomplete. There are metalink articles that attempt to correct the mistakes, so look for them. Keep in mind that at least in 10g, the prerequisite check didn’t cover all the required RPMs, so if you mess this step you will end up with a rather random error during the installation.
- Ask your network manager to configure the VIP in Linux before your install your cluster ware. Don’t. Just ask him for an IP – Oracle has a VIPCA utility that will configure and manage the VIP for you. If Linux already controls the VIP, RAC installation will fail.
- Configure SSH incorrectly. SSH configuration is a somewhat tricky part. Remember that your nodes should be able to ssh each other with user oracle without ssh asking for password or saying anything. ssh remotenode date should just give the date.
- Different times for different nodes. All nodes should show the exact date and time.
- Bad permissions on shared storage. Verify that root on all nodes has write access to the voting disk.
Thats what I recall right now. I’m sure there are lots more.