I'm just a simple DBA on a complex production system

Writing about all things production. Especially Oracle databases.

Few notes on migrating from SQL Server to Oracle July 27, 2007

Filed under: Oracle,SqlServer — prodlife @ 12:36 am

Ibrahimdemir suggested I should describe how we migrated from SQL Server to Oracle. Here are few notes on the subject. Obviously, there are lots of details here – the move took anything from 2 years to 6 month, depends on where you start counting, so I’m only mentioning the highlights.

1) Motivation: The official reason for moving was that Oracle released version 10.2. It became clear that many companies are successfully deploying RAC systems and that it represents a real solution for database high availability. High availability was such an issue with SQL Server that management agreed to the project with surprising speed.
Less officially, we were five Oracle DBAs and we found ourselves managing a production environment that is based almost entirely on SQL Server. We hated that and wanted a way out. RAC was simply a bait for upper management.

2) The applications: Our largest application was supposed to support Oracle out of the box, but we needed to change JDBC drivers to one that supports RAC. During tests we found other errors related to Oracle being different than SQL Server (Lots of issues around case sensitivity, which completely slipped my mind when I wrote the previous post), but mostly the change on the application was rather painless.
Other applications needed to be changed. Anything from small modifications to the connection string, to complete and total rewrite – depending on how the application used different database features.
We had to convert around a dozen applications, and the most difficult part was hunting them all down. Just yesterday, five days before we shut down the SQL Servers, we found two more applications that were somehow forgotten.

3) Deciding to migrate: At first we didn’t plan to migrate at all. We planned to put new customers on Oracle and keep existing customers on SQL Server, with the understanding that at some point the SQL Server environment will die naturally.
We lived with both environments for about a year and around that time two interesting things happened – we discovered that a two node RAC can carry the same load as six SQL Server machines and give significantly better performance, and the data center started pressuring us to use less electricity. The decision was made – move everything to Oracle and shut down the SQL Servers.

4) How to migrate: By the time we decided to migrate, we already knew exactly how the applications should work on Oracle. The correct schema were already in place. The only question was how to move the data. Our development team took a very controversial approach (I’m still not convinced it was the right one), and they wrote two utilities for the move.

The first utility could dump an SQL Server schema to a bunch of CSV files, and then load the CSV files into a ready schema on Oracle side. The utility moved only data, not definitions (table structure, indexes, constraints, etc were not included), but it was OK because we had the right definitions on Oracle already.

The second utility could “listen” to new data coming in to the SQL Server, and copy that data to Oracle. We used sequence (and identity) numbers to get this done.

Using both utilities we could copy a customer over to Oracle, run both systems in parallel for few days until the customer was happy with Oracle and then cut off the data duplication and remove the SQL Server schema.

We moved over 200 customers this way. One by one over the course of 6 month.

5) Summary: We thought we got it all nailed and done. We were ready for the big shut down on Wednesday. However, this morning we discovered two applications that were forgotten. So, shutdown was delayed by two weeks (which are rather costly, because the data center really wants us to cut down on electricity), during this time we have to make the necessary changes in the application, test them and make the move. We hope that this time it will really be it.

 

5 ways SQL Server is different than Oracle July 25, 2007

Filed under: SqlServer — prodlife @ 1:53 am

Recently I had to move an application from SQL Server to Oracle. I knew the databases are different, but I had no idea how different. Here are few issues I ran into:

1. Dates. Thats the most obvious difference. SQL Server dates are cast into strings implicitly, developers don’t need to do anything. Date arithmetics is done differently, and date functions are completely different. If you do anything interesting with dates in your code, you are better off writing it from scratch for each DB. Trying to translate the different functions can get you into trouble.

2. Partitions. SQL Server 2005 supports partitions, but before that SQL Server developers were stuck with creating lots of identical tables and putting them in a view to simulate partitions. When you translate an application to Oracle, you probably want to use real partitions instead.

3. Views. In SQL Server, if a view has a set operation like “Union All”, you can still apply DDL commands (like update) to the view itself instead of the underlying tables. The tables will update automatically. Unfortunately, by version 10.2, Oracle still doesn’t support this functionality.

4. Functions. In SQL Server a stored procedure returns the result of the last query used in it. So basically procedures return a table, just like a query. It was quite useful. In Oracle you will need to use temporary tables for a similar functionality.

5. Wait events. SQL Server doesn’t allow you to trace event 10046. Actually it doesn’t even give the normal SQL Trace functionality that Oracle has. No TKProf either. It does have a profiler – but it is not nearly as detailed as what Oracle provides. After working in Oracle, you can no longer understand how anyone managed to do any performance tuning with SQL Server.

 

MySQL on Netapp July 21, 2007

Filed under: mysql,netapp,recovery — prodlife @ 12:23 am

One of the databases we need to support is a MySQL database, installed on a windows server. The DBA team collectively has about 3 days of experience with MySQL. The system is technically production, but it is used only by internal users. It is a rather large database (200G) and gets tons of traffic because the application does lots of automatic data collection and reports. In addition it is an ultra sensitive application, which will crash if we stop the database for more than a minute.

But the worse thing is that the developers can’t keep their hands off the database. So every day contains a new surprise – new version, new storage system, the data files move to a new directory, all our backup scripts were deleted, etc etc.

I placed the MySQL data files on a Netapp LUN (connected to windows via iScsi) and I wrote a very nice script that locked tables, flushed them to disk, took netapp snapshot and unlocked the tables. I was rather proud of this script because it was really a good use of Netapp and allowed us to take backup without upsetting our sensitive application.

Well, few weeks ago we noticed that snapshots stopped on the relevant volume. Farther investigation revealed that the backup scripts are gone. Well, I’m used to that and I immediately uploaded a new copy of the scripts.

Only today I noticed that at some point our developers changed MySQL storage from MyISAM to InnoDB. InnoDB tables are not affected by lock and flush commands, which means that we have no valid backups.

I looked into InnoDB backup possibilities a bit and there seem to be two solutions, both of them are not really compatible with Netapp snapshot capabilities, which mean that they will be really really slow and take up too much storage. I’m still checking our options with Netapp support, but we may need to resort to cold backups.

In the meantime my opinion of MySQL just got a bit more negative. It appears to give DBAs (or developers) two choices for storage:
1. MyISAM – which is quick and can be easily backed up, but doesn’t support transactions, ACID or point-in-time recovery, so it can barely be called a real DB.
2. InnoDB – which can be called a real DB, except that it is very difficult to back up.

I love Open Source, but I prefer my databases to be Oracle.

 

RAC tricks – rolling patch with a shared home July 19, 2007

Filed under: Oracle,rac — prodlife @ 1:40 am

We had to apply yet another opatch, but we are only allowed 4 hours of downtime per system per month and we already used our monthly budget, so we need a way to apply an opatch without any downtime.

On some of our systems, it is not an issue. They are RAC systems where each node has its own $ORACLE_HOME on its own server. We take one node down, apply the patch, start the node, stop the other node, apply the patch, start other node. Patch installed on both nodes, no downtime for our customers. Win-Win.

But what do we do about our other systems? The ones which share a single $ORACLE_HOME on a filer? Where we need to take both nodes down for applying the patch?

A co-worker came up with a brilliant idea:

Stop one node. Use the filer power to duplicate $ORACLE_HOME. Connect node to new home, just make the change in /etc/fstab, the database will never notice the difference.
Apply patch in new home. Start database in new home. Now stop the second node and connect it to the new home as well. Start the node in the new home. We have a patched DB with no downtime in a shared home system! We even have a built in rollback – connect one node after the other back to the old home, where we didn’t apply the patch. In my experience rollback of opatches don’t always work, so having a sure rollback plan is a great bonus.

We tested it today in a staging environment and it seems to work well. Now we just need to convince management that we should do it in production. It looks like a great solution, but in my experience management hates approving any plan that does not appear in Oracle manuals. For all their talk of innovation and “thinking outside the box” they are a very conservative bunch. I can understand the extreme risk aversion of IT management, but if you never do anything new, you can never improve, and thats also risky.

 

More on Data Pump July 18, 2007

Filed under: data pump,Oracle — prodlife @ 1:11 am

Few Data Pump limitations that I ran into and you probably want to know about:

1. If you are using Oracle 10.2, and you want to write your exports into a directory that is on an NFS mount, it must be a hard mount. In 10.1 soft mounts used to work, and the upgrade caused us a significant amount of headache.

You see, our exports directory sits on an old and slow netapp (R200), and our storage manager gives us terabytes of space there but without any promise of high availability. This is fine for exports, but you can’t afford to hard mount storage that doesn’t have high availability. Once it goes down, processes will hang with uninterruptable sleep and we will need to reboot the machine to get rid of them. So with 10.2 data pump we are stuck with either hard mounting the original directories and taking a risk, or negotiating with the storage manager for space on a high availability device. I consider using interruptible hard mount, which should give us the best of both worlds, but I didn’t test it yet. I hope Oracle plans to bring back the option of configuring data pump directories on soft mounts.

2. Importing statistics takes unreasonable amounts of time. We had to export and import one schema. Collecting statistics on this schema takes around 5 minutes. Importing the statistics using Data Pump took over 20 minutes. Oracle’s documentation fails to mention this little bit of information. I really hope this issue is fixed in 11g.

Work around: import with “exclude=statistics” flag and collect statistics later.

3. Data Pump doesn’t write errors to standard output. At one point we believed that Data Pump got completely stuck, until we checked the alert log and discovered that we ran out of space in the tablespace. Data Pump itself never mentioned this little problem.

4. Data pump can hang when another process collects statistics while Data Pump is trying to import statistics. Note that Oracle 10g has a daily job that collects statistics by default.

5. If you consider working with multiple threads when importing, you need to do your homework. Multiple Data Pump workers don’t always behave as you might expect. This article by Carol Palmer gives a good explanation of the issues involved and when using multiple workers may not have benefit and may even slow you down.

Don’t get me wrong, I love Data Pump. It is one of the best things that happened to my job this year, literally saving me hours of boring work. You just need to be aware of its few catches.

 

To upgrade or not to upgrade July 17, 2007

Filed under: 11g,Oracle — prodlife @ 1:07 am

So 11g was finally announced and it seems to have some very cool features. Database replay is the largest and most attractive new feature, maybe it will finally allow us to have an accurate staging environment. Direct NFS also sounds interesting, especially for the added instrumentation and monitor opportunities. Table Space encryption sounds good, especially if it will allow us to get rid of the scary encryption device we currently use. Aside from completely new features, Oracle promises improvements to features I already enjoy – Data Pump, Partitions, Automatic Memory Management, building indexes online and lots more.

I didn’t get to install a test instance of 11g yet, but I’m planning to get to it as I can’t wait to try it hands-on.

Contrary to what Oracle believes, we are not even thinking of installing it on anything that remotely resembles a production system until Oracle releases service pack 1 or maybe even 2. Oracle managed to teach the entire DBA community to avoid new versions like the plague and new features even more. I thought Oracle knew this, but now I’m not so sure. I opened an SR in Metalink regarding a bug I found on 10.2 and needed a fix (It was a rather horrible bug which completely prevented us from collecting statistics), and Oracle Support claimed that this bug will not be fixed in 10.2 and that we should upgrade to 11g which will be released this month. We insisted that we won’t upgrade to 11g in the next year and that we will need statistics sooner, and eventually they did release an opatch, but this was rather unpleasant experience.

While we don’t plan to upgrade to 11g soon, it does raise an interesting question about the systems that we never got around to upgrading from 9i to 10g. Do we upgrade them to 10g or wait with 9i until 11g becomes stable? I prefer to upgrade, because my work is easier when all of my systems are the same version (and you get addicted to features like regular expressions very quickly), but the customers know that 11g was released and are applying pressure toward waiting a bit and then giving them the latest.

If you are planning to install 11g on production any time soon, the following article by Dizwell will probably change your mind.  In any case, you probably want to read Arup Nanda’s 11g new features series.

 

Our new Netapp is cursed July 13, 2007

Filed under: maintenance,netapp — prodlife @ 11:45 pm

We had another spectacular failure at a routine maintenance.

Remember the last time a routine maintenance went very bad? Well, few days ago we had to move another machine to the new Netapp.

During the last month we moved dozens of machines to the new Netapp, so we had lots of practice and knew the drill: stop the database, unmount volumes, edit fstab with new filer, mount volumes, start database.

No problem. Database stopped, volumes umounted, fstab edited, volumes moun… wait. Mount backgrounding? No route to host?
Few seconds later we found out that we couldn’t reach any host. Call the network manager. Reconfigure interface, ifdown, ifup, rinse, repeat. Half an hour later it appears that we were going nowhere. The network manager is pretty sure there is a hardware issue with the network card on the machine. What a coincidence! Hardware failure right in the middle of a maintenance.

No problem though. A good DBA is always prepared. We took the standby machine, connected it to the new netapp, mount works!
Time to “startup” the database…
ORA-00202: controlfile: ‘/u01/oracle/oradata/PROD/control02.ctl’
ORA-27086: skgfglk: unable to lock file – already in use

What did I do to deserve it?

We shutdown the original machine, to make extra sure it is not locking anything. we unmount and mount, we offline the volume and online it again, we reboot the machine, and we even revert to a snapshot of the database taken just before the move. Nope, file is still locked. Even though no process is locking it, Oracle refuses to use the control file. When we changed Oracle configuration to use another control file, Oracle claims that the other file is also locked.

Finally, defeated, we connect the standby machine to the old Netapp, and finally the database opens.

Now I have to write a report about this and suggest ways to do the move better next time. What can I say? The same procedure worked successfully million times in the past. The network card was clearly very bad luck, but what about the lock? It could be said that if we had better expertise with NFS maybe we would have known how to solve it, but the experts at Netapp or Redhat don’t have any answers on what have happened either. Is it just our fate to suffer failures from time to time which we can’t explain or prevent?

 

Rounding dates July 10, 2007

Filed under: dates,Oracle,sql — prodlife @ 11:12 pm

We are shutting down most of the SQL Servers, moving our applications and customers to Oracle. This move has been going on for a while and is targeted to finish on the end of the month. One of the applications that will have to move is our SLA calculations. Thats an important and highly visible app – our executives are following the results, so we really don’t want to mess up with this one. The downside is that moving it means translating a bunch of code written in T-SQL for SQL Server to Oracle’s PL/SQL. To make things even better, the application involves a lot of date manipulations. SQL Server and Oracle have very different approaches to dates. So, this task has been tossed back and forth for a while now, because no one really wanted to get his hands dirty with this, until it firmly lended on my desk, with 2 days deadline.

So, one of the things that had to be translated was a query that groups the last day of data into 15 minute intervals. I spent few minutes trying to translate the code into Oracle in a operation by operation basis, before I gave up and decided to write it from scratch. Since my solution is both generic (will work on any minute interval, not just 15) and much cuter than the other solutions I found in various forums, I decided to post it here:

Round to previous 15 minute (i.e. 4:10 will become 4:00):
TRUNC(sysdate,'HH24') + (to_char(SYSDATE,'mi')-mod(TO_char(SYSDATE,'mi'),15))/(24*60)

Round to next 15 minutes (4:10 will become 4:15):
trunc(sysdate,'HH24') + (to_char(SYSDATE,'mi')+15-mod(TO_char(SYSDATE,'mi'),15))/(24*60)

 

Data Types and Cost Based Optimization July 10, 2007

Filed under: design,optimizer,Oracle — prodlife @ 2:39 am

Jeff Smith, my favorite SQL Server blogger, wrote a great post about the importance of data types in database design.

Jeff is absolutely right – developers have absolutely no excuse for not getting data type corret. If you let an Excel script kiddie design your database, you deserve to be screwed.

If you are running Oracle, instead of SQL Server, the issue of data types is even more important than you think. It is not just a matter of getting the sort wrong, taking more space than you should, or being more vulnerable to typos. It is a matter of Oracle getting the execution plan right and getting you the data in an efficient way.

Here’s a nice example:

A bug tracking application was designed to support multiple databases. Since different databases handle dates very differently, the developers took a shortcut and decided to use numeric field for the date, use something like ’20070301′ to signify that the bug was closed at March 1st, 2007, and write tons of application code to handle all kinds of trivial date operations. Note that they could have used Unix time, seconds since 1970, and avoid all the pain I describe here, but Unix time is less human readable. The application also supported generating reports. Number of bugs opened last week, two month ago, etc.

Suppose we have two years of bugs in our database. Thats 730 different possible closing dates. But if they are kept as numbers from 20060101 to 20071231, the optimizer will be convinced that there are 11130 different possible dates.

First scenario: We want to generate a report from January 1 to June 1 2007.
When we use real dates that means a quarter of our date range and the optimizer will correctly decide that it is faster to go do a table scan than use an index.
When you use numbers, 20070601-20070101 = 500. That is 4% of the data and indexes will be used to retrieve 25% of your data. Go grab a cup of coffee and maybe a nice book (“Cost Based Oracle Fundamentals” will be a good choice), it will be a while before you see the results of that report.

Second scenario: Suppose you want just one week of data. Say, March 1 to March 7. With both data types this is a very small percent of the data and an index will be used.
But what if you want to see the week from December 28 to January 3rd? If you use dates, this is still just 1% of the data and the same plan will be used with the same quick results. If you use the numbers, however, 20070103-20061228=8875. Thats around 50% of the data – this means a full table scan. You will see lots of managers wondering why the report became so slow “all of the sudden” and “just when we need the end of the year report very urgently”.

I hope I made my point here: Using correct data types is critical, more so in Oracle, and it is so easy you really have no reason to get it wrong.

 

How high can you go? July 6, 2007

Filed under: Linux — prodlife @ 10:59 pm

What’s the highest load average you’ve seen on a machine?

Our production databases are running on boxes with two dual core cpus, so we usually have load average of around 3. When someone is doing heavy processing on a machine, the load average goes up to 10 by the time one of the DBAs intervenes and stops the madness. When we had a huge problem tuning the memory for client processes and the machine was swapping like there is no tomorrow, we had a load average of 25 and that was the highest I’ve seen a machine go. Needless to say, at that moment response times were less than inspiring.

You can imagine my surprise when I got a phone call one Sunday morning from the application owner who said “Looks like one of our DBs has load average of 500. It could be a problem with the monitor, but can you take a look?”

Sure I can take a look, I’ve nothing better to do on Sunday mornings.

I took a look, and saw the following:
[oracle@productiondb]# uptime
11:15:58 up 162 days, 15:58, 2 users, load average: 546.06, 538.21, 522.55

Not a monitor bug then. But the machine seemed surprisingly responsive for the extravagant load average. I connected to the DB with no delays, ran few queries which returned within a reasonable time. Everything seemed fine except for the high load average numbers.

And the numbers were going up. 30 minutes later we were well over 600.

I tried running “top” – which promptly hanged; “ps” hangs too; “df” works and now I know it is not a disk space problem nor an NFS issue.

I peaked in the “/proc” directory and saw that many of the stuck processes are actually “ps” and “bash”. So the hanging ps is probably causing the high load average, not a result of the load average. We have 500 ps processes hanging there, which inflates the load average while letting the database function as usual. More and more ps processes are accumulating because we use ps in most of our automated monitors. Vmstat shows 50% iowait, but thats obviously related to the stuck processes because the DB shows no more than the usual io waits. So ps hangs while waiting for io? Interesting.

Why does ps hangs? Strace reveals the problem:
stat("/proc/538", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
open("/proc/538/stat", O_RDONLY)= 6
read(6,<unfinished ...>

So ps gets stuck on process 538. I wonder what’s this process is doing. Of course, I can’t use ps, so lets go straight into the proc directory. Except that cd /proc/538 got stuck too. No way to know what it is then. Can I kill it? kill -9 538 doesn’t hang, but doesn’t do anything either. Only a reboot will clear it.

Now is the interesting part: Should we reboot a perfectly working database just because the OS is doing something interesting? Time for management decision. I escalate to my manager, who escalates to his, who calls our system administrator and the application owner.

Meanwhile the system administrator reveals more interesting information: Few hours earlier the DB moved to a new Netapp cluster. Is this the cause of the strange behavior? How can a move to a new storage can corrupt the process memory blocks so throughly? Well, Redhat will need to answer this question.

3 hours later I get the decision – reboot the server. We took a netdump of the machine and restarted it. 5 minutes late the DB is up and running again with the normal load average and ps works.

We sent the netdump to Redhat. It is now almost a month since the incident and there is still no word on what happened to process 538 and whether we can blame the new Netapp.

 

 
Follow

Get every new post delivered to your Inbox.

Join 48 other followers