You Can’t DeDupe Oracle DB Files

One of our storage vendors has DeDupe technology. DeDupe is short for deduplication – the idea is that when you have identical blocks on the storage, it will only keep one copy of the block. This is a nice idea that saves on storage. Its especially good on shared filesystems where many users keep copies of identical files.

Our vendor loves DeDupe and managed to convince my storage manager that DeDupe will lead to amazing storage savings. Even on the DB volumes! No matter how much the DBAs protested that DBs rarely have full blocks that are identical to each other, the vendor kept insisting that many other customers have seen amazing storage savings using this technology on their data files. “Databases have many empty blocks”, the storage manager said after lunch with the vendor “And they are all identical! Think how much space you can save by keeping just one empty block!”.

We agreed to test DeDupe. As expected, we saw about  2% of space savings. Not exactly what the storage manager expected.

I wasn’t surprised. Even empty data blocks in Oracle DB files are not really identical. They have a header, which contains a relative address, which makes each empty block slightly different.

So, no DeDupe. Thought you may want to know, so you won’t have to repeat this experience. Maybe even send a link to your vendor 🙂

If your experience was different though, I’d love to know. The vendor insisted that he had many custormers happily deduping their databases.

Advertisements

12 Comments on “You Can’t DeDupe Oracle DB Files”

  1. dombrooks says:

    In Oracle 11g, the new super lob functionality, Secure Files, has special deduplication powers… allegedly… I’ve not used it.

  2. joel garry says:

    Thank you, this made my day! I saw it coming by the second sentence. How many people has he duped :-O

    Personally, I’m a big fan of redundancy!

  3. Chris_c says:

    About the only place i can think of that de-dupe would work is in a database using traditional filesystems (i.e. not ASM) with a lot of empty data blocks, but its going to depend on the OS block size vs. the database blocksize, and be pointless anyway as eventually this space will be required anyway so all you are going to achieve is a temporary space saving at the cost of additional complexity and performance overhead.

  4. Noons says:

    I just looove the ” many other customers using this” argument.
    Wish I had a dollar for every time I’ve heard an idiot claiming it.
    Quite frankly, any manager falling for that one fully deserves the obvious result.

    “lots of empty blocks, all identical” , eh? That pesty block header just gets in the way every time…

    Good one, Chen.

  5. prodlife says:

    Noons,
    Keep in mind that I fell for that one 🙂

  6. Curtis Ruck says:

    Umm, dedup is great for a tier 2-3 storage for backups, especially if you don’t have to deal with tape. We are “deduping” our daily RMAN leveled backups just perfectly on a dedup device and achieving about a 4x “compression” (mostly due to the archive logs going there along with teired rman levels and a retention which includes 3 level zeros.

  7. TSMADMIN says:

    We recently installed dedup with our TSM backup software. The vendor (which will be nameless) swore we would see in upwards of 10.1 dedup ratio’s. The only question they asked was “Did we do client side compression” We indicated we were not. Needless to say most of our server environment is compressing files unknown to the storage admin. This adversely effected the dedup ratio to a dismal numbers on the lower side of 4.1. Moral of the story stay away from dedup if you’re backing up TB’s of data with TSM.

  8. vvangapally says:

    We have tried DataDomain product from EMC and we use FILESPERSET=1 when backing up the dbs using RMAN and we do get very good deduplication. The more backups we take the better it gets with deduplication. However, archivelogs do not get deduped, which is expected.

  9. ram says:

    I had a vendor yesterday promising the same . We have asked for POC. Our database are Sybase. Will keep you posted on the POC.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s