Fetch as Much as You Can

In my “Everything DBAs Need to Know About TCP/IP” presentation, I discussed the issue of fetch size – tuning the number of rows your application will get from Oracle in a single “fetch” call.

The problem is interesting when you want to get a large number of rows from the DB to the application. Cary Millsap would say that no one should want to do that, but Cary obviously never had to work with statistical analysis application. These things demand large amounts of blood raw data to do their thing.

If this is too slow for your users, you should look at statistics like “round trips”. If the number of round-trips looks completely off the wall, like 200,000 round trips for 2M rows, someone might have left JDBC’s default fetch size of 10.

In my presentation, I mentioned that there is really no problem with setting a very high fetch size. I recommended setting it as high as possible while taking into account the amount of memory each row will consume.

Charles Hooper blogged last week with a new reason to use large fetch sizes: Turns out that it allows Oracle to pin the block for the length of the fetch. He references an OTN discussion where Jonathan Lewis says:

“…what it the array fetch size ? If the array fetch size is large enough the number of block visits would be similar to the number of physical block reads. If the arraysize is one then the number of block visits could be close to the number of rows fetched as a block has to be released at the end of each fetch and the reacquired for the next fetch.”

So a higher fetch size ends up with fewer consistent gets. Even if you are not tuning any specific issue that requires lowering the number of consistent gets, I hope you’ll agree that getting the same results with less work is a good thing.

One more reason to set fetch size to as many rows as your memory allows.


8 Comments on “Fetch as Much as You Can”

  1. Cary Millsap says:

    (Grin) Now, Chen, remember please that I didn’t say a human would never want to /fetch/ jillions of rows; I said that a human would never want to /see/ more than ten of them. Hence the need for good graphical visualization tools (e.g., Mathematica), or good aggregation tools (sum, min, max, average, variance, skewness, etc.), which I presume are all features available to you in your statistical analysis application.

    • prodlife says:

      Cary,
      Considering the data, sometimes two rows are too much for most humans 🙂

      • Cary Millsap says:

        I’m with you on that!

        One thing I ran into while writing the “Making Friends” paper is that when the array size setting causes the server to try to put more rows into a network packet than will fit, you end up inducing new ‘SQL*Net more data to client’ calls. I didn’t do any research into how much time those events burn compared to regular ‘SQL*Net message to client’ calls. I suspect the ‘more data’ calls are cheaper, but I haven’t tested and measured. Sounds like it’s something you’d be interested in finding out.

      • prodlife says:

        Cary,

        It is not the size of the network packet (MTU) that is causing the “more data to client”, its the size of Oracle’s SDU. Tanel Poder wrote about it a while ago and he recommends setting the SDU to its max size (32K). I’ve heard from several sources that this is good advice, but I’ve never seen the benefits of it myself.

  2. Øyvind Isene says:

    Increasing array size is a good tip. More important is to lobby some of these statisticans to move some of the processing back into the database. Fetching 9M rows for a rather simple statistical processing in SAS is not optimal (SAS scripts are not always that advanced, but I guess they love their software as much as I love mine). Oracle can compute simple aggregates like min, max, avg and some rather complex one too if you care to rewrite code. As a compromise I would see if a small code change in SAS (or whatever) allows one to pre-aggregate in the database before fetching it into the statistical software for further processing.

    • prodlife says:

      I’ll check that. We are doing some specific statistical tests that arrive as a package in SPLUS and will need to be hand coded in Oracle. So I can see why the statistics folks will be reluctant.

  3. Curtis Ruck says:

    Two things, one:

    I prefer to attempt implementing statistical functions in the database.

    Second, Upping fetch size has some negative side effects on garbage collection: http://blog.palantirtech.com/2007/02/23/oracles-jdbc-driver-garbage/

  4. prodlife says:

    Thanks for the link and warning, Curtis.

    This is good information and a great demonstration of the difference between theory and practice 🙂


Leave a comment