Small shortcut in a large report

Came back from Oracle OpenWorld to a flurry of business demands. End of Quarter is here and with it new demands for reports and data.

The main report that I’m writing is rather large, generating the data requires about 30 different queries on about 10 databases. Some run in few seconds while others, despite my best tuning efforts can take an hour or more.

So, I wrote a PL/SQL procedure that runs all the queries on all the DBs and collects the results into a single table that I’ll use in the report. That’s pretty cool, but suppose there is an error in query number 23? After I fix the error, I have to run the report again which takes hours. Quite annoying.

So, I created a small table:

CREATE TABLE PROGRESS
( "STAT_NAME" VARCHAR2(30),
"SCHEMA_NAME" VARCHAR2(100),
"TIME_DONE" DATE)

And now, before each query runs, I check in the progress table if it was already done or not. If it was already done, I don’t need to do it again. If it wasn’t, I run it and after it ran successfully, I add it to the table.

Here’s the procedure that takes care of this process:

create or replace
PROCEDURE CONDITIONAL_EXEC ( in_stat_name IN VARCHAR2,
in_schema_name IN VARCHAR2,sql_stmt in varchar2) AS

stat_status int;
BEGIN
select count(*) into stat_status from progress where schema_name=in_schema_name and stat_name=in_stat_name;
if stat_status=0 then
dbms_output.put_line ('About to run: ' || sql_stmt);
execute immediate sql_stmt;
insert into progress values (in_stat_name,in_schema_name,sysdate);
commit;
end if;
END CONDITIONAL_EXEC;

Advantages of this approach:

  1. I save time by not running the same stuff twice
  2. I can always “undo” something that already ran by deleting from that table
  3. I can watch the progress of my report in the table
  4. The table contains the finish time of each query. This means I have rough information on how long each query took to run and I can use this information to decide which queries need more work.
  5. I can also use the time information to show the business how much db time each statistic they request costs them.Maybe they’ll decide that some statistics are not that important? Especially those that are carefully crafted so we can’t use any existing aggregated data to produce them. Unlikely, but who knows?

Not bad for 5 minutes of work.

About these ads

9 Comments on “Small shortcut in a large report”

  1. Boneist says:

    Could you not add a status flag to your table, and have it update if the report failed or succeeded – then you don’t need to add rows manually. Plus you keep a record of the ones that failed.

  2. dhoogfr says:

    Could this not also be done by creating materialized views from the queries instead of using a pl/sql procedure?

    regards,

    Freek

  3. prodlife says:

    Boneist,

    You mean use the data table to also track progress?
    It is possible, but I think it is more elegant and easier to implement in a separate table.
    I know which queries failed by the output of the procedure.

  4. prodlife says:

    Boneist,

    You mean use the data table to also track progress?
    It is possible, but I think it is more elegant and easier to implement in a separate table.
    I know which queries failed by the output of the procedure.

  5. prodlife says:

    Freek,

    Yes, MV are a nice idea too :-)

  6. Boneist says:

    No, I meant add the flag to your progress table, to reduce the amount of manual interference.

    So, for example:

    Report 1 started at timeA, finished at timeB, successful
    Report 2 started at timeC, finished at timeD, failed
    Report 3 started at timeE, finished at timeF, successful

    Then you know automatically which one needs rerunning. You could also add a column for the error message, and a “date-of-report” column so that you can automate the calling procedure better.

    Eg. instead of having to rerun the reports manually, all you’d have to do is rerun the calling procedure, and it’ll automatically choose which reports need rerunning.

    That’d be what I’d do, anyway.

  7. prodlife says:

    Hi Boneist,

    Nice idea, and I could definitely modify my existing code+table to do this with almost no work.

    I didn’t feel the need to do this because If I see in my table:
    Report 1 started at time A
    Report 2 started at time B

    I’d immediately know that report 3 failed…

  8. Boneist says:

    *nods*

    I’d go for having as much information as I could about the report runs in the progress table, so that there’s some historical information. (How long each report took, was it faster or slower than yesterday, etc)

    Mind you, we run our reports concurrently, as we wouldn’t have enough time in the day to run consecutively!

    How would you know if report 2 had failed? (Does the info in the progress table only get written after the report has run?)

  9. prodlife says:

    Ah, I think I got you now.

    Yes, more information is always better. I have a feeling that this innocent little progress table may grow.

    Yes, I only write to the progress table after the report finished.

    BTW. I’m only doing this tracking for our end-of-year reports, which are annoyingly complicated, but which don’t run concurrently because we leave some CPU for every day activities. It usually takes few days to finish running all of them.

    Lucky for me, the daily reports are produced by the application that is also putting the data in there, so I don’t have much to do with them.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 3,113 other followers