Pivot – You are doing it wrong

Posted: December 2, 2008 | Author: prodlife | Filed under: rants, sql |13 Comments

Suppose you have a table like this:

drop table t_;

create table t_ (
  nm Varchar2(20),
  pr Char    ( 7),
  vl Number  
);

insert into t_ values ('company 1','2003-06', 10);
insert into t_ values ('company 1','2003-07', 29);
insert into t_ values ('company 1','2003-08', 39);
insert into t_ values ('company 1','2003-09', 41);
insert into t_ values ('company 1','2003-10', 22);

insert into t_ values ('company 2','2003-06', 13);
insert into t_ values ('company 2','2003-07', 17);
insert into t_ values ('company 2','2003-08', 61);
insert into t_ values ('company 2','2003-09', 55);
insert into t_ values ('company 2','2003-10', 71);

insert into t_ values ('company 3','2003-06', 33);
insert into t_ values ('company 3','2003-07', 18);
insert into t_ values ('company 3','2003-08', 27);
insert into t_ values ('company 3','2003-09',  5);
insert into t_ values ('company 3','2003-10', 32);

(Thanks to René Nyffenegger for the helpful create script)

And you want this output:

NM                          JUL        AUG        SEP
-------------------- ---------- ---------- ----------
company 2                    17         61         55
company 3                    18         27          5
company 1                    29         39         41

If you are using 10g or older versions, you can use Tom Kyte’s decode trick. In 11g, we have an official keyword to solve this problem.

What you should never, ever do, is this:

  select
    t1.nm,
    t1.vl jul,
    t2.vl aug,
    t3.vl sep
  from
    t_ t1,
    t_ t2,
    t_ t3
  where 1=1
  and t1.nm=t2.nm
  and t2.nm=t3.nm
  and t1.pr='2003-07'
  and t2.pr='2003-08'
  and t3.pr='2003-09'
  group by t1.nm, 
    t1.vl,
    t2.vl,
    t3.vl

And if you do this, and if t_ has more than 15 rows, and if you actually need more than 3 columns, don’t be surprised if the performance may be slightly disappointing.

13 Comments on “Pivot – You are doing it wrong”

Luiz Vaz says:

December 2, 2008 at 4:50 am

Can you tell what keyword you mean?
If it is the obvius PIVOT, can you draw how can be the best way using the same sample data?

I’m just guessing, thanks!

Reply
Laurent Schneider says:

December 2, 2008 at 9:37 am

Hi Chen,
I find the Tom Kyte utterly hard to read, to understand, to maintain and to tune.

“Never say never, never say always”, Tom always said

SQL> drop table lsc_t;
Table dropped. Elapsed: 00:00:00.67 SQL> SQL> create table lsc_t ( 2 nm Varchar2(20), 3 pr date, 4 vl Number, 5 primary key (nm,pr) 6 ); Table created. Elapsed: 00:00:00.03 SQL> SQL> insert into lsc_t 2 select nm,pr,trunc(dbms_random.value(1,100)) 3 from 4 (select add_months(date '1900-01-01',rownum) pr from all_objects) 5 cross join 6 (select 'customer '||rownum nm from all_objects where rownum SQL> commit; Commit complete. Elapsed: 00:00:00.01 SQL> SQL> SQL> select 2 t1.nm, 3 t1.vl jul, 4 t2.vl aug, 5 t3.vl sep 6 from 7 lsc_t t1, 8 lsc_t t2, 9 lsc_t t3 10 where 1=1 11 and t1.nm=t2.nm 12 and t2.nm=t3.nm 13 and t1.pr=date '2003-07-01' 14 and t2.pr=date '2003-08-01' 15 and t3.pr=date '2003-09-01' 16 group by t1.nm, 17 t1.vl, 18 t2.vl, 19 t3.vl; NM JUL AUG SEP -------------------- ---------- ---------- ---------- customer 1 39 25 29 customer 2 85 69 83 customer 3 53 70 59 Elapsed: 00:00:00.04 SQL> SQL> SQL> select nm, 2 max(decode( pr, date '2003-07-01', vl )) jul, 3 max(decode( pr, date '2003-08-01', vl )) aug, 4 max(decode( pr, date '2003-09-01', vl )) sep 5 from lsc_t 6 group by nm; NM JUL AUG SEP -------------------- ---------- ---------- ---------- customer 1 39 25 29 customer 2 85 69 83 customer 3 53 70 59
Elapsed: 00:00:00.15

What’s wrong with selecting the same table thrice? It is occasionaly faster and way easier to maintain

In a book with an amazing picture on page 302 I wrote about PIVOT, but nobody is using 11g around… Also I am not use it is much different from Tom syntax. And also, it is not equivalent

BTW, to tune up my query, simply remove the group by 🙂

SQL> select 2 t1.nm, 3 t1.vl jul, 4 t2.vl aug, 5 t3.vl sep 6 from 7 lsc_t t1, 8 lsc_t t2, 9 lsc_t t3 10 where 1=1 11 and t1.nm=t2.nm 12 and t2.nm=t3.nm 13 and t1.pr=date '2003-07-01' 14 and t2.pr=date '2003-08-01' 15 and t3.pr=date '2003-09-01';
NM JUL AUG SEP -------------------- ---------- ---------- ---------- customer 1 39 25 29 customer 2 85 69 83 customer 3 53 70 59

Reply
Laurent Schneider says:

December 2, 2008 at 9:38 am

(of course, wordpress sucks, my < just truncated my code…)

Reply
Laurent Schneider says:

December 2, 2008 at 10:02 am

PS: do not blame me for storing dates in date…

Reply
prodlife says:

December 3, 2008 at 12:19 am

Hi Luiz Vaz,

Indeed I was talking about PIVOT. Here’s how I use pivot on this example:

select * from
( SELECT nm, pr, vl FROM t_)
PIVOT
(max(vl) FOR pr IN (‘2003-07’ AS JUL, ‘2003-08’ AS AUG, ‘2003-09’ AS SEP))

Reply
prodlife says:

December 3, 2008 at 12:27 am

Laurent,

Quiet a long comment…

1. Tom is right. I shouldn’t have said never. Maybe I should have said “Please don’t do it in code that I have to look at later”.

2. I forgot to mention that in the original code the date was not indexed, and it was a char… (varchar actually). I also forgot to mention that t_ was actually a complicated inline view that was copy pasted 11 times in the query.

3. And of course, the original did contain group by, to avoid showing the duplicates that other parts of the 18 table join created. Kind of like “distinct”.

In short, my example doesn’t do justice to the headache I went through yesterday. But I still think it is a good example of what not to do unless you are Laurent Schneider.
You get special exception to run whatever you want. If all my developers were that sexy, I’d let them run whatever they wanted too 😉

Reply
Laurent Schneider says:

December 3, 2008 at 8:07 am

Chen,

You saved my day !

Actually, in my former CS team I wrote something with max(decode()) KEEP … and after I left, they added some columns and ended up with rewriting with multiple joins. They asked me how to tune it. My first tuning hint was : “Removed that stupid GROUP BY clause”. I suggested that if the java guys have a maintainable query which does not look ugly, it is probably as good as a “bullet” query that will be misused next time they add a column.

YMMV

Reply
Sokrates says:

December 3, 2008 at 1:54 pm

“Never say never, never say always”, Tom always said

right, but I also heard him saying
“never use ‘when others then null;'” (which I agree on), so himself sometimes says never

nice blog by the way

Reply
Sokrates says:

December 3, 2008 at 1:56 pm

picture on p. 302 is really amazing, indeed !

Reply
Chen Shapira says:

December 4, 2008 at 5:18 am

I’ll admit to using “when others then null” on occasion.
Usually right after “drop table t”.
I wish Oracle had “drop if exists t”, the same way it has “create or replace”.

Reply
Laurent Schneider says:

December 4, 2008 at 2:01 pm

When others then null is really ugly… I told my editor to NOT PUT the code in the book, well, blame me now!!!

Prompt drop TABLE TLS_EXPOSURE_FACTS_TMP; declare TABLE_NOT_EXISTS exception;pragma exception_init(TABLE_NOT_EXISTS,-942);begin execute immediate ' DROP TABLE TLS_EXPOSURE_FACTS_TMP CASCADE CONSTRAINTS ';exception when TABLE_NOT_EXISTS then null; end; /

Reply
prodlife says:

December 7, 2008 at 1:05 am

Thanks for the snippet, dear.
I just thought that declaring a pragma and all is a bit of an overkill if I just want to drop a table!

Reply
laurentschneider says:

December 11, 2008 at 6:43 am

DDL,

if you just want to drop that table:

DROP TABLE t;

Why bothering about exception?

In case you want to avoid having errors that are safe to ignore in your DBA log files, then trap the right exception. Trapping everything will just hide fatal errors, this is not want you want, dear ! Simply do not do what I wrote in my book and do follow Tom Kyte advice!

Reply