Pivot – You are doing it wrong
Posted: December 2, 2008 Filed under: rants, sql 13 CommentsSuppose you have a table like this:
drop table t_; create table t_ ( nm Varchar2(20), pr Char ( 7), vl Number ); insert into t_ values ('company 1','2003-06', 10); insert into t_ values ('company 1','2003-07', 29); insert into t_ values ('company 1','2003-08', 39); insert into t_ values ('company 1','2003-09', 41); insert into t_ values ('company 1','2003-10', 22); insert into t_ values ('company 2','2003-06', 13); insert into t_ values ('company 2','2003-07', 17); insert into t_ values ('company 2','2003-08', 61); insert into t_ values ('company 2','2003-09', 55); insert into t_ values ('company 2','2003-10', 71); insert into t_ values ('company 3','2003-06', 33); insert into t_ values ('company 3','2003-07', 18); insert into t_ values ('company 3','2003-08', 27); insert into t_ values ('company 3','2003-09', 5); insert into t_ values ('company 3','2003-10', 32);
(Thanks to René Nyffenegger for the helpful create script)
And you want this output:
NM JUL AUG SEP -------------------- ---------- ---------- ---------- company 2 17 61 55 company 3 18 27 5 company 1 29 39 41
If you are using 10g or older versions, you can use Tom Kyte’s decode trick. In 11g, we have an official keyword to solve this problem.
What you should never, ever do, is this:
select t1.nm, t1.vl jul, t2.vl aug, t3.vl sep from t_ t1, t_ t2, t_ t3 where 1=1 and t1.nm=t2.nm and t2.nm=t3.nm and t1.pr='2003-07' and t2.pr='2003-08' and t3.pr='2003-09' group by t1.nm, t1.vl, t2.vl, t3.vl
And if you do this, and if t_ has more than 15 rows, and if you actually need more than 3 columns, don’t be surprised if the performance may be slightly disappointing.
Can you tell what keyword you mean?
If it is the obvius PIVOT, can you draw how can be the best way using the same sample data?
I’m just guessing, thanks!
Hi Chen,
I find the Tom Kyte utterly hard to read, to understand, to maintain and to tune.
“Never say never, never say always”, Tom always said
SQL> drop table lsc_t;
Table dropped.
Elapsed: 00:00:00.67
SQL>
SQL> create table lsc_t (
2 nm Varchar2(20),
3 pr date,
4 vl Number,
5 primary key (nm,pr)
6 );
Table created.
Elapsed: 00:00:00.03
SQL>
SQL> insert into lsc_t
2 select nm,pr,trunc(dbms_random.value(1,100))
3 from
4 (select add_months(date '1900-01-01',rownum) pr from all_objects)
5 cross join
6 (select 'customer '||rownum nm from all_objects where rownum
SQL> commit;
Commit complete.
Elapsed: 00:00:00.01
SQL>
SQL>
SQL> select
2 t1.nm,
3 t1.vl jul,
4 t2.vl aug,
5 t3.vl sep
6 from
7 lsc_t t1,
8 lsc_t t2,
9 lsc_t t3
10 where 1=1
11 and t1.nm=t2.nm
12 and t2.nm=t3.nm
13 and t1.pr=date '2003-07-01'
14 and t2.pr=date '2003-08-01'
15 and t3.pr=date '2003-09-01'
16 group by t1.nm,
17 t1.vl,
18 t2.vl,
19 t3.vl;
NM JUL AUG SEP
-------------------- ---------- ---------- ----------
customer 1 39 25 29
customer 2 85 69 83
customer 3 53 70 59
Elapsed: 00:00:00.04
SQL>
SQL>
SQL> select nm,
2 max(decode( pr, date '2003-07-01', vl )) jul,
3 max(decode( pr, date '2003-08-01', vl )) aug,
4 max(decode( pr, date '2003-09-01', vl )) sep
5 from lsc_t
6 group by nm;
NM JUL AUG SEP
-------------------- ---------- ---------- ----------
customer 1 39 25 29
customer 2 85 69 83
customer 3 53 70 59
Elapsed: 00:00:00.15
What’s wrong with selecting the same table thrice? It is occasionaly faster and way easier to maintain
In a book with an amazing picture on page 302 I wrote about PIVOT, but nobody is using 11g around… Also I am not use it is much different from Tom syntax. And also, it is not equivalent
BTW, to tune up my query, simply remove the group by 🙂
SQL> select
2 t1.nm,
3 t1.vl jul,
4 t2.vl aug,
5 t3.vl sep
6 from
7 lsc_t t1,
8 lsc_t t2,
9 lsc_t t3
10 where 1=1
11 and t1.nm=t2.nm
12 and t2.nm=t3.nm
13 and t1.pr=date '2003-07-01'
14 and t2.pr=date '2003-08-01'
15 and t3.pr=date '2003-09-01';
NM JUL AUG SEP
-------------------- ---------- ---------- ----------
customer 1 39 25 29
customer 2 85 69 83
customer 3 53 70 59
(of course, wordpress sucks, my < just truncated my code…)
PS: do not blame me for storing dates in date…
Hi Luiz Vaz,
Indeed I was talking about PIVOT. Here’s how I use pivot on this example:
select * from
( SELECT nm, pr, vl FROM t_)
PIVOT
(max(vl) FOR pr IN (‘2003-07’ AS JUL, ‘2003-08’ AS AUG, ‘2003-09’ AS SEP))
Laurent,
Quiet a long comment…
1. Tom is right. I shouldn’t have said never. Maybe I should have said “Please don’t do it in code that I have to look at later”.
2. I forgot to mention that in the original code the date was not indexed, and it was a char… (varchar actually). I also forgot to mention that t_ was actually a complicated inline view that was copy pasted 11 times in the query.
3. And of course, the original did contain group by, to avoid showing the duplicates that other parts of the 18 table join created. Kind of like “distinct”.
In short, my example doesn’t do justice to the headache I went through yesterday. But I still think it is a good example of what not to do unless you are Laurent Schneider.
You get special exception to run whatever you want. If all my developers were that sexy, I’d let them run whatever they wanted too 😉
Chen,
You saved my day !
Actually, in my former CS team I wrote something with max(decode()) KEEP … and after I left, they added some columns and ended up with rewriting with multiple joins. They asked me how to tune it. My first tuning hint was : “Removed that stupid GROUP BY clause”. I suggested that if the java guys have a maintainable query which does not look ugly, it is probably as good as a “bullet” query that will be misused next time they add a column.
YMMV
“Never say never, never say always”, Tom always said
right, but I also heard him saying
“never use ‘when others then null;'” (which I agree on), so himself sometimes says never
nice blog by the way
picture on p. 302 is really amazing, indeed !
I’ll admit to using “when others then null” on occasion.
Usually right after “drop table t”.
I wish Oracle had “drop if exists t”, the same way it has “create or replace”.
When others then null is really ugly… I told my editor to NOT PUT the code in the book, well, blame me now!!!
Prompt drop TABLE TLS_EXPOSURE_FACTS_TMP;
declare TABLE_NOT_EXISTS exception;pragma exception_init(TABLE_NOT_EXISTS,-942);begin execute immediate '
DROP TABLE TLS_EXPOSURE_FACTS_TMP CASCADE CONSTRAINTS
';exception when TABLE_NOT_EXISTS then null;
end;
/
Thanks for the snippet, dear.
I just thought that declaring a pragma and all is a bit of an overkill if I just want to drop a table!
DDL,
if you just want to drop that table:
DROP TABLE t;
Why bothering about exception?
In case you want to avoid having errors that are safe to ignore in your DBA log files, then trap the right exception. Trapping everything will just hide fatal errors, this is not want you want, dear ! Simply do not do what I wrote in my book and do follow Tom Kyte advice!