Unexpected Side Effect of Table Shrinking
Posted: April 6, 2009 Filed under: maintenance, optimizer, performance, tips 10 CommentsLast year, I wrote a bit about table shrinking. It generated quite a bit of discussion and both Freek d’Hooge and Venz warned about some of the side effects involved, especially when using “enable row movement”.
I just discovered another effect of row movement. As the name implies, after enabling row movement, rows get moved around. This can impact the clustering factor of your primary key.
If you had an index on a monotonously growing column (such as “create date”), the table and the index would be ordered in the same way. This results in a low clustering factor, and it means that the index is considered more favorably by the optimizer for some operations. I’m not a big expert on the optimizer, but I assume that if the index and table are ordered in the same way, and you need to scan a portion of the table, the index can be used to find the first row and then the table itself can be scanned. This method should be about twice faster than scanning the index and going to the table for each row to get extra columns that are not in the index. (Again, this is common sense only, I did not test or even deeply studied this).
Anyway, so you have this nicely ordered table, and a nicely ordered index to match, and you probably enjoy all kinds of performance benefits from this arrangement.
Now you enable row movement and shrink. Rows get moved around, and the order of rows in the table no longer matches the order of rows in the index. So you have a smaller table (less blocks) and a higher clustering factor. Maybe scanning portions of the table will be faster since the table is smaller and you read fewer blocks, but maybe it’ll take longer because the ordered changed and the index is less usefull now. In any case, this is something you should consider before deciding to shrink the table with row movement – it can have unexpected effect on your plans.
Here’s a small test I ran to verify:
SQL> create table test (id number); Table created. SQL> INSERT INTO test SELECT rownum FROM dual CONNECT BY LEVEL <= 10000; 10000 rows created. SQL> commit; Commit complete. SQL> create index test_i on test (id); Index created. SQL> EXEC dbms_stats.gather_table_stats(ownname=>'SCOTT', tabname=>'TEST', estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed. SQL> select t.table_name,i.index_name,t.blocks,t.num_rows,i.clustering_factor from user_tables t, user_indexes i where t.table_name=i.table_name and i.index_name='TEST_I'; TABLE_NAME INDEX_NAME BLOCKS NUM_ROWS CLUSTERING_FACTOR ------------------------------ ------------------------------ ---------- ---------- ----------------- TEST TEST_I 20 10000 16 SQL> delete from test where mod(id,2)=0; 5000 rows deleted. SQL> commit; Commit complete. SQL> EXEC dbms_stats.gather_table_stats(ownname=>'SCOTT', tabname=>'TEST', estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed. SQL> select t.table_name,i.index_name,t.blocks,t.num_rows,i.clustering_factor from user_tables t, user_indexes i where t.table_name=i.table_name and i.index_name='TEST_I'; TABLE_NAME INDEX_NAME BLOCKS NUM_ROWS CLUSTERING_FACTOR ------------------------------ ------------------------------ ---------- ---------- ----------------- TEST TEST_I 20 5000 16 SQL> alter table test enable row movement ; Table altered. SQL> alter table test shrink space cascade; Table altered. SQL> alter table test disable row movement ; Table altered. SQL> EXEC dbms_stats.gather_table_stats(ownname=>'SCOTT', tabname=>'TEST', estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed. TABLE_NAME INDEX_NAME BLOCKS NUM_ROWS CLUSTERING_FACTOR ------------------------------ ------------------------------ ---------- ---------- ----------------- TEST TEST_I 8 5000 22
So, I’ve got a 10.2.0.4 hp-ux Itanium system, with the default statistics and segment advisor jobs going. I’ve got a table with 30852977 rows in it, last analysed in September (when I originally loaded it). I’m trying to be a good DBA 2.0 and use EM. Advisor tells me I can shrink space and get back 600M IIRC. So in February and again in March I schedule a shrink space job (without cascade, but the indices have their own), which completes without error, but doesn’t seem to get any space back. Then, and it happened again April 4 and I noticed it this morning, EM segment advisor tells me I have 2,575.38M reclaimable, 2,575.38M allocated, and 0.00 space used in this table. Should I press the shrink button?
After I picked myself off the floor, I see *_tables and EM think there are 28465258 rows (from last September analysis). Select count… and general working of the db don’t seem to have any problem at all. So I hope it is just a stupid advisor bug.
Anyone know what it is about the advisor that does this? It really shakes my trust in all this stuff (not that I’ve ever fully trusted it). Where does the advisor get the count? And why only a problem on one table? There doesn’t seem to be any time pressure against maintenance windows. No clue how to replicate this WTF. (I have the table in a test db which I’ve shrank the table and now is off the advisor list. I have it in another db which I haven’t touched, should anyone have a suggestion).
Hey Joel,
Good story 🙂
I hope someone here can answer your questions. I’ve no clue on how the space advisor does its magic.
Nice one Chen,
On 10G doc it says
For an index or index-organized table, specifying ALTER [INDEX | TABLE] … SHRINK SPACE COMPACT is equivalent to specifying ALTER [INDEX | TABLE … COALESCE.
When you do alter table test shrink space COMPACT cascade;
it increases the clustering factor as well but when you do coalesce index instead of shrinking it does not change the clustering factor.
One more point shrinking compact WITHOUT cascade also increases the clustering factor
rebuild and drop create the index does not change the results as well.
Good Chen…
I like use SHRINK SPACE… on TABLES + INDEXES. (that can make online… no affect with applications)
But I don’t use “shrink space cascade” or “shrink space” often… Anyway with some tables are deleted/updated all times…(and large table) that make they are blocked!
I use “shrink space COMPACT” before, that help shrunk table+ index and then use “shrink space” again (make faster…) to compressed and the HWM is reduced.
alter table test shrink space compact;
alter table test shrink space;
By the way, “ALTER TABLE… ENABLE ROW MOVEMENT” can make some objects (procedure/ function/ package) depend on it …. Invalid 😉
but I assume that if the index and table are ordered in the same way, and you need to scan a portion of the table, the index can be used to find the first row and then the table itself can be scanned.
Chen, when the first row is found the remaining rows are not scanned from the table without accessing the index. The mechanism here is buffer pinning. When the first row is found the table block is pinned, if the second row is also in the same block (meaning the clustering factor is good) the block is accessed without doing a logical read and without using cache buffers chains latches, this lowers the number of logical reads and thus increases performance.
Dion Cho had a recent post about this here: http://dioncho.wordpress.com/2009/04/08/buffer-pinning/
@Yas
Thanks. It sounds interesting and I’ll check it out.
[…] the Oracle side, our old friend Chen Shapira discovered an unexpected side effect of table shrinking, “ . . . especially when using ‘enable row movement’. […]
“shrink” would down the HWM of object, but may not for datafile. and it would invalidate the dependent plsql/trigger definitely. so I favor in “move tablespace”. it is no online, but it is ok. as normally, only old historic partition need to reclaim the space (not only tbs level, may even to os file level).
hi chen, btw, I am a dba start with web developer as well. 🙂
@an oracle dba
Nice! I did not know triggers will be invalidated!
BTW. If you always have the old data in a seperate “historic” partition, you are in a very good situation.
I have a lot of schema with huge tables, containing all data since 1873 and no good way to clean them 😦
[…] April 13, 2009 Filed under: 11g, plsql, tips — prodlife @ 6:35 pm Commenting on my post from last week, “an oracle dba” […]