Optimizer statistics – parse time statistics aggregation

It’s getting harder and harder to find time for writing. Anyway, this post stays from a long time in my drafts so, I think is time to finish and post it. That one is the last one from the optimizer statistics series. For now, of course 🙂

After describing the horror around the way that Oracle is handling the global statistics, it’s time to take a look what happens if we don’t have global statistics at all.

I got really surprised when I saw that the optimizer is trying to aggregate the local statistics on the fly. And got even more surprised when I saw that it’s doing it quite well.
And got really disappointed when I realized that is quite hard to get rid of the global statistics… Anyway.

Generally speaking the optimizer is using local statistics only when we are accessing single partition. In all other cases it uses global statistics. In my previous post I described how Oracle is trying to aggregate/approximate global statistics when gathering in order to maintain them incrementally.
So as a result of that, if we are accessing two partitions, the optimizer will use the estimated global statistics to estimate the statistics for those two partitions… Estimation, from an estimation, from a sample… The result can be quite interesting.

Ok, lets see what will happen if we don’t have global statistics at all. Let start 10053 trace and check the optimizer behavior.

In this example I’m selecting 3 from 10 partitions and as expected the optimizer is using the global statistics:


SQL> select count(*) from ipt where part_time between to_date('15-08-2015', 'DD-MM-YYYY') and to_date('17-08-2015', 'DD-MM-YYYY') and id_c=7;

  COUNT(*)
----------
      2000

SQL>

-----------------10053 TRACE:

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: IPT  Alias: IPT  (Using composite stats)
  (making adjustments for partition skews)
  ORIGINAL VALUES::    #Rows: 36000000  #Blks:  160146  AvgRowLen:  27.00  ChainCnt:  0.00
  PARTITIONS::
  PRUNED: 3
  ANALYZED: 3  UNANALYZED: 0
    #Rows: 36000000  #Blks:  53367  AvgRowLen:  27.00  ChainCnt:  0.00
Access path analysis for IPT
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for IPT[IPT]
NOTE: ignoring histogram of column (IPT.ID_C)
      used only for incremental stats maintenance
    #Rows: 36000000  #Blks:  160146  AvgRowLen:  27.00  ChainCnt:  0.00
  Column (#4): ID_C(  Part#: 0
    AvgLen: 4 NDV: 8403 Nulls: 0 Density: 0.000000 Min: 0 Max: 0
    Histogram: HtBal  #Bkts: 254  UncompBkts: 254  EndPtVals: 0
NOTE: ignoring histogram of column (IPT.ID_C)
      used only for incremental stats maintenance
  Column (#4): ID_C(
    AvgLen: 4 NDV: 8403 Nulls: 0 Density: 0.000119 Min: 2 Max: 10000

  Column (#1):
    NewDensity:0.054644, OldDensity:0.000000 BktCnt:48944, PopBktCnt:48944, PopValCnt:9, NDV:9
  Column (#1): PART_TIME(
    AvgLen: 8 NDV: 9 Nulls: 0 Density: 0.054644 Min: 2457251 Max: 2457259
    Histogram: Freq  #Bkts: 9  UncompBkts: 48944  EndPtVals: 9
  Table: IPT  Alias: IPT
    Card: Original: 36000000.000000  Rounded: 941  Computed: 941.28  Non Adjusted: 941.28
  Access Path: TableScan
    Cost:  14561.03  Resp: 14561.03  Degree: 0
      Cost_io: 14455.00  Cost_cpu: 3499260449
      Resp_io: 14455.00  Resp_cpu: 3499260449
  Best:: AccessPath: TableScan
         Cost: 14561.03  Degree: 1  Resp: 14561.03  Card: 941.28  Bytes: 0

***************************************


On row 15 we see that the optimizer is using composite statistics, on 26 and 27 we see something ridiculous /if you are following my statistics posts, you shall know from where they are coming from/ and on line 49 we see the cardinality estimation, which is not really accurate.

Now lets try without global stats:

SQL> select count(*) from ipt where part_time between to_date('15-08-2015', 'DD-MM-YYYY') and to_date('17-08-2015', 'DD-MM-YYYY') and id_c=7;

  COUNT(*)
----------
      2000

SQL>

-----------------10053 TRACE:

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: IPT  Alias: IPT
    #Rows: 36000000  #Blks:  160147  AvgRowLen:  27.00  ChainCnt:  0.00
Access path analysis for IPT
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for IPT[IPT]
EXEC #18446604434618639176:c=11,e=11,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=2239883476,tim=4153162080812
FETCH #18446604434618639176:c=11,e=12,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=2239883476,tim=4153162080840
CLOSE #18446604434618639176:c=3,e=4,dep=1,type=3,tim=4153162080862
EXEC #18446604434618639176:c=8,e=8,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=2239883476,tim=4153162080894
FETCH #18446604434618639176:c=8,e=8,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=2239883476,tim=4153162080917
CLOSE #18446604434618639176:c=3,e=3,dep=1,type=3,tim=4153162080937
NOTE: ignoring histogram of column (IPT.ID_C)
      used only for incremental stats maintenance
    #Rows: 4000000  #Blks:  17789  AvgRowLen:  27.00  ChainCnt:  0.00
  Column (#4): ID_C(  Part#: 0
    AvgLen: 4 NDV: 1810 Nulls: 0 Density: 0.000772 Min: 0 Max: 0
    Histogram: HtBal  #Bkts: 254  UncompBkts: 254  EndPtVals: 0
NOTE: ignoring histogram of column (IPT.ID_C)
      used only for incremental stats maintenance
  Column (#4): ID_C(
    AvgLen: 4 NDV: 1929 Nulls: 0 Density: 0.000518 Min: 2 Max: 10000

EXEC #18446604434618639176:c=8,e=8,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=2239883476,tim=4153162081112
FETCH #18446604434618639176:c=7,e=6,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=2239883476,tim=4153162081134
CLOSE #18446604434618639176:c=3,e=4,dep=1,type=3,tim=4153162081155
EXEC #18446604434618639176:c=8,e=8,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=2239883476,tim=4153162081184
FETCH #18446604434618639176:c=7,e=6,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=2239883476,tim=4153162081205
CLOSE #18446604434618639176:c=3,e=3,dep=1,type=3,tim=4153162081225
EXEC #18446604434622397208:c=11,e=11,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=3312420081,tim=4153162081257
FETCH #18446604434622397208:c=20,e=20,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=3312420081,tim=4153162081292
CLOSE #18446604434622397208:c=0,e=1,dep=1,type=3,tim=4153162081335
  Column (#1): PART_TIME(  Part#: 592
    AvgLen: 8 NDV: 1 Nulls: 0 Density: 0.000000 Min: 2457251 Max: 2457251
    Histogram: Freq  #Bkts: 1  UncompBkts: 5527  EndPtVals: 1
EXEC #18446604434618639176:c=8,e=8,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=2239883476,tim=4153162081425
FETCH #18446604434618639176:c=6,e=7,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=2239883476,tim=4153162081447
CLOSE #18446604434618639176:c=3,e=3,dep=1,type=3,tim=4153162081467
EXEC #18446604434622397208:c=8,e=9,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=3312420081,tim=4153162081495
FETCH #18446604434622397208:c=13,e=13,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=3312420081,tim=4153162081523
CLOSE #18446604434622397208:c=1,e=1,dep=1,type=3,tim=4153162081544
  Column (#1): PART_TIME(  Part#: 593
    AvgLen: 8 NDV: 1 Nulls: 0 Density: 0.000000 Min: 2457252 Max: 2457252
    Histogram: Freq  #Bkts: 1  UncompBkts: 5545  EndPtVals: 1
EXEC #18446604434618639176:c=8,e=8,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=2239883476,tim=4153162081631
FETCH #18446604434618639176:c=6,e=7,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=2239883476,tim=4153162081653
CLOSE #18446604434618639176:c=3,e=3,dep=1,type=3,tim=4153162081672
EXEC #18446604434622397208:c=8,e=9,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=3312420081,tim=4153162081700
FETCH #18446604434622397208:c=15,e=15,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=3312420081,tim=4153162081730
CLOSE #18446604434622397208:c=0,e=1,dep=1,type=3,tim=4153162081751
  Column (#1): PART_TIME(  Part#: 594
    AvgLen: 8 NDV: 1 Nulls: 0 Density: 0.000000 Min: 2457253 Max: 2457253
    Histogram: Freq  #Bkts: 1  UncompBkts: 5576  EndPtVals: 1
  Column (#1): PART_TIME(
    AvgLen: 8 NDV: 9 Nulls: 0 Density: 0.111111 Min: 2457251 Max: 2457259
  Using density: 0.111111 of col #1 as selectivity of unpopular value pred
  Table: IPT  Alias: IPT
    Card: Original: 36000000.000000  Rounded: 2074  Computed: 2073.61  Non Adjusted: 2073.61
  Access Path: TableScan
    Cost:  43693.19  Resp: 43693.19  Degree: 0
      Cost_io: 43375.00  Cost_cpu: 10501514058
      Resp_io: 43375.00  Resp_cpu: 10501514058
  Best:: AccessPath: TableScan
         Cost: 43693.19  Degree: 1  Resp: 43693.19  Card: 2073.61  Bytes: 0

***************************************

The final cardinality estimation is way better. Almost exact. At line 36 we can see that Oracle estimated the partition set NDV and the Density for us on the fly. In this case it uses the synopses, which are created while we are gathering statistics with table pref INCREMENTAL => TRUE.
As we know those synopses are making problems. For example they can consume a lot of space, which is not a big issue, but the situation is getting worst when Oracle tries to delete some old synopses… it takes decades to gather stats because of this /delete on ***G table/. Of course some of those issues are rdbms version specific and can vary between the applications. So try them and if they are not making problems for you – keep them. If you have slow gathering issue because of synopses deletes – drop the statistics and gather them again /with or without synopses/. The DELETE_TABLE_STATS procedure uses truncates.

synopses, synopses…

Ok, lets see what happens if we don’t have synopses:

SQL> select count(*) from ipt where part_time between to_date('15-08-2015', 'DD-MM-YYYY') and to_date('17-08-2015', 'DD-MM-YYYY') and id_c=7;

  COUNT(*)
----------
      2000

SQL>

-----------------10053 TRACE:

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: IPT  Alias: IPT
    #Rows: 36000000  #Blks:  160147  AvgRowLen:  27.00  ChainCnt:  0.00
Access path analysis for IPT
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for IPT[IPT]
EXEC #18446604434602892016:c=11,e=11,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=2239883476,tim=4153449039812
FETCH #18446604434602892016:c=13,e=12,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=2239883476,tim=4153449039842
CLOSE #18446604434602892016:c=2,e=2,dep=1,type=3,tim=4153449039865
EXEC #18446604434602892016:c=7,e=7,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=2239883476,tim=4153449039898
FETCH #18446604434602892016:c=6,e=6,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=2239883476,tim=4153449039920
CLOSE #18446604434602892016:c=2,e=2,dep=1,type=3,tim=4153449039940
  Column (#4): ID_C(
    AvgLen: 4 NDV: 1838 Nulls: 0 Density: 0.000544 Min: 2 Max: 10000

EXEC #18446604434602892016:c=7,e=7,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=2239883476,tim=4153449040079
FETCH #18446604434602892016:c=14,e=14,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=2239883476,tim=4153449040111
CLOSE #18446604434602892016:c=2,e=2,dep=1,type=3,tim=4153449040144
EXEC #18446604434602892016:c=7,e=7,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=2239883476,tim=4153449040174
FETCH #18446604434602892016:c=6,e=6,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=2239883476,tim=4153449040196
CLOSE #18446604434602892016:c=2,e=2,dep=1,type=3,tim=4153449040216
EXEC #18446604434622430168:c=11,e=11,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=3312420081,tim=4153449040250
FETCH #18446604434622430168:c=20,e=21,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=3312420081,tim=4153449040288
CLOSE #18446604434622430168:c=1,e=1,dep=1,type=3,tim=4153449040309
  Column (#1): PART_TIME(  Part#: 592
    AvgLen: 8 NDV: 1 Nulls: 0 Density: 0.000000 Min: 2457251 Max: 2457251
    Histogram: Freq  #Bkts: 1  UncompBkts: 5463  EndPtVals: 1
EXEC #18446604434602892016:c=7,e=7,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=2239883476,tim=4153449040404
FETCH #18446604434602892016:c=6,e=5,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=2239883476,tim=4153449040426
CLOSE #18446604434602892016:c=2,e=2,dep=1,type=3,tim=4153449040446
EXEC #18446604434622430168:c=8,e=8,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=3312420081,tim=4153449040474
FETCH #18446604434622430168:c=12,e=12,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=3312420081,tim=4153449040503
CLOSE #18446604434622430168:c=1,e=0,dep=1,type=3,tim=4153449040523
  Column (#1): PART_TIME(  Part#: 593
    AvgLen: 8 NDV: 1 Nulls: 0 Density: 0.000000 Min: 2457252 Max: 2457252
    Histogram: Freq  #Bkts: 1  UncompBkts: 5429  EndPtVals: 1
EXEC #18446604434602892016:c=7,e=7,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=2239883476,tim=4153449040615
FETCH #18446604434602892016:c=6,e=6,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=2239883476,tim=4153449040638
CLOSE #18446604434602892016:c=2,e=2,dep=1,type=3,tim=4153449040658
EXEC #18446604434622430168:c=7,e=7,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=3312420081,tim=4153449040685
FETCH #18446604434622430168:c=13,e=14,p=0,cr=3,cu=0,mis=0,r=1,dep=1,og=3,plh=3312420081,tim=4153449040716
CLOSE #18446604434622430168:c=1,e=0,dep=1,type=3,tim=4153449040745
  Column (#1): PART_TIME(  Part#: 594
    AvgLen: 8 NDV: 1 Nulls: 0 Density: 0.000000 Min: 2457253 Max: 2457253
    Histogram: Freq  #Bkts: 1  UncompBkts: 5433  EndPtVals: 1
  Column (#1): PART_TIME(
    AvgLen: 8 NDV: 9 Nulls: 0 Density: 0.111111 Min: 2457251 Max: 2457259
  Using density: 0.111111 of col #1 as selectivity of unpopular value pred
  Table: IPT  Alias: IPT
    Card: Original: 36000000.000000  Rounded: 2176  Computed: 2176.28  Non Adjusted: 2176.28
  Access Path: TableScan
    Cost:  43693.20  Resp: 43693.20  Degree: 0
      Cost_io: 43375.00  Cost_cpu: 10501565391
      Resp_io: 43375.00  Resp_cpu: 10501565391
  Best:: AccessPath: TableScan
         Cost: 43693.20  Degree: 1  Resp: 43693.20  Card: 2176.28  Bytes: 0

***************************************

Almost the same situation. On line 28 we can see that the NDV and the Density is slightly different from the previous example. This is because without synopses Oracle cannot estimate the NDV well. So instead of trying to estimate them, Oracle takes the biggest NDV from hist/col stats for our partition set.
And the result is still way, way better than global stats.


At the beginning I mentioned that is quite hard to get rid of those global stats.
Oracle have some creepy automation that aggregates your local statistics while gathering statistics with granularity => ‘PARTITION’. So if you have full set of local statistics /all partitions have statistics/, after gathering the last partition statistics, DBMS_STATS will automatically aggregate and create global statistics for you. Yep that’s right.
Still I don’t have documented way to disabling this /coursing is not helping – tried many times/. So, any ideas are welcomed!

The only solution that I currently have is deleting them after every gathering. Just be careful when you are using DELETE_TABLE_STATS – all cascade_* parameters are true by default. You shall use something like this:

begin DBMS_STATS.DELETE_TABLE_STATS(ownname => '', tabname => 'IPT', no_invalidate => false, cascade_parts => false, cascade_columns => false, cascade_indexes => false); end;
/

Happy gathering!
Kovachev

Advertisements

5 thoughts on “Optimizer statistics – parse time statistics aggregation

  1. This is a great article, however it never mentions what version of Oracle you are using for your testing. Can you please share that info?

  2. so in 11.2.0.3 it doesn’t work? because I’ve tried it in 11.2.0.3 and it produced somewhat different results (for example, in the NDV estimate where there aren’t any synopsis – the result was way off).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s