SMART RUNSTATS  - THE STATSCHK UTILITY
______________________________________

Most shops have a maintenance suite that will RUNSTATS some, most or all tables. 
Table statistics are often updated unselectively en masse, but we can be smarter than that;
and attempt to RUNSTATS just those tables where data have changed since the last RUNSTATS. 
This readme presents a number of options. Option 2C seems to be the way to go, and remaining options are presented just for completeness.
The tar file contains the code components for option 2C only.
The aaa_install_readme file contains a packing list and installation instructions.

OPTION 1 Collect less stats
---------------------------
Collect less stats for each table, e.g. sampled instead of detailed statistics, or omit distribution statistics. 
This will reduce statistics collection time but runs counter to standard practice and IBM advice.

In the same way that we use REORGCHK output to REORG only those tables that need it, the obvious solution is to RUNSTATS only those application tables that have changed since the last run. There is no built-in STATSCHK utility, so this approach gives us a further 3 options:

OPTION 2A Automatic Table Maintenance
-------------------------------------
Introduced at DB2 V8.2. This will only collect statistics on tables that have inserts, updates, deletes or inserts (UDI) since the last RUNSTATS. The evaluation checks whether or not tables require statistics to be updated, and schedules runstats for tables whose statistics are out of date at a later time.
The V9.1 algorithm is in http://www.ibm.com/developerworks/data/library/trecharticle/dm-0706tang/index.html and in brief is:
1. Check if the table is due for evaluation. No further action performed if the table is not due for evaluation. An internal table is used to track if tables are due for evaluation. 
2. Check if the table has been accessed by the current workload. 
3. Check if table has statistics. If statistics hare never been collected for this table, issue RUNSTATS on the table. No further checks performed. 
4. Check whether UDI counter is greater than 10% of the rows. If not, no action on the table. 
5. Check whether UDI counter is greater than 50% of the rows, issue RUNSTATS on the table if UDI counter is greater than 50% of the rows. 
6. Issue RUNSTATS if the table is small. 

The first evaluation occurs within two hours of database activation. Subsequent evaluations occur approximately every two hours after that, as long as the database remains active.
One reason not to use automatic statistics collection is that the evaluation algorithm only checks if the table has been accessed by the current workload. 
However it only checks every 2 hours, so could miss significant activity. 
Another is that the maintenance policy can only be updated using an often flaky DB2 Control Center graphical interface (subsequently Optim). 
There is no command line or programmatic interface. 
Third, the maintenance windows (online and offline) in the maintenance policy are just 24-hour clock times, which means runs are not event-driven,
so you get reduced flexibility with scheduling.

OPTION 2B The Health Monitor
----------------------------
If you do not already use it, you could activate the free, built-in DB2 health monitor; the health indicator in question is db.tb_runstats_req,
and you could disable as many of the remaining indicators as you wish. 
That indicator enters an attention (non-normal) state whenever a table requires statistics collection. In the case where automatic statistics collection is enabled, this means that automatic statistics collection of a table has failed. If automatic statistics collection is not enabled, one could simply use the health indicator  just to detect when tables need to be collected statistics.
That would require an instance level configuration change (UPDATE DBM CONFIG USING HEALTH_MON ON)  and housekeeping script changes to parse output or read a list of tables from either:
	> get health snapshot for db on dbalias    or
	> select * from table(HEALTH_DB_HIC('dbname',-1)) as zzzz
To disable the remaining indicators, use the UPDATE ALERT CONFIGURATION command otherwise you will be bombarded by alerts.
The biggest problem with the HM is that it is reactive and will apply to all tables, including ones you don't care about.
I also happen to object to anything in principle that can disturb your social life and sleep by sending text and mail messages.  
My experience of the Health Monitor is that it was a resource-guzzler at V8.2 on DPF systems, though it might prove a lighter touch at V9.1+ for 
single partition databases. The documentation is clear that it does not turn off and on any snapshot monitor switches. 
However a further problem is that the HM only retains history for the last 10 intervalsi, so at 2-hourly intervals, 10 collections will not span a whole day, which is what you may want on a 24*7 system.


OPTION 2C The UDI Counter
-------------------------
DB2 now maintains a UDI counter for each table, visible through db2pd -tcbstats 
TCB Table Stats:
Address            TableName          Scans      UDI        PgReorgs   NoChgUpdts Reads      FscrUpdates Inserts    Updates    Deletes    OvFlReads  OvFlCrtes
0x07800000B94AAD80 SORETHUMB          0          1          0          0          0          0           1          0          0          0          0
which we refine down to known user tablespaces.
The counter is reset after a RUNSTATS. Therefore, the STATSCHK.ksh logic just parses the db2pd utility output to construct a table list of candidates.  
Unfortunately there don't appear to be any db2pd-like table functions or administrative views like ADMINTABINFO that return the UDI counter, 
so were stuck with text processing.

There is only one caveat. The in-memory UDI counter value will not persist across a database deactivation/reactivation.
The UDI counter value is hardened to disk under a few specific circumstances.
At DB2 V9.1, the conditions are as follows:
- A new page is added to the table, e.g. through the IMPORT command. LOAD will also update the on disk UDI counter; as can a table reorg, classic reorg, and certain inplace table reorgs. 
- The automatic statistics collection process requests the UDI counter. 
- After a statistics collection resulting from either the RUNSTATS command or the automatic statistics collection process. 
At DB2 V9.5, there is one more condition:
- If a table has a ROW CHANGE TIMESTAMP column, meta data for the row change time column will get periodically written to disk. 
  When this occurs, the UDI counter value is written to disk also. 

There are additional db2pd TCB Table Stats columns at v9.5 over v9.1. These are RTSUDI, RowsComp and RowsUnComp.
V9.7 also introduced additional columns for schema name, plus CCLogReads,StoreBytes,BytesSaved.
UDI represents the update/delete/insert count since the last background stats collection or manual RUNSTATS. 
RTSUDI (Real Time Statistics UDI) is the sum of the UDI count plus real time stats gathering enabled by db cfg AUTO_STMT_STATS.
This parameter allows auto-collection of stats when required to optimize or run a query, but may revert to background mode if collection will 
take too long. AUTO_STMT_STATS can only be ON if automatic maintenance, automatic table maintenance and automatic statistics are also ON.
Otherwise RTSUDI holds the same value as UDI.

TCB Table Stats (V9.1):
Address            TableName          Scans      UDI        PgReorgs   NoChgUpdts Reads      FscrUpdates Inserts    Updates    Deletes    OvFlReads  Ov
FlCrtes
0x07800000B94AAD80 SORETHUMB          0          1          0          0          0          0           1          0          0          0          0
TCB Table Stats (V9.5):
Address            TableName          Scans      UDI        RTSUDI               PgReorgs   NoChgUpdts Reads      FscrUpdates Inserts    Updates    Deletes    OvFlReads  OvFlCrtes  RowsComp   RowsUncomp
0x00002AAB22235338 SYSTABLES          499        286100     286100               4546       0          7864309    18979       22141      100919     23351      3914       1          0          0         
TCB Table Stats (V9.7):
Address            TableName          SchemaNm Scans      UDI        RTSUDI               PgReorgs   NoChgUpdts Reads      FscrUpdates Inserts    Updates    Deletes    OvFlReads  OvFlCrtes  RowsComp   RowsUn
comp CCLogReads StoreBytes BytesSaved
     0          -          -
0x00002B856E17EA58 SYSTABLES          SYSIBM   126        35         35                   0          39         178105     0           0          280        0          11629      0          0          0
     0          -          -

On a single partition instance, catalog and application tables appear togerther under db2pd -tcbstats TbspaceID=${tbspaceid} -db ${dbname}
To extend to DPF systems add the -dbp 0,1 and -full switches. The code in this suite simply uses the min(dbpartition) for the tablespace's partition group, on the assumption that data is evenly distributed
across all participating nodes.
Alex Levy, June 2012.

There is an additional script to RUNSTATS statistical views - STATSVIEWS.ksh. This will eventually be enhanced to check the UDI counters for the base tables,
but for the time being the script will unconditionally RUNSTATS any statistical view.
This can run as a standalone, or if you prefer, you could amend The STATSCHK.ksh script to call STATSVIEWS.ksh.
In general it is probably wiser to schedule statistical view runs separately, while the statistical view updates are unconditional. 
That's because of extended runtimes - the view definitions can involve multiple table joins and reference other complex views, often hierarchically.
Alex Levy, September 2012.
