User Guide > Performance Tuning > Creating Cardinality Statistics for Cost-Based Optimization > Creating Cardinality Statistics on a Data Source
 
Creating Cardinality Statistics on a Data Source
Statistics gathering must be enabled on the parent data source. You create statistics from the Cardinality Statistics tab, which is available for any data source.
Note: The data source sign-in and password must be saved to enable statistics gathering.
To collect statistics on a data source
1. From the resource tree, open a data source.
2. Select the Cardinality Statistics tab.
3. Click Create Statistics.
4. Check Enable to allow use of the table statistics gathered to optimize execution plans.
You do not need to check the Enable check box to configure statistics gathering, but you need to check that box for the optimizer to use the statistics. To disable statistics gathering and use of existing statistics, clear the Enable check box and save.
5. For Mode, select one of the following:
Do not gather statistics—The default setting for all data sources and tables.
Gather table boundary statistics—Counts the number of rows in tables, performing a SELECT COUNT (*) on each of the specified tables.
Gather column boundary statistics—Gets table boundary statistics and gathers the minimum and maximum, and counts distinct values for numeric data types. For string data types, calculates the number of distinct values for tables with fewer than ten thousand rows. The cost of gathering statistics for the column boundary statistics is reduced by application of one or more assumptions.
Gather all statistics—Collects all the above, plus a full table scan of numeric data types, to build a histogram. For string data types, also builds a string index.
6. Optionally, set the Number of Threads.
Number of Threads specifies the number of low-priority, query-executing threads allowed on the database and on TDV.
Note: Some databases do not tolerate large numbers of threads scanning tables at the same time. TDV also has a thread capacity limit that can cause disk paging when nearing memory thresholds. You can specify a maximum number of threads (without needing a server restart) by selecting Administration > Configuration, and in that window navigating to Server > SQL Engine > Optimizations and setting Max Number of Statistics Gathering Threads In The System to whatever value works in your environment.
7. Set the statistics gathering schedule:
Manual—Use Gather Now to gather statistics right away. If your cache is controlled by a cache policy, it is probably better to use a periodic or programmatic refresh.
Exactly Once—Requires that you set a day and time in the future to run.
Periodic—Statistical collection occurs at timed intervals, such as at an off-peak hour each night.
Frequency of statistics gathering also depends on data volatility. Some tables never change and others change frequently. Gather statistics based on the expected frequency of cardinality and table boundary change.
It is best to use off-peak hours for gathering data source statistics, a process that can use significant resources. You can use multiple threads for data sources that support parallel processing.
8. In the Time-out in minutes field, specify the number of minutes permitted for gathering data from a resource.
A value of 0 or null (the default) indicates that there is no time limit.
If the timeout expires after statistics processing is already returning data, the timeout triggers an attempt to make statistical estimates from the subset of data already returned.
9. Save your selections.
10. Define cardinality statistics gathering option for the table or view that you are interested in by following the steps in Creating Cardinality Statistics for a View.
TDV supports the NTILE analytical function, which can be run for a given data source to compute statistics. TDV uses this function to generate histograms, capturing the distribution of the values of a given column, within the data source instead of pulling all the data within the TDV server to compute the histograms.
By default NTILE-based statistics collection is disabled. You can enable it by setting the property Disable NTILE-based Statistics Collection to False. This option can be accessed via Server > Configuration > Debugging > Statistics.
Refer to NTILE in the Reference Guide for the syntax and usage of this analytical function.