User Guide > TDV Caching > Overview of TDV Caching > How Does TDV Caching Work?
 
How Does TDV Caching Work?
TDV caching options and how caching works for the different TDV resources are worth understanding before choosing one over the other. This section covers the following topics:
How Does Table and View Caching Work?
What Is Procedure Caching?
What Is Transaction Result Caching for Procedures?
How Does Caching Data in a Cluster Environment Work?
How Does Table and View Caching Work?
Within TDV, it is possible to configure caching for views and tables. A SELECT * is run against the view or tables and every row returned is placed into the cache. Because of this:
The cache will be as large as the output set of the table or view.
All requests to that view or object come from the cache after its first load.
You can create a view based on the data or object that you want to cache. You could then later limit the data that is included in the cached view for performance reasons.
TDV caching is best used for materialized views under any of these conditions:
A view requires substantial execution time.
The data does not change significantly within a given period.
The data source should be protected from excessive usage.
For view caches, if a query references a cached view, and the cache is not loaded, it starts a refresh and the query is blocked until the cache is loaded.
In the following example, Sales_Table is an Oracle table, and the UK_Sales_View is a TDV view of that table. If you cache at the table level, all 10 million rows and 15 columns are selected from Sales_Table and put into the cache. However, if you cache the view, the view projects only three columns and has a WHERE clause that limits the result set to 100,000 rows.
The WHERE clause and the SQL in the UK_Sales_View are pushed to the Oracle database for processing. Only 100,000 rows and three columns are extracted from Oracle, sent over the network, and placed in the cache. This is 500 times less data for the same result. The cache is smaller, the refresh is quicker, and there are fewer loads on both the Oracle database and the network during the refresh.
In the following example, Composite_view joins a file, an Oracle table, and an SAP table. Report_proc uses Composite_view and pushes a predicate against it. Because File_1 does not support pushes, the predicate requires a full scan of File_1 and SAP_Table_1 each time, slowing performance with the joins and the fetches. If Composite_view is cached, the join is performed only once.
What Is Procedure Caching?
Tabular results from procedure-based resources can be cached by TDV. Procedures with one or more input parameters can have many variants (unique sets of input parameters) and return different result sets based on those input variants. More than one result set from the same procedure can also be cached.
To track variants, the input parameter values are converted to a string to compare for uniqueness. If this string exceeds 255 characters, the procedure call is not cached. Instead, it is executed as if caching was disabled.
For example, Proc_1 has one input parameter (A, an integer) and returns a result set. Proc_1 can be called for different values of A. The caching mechanism executes the procedure and stores the result set for each unique set of input parameters (in this case, integer A). If two variants are in the cache, one with A= 1 and another with A = 2, and Proc_1 is called with A equal to 1 or 2, there is a hit on the variant and the corresponding cached result set is returned.
If Proc_1 is called with a variant that is not currently cached, like A = 3, a “cache miss” occurs, and the procedure is executed with A = 3. The result set is directly returned to the caller, and both the variant and the result set are written to the cache. If proc_1 (3) is called again prior to expiration of the result set, where A = 3, the cached result set is returned immediately.
Because there can be many variants, it might not be possible to store them all. By default, the maximum number of stored variants is 32, but you can change the value in the Maximum number of procedure variants field in the Advanced section of the Caching panel. If the cache is full, the least recently used variant is swapped out for the latest variant.
The following procedural resources can be cached:
Java procedures
Packaged query procedures
Parameterized SQL procedures
Physical stored procedures that are introspected
SQL script procedures
Transformation procedures—basic, streaming, XSLT, and XQuery
Web service operations
The procedure caching process uses one storage table for each output cursor and an additional storage table for any scalar outputs. For example, a procedure with two INTEGER outputs and two CURSOR outputs would use three tables:
One for the pair of scalars
One for the first cursor
One for the second cursor
If a procedure has input parameters, the cached results are tracked separately for each unique set of input values. Each unique set of input parameter values is called a variant.
For example, a procedure with an INTEGER input parameter would have separate results cached for inputs of 1, 3, and 5, or whatever value. A procedure with two INTEGER input parameters would have separate results cached for inputs (1,1), (1,2), and (x,y).
Note: A procedure cache that uses non-null input parameters must be seeded with at least one variant from a client application other than Studio, for the Cache Status to change from NOT LOADED to UP. Using the Refresh Now button does not change the status of a cache that is not loaded. Even if procedure caching configuration is correct, the status does not show that it is loaded until a client seeds the cache.
When a procedure cache is refreshed, all the cached variants already in the table are refreshed. If no variants have yet to be cached, then nothing is refreshed or only the null input variant is refreshed. You can refresh a procedure cache from Studio or using the RefreshResourceCache procedure. (See the TDV Application Programming Interfaces Guide.)
What Is Transaction Result Caching for Procedures?
Transactional result caching is available for procedures. For each unique set of input parameter values, the caching data is collected one time. When transaction result caching is enabled, results are captured in memory the first time the procedure is run during any transaction; additional calls to the procedure with the same input parameters return the cached data. Transaction result caching has no refresh, clear, status, or tracking available.
Transaction result caching is useful in the following scenarios:
The TDV procedure is run against every row in a query.
The SQL Script logic inherently requires enough processing time to make the entire query’s cumulative execution time unacceptable.
The results from the TDV procedure performs lookups against a table whose data changes frequently.
Results of calls to the SQL script are stored in memory, and later calls with the same parameters return data from the cache if the data exists there, or execute the SQL script and store that variant in memory.
In Transaction Result Caching:
Data cached for the transaction persists only for the duration of the transaction.
Transactional cached information is stored in memory.
For example, consider the following two TDV resources. Each call to lookup requires a SQL lookup to a table. If the composite view returns 1,000,000 rows without Transaction Result Caching, there are 1,000,000 database queries performed by lookup. If transaction processing is enabled and field1 does not contain unique data, each result and input combination is stored temporarily in memory. When a duplicate field1 value is returned, TDV returns the previous result value in memory for that input. All the values are discarded when the transaction ends.
Composite View
Lookup SQL Script
SELECT lookup(field1) result, field2, field3
FROM largeTable
PROCEDURE lookup(IN iVal INTEGER,
OUT resVal INTEGER)
BEGIN
DECLARE c CURSOR;
OPEN c AS SELECT f1
FROM largeSlowLookupTable
WHERE lookupVal = iVal;
FETCH c INTO resVal;
CLOSE c;
END;
To enable transaction result caching, check the “Execute only once per transaction for each unique set of input values” check box in the Transaction Options section of the Info tab for any procedure.
How Does Caching Data in a Cluster Environment Work?
Running TDV server in a cluster environment, whether the TDV Active Cluster or any other clustered environment, you can cache data.
When the TDV server is running in a cluster and the cache is in Default Cache or Automatic mode, each server keeps a separate copy of cached data on its local file or database system, and no sharing of data is performed.
When the server is running in a cluster and the cache is in single table or multi-table mode with an identified data target, all servers in the cluster access the same data target for cache information. In addition, the servers cooperate to ensure that a minimum number of refreshes occur. For example, if two servers both need to refresh the data in a cache, only one of them performs the refresh, and then both servers use the updated data.
Note: The use of the File-Cache data source or other file-based data sources is not recommended when in a cluster. The use of network-mounted files is also not recommended because the file data sources do not support file locking.
In a cluster environment, any attempt to refresh a cache uses the TDV cache status table to determine if any other refreshes are in progress. If no others are in progress, the refresh starts and all other servers are notified to reread the TDV cache status table (for information on what the cache status table is, see TDV-Created Caching Objects). If a refresh is already in progress, the server contacts the server that is currently refreshing and waits for that refresh to complete.
In a cluster environment, an attempt to clear a cache marks the cache data for clearing in the TDV cache status table. The background task to actually delete the data contacts other cluster members to determine what cache keys are no longer in use so the task can safely remove only rows that no server in the cluster is accessing. After the clear task completes, all other servers are notified to reread the TDV cache status table.
A scheduled refresh in a cluster relies on the trigger system cluster feature to ensure that triggers are processed only the appropriate number of times. File cache schedules are configured to fire separately on each server. Database cache schedules are configured to fire only once per cluster. You can create trigger resources, instead of using the schedule options provided on the Caching panel, to control this behavior. (See Triggers.)