User Guide > TDV Caching > Setting Up Caching > Setting Up Native (Bulk) Caching Options
 
Setting Up Native (Bulk) Caching Options
The TDV native loading option improves cache performance when:
The cache source and cache target reside on the same TDV data source resource.
The cache data is being saved to one of the active cache targets listed in the Native Loading Option column of the table in Supported Cache Target Storage Types.
The native cache loading options are enabled by default. Native loading makes use of functionality that is inherent to the data source to improve the speed of data movement.
If TDV detects that your data source and your cache data target are on the same TDV data source resource, it determines if a direct SELECT and INSERT is the best way to move data or if one of the configurable native loading options might work faster. Most databases have proprietary data movement functionality that is optimized for that database. For example, all Oracle databases come with database link functionality. That database link functionality can be used to provide cache loading performance gains that are not possible when using the TDV JDBC connections to move data.
If both the native database loading and the TDV cache loading fail, the refresh status is set to failed.
Topics in this section
Performance Option
Uses the DB2 LOAD utility.
Uses the bcp utility (bcp.exe) for bulk import and export.
Uses Netezza external tables.
Uses database links.
LOAD_TABLE statement.
Uses Bulk Load utility.
 
Configuring TDV Native Loading Option for DB2
For DB2, the TDV native loading option makes use of DB2’s LOAD utility. The LOAD utility can quickly load or add data to a table where large amounts of data need to move. LOAD can perform significantly faster than IMPORT because LOAD writes formatted pages directly into the database while IMPORT uses SQL INSERTs. Before attempting to use this method, we recommend that you see if using JDBC batch insert options can give you acceptable performance improvements. If you choose to implement the JDBC functionality, you do not need to configure the DB2 LOAD utility, or change any TDV configuration settings.
The DB2 LOAD utility when working with TDV requires that the DB2 command-line utility be run. How it works with TDV varies by platform:
Platform
DB2 Command-Line Utility Name
Execution Details
Windows
DB2CMD.exe
When it runs, a command window pops up, requires a password, and stays active until the upload completes. While the window is open, the password might be visible.
UNIX
db2
It can be run as a background process.
Requirements
The data being loaded must be local to the server.
Requires advanced DB2 database level configuration
Limitations
This feature is not valid for:
Binary and BLOB data types
Kerberos implementations
To configure the DB2 LOAD utility to work with TDV for caching
1. Consult the IBM documentation on configuring and using the DB2 LOAD utility.
2. Install and configure all relevant parts of the DB2 LOAD utility according to IBM’s instructions for your platform.
The client drivers might need to be installed on all the machines that are part of your TDV environment.
3. Verify the full path to the DB2 command-line utility that you want to use to run the DB2 LOAD utility. For example, locate the path to your DB2 SQL command-line client.
4. Open Studio.
5. Select Administration > Configuration.
6. Locate the Enable Bulk Data Loading configuration parameter.
7. Set the value to True.
8. Click Apply.
9. Click OK.
10. Locate the DB2 Command-Line Utility Path configuration parameter.
11. For Value, type the full directory path to the DB2 command-line program. If there are spaces in the pathname, be sure to enclose the path in double quotes. For example:
“C:\Program Files\IBM\DB2\Tools\Bin\DB2CMD.exe”
 
12. Click Apply.
13. Click OK
14. Restart your TDV Server.
Some configuration parameter value changes are applied while the TDV Server is running, but many of them require a server restart to work correctly.
15. Open Studio and locate the DB2 data source for which you want to enable the bulk load feature.
16. Open the data source editor.
17. Select the Advanced tab.
18. Scroll down to locate the Database Alias (Used For DB2 Load) field.
This property is passed on to the DB2 Load to identify the cataloged database.
19. For Value, type the name of the DB2 database. For example: dvbudb21.
20. Save your changes.
21. Refresh your cache.
Temporary files used to support this feature are created in <TDV_install_dir>\...tmp\cacheloading\db2, all files, data, commands and logs are deleted by TDV after the upload process completes.
Configuring TDV Native Loading Option for Microsoft SQL Server
For Microsoft SQL Server, the TDV native loading option makes use of Microsoft’s bulk import and bulk export (bcp) function.
The bulk import and bulk export functionality available caching for Microsoft SQL Server requires the configuration of the Microsoft bcp utility. When the bulk import and bulk export functionality is used instead of the JDBC functionality, performance is improved. The bcp utility (bcp.exe) is a command-line tool that uses the Bulk Copy Program API. The Microsoft bcp utility performs the following tasks:
Bulk exports data from a SQL Server table into a data file.
Bulk exports data from a query.
Bulk imports data from a data file into a SQL Server table.
Generates format files.
If you choose to implement the JDBC functionality, you do not need to configure the bcp utility, or change any TDV configuration settings.
When the cache source and target exist on the same SQL Server data source, native loading of cache data using a direct SELECT and INSERT can provide significant performance gains. The feature requires that the value of the Enable Bulk Data Loading Studio configuration parameter be set to True.
When configuring data ship or a cache for Microsoft SQL Server, you can configure the bcp utility and then set up access permissions. Both of these processes are described below.
The configuration steps are similar for caching and data ship, for more information see Configuring Data Ship for Microsoft SQL Server.
Note: In the bcp utility, NULL values are interpreted as EMPTY and EMPTY values are interpreted as NULL. TDV passes the ‘\0’ value (which represents EMPTY) through the bcp utility to insert am EMPTY value into the cache or data ship target. Similarly, TDV passes an EMPTY string (“”) to the bcp utility to insert a NULL value into the cache or data ship target.
To configure the Microsoft bcp utility to work with TDV for caching
1. Verify that the bcp.exe utility has been installed and is located in a directory that you can access. Note the full path to the bcp.exe file.
2. Open Studio.
3. Select Administration > Configuration > Data Sources > MS SQLServer Sources.
4. Select the Microsoft BCP utility parameter.
5. For Value, type the full directory path to the bcp.exe. For example:
C:\Program Files\Microsoft SQL Server\100\Tools\Binn\bcp.exe
6. Optionally to use Windows Authentication, locate and select the DataShip BCP Threshold parameter.
The default value is 50000. After bcp utility is enabled, if the data shipping cost estimate is 50000 or above, data ship uses the bulk import and export implementation; otherwise JDBC, is used to move the data.
7. Optionally to use Windows Authentication, adjust the value of the DataShip BCP Threshold parameter to values that fit your requirements.
8. Click Apply.
9. Select Administration > Configuration.
10. Navigate to TDV Server > Configuration > Debugging > Enable Bulk Data Loading.
Or, search for ‘bulk’ using the Configuration Find field.
11. Set the value to True.
12. Click Apply.
13. Click OK.
14. Stop the TDV Server.
15. From the command line where your TDV Server runs, log in as the domain user.
16. Start the TDV Server.
17. You can now complete the setup of your cache data target.
18. Click OK.
19. Optionally, when running with Windows Authentication, on windows, open Services > TDV Server <version> and select Properties.
20. Select the Login On tab, set the This account fields to a domain user who has access to MSSQL instance.
21. Click OK.
To set up access permissions for SQL Server for caching
1. Consult your SQL Server documentation for how to set the SHOWPLAN permissions for the database.
2. For every SQL Server database that will participate in the caching, grant the SHOWPLAN permissions. For example:
GRANT SHOWPLAN
TO <database> [ ,...n ]
 
3. You can now complete the setup in Caching to a File Target.
To tune the TDV caching behavior when Microsoft SQL Server is the cache target
1. Open Studio.
2. Select Administration > Configuration from the Studio menu bar.
3. Expand the Data Sources > Common to Multiple Source Types >Data Sources Data Transfer.
4. Determine the best value for the Column Delimiter parameter.
The default is |.
5. Click Apply.
6. Click OK.
Configuring Native Caching for Netezza
For Netezza, the TDV native loading option makes use of the Netezza external tables. You must follow the Netezza documentation for how to set up the external tables on the Netezza data source.
For Netezza you have the following configuration choices:
Enabling Bulk Load for Netezza
Managing NUL Character in Strings for Netezza
When the cache source and target exist on the same Netezza data source, native loading of cache data using a direct SELECT and INSERT can provide significant performance gains. The feature requires that the value of the Enable Bulk Data Loading Studio configuration parameter be set to True. For Netezza, the parameter value must also be True to enable bulk loading of data cached from procedures.
XML, BINARY, and VARBINARY are not supported by Netezza, any data type related to these data types will not be added to the Netezza cache.
By default, TDV analyzes the table selected for caching to see if DISTRIBUTE can be used to achieve performance gains through parallel processing of the cache query. All keys and indexes specified on the resource are consulted. This is done to determine if the DISTRIBUTE clause was added to the native DDL used to create the cache target tables. For single-table caching, the cachekey column is also added to the DISTRIBUTE clause. If neither keys nor indexes are specified, DISTRIBUTE ON RANDOM is used.
Enabling Bulk Load for Netezza
To enable bulk loading for Netezza
1. Open Studio.
2. Select Administration > Configuration from the Studio menu bar.
3. To tune buffer size for Netezza, expand the Data Sources > Common to Multiple Source Types >Data Sources Data Transfer folder.
4. Determine the best value for the Buffer Flush Threshold configuration parameter.
This parameter controls how many data rows are written to the buffer file before flushing the buffer. The minimum is 1000 and the default is 10000.
5. Click Apply.
6. Click OK.
7. Select Administration > Configuration.
8. Navigate to TDV Server > Configuration > Debugging > Enable Bulk Data Loading.
Or, search for ‘bulk’ using the Configuration Find field.
9. Set the value to True.
10. Click Apply.
11. Click OK.
12. You can now complete the setup of your Netezza cache data target.
Managing NUL Character in Strings for Netezza
Optionally, if you need to move data that has NUL characters, you can use the following procedure to determine how those characters are managed by TDV.
When the Ignore Nul Characters in Strings configuration parameter is true, NUL('\0') characters get discarded by Netezza and rest of the String is loaded. When the configuration parameter is false, NUL characters in Strings are not discarded. The default value is false.
To ignore Nul characters in strings
1. Make sure you have both Modify All Config and Access Tools rights.
2. Log into Studio as the admin user.
3. From the Administration menu, choose Configuration.
4. In the tree pane, navigate to Data Sources > Netezza Sources > Ignore Nul Characters in Strings.
5. Select True.
When set to false, NUL characters in strings are not discarded.
6. Click Apply.
7. Click OK.
This Studio configuration change is not immediately propagated to other open instances of Studio connected with this server.
8. Restart the TDV Server.
Configuring Native Caching for Oracle
When you want to use Oracle as your cache target, you can finish setup of the TDV native loading option that uses Oracle’s database link functionality to provide better caching performance. Additionally, you can define the TDV parallel cache loading option. This feature is supported for Oracle cache targets.
To set up the native loading option caching data to an Oracle target
1. Make sure that the view that you want to cache:
Is a view created from a single TDV data source.
Is a view that has a pass-through query that is run (or pushed down) entirely to the data source.
The view source can be DB2, Oracle, SQL Server 2008, or Sybase 15.
2. Define database links between your data sources and the Oracle cache target. The database links must be defined using Oracle database tools to directly add the database link using SQL, or by modifying your tnsnames.ora file.
3. Open Studio.
4. Open the Oracle data source that you want to act as your cache target.
5. On the Configuration tab, select the Advanced tab in the Connection Information section.
6. Scroll down and check the Enable Oracle Database Link check box.
7. Type the database link name as you defined it in your Oracle database. The database link path must point to the path of the data that is the source for the cache. Use the Add Database Link button if you want to have more than one database link (to enable caching from multiple sources).
8. Make sure Enable Data Source is checked.
9. Click Add/Remove Resources to open the Data Source Introspection window and find, introspect, remove and view properties of data sources.
See Retrieving Data Source Metadata, for details.
10. Click Test Connection to make sure the connection works.
11. Save your changes.
Configuring Native Caching Option for Sybase IQ
Depending on the platform you are installing to, your steps might vary. These steps do not detail every step as might be required by Sybase, refer to your Sybase documentation for further details.
To configure Native Cache Loading for Sybase
1. Verify or install a SQL Anywhere Database Client. You can obtain a trial version through the Sybase download site.
If you don't have SQL Anywhere Database Client and don't want to install one on your Unix system, then you can copy the client into <TDV_install_dir>/sqlanywhere<ver>, because the SQL Anywhere database client has the drivers that are needed for TDV.
2. Locate the following files in the directory where the Sybase client is installed.
jodbc.jar
libdbjodbc<ver>.so for Unix
dbjodbc<ver>.dll for Windows
3. Copy the files to the locations described in the following table.
OS
Copy jodbc.jar Into
Copy dbjodbc<ver>.dll Into
Windows 64
<TDV_install_dir>\apps\common\lib
<TDV_install_dir>\apps\common\lib\win64
Windows 32
<TDV_install_dir>\apps\common\lib
<TDV_install_dir>\apps\common\lib
UNIX (32 and 64)
<TDV_install_dir>\apps\common\lib
<TDV_install_dir>\apps\common\lib
4. Stop your TDV Server.
5. For Unix, set the global LD_LIBRARY_PATH environment variable to point at the SQL Anywhere client libraries folder. TDV uses LD_LIBRARY_PATH to find the Sybase Client library. For example, to temporarily set the variable, type:
export LD_LIBRARY_PATH=/opt/<TDV_install_dir>/sqlanywhere<ver>/lib<ver>/
 
6. Create an ODBC data source following the guidelines in your Sybase documentation. The following instructions highlight some of the necessary steps depending on your platform.
Platform
Instructions
Unix
1. Create an odbc.ini file. We recommend that you create it under <TDV_install_dir>.
2. Create a Sybase IQ data source name (DSN) in the odbc.ini file. For work with TDV, the Driver variable is the most important setting, because it points to the SQL Anywhere client that you have installed. For example, data sources named, test1 and test2, would look similar to the following:
#####################################
SybDB@machine64:cat odbc.ini
[test1]
Driver=/opt/<TDV_install_dir>/sqlanywhere<ver>/lib<ver>/libdbodbc<ver>_r.so
host=10.5.3.73
port=2638
uid=dba
PWD=password
DatabaseName=asiqdemo
PreventNotCapable=YES
 
[test2]
Driver=/opt/<TDV_install_dir>/sqlanywhere<ver>/lib<ver>/libdbodbc<ver>_r.so
host=10.5.3.74
port=2638
uid=dba
PWD=password
DatabaseName=asiqdemo
PreventNotCapable=YES
######################################
Windows
1. Start the ODBC Administrator, select Sybase > Data Access > ODBC Data Source Administrator.
2. Click Add on the User DSN tab.
3. Select the Sybase IQ driver and click Finish.
4. The Configuration dialog box appears.
5. Type the Data Source Name in the appropriate text box. Type a Description of the data source in the Description text box if necessary. Do not click OK yet.
6. Click the Login tab. Type the username and password. If the data source is on a remote machine, type a server name and database filename (with the .DB suffix).
7. If the data source is on your local machine, type a start line and database name (without the DB suffix).
8. If the data source is on a remote system, click the Network tab. Click the check box for a protocol and type the options in the text box.
9. Click OK when you have finished defining your data source.
10. On Unix, use the following commands to verify that the DSN is set up successfully:
cd sqlanywhere<ver>/bin<ver>
./dbping -m -c "DSN=test1"
 
A “Ping server successful” message means that the DSN is working and any application can use the DSN to contact the Sybase IQ through the ODBC Driver.
11. On Unix, set the global ODBCINI environment variable to point at the SQL Anywhere odbc.ini file. For example, to temporarily set the variable, type:
export ODBCINI=/opt/<TDV_install_dir>/odbc.ini
 
12. Start TDV server.
13. Open Studio.
14. Open your Sybase IQ data source.
15. Select the Advanced tab.
16. Scroll to the end of the Advanced tab.
17. Select and type the following:
Sybase iAnywhere JDBC Driver
Select this check box to enable better performance. This option enables the Sybase IQ specific ODBC LOAD TABLE SQL tool to import data into TDV.
Sql Anywhere Data Source
Enter your ODBC DSN name.
18. Save and close your Sybase IQ data source.
Configuring Native Caching Option for Vertica
By default, TDV is configured to use Vertica’s Bulk Load utility or INSERT/SELECT as the native loading mechanism. There are no TDV configuration parameters that you need to set.
Validating that your system is configured to enable native caching
1. Open Studio.
2. Select Administration > Configuration.
3. Search for Enable Native Loading and make sure it is set to True.
4. You can now complete the setup in Caching to a File Target.