Query Options
Click the Query Options button on the StreamingDB spreadsheet to display the Query Options dialog box. Use these options to determine exactly how the exchange of data is handled between the client machine and the database on a server machine.
- General information
- Not all database environments (the database itself and the OLE DB provider or ODBC driver) will support all combinations of options on the Query Options dialog. In general, these situations are handled "silently and gracefully" by creating a cursor with possibly different options and settings than the ones specified. For example, if you specify a client side dynamic cursor, you will get a static cursor instead, since all client side cursors (which use the ADO cursor library) are by definition static. If you specify an updateable server-side static cursor, you may receive a keyset cursor if the OLE DB provider or ODBC driver can support one, or a read-only cursor if not. To determine exactly what kind of cursor you have received after executing a query, display the Query Options dialog again. The settings here will be updated based on the current cursor type and options.
- Writing information back to the database
- If you intend to write information back to the input database via the Rapid Deployment of Models options for computing predicted values and classifications (and other statistics), review the options described below carefully to select the appropriate Cursor Type and Lock Type consistent with that operation. Specifically, choose a Lock Type other than Read Only and either a Server Side Dynamic or Keyset cursor, or a Client Side Static cursor. Generally, a Lock Type of Batch Optimistic will be most efficient (updates are sent to the remote database in batches rather than one record at a time.) However, not all database environments will support this option. The Client Side cursor will always support the batch optimistic lock type. Some experimentation with various Query Options may be necessary to determine the optimal settings for your particular database environment. Note that writing back to the database requires additional processing resources, and unless this functionality is required (e.g., for deployment of predictive data mining models (via the Rapid Deployment of Models module), it is strongly recommended not to configure the connection to the database for write-access.
- Cursor Location
- The following options are in the Cursor Location group box.
- Server Side
- If this option button is selected, the cursor is maintained on the server.
- Client Side
- If this option button is selected, the cursor is maintained on the client (all data are copied onto client machine)
- Cache Size
- Specify the number of records that are cached on the client machine (applies when using a server side cursor)
- Cursor Type
- The following options are in the Cursor Type group box.
- Forward Only
- If this option button is selected, the cursor can be traversed only forward, one record at a time. If an analysis requests data that comes before the current position, the implementation must re-query and advance to the requested record. Good for analyses that require only a single pass through the data.
- Static
- A static cursor supports scrolling forward and backward through the data, thus allowing random access to the data. This cursor type provides a "snapshot" of the result of your query - records modified, added to, or removed from the database after the cursor is populated will not be visible. A server side static cursor can place considerable strain on the database system. A client-side cursor is always static. If an analysis or other usage of the Streaming Database Connector requires random access or multiple passes through the data, this type of cursor may be needed.
Writing back to the database. Generally, a server side static cursor is by definition read-only. Note however that client side static cursors can be updateable when an appropriate Lock Type is chosen, and with certain limitations. When using a client side cursor, the cursor is populated and maintained on the local client machine by Microsoft’s ADO Cursor Engine, and after all data is copied to the local machine, the data in the cursor is inherently disconnected from the database. Therefore, when you attempt to update the database through such a cursor, the cursor engine constructs and executes an action query, for example
UPDATE Customers SET Balance = 100 WHERE CustomerID = 7
Limitations of Client Side Updatable Cursors. In order for this scheme to work, the cursor engine needs to be able to get certain meta data about the tables in your query from the database. If the OLE DB provider or ODBC driver returns inaccurate data or ADO cannot gather the base table and field names, updates may fail.
ADO must be able to locate the records in the database that you want to modify. If the table has a primary key, ADO will use the primary key fields in the action queries it builds to locate the correct record, if those primary key fields were all included in your query. Retrieving all the primary key data for a given table can be expensive and there is a notable exception; the Microsoft OLE DB provider for Oracle requests the row ID and uses that row ID instead of the primary key fields in constructing the action queries, resulting in better performance.
If you use a database table that has no primary key, or does not include the primary key fields in your query, then ADO has no reliable way to locate the record you want to modify. In this case, it will use all the values for all fields in the cursor in the WHERE clause of the action query.
- Keyset
- A Keyset cursor is a server side cursor that enables you to update data (when using an appropriate Lock Type) and to see changes made by other users. It allows random access to records and multiple passes through the data. In contrast to a static cursor, which the query processor will completely populate when you submit your query, the Keyset cursor is initially populated with only the data required to locate the records in the tables that satisfy your query. Generally this data corresponds to the primary key in your tables and is referred to as the keyset.
As you (or a Statistica analysis) pass through the data in the cursor, the number of records specified in the Cache Size is copied to the local machine. When a record not in this cache is requested, the stored key data in the cursor is used to get the next batch of records from the database.
With the Keyset cursor, any record changed by another user will be visible to you the next time that record is loaded into the local cache, e.g., on the next pass through the data. Records added to the database that satisfy your query will NOT be visible; the set of key values stored in the cursor is itself static. Records deleted from the database by other users will be removed from the Keyset cursor the next time the cache is refreshed.
- Dynamic
- The Dynamic cursor "behaves" somewhat like the Keyset cursor. Initially, the query processor retrieves the key field information for the records that satisfy your query and returns the number of records requested in the Cache Size property to the local machine. With the dynamic cursor, however, the query processor will rebuild the keyset information each time the cache is refreshed. Thus records added by other users will become available in this cursor as you navigate through the records.
Note: the Keyset and Dynamic cursor types are significantly more complicated than forward only and static cursors, and therefore they are typically much slower and place a greater strain on the remote database system. In addition, unexpected or misleading results can be obtained if the number or content of records in the Streaming DB Connector changes while an analysis is in progress.
- Lock Type
- Locking is the process by which a DBMS restricts access to a row in a multi-user environment. When a row or column is exclusively locked, other users are not permitted to access the locked data until the lock is released. This ensures that two users cannot simultaneously update the same data. The Lock Type options specify how records are to be locked when updating data in the database. Most uses of the streaming DB connector do not require updating the database, and so for best performance in those circumstances, the Read Only option should be chosen. However, certain Statistica analyses and operations, such as the Rapid Deployment of Models module, may write to the data source (spreadsheet or StreamingDB spreadsheet). Therefore when performing such an operation with streaming DB connector, you must ensure that your recordset is updatable or the operation will fail.
Providers may not support all locking options. If an unsupported type is requested, a supported type will be substituted.
- Read Only
- If this option button is selected, the Streaming DB Connector is read-only. You cannot write back to the database. This setting will normally result in faster data reading, so unless you need to update the database, this is the recommended option.
- Pessimistic
- Pessimistic locking is the most aggressive locking option. When using this option, other users are unable to access records that you are modifying, and vice versa. When using streaming DB connector programmatically via the Statistica Object Model, pessimistic locking implies that the record becomes locked when you first change any field (variable value) in a record (case) and remains locked until you call the Update (or CancelUpdate) method of the streaming DB connector DBTable object.
- Optimistic
- With Optimistic locking, two users can edit the same record simultaneously. The first one to commit their changes successfully updates the database, and the second user’s attempt to update that data fails. With this option, other users are able to edit records that you are updating (possibly creating conflicts). When using streaming DB connector programmatically via the Statistica Object Model, optimistic locking implies that a record is not locked as you modify fields (variable values) for a particular record (case); they are only locked when the Update method of the streaming DB connector DBTable object is called, and remain locked only until that call returns.
- Batch Optimistic
- This option is primarily for client-side recordsets, but can be used with server side recordsets if the OLE DB provider and/or database supports having multiple records with pending changes; for example, SQL Server cursors support this functionality, while Microsoft Access cursors do not. With this option, updates to many records can be batched and submitted all at once, thus reducing network traffic and the number of database transactions.
When using a client side recordset, or if your database and OLE DB provider/ODBC driver support batch updates in this manner, and when using a Statistica analysis that writes back to the database, this is the recommended setting.
When using streaming DB connector programmatically via the Statistica Object Model, you must call the streaming DB connector DBTable method UpdateBatch to submit the batch update, or CancelBatch to cancel the pending changes.
- Asynchronous Query
- When the Asynchronous Query check box is selected, the application does not wait for the query to complete. Data will be previewed as it becomes available. An analysis can begin immediately but will block when it requests data until the data becomes available.
- Asynchronous Fetch
- Asynchronous Fetch applies when using a client side cursor. The application does not wait for all records to be transferred to the client machine. Records will be previewed as they become available. When an analysis requests data that has not been fetched yet, it will block until the data is available.
- Maximum number of records to return from query
- If supported by the database platform, this option limits the number of rows returned from a query. An entry of zero means no limit.
- Use optimized Oracle driver for updating the database
- Select this option to use Oracle's native API (OO4O) for write back operations to Oracle databases. This option is recommended for maximum performance when writing back to Oracle databases. Note that you must select the primary key from every table in your query to take advantage of this option. If the table that you are querying and writing back to does not have a primary key or you are not including that primary key in your query then you must clear this option.
Note: Statistica may alter some of the options that you set in this dialog box after the initial query if it determines that different settings may be more optimal.
Note: Performance considerations. For best performance (and for read-access only, i.e., when not writing statistics back to the database via the Rapid Deployment of Models module), follow these guidelines:
- If you are creating an analysis that requires only one forward pass through the data, use a server side forward only cursor. This is the fastest type of cursor.
- If you require random access to or will be making several passes through the data, use a client side cursor with asynchronous query and fetch.
- Select the option Use optimized Oracle driver for updating the database when working with Oracle databases.
See the Streaming Database Connector and Cursor definitions. See also, Streaming Database Connector Technology (Technical Overview) and Streaming Database Connector FAQs.