Importing Data from Cubes


When you add data tables from a data connection, you have the option to either keep the data table external or to import it into the Spotfire data engine and use it in-memory.

Note: If you are interested in all of the data from an OLAP (Online Analytical Processing) cube, it is probably a better idea to use an in-database connection, while import is mainly intended for small subsections of the cube data.

When you are importing cube data into Spotfire you will actually flatten the cube into a single data table, which means that some features that you have access to when working with an in-database cube are lost and you may see different results when aggregating your imported data in Spotfire compared to what you get when the external system does all the aggregation. Spotfire does not import aggregated data, only leaf nodes that then can be aggregated using the Spotfire aggregation methods.

If you intend to import data from a cube into Spotfire there are a number of things that you must remember. First of all, the full cube may be enormous, so it is recommended to always limit the dimensions to import to those that are absolutely necessary only, to reduce data explosion as much as possible.

For example, if a cube like the one shown to the left below is imported into Spotfire, you will get the table shown to the right in the image below. The visualizations you set up will then decide how to aggregate the measures from the resulting table.

connect_cube_import_example.png

When an imported cube data table is aggregated (for example, with the default Sum aggregation), the available leaf values are simply summarized when you show a categorical value of a higher level (e.g. the Sum(Sales) for France from the table above). However, in the in-db cube, it is the external system that decides what an aggregation means, which may include other aggregation methods, data that was not included in the import, and so on, which may produce a completely different result.

Limit the Number of Dimensions

As an example of the importance of limiting the data as much as possible prior to import, imagine a cube with the following number of values for different dimensions:

Department: 10

Product: 5000

Currency: 20

Business Unit: 1000

Scenario:  10

Account: 100

Bringing in the entire cube would result in a cross-joined table with the following number of columns: 10x5000x20x1000x10x100=1x1012

As you can see, the amount of data very quickly explodes with the number of dimensions used, especially if each dimension contains many different values. Take care not to overload the data source by combining too many dimensions.

Best Practices

Known Limitations

The TIBCO Spotfire Connectors product also includes connections to other cube data sources. Even though these steps use a Microsoft SQL Server Analysis cube they can be used as an example for import of any cube data since the process for adding data connections is quite similar for all cube sources.

  1. Select File > Add Data Tables....

  2. Response: The Add Data Tables dialog appears.

  3. Click Add > Connection To > Microsoft SQL Server Analysis Services.

  4. Response: The Microsoft SQL Server Analysis Services Connection dialog is opened.

  5. Specify the Server you want to connect to.

  6. Click Connect.

  7. Response: Spotfire will connect to the specified server, and the cubes that are available on the server will be listed in the Cube drop-down list.

  8. Select the Cube of interest.

  9. Click OK.

  10. Response: The Data Selection in Connection dialog appears.

  11. If you want to limit your import to data from a single measure group, select the measure group of interest in the drop-down list.

  12. Comment: You can also type any text in the search field to find objects containing that text only.

  13. Click on the object of interest in the left pane, then click Add >.

  14. Comment: See Icon Descriptions for information about what the different objects represent. If you select the cube icon, you will get all data in the cube. When it comes to hierarchies, you always have to add the entire hierarchy to the list, but you can use the Limit Data button to filter out higher levels of detail in the hierarchy from the actual selection at a later stage. You can repeat this step as many times you want to add more data to the selection.

    Response: The selected data is added to the Data selection in connection list.

  15. Dimension columns and hierarchies can be further limited by clicking on the selected column or hierarchy in the Data selection in connection list and then clicking on the button with a filter funnel and selecting Limit Data....

  16. Response: The Limit Data dialog appears.

  17. Clear the check boxes for categories that you want to remove from the selection.

  18. Click OK to close the Limit Data dialog.

  19. When you are done with your data selection it is recommended to click the Estimate Import Size button to get a hint on whether this selection will be possible to import or not.

  20. Comment: The calculation may take a while. No measures are taken into account in the calculation, so if you have added many measures you may be unable to import the selection even if the estimated number of rows and columns is small. The estimated import size may be of help if an import should fail. Then, the next time you try to import data from that cube, you should try to make sure that the estimated size is smaller than the previous time.

  21. When you are done with the data selection, click OK.

  22. Response: The connection is added to the Data tables list in the Add Data Tables dialog.

  23. Make sure the data table and not the connection is selected in the Data tables list and, under Load method, click Import data table.

  24. Click OK in the Add Data Tables dialog.

  25. Response: If the data selection was small enough, the data is imported.

Tip: Do all of the data selection work in a shared data connection and save it to the library instead of adding it to the analysis directly. This way you can reuse the work you have done later, or share your selections with others.

See also:

Adding Data Connections to an Analysis

How to Work with Data Connections and their Data Sources in the Library

Working With Microsoft SQL Server Analysis Services

Working With Data from SAP BW