Sometimes the data you want to analyze in Spotfire is located in different data tables. Working with visualizations combining data from multiple data tables is not very different from working with data from a single data table. You can choose the visualization that best suits your data, you can filter, mark, and drill down in your data, just as with a visualization using data from a single data table. However, a couple of concepts are important to be familiar with when setting up and working with a visualization combining data from different data tables. This topic describes key concepts, and includes a couple of examples.
In a visualization combining data from different data tables, the main data table plays an important part. A visualization always has only one main data table, which is the anchor point in the data for the visualization. It defines what a row is in an unaggregated visualization, and the columns in the main data table are the columns that can be used to group the visualization in different ways. Consequently, the main data table columns control what becomes an item (such as a marker in a scatter plot, or a bar in a bar chart) in an aggregated visualization.
When you mark items in a visualization, details will be shown for the columns in the main data table only. And the main data table is also the data table that all expressions refer to by default, unless you explicitly specify that an expression should refer to another data table in the visualization by using the qualified column name.
To get the most out of your data, it can therefore be valuable to take a moment to think through which of the data tables is best suited to be the main data table in the visualization, before you start configuring the visualization.
You can select the main data table on the Data page in the Visualization Properties dialog, or in the data table selector in the legend. In the image below, the name of the main data table is 'Sales 2011':
Additional Data Tables
Columns from other data tables than the main data table can be used on aggregating axes in the visualization, but not on axes that are grouping the visualization.
To add a column from another data table, you can use drag-and-drop from the data panel or the filters panel, or select a column from the column selector. Just open the column selector and switch to the data table of interest; the column selector will switch to show the columns in the selected data table instead. The data table selector will only be visible if you have multiple data tables in the analysis and there are column matches available, see below.
Matching Columns
Another important concept to know about is column matching. For a visualization to show data from many data tables, at least one column that you are going to use to group the visualization in some way, should match a corresponding column in the other data tables in the visualization. A column is matching if it contains the same kind of data. If columns contain values of the same data type AND have the same column name they will be matched automatically. For example, in the two data tables 'Sales 2011' and 'Sales 2012' below, the columns 'Category' and 'Type' match between the two data tables. A basic rule when setting up a visualization is that all the categories you are going to use in the visualization should exist in all the data tables. That way, matching of columns is going to be easy. However, there are exceptions to that rule. To learn more about that, see Missing_Column_Matches.
Spotfire will automatically match columns with identical names and data types, but you can also add matches manually if necessary. To learn more about when and how to match columns manually, see Column Matching.
While it is generally not necessary to set up a relation between data tables in addition to a column match, it can sometimes be useful to do so. With a relation between two data tables, marking and filtering from one data table can be propagated to the other data table. To read more about how to specify how filtering should work in related data tables, see Filtering in Related Data Tables.
Basic Example
Comparing the data in the two data tables above in a bar chart does not require any special adjustments. Just load the two data tables into Spotfire, create the bar chart, select one of the categorical columns on the category axis, and then select the two columns 'Sales 2011' and 'Sales 2012' on the value axis.
Since the columns 'Category' and 'Type' in those two data tables have identical names and contain values of the same data type, they have already been matched automatically. As the example illustrates, the main data table is 'Sales 2011', and therefore, the column used on the category axis originates in that data table. In this example, any of the two data tables could have been used as the main data table, because the categorical columns are the same in both data tables. As always when multiple columns are used on the value axis of a bar chart, (Column Names) should be used to group the visualization in order to compare the two columns on that axis. In this case the (Column Names) option is used to color by, but using it to trellis by or, as in the image below, adding it to the category axis are other possibilities.
Different Levels of Detail
In the example above, the data tables had more or less the same columns; 'Category', 'Type', and a column containing the sales figures. You can also compare data from data tables with data on different levels of detail in a visualization. For example, you may want to compare sales targets for a certain year with the actual sales so far for the year. Perhaps you have a data table containing sales targets for fruits and vegetables. For each fruit and vegetable type, one single row represents the target, as seen in the 'Targets' data table below. In another data table you may have the actual sales data for the current year, as seen in the 'Sales' data table below. In this data table each sales transaction is represented by one row, which means that for each type of fruit and vegetable, there are several rows of sales figures.
By combining data from those two data tables in a bar chart you can now see which fruits and vegetables have reached their targets this year:
Recommended Workflow
If you are unsure how to set up a visualization combining columns from different data tables, this recommended workflow can be helpful.
1. Choose the main data table
Start by having a look at the data in the different data tables, and try to answer a couple of questions. What data do they contain? What do you want to visualize based on that data? A data table containing categories you would like to group your visualization by is a good candidate for the main data table. For instance, you may want to group by region, department, salesperson, product type, or similar.
2. Set up the visualization with only the main data table
Create the visualization type you want to use, and then configure as much as you can of that visualization with columns only from the main data table. Select how and by which columns the visualization should be grouped, and if the main data table also contains columns that you want to show as aggregated, add those columns to the appropriate axes as well.
3. Add the aggregating measures
When the visualization has been configured as much as possible with only main data table, you can start adding aggregated columns from other data tables.
See also:
Working with Multiple Data Tables in One Visualization