Load methods

If your data comes from a data connection to an external system, you have the opportunity to choose how the data should be loaded when adding data; either as in-memory data, analyzed by the internal data engine of Spotfire, or, as in-database data (in-db data), where all calculations are handled by the external system.

In-memory analysis (Import)

Text files, Excel files, and information links (not available on Cloud) are always imported and analyzed in memory, whereas, with data connections, you can choose to import the data, if desired. When you are working with data in memory, you have access to all the functionality of Spotfire, via the built-in data engine. The internal data engine is available to all users, either in the Windows client or, for web client users, on the server. You have the opportunity to use all columns as filters and you can perform many types of calculations. With imported data you can also combine data from different sources into a single data table using the Add rows or Add columns operations (and, using Spotfire Analyst, it is possible to add transformations to the data).

If your data is small enough, imported data is most of the times the preferred option, because it often improves the performance of calculations.

In-database analysis (External)

If you choose to keep a data connection external, all calculations are done using the external system and not with the Spotfire data engine. This allows you to work with data volumes too large to fit into primary memory and take advantage of the power of the external system. When working with external data connections, you access only the current selection of data and all aggregations and calculations are made in database (in-db).

When a visualization uses in-db data, the visualization queries the external data source directly. Every time a change is made to the configuration of the visualization, e.g., a measure is defined on the Y-axis or a categorical column is added, a new query is sent to the external data source resulting in new, aggregated data.

When working with in-db data it is the connector and the underlying data source that determines which aggregation methods are available.

In-db data is usually preferred if you are working with very large data volumes, which would not fit in-memory, or if you want to make sure that the data is always the latest data from your external system, and always handled in the ways of your external system.

Data loading settings

For in-memory data, you can specify Data loading settings for each source in your data table, as long as the data is kept linked. These settings determine whether to use Stored data, Always new data or New data when possible. When saving an analysis, it is important to consider the data loading settings because they can determine whether other users will have access to the data in the analysis, if the analysis is shared.

The Stored data option saves the current data in the analysis. New data will only be loaded if the source is manually reloaded.
The New data when possible option will also store the current data in the analysis. In that case, data will also be available to users who do not have access to the source. However, if a user has access, then new data will be loaded when the analysis is opened.
Always new data does not store any data in the analysis file.

You can change the data loading settings for applicable sources from within the Data canvas. See Storing data within the analysis for more information.

If the analysis will be used with scheduled updates, you can exclude certain data sources from that update in the data loading settings. See Reloading data for each user when using scheduled updates for more information.

On-demand (configured using installed client only)

When data from a non-cube data connection or an information link is to be added to the analysis as a new data table, in the installed client, you have the option to either load all data at once, or to load data on demand only. This is applicable to both in-memory and in-database data (for data connections). Your analysis can benefit from on-demand loading when you have access to massive amounts of data, but you only need to work with some parts of the data at a time. When setting up an on-demand data table you can specify conditions based on one or more other data tables to control what to load. You can also start by letting an on-demand data table be the first (or only) data table in the analysis if its input is defined by a document property, a variable you can define yourself.

On-demand loading can be regarded as a way to filter data; it is basically a WHERE-clause which dynamically limits what is read and used in calculations.

Custom queries (configured using installed client only)

When working with data connections to relational databases or other non-cube data sources, you get the option to select one or more tables from the data source in the modeling view. Here, you might also have the option to create your own custom database query, depending on your licenses. A custom query results in a custom table which in turn can be used to set up a view in the selected connection in the same way as you would do with other database tables. Queries are written in a language that the database understands. For example, for a Microsoft SQL Server database, you would write your custom query in the Microsoft SQL Server dialect of SQL.

Information links also give you the possibility to create custom SQL.