What are data functions?

Data functions are the Spotfire way of letting advanced analysts, statisticians or mathematicians enhance Spotfire by creating scripts that can perform pretty much any type of calculation and returning the results to a Spotfire analysis. They can be created using the installed Spotfire client. If the data function is saved in the library, a Spotfire Business Author with the Execute Data Functions license feature (under the TIBCO Spotfire Advanced Analytics licence) can use the data function when creating a new analysis. Consumer users can benefit from the results of the calculations when interacting with the finished analysis.

Due to their flexibility, data functions can be used for many different things, for example:

Opening data.
Transforming data (transformations can only be added using the installed client).
Adding features to a visualization (for example, curves) by addition of a new data table, based on the first one.

In most cases, the use of a data function is a matter of mapping inputs to outputs, that is, the script requires someone to say what to base the calculations on, and where to place the result from the calculation, in the context of your current analysis.

Inputs can, for example, be a value, a column or a data table in your current analysis, but it is also possible to let the script import data from somewhere else, and have the data function as the source for the first data table in the analysis.

The output is any combination of numeric values (e.g., model coefficients, forecasts, etc.), text (e.g., summary diagnostics) or even R graphical objects. An output is also mapped to a value, a column or a data table in Spotfire. When new columns are created, they can be incorporated into an existing data table, if desired. Single value outputs can be mapped to a property and shown in a text area, if this has been configured using an installed client.

To easily find and reuse data functions from the library, they can be pinned to the f(x) flyout.

Data function definitions vs. data function instances

What is saved in the library is actually the data function definition. It contains the script itself and the author's specification of what types of inputs and outputs to expect or allow.

When you run a data function by mapping the definition to inputs and outputs in your analysis, you create an instance of that data function in the document. You can, in fact, have multiple different instances of the same data function in your analysis, if you run it multiple times, but the only time you would actually need that is when you run the data function with different inputs, and you want to use or keep all of the different outputs (similar to how you can create multiple calculated columns using the same function). Keeping a single instance of each data function definition in the document is most of the times the preferred option, for performance reasons.

If you just want to refresh the data function, or tweak the parameters or the script of an already existing data function instance, you can edit it from the Data canvas, instead of adding it again.