What are Data Functions?


Data functions are the Spotfire way of letting advanced analysts, statisticians or mathematicians enhance Spotfire by creating scripts that can perform pretty much any type of calculation and returning the results to a Spotfire analysis. If the data function is saved in the library, any author with the Execute Data Functions license can use the saved data function when creating a new analysis. Consumer users can benefit from the results of the calculations when interacting with the finished analysis.

Due to their flexibility, data functions can be used for many different things, for example:

In most cases, the use of a data function is a matter of mapping inputs to outputs, that is, the script requires someone to say what to base the calculations on, and where to place the result from the calculation, in the context of your current analysis.

Inputs can, for example, be a value, a column or a data table in your current analysis, but it is also possible to let the script import data from somewhere else, and have the data function as the source for the first data table in the analysis. Input values can, for example, be a document property which is provided by the end user using a property control in a text area.

The output is any combination of numeric values (e.g., model coefficients, forecasts, etc.), text (e.g., summary diagnostics) or even R graphical objects. An output is also mapped to a value, a column or a data table in Spotfire. When new columns are created, they can be incorporated into an existing data table, if desired. Single value outputs can be mapped to a property and shown in a text area.

To easily find and reuse data functions from the library, they can be pinned to the f(x) flyout.

Data function definitions vs. data function instances

What is saved in the library is actually the data function definition. It contains the script itself and the author's specification of what types of inputs and outputs to expect or allow.

When you run a data function by mapping the definition to inputs and outputs in your analysis, you create an instance of that data function in the document. You can, in fact, have multiple different instances of the same data function in your analysis, if you run it multiple times, but the only time you would actually need that is when you run the data function with different inputs, and you want to use or keep all of the different outputs (similar to how you can create multiple calculated columns using the same function). Keeping a single instance of each data function definition in the document is most of the times the preferred option, for performance reasons.

If you just want to refresh the data function, or tweak the parameters or the script of an already existing data function instance, you can edit it from the Data Canvas or from the Data Function Properties dialog, instead of adding it again.

Similarity to expression functions

By saving a script as an expression function (only possible with scripts based on Spotfire® Enterprise Runtime for R (a/k/a TERR™)), a statistical calculation can be used in the Spotfire expression language directly, as any other function. This is a way you can enhance the expression language with your own, script-based functions. See How to Create an Expression Function for more information. In contrast to data functions, expression functions are always defined in the context of an analysis and cannot be shared between analyses (except by copying the script from one analysis and saving it in another). Because data function definitions can be saved in the library, they are much easier for others to find and reuse.

See also:

How to Use a Data Function

Introduction for Data Function Authors

Data Functions in the Data Canvas