Terminology

Before using TIBCO Clarity, go over the terminology used in TIBCO Clarity.

dataset

A collection of raw data from one or more data sources. A dataset can contain more than one project.

project

A project either contains an entire dataset or a portion of a dataset. Various validation and transformation rules can be performed on a project.

See dataset.

rows/records

A mode that determines how your source data can be organized. In the rows mode, each single data row is treated as an independent piece of data. In the records mode, each object is treated as an independent piece of data, which means a single object may contain more than one row.

undo/redo

TIBCO Clarity saves all the operations performed on a project. The Undo/Redo function allows you to revert data to a previous status, or to reproduce the steps already performed.

See project.

predefined data type

A predefined data type is defined by TIBCO Clarity based on basic data types, such as String, Integer, and so on.

custom data type

A customized data type that is defined based on the predefined data types with extra constraints.

facet

A facet is a single defining aspect that helps determine the set of values for a simple type. By applying facets on a particular column, you can filter down to a subset of rows and understand data in greater detail.

cluster

switchable groups

A group that merges several data columns to detect duplicates.

look-up table

A look-up table is defined to help transform source data to a desirable format.

data profile

A process to get an assessment of the current state of data and information about errors that the data contains.

data transformation

A process where source data is changed from its given format into the format expected by an appropriate application.

dedup

A process to find duplicated or similar records in data. It is short for deduplication.

dependency check

A process to explore dependencies among data columns. TIBCO Clarity allows you to group some data columns as a Key, and also a Value, and then checks if the Key columns can uniquely determine the Value columns.

batch processing

Batch processing applies various data management operations performed on one project to the whole dataset.

See project and dataset.

null/empty/blank

null: A field without any value.

empty: An empty field without a white space, or with one or more white spaces. For example, "", " ", or " ".

blank: A field without any value (null). Or, an empty field without a white space ("").