TIBCO EBX®
Information Governance Add-on Documentation > User Guide
Navigation modeInformation Governance Add-on Documentation > User Guide

Repository key concepts

Repository structure

Each data asset registered with EBX® (table, field, complex data type, etc.) is linked to an ISO-IEC 11179 item also called an 'Administered Item'. These ISO items represent the structure of the repository and support the logical and semantic descriptions: identification, definition, synonym, examples, classification, properties in the logical modeling, etc. The descriptions are localized by language and context. This architecture scheme is what comprises the "governance repository".

The figure below highlights the links between the ISO items and the data assets known by EBX®.

/1000000000000433000002A92FA46C1E.png

The relationships between the data assets and the ISO items rely on these rules:

Types of data assets (Logical data types)

A data asset is registered with EBX® through a data model. It is represented by a data structure definition such as a table, a field, a domain, etc. But it can also represent a data value, either to get examples of data when a concept is documented, or to define a value domain (enumeration of data). The data model used to obtain information to apply governance to is either:

The types of data assets that are governed are described in the table below. They are also known as 'Logical data types'.

Type of data asset (Logical data type)

Definition

Table

A table as defined in SQL. With the semantic data management mode in EBX®, the data structure of a table can be a complex data type that includes multi-occurrence fields.

Domain

A group of tables. A data model can be arranged into domains and sub-domains of tables.

Field

Atomic data with a simple data type: integer, string, etc. or a complex data type.

Group of fields

With a 'Group of fields' you can arrange fields around topics. For example, 'Address' with its related fields.

Complex data type

A reusable data structure built with fields. A complex data type can be used by tables and groups of fields.

Data type

Atomic data type (Integer, String, etc.) with facets for data validation (length, enumeration, etc.).

Association

Provides an abstraction over an existing relationship in the data model and enables an easy, model-driven integration of associated objects in the user interface and in data services.

Data value

Actual data value in EBX®.

Special notation:

/100002010000000E0000000E86672C34.png

The 'Workflow' and 'Rule' types of data assets are not managed in the current version of the add-on.

Types of ISO items

The ISO items, also called 'Administered Items', represent the foundation of the governance repository. If needed, you can edit the descriptive name of these elements. To perform these edits:

/Type_Of_ISO_Item.png

/Type_Of_ISO_Item_2.png

The table below describes the following items: 'Data Element', 'Data Element Concept' (D.E.C.), 'Object Class', 'Property', 'Value Domain', 'Context'.

ISO-IEC 11179 item

(Administered Item)

Definition

Data Element

Actual value of a data element, or unit of data. A 'Data Element' is used as an example to enrich the definition of an ISO item. A 'Data Element' exists as the link between a 'Data Element Concept' (D.E.C.) and a 'Value Domain'.

Data Element Concept (D.E.C.)

A 'Data Element Concept' (D.E.C.) is the association between an 'Object Class' and a 'Property'. This is not similar to a field in a table because a property can be associated with many tables. For instance, the 'Employee', 'Client' and 'Company' tables can each have an 'age' field. In this case, an 'age' property can be linked to the 'age' fields in each of the three tables.

Object Class

An 'Object Class' is a representation of a composite data structure such as a business object or a group of related information.

Property

A 'Property' is an atomic item. The association of a property with an 'Object Class' represents a 'Data Element Concept'. A property can be associated with many fields with the same name in different tables. For instance, the 'Name' property can be linked to the 'name' fields that are located in the 'Employee', 'Supplier' and 'Partner' tables.

Value Domain

A 'Value Domain' is either an atomic data type (integer, string, etc.) or an enumeration that provides a list of possible values for a 'Property'.

Context

A 'Context' lets you declare representational spaces in which the ISO item definition is provided. The most basic context is one related to languages and geographical zones. However, any type of context can be defined such as: organization, level of maturity, user profile, sources, etc.

Relation between data asset types and ISO items

The governance repository maintains a bridge between the logical data management layer (Logical data type) and the semantic data management layer (ISO item). The ability to manage the links between the two levels of data management ensures the metadata retains a high quality over time. This also ensures that you can easily catch and fix any de-synchronization between the two data layers.

Moreover, when data duplication is identified at the logical layer level, then a de-duplication policy can be executed. For example, by allowing the definition of a shared concept to be applied to several data assets in the logical layer.

The table below highlights the rules that the add-on applies to manage the links between ISO items and types of data assets.

ISO item (Administered Item)

Data asset types

(Logical data types)

Relation between the data asset types and the ISO items

de-duplication policy

Data Element

Data value

Unique

N/A

Data Element Concept (D.E.C.)

N/A

N/A

N/A

Object Class

Table

Unique

No de-duplication

Group of fields

Unique

No de-duplication

Complex Type

Multiple

Used in one to many tables and groups of fields

Based on the logical name of the Complex type (exact matching)

Domain

Unique

No de-duplication

Property

Field or Association

Multiple

Used in one to many tables and groups of fields

Based on the logical name of the Property (exact matching)

Value Domain

Data type

Multiple

Used in one to many fields

Based on the logical name of the Value Domain (exact matching)

It is important to understand at which levels the add-on applies a data asset de-duplication process to enable definition unification when needed (see next section).

Data asset de-duplication process

When the de-duplication process is applied to any type of field assets that are defined in a data model, two fields with the same logical name will only generate a single 'Property'.

The de-duplication process uses an exact matching execution that is case sensitive. For example, the fields 'age' and 'Age' are considered different fields. But if an 'age' field is located in two tables ('Employee' and 'Company') and each field name is an exact match, then only one 'age' property is created in the repository. This property is then linked to the two fields located in the two tables.

Special notation:

/100002010000000E0000000E3C009B93.png

By default, de-duplication is not active. The decision on whether or not to activate the de-duplication process is taken when the data model is configured in the repository. This decision is definitive for the related data model. The 'De-duplication is active' configuration property becomes read-only once the data model is declared in the repository.

/Data_Asset_Deduplication_Process.png

When de-duplication is activated, D.E.C. (Data Element Concept) types of 'Administered Items' can be used to collect the definition related to the association between a field (Property) and a table (Object class). For instance, if an 'age' field exists in two tables ('Product' and 'Client') then the de-duplication process generates the following Administered Items:

The generic definition of 'age' is applied to the property and the additional definitions that are specific to the 'Client' and 'Product' tables are attached to the related D.E.C.s.

De-duplication of other ISO items depends on the governance processes an organization needs to enforce. For example, the de-duplication of tables cannot be done systematically. It is common to declare tables with the same name in different data model domains. At the governance repository level either each of these tables is itself considered an 'Object' class, or a de-duplication is applied in an attempt to unify the definition through a unique 'Object' class.

Special notation:

/100002010000000E0000000E86672C34.png

The de-duplication process currently does not support:

  • Application to 'Field' and 'Complex types' using fuzzy matching.

  • Application to 'Table', 'Group of fields', and 'Domain' types.

  • Configuration of whether to execute de-duplication by asset type.

  • A scope of multiple data models.

  • Application based on Item definitions.

When you disable de-duplication at the data model level, it applies only to fields.

Data staging applied to the add-on repository

Data spaces are created to manage add-on repositories by contexts of use. For instance, a child data space can be dedicated to manage the 'Administered items' related to a single data model or to prepare a future version of the metadata definitions.

Special notation:

/100002010000000E0000000E3C009B93.png

The data spaces are also used to track the history of data modification applied to the repository. For instance, it is possible to create a 'Future version' child data space and then use a comparison of data spaces to oversee the differences between the current data space and the future version in preparation.

/Data_Staging_Applied_To_The_EBX_IGOV_Repository.png

Metadata views

On the table 'Item', several data views are available to display the metadata either in the simplest way (View Object class) or through the full list of metadata types (View all metadata).

/Metadata_Views.png

When the de-duplication process is not applied, the simplest data view 'View Object class' is sufficient to manage the metadata in most cases. The usage of the Data Element Concept is interesting when the de-duplication process is used and then duplicate properties are detected. In this case, the duplicate properties are defined in a unified way (one property only with one definition) and then more specific definitions are provided in the context of each related Object classes, namely the associated Data Element Concepts (link between the Property and an Object class). For instance, the property 'age' is related to two different fields in the tables Client and Product. This property is defined by itself, and then for the two related Data Element Concepts 'Client-age' and 'Product-age'.

The labels of the data hierarchy views are based on the logical labels of the items that come from the logical data model. As illustrated later in this user guide, you must use a 'Universal name' to provide a more meaningful label. This universal name is then displayed in parenthesis after the logical label.