The main principles of the TIBCO EBX® technical architecture are the following:
A Java process (JVM) that runs EBX® is limited to a single EBX® repository. This repository is physically persisted in a supported relational database instance, accessed through a configured data source.
A repository cannot be shared by multiple JVMs at any given time. However, a failover architecture may be used. These aspects are detailed in the sections Single JVM per repository and Failover with hot-standby. Furthermore, to achieve horizontal scalability, an alternative is to deploy a distributed data delivery (D3) environment.
A single relational database instance can support multiple EBX® repositories (used by distinct JVMs). It is then required that they specify distinct table prefixes using the property ebx.persistence.table.prefix
.
In order to guarantee the integrity of persisted master data, it is strictly forbidden to perform direct SQL writes to the database.
It is required for the database user specified by the configured data source to have the 'create/alter' privileges on tables, indexes and sequences. This allows for automatic repository installation and upgrades.
A repository cannot be shared by multiple JVMs. If such a situation was to occur, it would lead to unpredictable behavior and potentially even corruption of data in the repository.
EBX® performs checks to enforce this restriction. Before the repository becomes available, the repository must first acquire exclusive ownership of the relational database. After starting the repository, the JVM periodically checks that it still holds ownership of the repository.
These checks are performed by repeatedly tagging a technical table in the relational database. The shutdown command for the application server ensures that the tag on this technical table is removed. If the server shuts down unexpectedly, the tag may be left in the table. If this occurs, the server must wait several additional seconds upon restart to ensure that the table is not being updated by another live process.
To avoid an additional wait period at the next start up, it is recommended to always properly shut down the application server.
The exclusion mechanism described above is compatible with failover architectures, where only one server is active at any given time in an active/passive cluster. To ensure that this is the case, the main server must declare the property ebx.repository.ownership.mode=failovermain
. The main server claims ownership of the repository database, as in the case of a single server.
A backup server can still start up, but it will not have access to the repository. It must declare the property ebx.repository.ownership.mode=failoverstandby
to act as the backup server. It is not recommended to share the EBX® root directory, defined by the ebx.repository.directory
property, with the active EBX® and standby EBX®. This means that, when the standby EBX® takes control, it starts on an empty directory and indexes have to be rebuilt, slowing down the first access to table data. To avoid this, you can replicate the EBX® repository folders. On Linux, running a timer on the failover node that executes rsync
every few minutes to copy EBX® repository folder from the active to the failover node has been successfully tested. It's important to stop synchronization before the failover node is started to prevent any concurrent modification of indexes.
On Kubernetes, it is recommended to not use EBX® failover with hot standby and to rely instead on Kubernetes itself, which can automatically replace a failing POD with a new one that takes over the disks of the previous POD. Sample HELM charts are available on GitHub.
Once started, the backup server is registered in the connection log. Its status can be retrieved using the Java API or through an HTTP request, as described in the section Repository status information and logs below.
In order to activate the backup server and transfer exclusive ownership of the repository to it, a specific request must be issued by an HTTP request, or using the Java API:
Using HTTP, the request must include the parameter activationKeyFromStandbyMode
, and the value of this parameter must be equal to the value declared for the entry ebx.repository.ownership.activationkey
in the EBX® main configuration file. See Configuring failover.
The format of the request URL must be:
http[s]://<host>[:<port>]/ebx?activationKeyFromStandbyMode={value}
Using the Java API, call the method RepositoryStatus.wakeFromStandby
.
If the main server is still up and accessing the database, the following applies: the backup server marks the ownership table in the database, requesting a clean shutdown for the main server (yet allowing any running transactions to finish). Only after the main server has returned ownership can the backup server start using the repository.
A log of all attempted Java process connections to the repository is available in the Administration area under 'History and logs' > 'Repository connection log'.
The status of the repository may be retrieved using the methods in the RepositoryStatus
API.
It is also possible to get the repository status information using an HTTP request that includes the parameter repositoryInformationRequest
with one of following values:
| The state of the repository in terms of ownership registration.
|
| The number of times that the repository has made contact since associating with the database. |
| Detailed information for the end-user regarding the repository's registration status. The format of this information may be subject to modifications in the future without explicit warning. |
Several technical tables can be accessed in the 'Administration' area of the EBX® user interface. These tables are for internal use only and their content should not be edited manually, unless removing obsolete or erroneous data. Among these technical tables are:
Auto-increments | Lists all auto-increment fields in the repository. |
By complying with the Rules for the database access and user privileges, the repository installation or upgrade is done automatically.
EBX® provides a way to export the full content of a repository to another database. The export includes all dataspaces, configuration datasets, and mapped tables. To operate this migration, the following guidelines must be respected:
The source repository must be shut down: no EBX® server process must be accessing it; not strictly complying with this requirement can lead to a corrupted target repository;
A new EBX® server process must be launched on the target repository, which must be empty. In addition to the classic Java system property -Debx.properties
, this process must also specify ebx.migration.source.properties
: the location of an EBX® properties file specifying the source repository. (It is allowed to provide distinct table prefixes between target and source.)
The migration process will then take place automatically. Please note, however, that this process is not transactional: should it fail halfway, it will be necessary to delete the created objects in the target database, before starting over.
After the migration is complete, an exception will be thrown, to force restarting the EBX® server process accessing the target repository.
Limitations:
The names of the database objects representing the mapped tables (history, replication) may have to be altered when migrated to the target database, to comply with the limitations of its database engine (maximum length, reserved words, ...). Such alterations will be logged during the migration process.
As a consequence, the names specified for replicated tables in the data model will not be consistent with the adapted name in the database. The first recompilation of this data model will force to correct this inconsistency.
Due to different representations of numeric types, values for xs:decimal
types might get rounded if the target database engine offers a lesser precision than the source. For example, a value of 10000000.1234567890123456789
in Oracle will get rounded to 10000000.123456789012345679
in SQL Server.
A global backup of the EBX® repository must be delegated to the underlying RDBMS. The database administrator must use the standard backup procedures of the underlying database.
Archives are stored in a sub-directory called archives
within the ebx.repository.directory
(see configuration). This directory is automatically created during the first export from EBX®.
As specified in the security best practices, access to this directory must be carefully protected. Also, if manually creating this directory, make sure that the EBX® process has read-write access to it. Furthermore, the administrator is responsible for cleaning this directory, as EBX® does not maintain it.
The transfer of files between two EBX® environments must be performed using tools such as FTP or simple file copies by network sharing.
A repository has the following attributes:
repositoryId | Uniquely identifies a repository within the scope of the company. It is 48 bits (6 bytes) and is usually represented as 12 hexadecimal digits. This information is used for generating UUIDs (Universally Unique Identifiers) for entities created in the repository, as well as transactions logged in history tables or in the XML audit trail. This identifier acts as the 'UUID node' part, as specified by RFC 4122. |
repository label | Provides a user-friendly label that identifies the purpose and context of the repository. For example: "Production environment". |
store format | Identifies the underlying persistence system, including the current version of its structure. |
An issue with indices can occur where records in the same table are duplicated. EBX® provides a built-in service to resolve this issue. See the steps to run the service below.
While the service executes, you cannot access the EBX® repository through the UI, REST, or SOAP requests. Once completed, the service automatically shuts down the repository.
To remove duplicate records:
In the ebx.properties
configuration file, set the ebx.persistence.boot.checkRecordDuplicates
property to true
.
Start the repository and wait for the service to automatically shut down the repository.
Set the ebx.persistence.boot.checkRecordDuplicates
property to false
.
Start your repository.
Some entities accumulate during the execution of EBX®.
It is the administrator's responsibility to monitor and clean up these entities.
The persistence data source of the repository must be monitored through RDBMS monitoring.
The EBX® following tables allow data persistence in database:
The table {ebx.persistence.table.prefix}G_DSP
, in which each record represents a dataspace or a snapshot (its name is EBXG_DSP
if the property ebx.persistence.table.prefix
is unset).
The table ${ebx.persistence.table.prefix}G_DST
holds dataset references.
The table {ebx.persistence.table.prefix}G_BLK
, where the data of EBX® tables are segmented into blocks sized up to 256 EBX® records.
The tables {ebx.persistence.table.prefix}G_DTR, {ebx.persistence.table.prefix}G_TRV and {ebx.persistence.table.prefix}G_SHR, defining which blocks belong to a given EBX® table in a given dataspace or snapshot.
The tables {ebx.persistence.table.prefix}G_TLR and {ebx.persistence.table.prefix}G_SCH, holds information about schemas and the tables they define.
The performance of requests executed by EBX® requires that the database has computed up-to-date statistics on its tables. Since database engines regularly schedule statistics updates, this is usually not an issue. Yet, it could be necessary to explicitly update the statistics in cases where tables are heavily modified over a short period of time (e.g. by an import creating many records).
For history tables, some UI components use statistics to adapt their behavior in order to prevent users from executing costly requests unwillingly.
For example, the combo box will not automatically search on user input if the table contains a large volume of records. This behavior may also occur if the database's statistics are not up to date, because a table may be considered as containing a large volume of records even if it is not actually the case.
A full cleanup of dataspaces, snapshots, and history from the repository involves several stages:
Closing unused dataspaces and snapshots to keep the cache to a minimal size.
Deleting dataspaces, snapshots, and history.
Purging the remaining entities associated with the deleted dataspaces, snapshots, and history from the repository.
In order to keep the cache and the repository to a reasonable size, it is recommended to close any dataspaces and snapshots that are no longer required. This can be done in the following ways:
Through the user interface, in the 'Dataspaces' area.
From the 'Dataspaces / Snapshots' table under 'Dataspaces' in the 'Administration' area, using the Actions menu in the workspace. The action can be used on a filtered view of the table.
Through the Java API, using the method Repository.closeHome
.
Using the data service "close dataspace" and "close snapshot" operations. See Closing a dataspace or snapshot for more information.
Once the dataspaces and snapshots have been closed, the data can be safely removed from the repository.
Closed dataspaces and snapshots can be reopened in the 'Administration' area, under 'Dataspaces'.
Dataspaces, associated history and snapshots can be permanently deleted from the repository. However, the deletion of a dataspace does not necessarily imply the deletion of its history. The two operations are independent and can be performed at different times.
The deletion of a dataspace, a snapshot, or of the history associated with them is recursive. The deletion operation will be performed on every descendant of the selected dataspace.
After the deletion of a dataspace or snapshot, some entities will remain until a repository-wide purge of obsolete data is performed.
In particular, the complete history of a dataspace remains visible until a repository-wide purge is performed. Both steps, the deletion and the repository-wide purge, must be completed in order to totally remove the data and history. The process has been divided into two steps for performance issues. As the total clean-up of the repository can be time-intensive, this allows the purge execution to be initiated during off-peak periods on the server.
The process of deleting the history of a dataspace takes into account all history transactions recorded up until the deletion is submitted or until a date specified by the user. Any subsequent historized operations will not be included when the purge operation is executed. To delete new transactions, the history of the dataspace must be deleted again.
It is not possible to set a deletion date in the future. The specified date will thus be ignored and the current date will be used instead.
The deletion of dataspaces, snapshots, and history can be performed in a number of different ways:
From the 'Dataspaces/Snapshots' table under 'Dataspaces' in the 'Administration' area, using the Actions menu button in the workspace. The action can be used on a filtered view of the table.
Using the Java API, and more specifically the methods Repository.deleteHome
and RepositoryPurge.markHomeForHistoryPurge
.
At the end of the data service "close dataspace" operation, using the parameters deleteDataOnClose
and deleteHistoryOnClose
, or at the end of a "merge dataspace" operation, using the parameters deleteDataOnMerge
and deleteHistoryOnMerge
.
Once items have been deleted, a purge can be executed to clean up remaining data from all deletions performed until that point. A purge can be initiated in the following ways:
Through the user interface, by selecting in the 'Administration' area Actions > Execute purge in the navigation pane.
Using the Java API, specifically the method RepositoryPurge.purgeAll
.
Using the task scheduler. See Task scheduler for more information.
The purge process is logged in the directory ${ebx.repository.directory}/db.purge/
.
It is the administrator's responsibility to monitor and regularly cleanup the following entities.
A purge can be executed to clean up the remaining data from all deletions, that is, deleted dataspaces, snapshots and history performed up until that point. This includes the dataspaces and snapshots created for the persistent validation reports that have become obsolete. A purge can be initiated by selecting in the 'Administration' area Actions > Execute purge in the navigation pane.
Task scheduler execution reports are persisted in the 'executions report' table, in the 'Task scheduler' section of the 'Administration' area. Scheduled tasks constantly add to this table as they are executed. Even when an execution terminates normally, the records are not automatically deleted. It is thus recommended to delete old records regularly.
User interactions are used by the EBX® component as a reliable means for an application to initiate and get the result of a service execution. They are persisted in the ebx-interactions administration section. It is recommended to regularly monitor the user interactions table, as well as to clean it, if needed.
The workflow events are persisted in the workflow history table, in the 'Workflow' section of the 'Administration' area. Data workflows constantly add to this table as they are executed. Even when an execution terminates normally, the records are not automatically deleted. It is thus recommended to delete old records regularly.
The steps to clean history are the following
Make sure the process executions are removed (it can be done by selecting in the 'Administration' area of Workflows Actions > Terminate and clean this workflow or Actions > Clean from a date in the navigation pane).
Clean main processes in history (it can be done by selecting in the 'Administration' area of Workflows history Actions > Clear from a date or Actions > Clean from selected workflows in the navigation pane).
Purge remaining entities in workflow history using 'standard EBX® purge'
In order to guarantee the correct operation of EBX®, the disk usage and disk availability of the following directories must be supervised by the administrator, as EBX® does not perform any clean up, except for Lucene indexes:
Lucene indexes: ${ebx.repository.directory}/indexes-(...)/
Lucene indexes: Indexes can require a lot of disk space; they are critical to the correct functioning of EBX®. In nominal usage, they must not be deleted or modified directly. However, there are cases where deleting these indexes might be needed:
If the repository is recreated from scratch, whereas the directory ${ebx.repository.directory}/
is preserved; to ensure consistency of data, it is then required to delete the root directory of the indexes.
More generally, if the indexes have become inconsistent with the repository data (this could happen in rare cases of bugs).
After deletion, the content of the indexes will be lazily recomputed per table, derived from the content of the repository. The deletion must happen at the root folder of the indexes: if a single directory is deleted at a lower level, the global structure of the index will become inconsistent. This operation, however, has a cost, and should generally be avoided.
XML audit trail: ${ebx.repository.directory}/History/
Archives: ${ebx.repository.directory}/archives/
Logs: ebx.logs.directory
Temporary directory: ebx.temp.directory
For XML audit trail, if large transactions are executed with full update details activated (contrary to the default setting), the required disk space can increase.
For pagination in the data services getChanges
operation, a persistent store is used in the Temporary directory. Large changes may require a large amount of disk space.
Some dataspace administrative tasks can be performed from the 'Administration' area of EBX® by selecting 'Dataspaces'.
This table lists all the existing dataspaces and snapshots in the repository, whether open or closed. You can view and modify the information of dataspaces included in this table.
From this section, it is also possible to close open dataspaces, reopen previously closed dataspaces, as well as delete and purge open or closed dataspaces, associated history, and snapshots.
This table lists all the existing permission rules defined on all the dataspaces in the repository. You can view the permission rules and modify their information.
The table 'Deleted dataspaces/snapshots' lists all the dataspaces that have already been purged from the repository.
From this section, it is also possible to delete the history of purged dataspaces.