This appendix describes the collection of built-in services provided by TIBCO DQ, which can be exposed as Rules.
Name
analyze_text_sentiment
Description
Analyze text sentiment and categorize it as Positive, Neutral, or Negative.
Inputs
|
in_text |
Input text to be analyzed. |
Outputs
|
out_score |
Compound score is a metric computed by summing the valence scores of each word in the lexicon, adjusted according to the rules, and then normalized to be between -1 (most extreme negative) and +1 (most extreme positive). |
|
positive_score |
Ratios for proportions of text that fall in POSITIVE category. |
|
neutral_score |
Ratios for proportions of text that fall in NEUTRAL category. |
|
negative_score |
Ratios for proportions of text that fall in NEGATIVE category. |
|
tag_value |
POSITIVE if out_score >= 0.05, NEGATIVE if out_score <=-0.05, NEUTRAL if out_score is between -0.05 and 0.05 |
|
tag_category |
Always set to CLEANSED if a score was successfully computed. |
Rule Parameters
None
Tags
|
EMPTY_OR_WHITESPACES |
Input value is an empty string or contains all whitespaces. |
|
POSITIVE |
Input text has Positive sentiment. |
|
NEUTRAL |
Input text has Neutral sentiment. |
|
NEGATIVE |
Input text has Negative sentiment. |
Name
cleanse_address_essentials_loqate
Description
Standardizes, verifies, cleanses, and geocodes address data using Loqate.
Inputs
|
in_address_line1 |
Input flat, building, or house number and street name. |
|
in_address_line2 |
Additional information for the address (e.g., suite number, unit, or other number). |
|
in_address_city |
Input name of the city, town, or locality. |
|
in_address_state |
Input name of the state / administrative area. |
|
in_address_postal |
Input value of the complete postal code for a particular delivery point. |
Outputs
|
out_address_line1 |
The alphanumeric code identifies an individual location along with the most common street or block data element within a country. |
|
tag_value |
Secondary identifiers for a particular delivery point. For instance, FLAT 1 or SUITE 212. |
|
out_address_city |
The most common population center data element within a country such as USA city. |
|
out_address_county |
Smallest geographic data element within a country such as USA county. |
|
out_address_state |
Most common geographic data element within a country such as USA State. |
|
out_address_country_name |
ISO 3166 official country name. |
|
out_address_country_code |
ISO 3166 2-character country code. |
|
out_address_postal |
Complete postal code for a particular delivery point. |
|
out_address_latitude |
WGS 84 latitude in decimal degrees format. |
|
out_address_longitude |
WGS 84 longitude in decimal degrees format. |
|
out_loqate_avc |
Loqate Address Verification Code. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data or Invalid Data. |
Rule Parameters
|
country |
Country code |
Tags
|
EMPTY_OR_WHITESPACES |
Input address contains empty string or contains all whitespaces. |
|
LQT_PARTIALLY_VERIFIED |
Input address could only be partially verified. |
|
LQT_AMBIGUOUS |
Input address had more than one close reference data match. |
|
LQT_VERIFIED |
Input address had an exact match in the reference data. |
|
LQT_UNVERIFIED |
Input address did not match the reference data. |
Name
cleanse_aus_bank_acc
Description
Cleanse Australian bank account number.
Inputs
|
in_account |
Input bank account number. |
Outputs
|
out_account |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
default_account |
Default value when tag_category is MISSING or INVALID. |
Tags
|
EMPTY_OR_WHITESPACES |
Input value is an empty string or contains all whitespaces. |
|
INCORRECT_LENGTH_OR_NOT_AN_ABAN |
Input value has incorrect length or does not represent an AUS Bank Account Number. |
|
VALID_BANK_ACCOUNT |
Input value is Valid. |
|
CLEANSED_BANK_ACCOUNT |
Input value was reformatted or cleansed. |
Name
cleanse_aus_business_number
Description
Cleanse Australian business number.
Inputs
|
in_abn |
Input business number. |
Outputs
|
out_abn |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
default_abn |
Default value when tag_category is MISSING or INVALID. |
Tags
|
EMPTY_OR_WHITESPACES |
Input value is an empty string or contains all whitespaces. |
|
INCORRECT_LENGTH_OR_NOT_AN_ABN |
Input value has incorrect length or does not represent an AUS Business Number. |
|
FAILED_CHECKSUM |
Input value failed Modulus 10 checksum. |
|
VALID_ABN |
Input value is Valid. |
|
STANDARDIZED_ABN |
Input value was reformatted or cleansed. |
Name
cleanse_aus_phone
Description
Cleanse Australian phone number.
Inputs
|
in_phone |
Value to be cleansed or verified. |
Outputs
|
out_phone_local |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID. |
|
out_phone_intl |
Standardized phone in international format. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
default_phone_local |
Default value when tag_category is MISSING or INVALID. |
|
default_phone_intl |
Default value when tag_category is MISSING or INVALID. |
Tags
|
EMPTY_OR_WHITESPACES |
Input value is an empty string or contains all whitespaces. |
|
FAILED_REGEX_CHECK |
Input value failed regular expression check. |
|
VALID_PHONE |
Input value is Valid. |
|
STANDARDIZED_PHONE |
Input value was reformatted or cleansed. |
Name
cleanse_date
Description
Cleanse date.
Inputs
|
in_date |
Value to be cleansed or verified. |
Outputs
|
out_date |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID. |
|
date_format |
Expected output date format. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
date_format |
Expected output date format. |
|
default_date |
Default value when tag_category is MISSING or INVALID. |
Tags
|
EMPTY_OR_WHITESPACES |
Input value is an empty string or contains all whitespaces. |
|
FAILED_TO_PARSE_DATE |
Input value is either not a date value or is not a valid calendar date. |
|
VALID_DATE_IN_CORRECT_FORMAT |
Input value is Valid. |
|
VALID_DATE_BUT_INVALID_OUTPUT_FORMAT |
Input value is a Valid date, but the date format is Invalid. |
|
REFORMATTED_DATE |
Input value was reformatted or cleansed. |
Name
cleanse_email
Description
Cleanse email addresses.
Inputs
|
in_email |
Value to be cleansed or verified. |
Outputs
|
out_email |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
default_email |
Default value when tag_category is MISSING or INVALID. |
Tags
|
EMPTY_OR_WHITESPACES |
Input value is an empty string or contains all whitespaces. |
|
INVALID_EMAIL_ADDRESS |
Value does not represent a valid email address. |
|
VALID_EMAIL |
Input value is Valid. |
|
NORMALIZED_EMAIL |
Input value was reformatted or cleansed. |
Name
cleanse_email_loqate
Description
Cleanse email addresses with Loqate’s online email validation service.
Inputs
|
in_email |
Value to be cleansed or verified. |
Outputs
|
out_email |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
default_email |
Default value when tag_category is MISSING or INVALID. |
Tags
|
EMPTY_OR_WHITESPACES |
Input value is an empty string or contains all whitespaces. |
|
COULD_NOT_VERIFY |
Value does not represent a valid email address. |
|
DISPOSABLE_TEMPORARY_COMPLAINER |
The email address provided is a disposable mailbox. |
|
VALID_EMAIL |
Input value is Valid. |
|
NORMALIZED_EMAIL |
Input value was reformatted or cleansed. |
Name
cleanse_payment_card
Description
Cleanse payment card number (credit card, debit card, cash card, etc.).
Inputs
|
in_card |
Value to be cleansed or verified. |
Outputs
|
out_card |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID. |
|
out_card_issuer |
Name of the card issuer. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
default_card |
Default value when tag_category is MISSING or INVALID. |
Tags
|
EMPTY_OR_WHITESPACES |
Input value is an empty string or contains all whitespaces. |
|
COULD_NOT_EXECUTE_LUHN_CHECK |
LUHN check could not execute successfully. |
|
FAILED_LUHN_CHECK |
Input value failed LUHN check. |
|
VALID_CARD |
Input value is Valid. |
|
STANDARDIZED_CARD |
Input value was reformatted or cleansed. |
Name
cleanse_phone_loqate
Description
Cleanse phone number using Loqate’s online validation service.
Inputs
|
in_phone |
Value to be cleansed or verified. |
Outputs
|
out_phone_local |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID. |
|
out_phone_intl |
Standardized phone in international format. |
|
out_phone_country |
ISO 2-digit country code. |
|
out_phone_country_code |
International dial code for the country |
|
out_phone_type |
Type of phone number (MOBILE, LANDLINE, OR VOIP) |
|
network_name |
Name of the current operator serving the supplied number. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
default_phone |
Default value when tag_category is MISSING or INVALID. |
|
country |
ISO 2-digit country code. |
Tags
|
EMPTY_OR_WHITESPACES |
Input value is an empty string or contains all whitespaces. |
|
COULD_NOT_VERIFY |
Input value is not a valid phone number. |
|
VALID_PHONE |
Input value is Valid. |
|
NORMALIZED_PHONE |
Input value was reformatted or cleansed. |
Name
cleanse_usa_dea
Description
Cleanse USA Drug Enforcement Agency assigned prescriber identifier.
Inputs
|
in_dea |
Value to be cleansed or verified. |
|
in_last_name |
Last name of the prescriber. |
Outputs
|
out_dea |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
default_dea |
Default value when tag_category is MISSING or INVALID. |
Tags
|
EMPTY_OR_WHITESPACES |
Input value is an empty string or contains all whitespaces. |
|
INCORRECT_LENGTH_OR_NOT_A_DEA |
Input value has incorrect length or does not represent a US DEA value. |
|
INVALID_1ST_LETTER |
First character does not belong to the list of allowed characters. |
|
INVALID_2ND_LETTER |
Second character does not match the first character of provider's last name. |
|
INVALID_DEA |
Input value does not represent a DEA number. |
|
FAILED_DEA_CHECKSUM |
Input value failed DEQ checksum. |
|
VALID_DEA |
Input value is Valid. |
|
STANDARDIZED_DEA |
Input value was reformatted or cleansed. |
Name
cleanse_usa_npi
Description
Cleanse USA National Provider Identifier.
Inputs
|
in_npi |
Value to be cleansed or verified. |
Outputs
|
out_npi |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
default_npi |
Default value when tag_category is MISSING or INVALID. |
Tags
|
EMPTY_OR_WHITESPACES |
Input value is an empty string or contains all whitespaces. |
|
INCORRECT_LENGTH_OR_NOT_A_NPI |
Input value has incorrect length or does not represent a US NPI value. |
|
COULD_NOT_EXECUTE_LUHN_CHECK |
The regular expression for US NPI could not be compiled. |
|
FAILED_LUHN_CHECK |
Input value failed LUHN check. |
|
VALID_CARD |
Input value is Valid. |
|
STANDARDIZED_CARD |
Input value was reformatted or cleansed. |
Name
cleanse_usa_phone
Description
Cleanse USA phone number.
Inputs
|
in_phone |
Value to be cleansed or verified. |
Outputs
|
out_phone_local |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID. |
|
out_phone_intl |
Standardized phone in international format. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
default_phone_local |
Default value when tag_category is MISSING or INVALID. |
|
coundefault_phone_intl |
Default value when tag_category is MISSING or INVALID. |
Tags
|
EMPTY_OR_WHITESPACES |
Input value is an empty string or contains all whitespaces. |
|
FAILED_REGEX_CHECK |
Input value failed regular expression check. |
|
VALID_PHONE |
Input value is Valid. |
|
NORMALIZED_PHONE |
Input value was reformatted or cleansed. |
Name
cleanse_usa_ssn
Description
Cleanse USA Social Security Number.
Inputs
|
in_ssn |
Value to be cleansed or verified. |
Outputs
|
out_ssn |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
default_ssn |
Default value when tag_category is MISSING or INVALID. |
Tags
|
EMPTY_OR_WHITESPACES |
Input value is an empty string or contains all whitespaces. |
|
INCORRECT_LENGTH_OR_NOT_A_SSN |
Input value has incorrect length or does not represent a US SSN value. |
|
FAILED_SSN_REGEX_CHECK |
Input value failed regular expression check. |
|
INVALID_AREA_CODE |
Input value contains invalid area code. |
|
INVALID_GROUP_CODE |
Input value contains invalid group code. |
|
INVALID_SERIAL |
Input value contains invalid serial number. |
|
BLACKLISTED_SSN |
Input value is blacklisted. |
|
VALID_SSN |
Input value is Valid. |
|
STANDARDIZED_SSN |
Input value was reformatted or cleansed. |
Name
compare_pair_values
Description
Compare two values using a comparison operator.
Inputs
|
input_value_a |
Input value A. |
|
input_value_b |
Input value B. |
Outputs
|
out_value_a |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID. |
|
comp_operator |
Comparison operator. |
|
data_type |
Expected data type, expected values are str, int, float, complex. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
comp_operator |
Comparison operator - expected operators are <, >, <=, >=, ==, != |
|
default_value |
Default value when tag_category is MISSING or INVALID. |
|
data_type |
Expected data type, expected values are str, int, float, complex. |
Tags
|
COMPARISON_PASSED |
in_value_a passed comparison against in_value_b. |
|
COMPARISON_FAILED |
in_value_a failed comparison against in_value_b. |
|
COMPARISON_FAILED_INCOMPATIBLE_TYPES |
Input values could not be compared because they are of incompatible types. |
Name
detect_outlier_iqr
Description
Detects outliers using interquartile range.
Inputs
|
in_value |
Input value. |
Outputs
|
out_value |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID |
|
q1 |
first quartile |
|
q3 |
third quartile |
|
iqr |
interquartile range |
|
upper |
upper limit |
|
lower |
lower limit |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
default_value |
Default value when tag_category is MISSING or INVALID. |
Tags
|
NORMAL |
Input value is within normal range. |
|
OUTLIER |
Input value is abnormal or an anomaly. |
Name
detect_outlier_zscore
Description
Detects outliers using Z-score.
Inputs
|
in_value |
Input value. |
Outputs
|
out_value |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID |
|
z_score |
z score |
|
z_threshold |
z-score threshold |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
default_value |
Default value when tag_category is MISSING or INVALID. |
|
z_threshold |
z-score threshold |
Tags
|
NORMAL |
Input value is within normal range. |
|
OUTLIER |
Input value is abnormal or an anomaly. |
Name
impute_missing_numeric_values
Description
Identify missing numeric values and replace them with default value using statistical methods.
Inputs
|
in_value |
Input value. |
Outputs
|
out_value |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
method |
Statistical method to use - max, min, mean, median, pctl |
|
pctl |
Percentile value if selected method = pctl |
Tags
|
IMPUTED |
Input value is missing and output value has been imputed. |
|
ORIGINAL |
Output value is same as input value. |
Name
verify_age
Description
Calculate and verify age from birth date.
Inputs
|
in_birth_date |
Input value to be cleansed or verified. |
Outputs
|
out_age |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
max_age |
Maximum expected age. |
Tags
|
EMPTY_OR_WHITESPACES |
Input value is an empty string or contains all whitespaces. |
|
BIRTHDATE_IN_FUTURE |
Birth date is in future. |
|
GREATER_THAN_MAX_AGE |
Calculated age is greater than max expected age. |
|
CALCULATED_AGE_FROM_BIRTHDATE |
Successfully parsed date and calculated age. |
|
FAILED_TO_PARSE_BIRTHDATE |
Invalid birth date, failed to parse date value. |
Name
verify_value_in_range
Description
Verify value is in a defined range.
Inputs
|
in_value |
Input value to be cleansed or verified. |
Outputs
|
out_value |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID. |
|
min_value |
Lower value of the range. |
|
max_value |
Upper value of the range. |
|
data_type |
Expected data type, expected values are str, int, float, complex. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
min_value |
Lower value of the range. |
|
max_value |
Upper value of the range. |
|
default_value |
Default value when tag_category is MISSING or INVALID. |
|
data_type |
Expected data type, expected values are str, int, float, complex. |
Tags
|
VALUE_IN_RANGE |
Input value is within desired range. |
|
VALUE_OUT_OF_RANGE |
Input value is outside desired range. |
|
COMPARISON_FAILED_INCOMPATIBLE_TYPES |
Input value data type is incompatible with desired range. |
Name
verify_value_of_datatype
Description
Verify value against a known data type.
Inputs
|
in_value |
Input value to be cleansed or verified. |
Outputs
|
data_type |
Expected data type. |
|
out_value |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
default_value |
Default value when tag_category is MISSING or INVALID. |
|
data_type |
Expected data type, expected values are str, int, float, complex. |
Tags
|
VALID_DATA_TYPE |
Input value is of expected data type |
|
INVALID_DATA_TYPE |
Input value does not match expected data type. |
Name
verify_with_regex
Description
Verify value by matching with a regular expression.
Inputs
|
in_value |
Input value to be cleansed or verified. |
Outputs
|
regex |
Regular expression. |
|
out_value |
Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID. |
|
tag_value |
Tag value that provides explanation for malformed, unexpected, or missing data. |
|
tag_category |
Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data. |
Rule Parameters
|
default_value |
Default value when tag_category is MISSING or INVALID. |
|
regex |
Regular expression. |
Tags
|
EMPTY_OR_WHITESPACES |
Input value is an empty string or contains all whitespaces. |
|
DID_NOT_MATCH_REGEX |
Input value failed regular expression check. |
|
MATCHED_REGEX |
Input value is Valid because it matched regular expression. |
|
TRIMMED_WHITESPACES |
Input value was reformatted or cleansed. |