Packaged DQ Services

This appendix describes the default services packaged with ibi Data Quality in the DQCore workspace. Users can define Rules against these services but cannot edit or replace these services.

Analyze Text Sentiment

Name

analyze_text_sentiment

Description

Analyze text sentiment and categorize it as Positive, Neutral, or Negative.

Inputs

in_text

Input text to be analyzed.

Outputs

out_score

Compound score is a metric computed by summing the valence scores of each word in the lexicon, adjusted according to the rules, and then normalized to be between -1 (most extreme negative) and +1 (most extreme positive).

positive_score

Ratios for proportions of text that fall in POSITIVE category.

neutral_score

Ratios for proportions of text that fall in NEUTRAL category.

negative_score

Ratios for proportions of text that fall in NEGATIVE category.

tag_value

POSITIVE if out_score >= 0.05, NEGATIVE if out_score <=-0.05, NEUTRAL if out_score is between -0.05 and 0.05

tag_category

Always set to CLEANSED if a score was successfully computed.

Rule Parameters

None

Tags

EMPTY_OR_WHITESPACES

Input value is an empty string or contains all whitespaces.

POSITIVE

Input text has Positive sentiment.

NEUTRAL

Input text has Neutral sentiment.

NEGATIVE

Input text has Negative sentiment.

Cleanse Address Using Loqate

Name

cleanse_address_essentials_loqate

Description

Standardizes, verifies, cleanses, and geocodes address data using Loqate.

Inputs

in_address_line1

Input flat, building, or house number and street name.

in_address_line2

Additional information for the address (e.g., suite number, unit, or other number).

in_address_city

Input name of the city, town, or locality.

in_address_state

Input name of the state / administrative area.

in_address_postal

Input value of the complete postal code for a particular delivery point.

Outputs

out_address_line1

The alphanumeric code identifies an individual location along with the most common street or block data element within a country.

tag_value

Secondary identifiers for a particular delivery point. For instance, FLAT 1 or SUITE 212.

out_address_city

The most common population center data element within a country such as USA city.

out_address_county

Smallest geographic data element within a country such as USA county.

out_address_state

Most common geographic data element within a country such as USA State.

out_address_country_name

ISO 3166 official country name.

out_address_country_code

ISO 3166 2-character country code.

out_address_postal

Complete postal code for a particular delivery point.

out_address_latitude

WGS 84 latitude in decimal degrees format.

out_address_longitude

WGS 84 longitude in decimal degrees format.

out_loqate_avc

Loqate Address Verification Code.

tag_value

Tag value that provides explanation for malformed, unexpected or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data or Invalid Data.

Rule Parameters

country

Country code

Tags

EMPTY_OR_WHITESPACES

Input address contains empty string or contains all whitespaces.

LQT_PARTIALLY_VERIFIED

Input address could only be partially verified.

LQT_AMBIGUOUS

Input address had more than one close reference data match.

LQT_VERIFIED

Input address had an exact match in the reference data.

LQT_UNVERIFIED

Input address did not match the reference data.

Cleanse Global Addresses Using Loqate API

Name

cleanse_address_loqate_api

Description

Standardizes, verifies, cleanses and geocodes worldwide address data using Loqate's online address verification service..

Special Instructions

In order to use this service, you will have to purchase an API access key from Loqate and request your DQ Admin to configure the API access key in the environment file as specified in Loqate API Access Keys configuration section.

Inputs

in_address_line1

Input flat, building, or house number and street name.

in_address_line2

Additional information for the address (e.g., suite number, unit, or other number).

in_address_city

Input name of the city, town, or locality.

in_address_state

Input name of the state / administrative area.

in_address_postal

Input value of the complete postal code for a particular delivery point.

in_address_country Input value of the 2 digit ISO country code

Outputs

out_address_line1

The alphanumeric code identifies an individual location along with the most common street or block data element within a country.

out_address_line2 Secondary identifiers for a particular delivery point. For instance, FLAT 1 or SUITE 212.
out_address_city The most common population center data element within a country such as USA city.
out_address_county Smallest geographic data element within a country such as USA county.
out_address_state Most common geographic data element within a country such as USA State.
out_address_country_name ISO 3166 official country name.
out_address_country_code ISO 3166 2-character country code.
out_address_postal Complete postal code for a particular delivery point
out_address_latitude WGS 84 latitude in decimal degrees format
out_address_longitude WGS 84 longitude in decimal degrees format
out_loqate_avc Loqate Address Verification Code
tag_value Tag value that provides explanation for malformed, unexpected or missing data
tag_category Tag category that categorizes tags as Missing Data, Cleansed Data or Invalid Data

Rule Parameters

country

Country code

Tags

EMPTY_OR_WHITESPACES

Input address contains empty string or contains all whitespaces.

LQT_PARTIALLY_VERIFIED

Input address could only be partially verified.

LQT_AMBIGUOUS

Input address had more than one close reference data match.

LQT_VERIFIED

Input address had an exact match in the reference data.

LQT_UNVERIFIED

Input address did not match the reference data.

LQT_REVERTED Input address could not be verified to the specified minimum acceptable level.

Cleanse Australian Bank Account Number

Name

cleanse_aus_bank_acc

Description

Cleanse Australian bank account number.

Inputs

in_account

Input bank account number.

Outputs

out_account

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

default_account

Default value when tag_category is MISSING or INVALID.

Tags

EMPTY_OR_WHITESPACES

Input value is an empty string or contains all whitespaces.

INCORRECT_LENGTH_OR_NOT_AN_ABAN

Input value has incorrect length or does not represent an AUS Bank Account Number.

VALID_BANK_ACCOUNT

Input value is Valid.

CLEANSED_BANK_ACCOUNT

Input value was reformatted or cleansed.

Cleanse Australian Business Number

Name

cleanse_aus_business_number

Description

Cleanse Australian business number.

Inputs

in_abn

Input business number.

Outputs

out_abn

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

default_abn

Default value when tag_category is MISSING or INVALID.

Tags

EMPTY_OR_WHITESPACES

Input value is an empty string or contains all whitespaces.

INCORRECT_LENGTH_OR_NOT_AN_ABN

Input value has incorrect length or does not represent an AUS Business Number.

FAILED_CHECKSUM

Input value failed Modulus 10 checksum.

VALID_ABN

Input value is Valid.

STANDARDIZED_ABN

Input value was reformatted or cleansed.

Cleanse Australian Phone Number

Name

cleanse_aus_phone

Description

Cleanse Australian phone number.

Inputs

in_phone

Value to be cleansed or verified.

Outputs

out_phone_local

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

out_phone_intl

Standardized phone in international format.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

default_phone_local

Default value when tag_category is MISSING or INVALID.

default_phone_intl

Default value when tag_category is MISSING or INVALID.

Tags

EMPTY_OR_WHITESPACES

Input value is an empty string or contains all whitespaces.

FAILED_REGEX_CHECK

Input value failed regular expression check.

VALID_PHONE

Input value is Valid.

STANDARDIZED_PHONE

Input value was reformatted or cleansed.

Cleanse Date

Name

cleanse_date

Description

Cleanse date.

Inputs

in_date

Value to be cleansed or verified.

Outputs

out_date

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

date_format

Expected output date format.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

date_format

Expected output date format.

default_date

Default value when tag_category is MISSING or INVALID.

Tags

EMPTY_OR_WHITESPACES

Input value is an empty string or contains all whitespaces.

FAILED_TO_PARSE_DATE

Input value is either not a date value or is not a valid calendar date.

VALID_DATE_IN_CORRECT_FORMAT

Input value is Valid.

VALID_DATE_BUT_INVALID_OUTPUT_FORMAT

Input value is a Valid date, but the date format is Invalid.

REFORMATTED_DATE

Input value was reformatted or cleansed.

Cleanse Email

Name

cleanse_email

Description

Cleanse email addresses.

Inputs

in_email

Value to be cleansed or verified.

Outputs

out_email

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

default_email

Default value when tag_category is MISSING or INVALID.

Tags

EMPTY_OR_WHITESPACES

Input value is an empty string or contains all whitespaces.

INVALID_EMAIL_ADDRESS

Value does not represent a valid email address.

VALID_EMAIL

Input value is Valid.

NORMALIZED_EMAIL

Input value was reformatted or cleansed.

Cleanse Email With Loqate

Name

cleanse_email_loqate

Description

Cleanse email addresses with Loqate’s online email validation service.

Special Instructions:

In order to use this service, you will have to purchase an API access key from Loqate and request your DQ Admin to configure the API access key in the environment file as specified in Loqate API Access Keys configuration section.

Inputs

in_email

Value to be cleansed or verified.

Outputs

out_email

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

default_email

Default value when tag_category is MISSING or INVALID.

Tags

EMPTY_OR_WHITESPACES

Input value is an empty string or contains all whitespaces.

COULD_NOT_VERIFY

Value does not represent a valid email address.

DISPOSABLE_TEMPORARY_COMPLAINER

The email address provided is a disposable mailbox.

VALID_EMAIL

Input value is Valid.

NORMALIZED_EMAIL

Input value was reformatted or cleansed.

Cleanse Payment Card

Name

cleanse_payment_card

Description

Cleanse payment card number (credit card, debit card, cash card, etc.).

Inputs

in_card

Value to be cleansed or verified.

Outputs

out_card

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

out_card_issuer

Name of the card issuer.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

default_card

Default value when tag_category is MISSING or INVALID.

Tags

EMPTY_OR_WHITESPACES

Input value is an empty string or contains all whitespaces.

COULD_NOT_EXECUTE_LUHN_CHECK

LUHN check could not execute successfully.

FAILED_LUHN_CHECK

Input value failed LUHN check.

VALID_CARD

Input value is Valid.

STANDARDIZED_CARD

Input value was reformatted or cleansed.

Cleanse Person Name

Name

cleanse_person_name

Description

Standardizes person names by updating the first, middle and last name to camel case and concatenating the three parts together to create the full name.

Inputs

in_first_name First name of the person
in_middle_name Middle name of the person
in_last_name Last name of the person
in_full_name Full name of the person

Outputs

out_first_name Standardized first name of the person
out_middle_name Standardized middle name of the person
out_last_name Standardized last name of the person
out_full_name Standardized full name of the person
tag_value Tag value that provides explanation for malformed, unexpected or missing data
tag_category Tag category that categorizes tags as Missing Data, Cleansed Data or Invalid Data

Tags

EMPTY_OR_WHITESPACES

Input value is an empty string or contains all whitespaces.

MISSING_DATA One or more input name part (first or last name) is missing.
STANDARDIZED_NAME Input value was standardized.

Cleanse Phone Using Loqate

Name

cleanse_phone_loqate

Description

Cleanse phone number using Loqate’s online validation service.

Special Instructions

In order to use this service, you will have to purchase an API access key from Loqate and request your DQ Admin to configure the API access key in the environment file as specified in Loqate API Access Keys configuration section.

Inputs

in_phone

Value to be cleansed or verified.

Outputs

out_phone_local

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

out_phone_intl

Standardized phone in international format.

out_phone_country

ISO 2-digit country code.

out_phone_country_code

International dial code for the country

out_phone_type

Type of phone number (MOBILE, LANDLINE, OR VOIP)

network_name

Name of the current operator serving the supplied number.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

default_phone

Default value when tag_category is MISSING or INVALID.

country

ISO 2-digit country code.

Tags

EMPTY_OR_WHITESPACES

Input value is an empty string or contains all whitespaces.

COULD_NOT_VERIFY

Input value is not a valid phone number.

VALID_PHONE

Input value is Valid.

NORMALIZED_PHONE

Input value was reformatted or cleansed.

Cleanse USA DEA

Name

cleanse_usa_dea

Description

Cleanse USA Drug Enforcement Agency assigned prescriber identifier.

Inputs

in_dea

Value to be cleansed or verified.

in_last_name

Last name of the prescriber.

Outputs

out_dea

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

default_dea

Default value when tag_category is MISSING or INVALID.

Tags

EMPTY_OR_WHITESPACES

Input value is an empty string or contains all whitespaces.

INCORRECT_LENGTH_OR_NOT_A_DEA

Input value has incorrect length or does not represent a US DEA value.

INVALID_1ST_LETTER

First character does not belong to the list of allowed characters.

INVALID_2ND_LETTER

Second character does not match the first character of provider's last name.

INVALID_DEA

Input value does not represent a DEA number.

FAILED_DEA_CHECKSUM

Input value failed DEQ checksum.

VALID_DEA

Input value is Valid.

STANDARDIZED_DEA

Input value was reformatted or cleansed.

Cleanse USA NPI

Name

cleanse_usa_npi

Description

Cleanse USA National Provider Identifier.

Inputs

in_npi

Value to be cleansed or verified.

Outputs

out_npi

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

default_npi

Default value when tag_category is MISSING or INVALID.

Tags

EMPTY_OR_WHITESPACES

Input value is an empty string or contains all whitespaces.

INCORRECT_LENGTH_OR_NOT_A_NPI

Input value has incorrect length or does not represent a US NPI value.

COULD_NOT_EXECUTE_LUHN_CHECK

The regular expression for US NPI could not be compiled.

FAILED_LUHN_CHECK

Input value failed LUHN check.

VALID_CARD

Input value is Valid.

STANDARDIZED_CARD

Input value was reformatted or cleansed.

Cleanse USA Phone

Name

cleanse_usa_phone

Description

Cleanse USA phone number.

Inputs

in_phone

Value to be cleansed or verified.

Outputs

out_phone_local

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

out_phone_intl

Standardized phone in international format.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

default_phone_local

Default value when tag_category is MISSING or INVALID.

coundefault_phone_intl

Default value when tag_category is MISSING or INVALID.

Tags

EMPTY_OR_WHITESPACES

Input value is an empty string or contains all whitespaces.

FAILED_REGEX_CHECK

Input value failed regular expression check.

VALID_PHONE

Input value is Valid.

NORMALIZED_PHONE

Input value was reformatted or cleansed.

Cleanse USA SSN

Name

cleanse_usa_ssn

Description

Cleanse USA Social Security Number.

Inputs

in_ssn

Value to be cleansed or verified.

Outputs

out_ssn

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

default_ssn

Default value when tag_category is MISSING or INVALID.

Tags

EMPTY_OR_WHITESPACES

Input value is an empty string or contains all whitespaces.

INCORRECT_LENGTH_OR_NOT_A_SSN

Input value has incorrect length or does not represent a US SSN value.

FAILED_SSN_REGEX_CHECK

Input value failed regular expression check.

INVALID_AREA_CODE

Input value contains invalid area code.

INVALID_GROUP_CODE

Input value contains invalid group code.

INVALID_SERIAL

Input value contains invalid serial number.

BLACKLISTED_SSN

Input value is blacklisted.

VALID_SSN

Input value is Valid.

STANDARDIZED_SSN

Input value was reformatted or cleansed.

Compare Pair Values

Name

compare_pair_values

Description

Compare two values using a comparison operator.

Inputs

input_value_a

Input value A.

input_value_b

Input value B.

Outputs

out_value_a

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

comp_operator

Comparison operator.

data_type

Expected data type, expected values are str, int, float, complex.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

comp_operator

Comparison operator - expected operators are <, >, <=, >=, ==, !=

default_value

Default value when tag_category is MISSING or INVALID.

data_type

Expected data type, expected values are str, int, float, complex.

Tags

COMPARISON_PASSED

in_value_a passed comparison against in_value_b.

COMPARISON_FAILED

in_value_a failed comparison against in_value_b.

COMPARISON_FAILED_INCOMPATIBLE_TYPES

Input values could not be compared because they are of incompatible types.

Detect Outliers Using Interquartile Range

Name

detect_outlier_iqr

Description

Detects outliers using interquartile range.

Inputs

in_value

Input value.

Outputs

out_value

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID

q1

first quartile

q3

third quartile

iqr

interquartile range

upper

upper limit

lower

lower limit

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

default_value

Default value when tag_category is MISSING or INVALID.

Tags

NORMAL

Input value is within normal range.

OUTLIER

Input value is abnormal or an anomaly.

Detect Outliers Using Z-Score

Name

detect_outlier_zscore

Description

Detects outliers using Z-score.

Inputs

in_value

Input value.

Outputs

out_value

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID

z_score

z score

z_threshold

z-score threshold

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

default_value

Default value when tag_category is MISSING or INVALID.

z_threshold

z-score threshold

Tags

NORMAL

Input value is within normal range.

OUTLIER

Input value is abnormal or an anomaly.

Impute Missing Numeric Values

Name

impute_missing_numeric_values

Description

Identify missing numeric values and replace them with default value using statistical methods.

Inputs

in_value

Input value.

Outputs

out_value

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

method

Statistical method to use - max, min, mean, median, pctl

pctl

Percentile value if selected method = pctl

Tags

IMPUTED

Input value is missing and output value has been imputed.

ORIGINAL

Output value is same as input value.

Verify Age

Name

verify_age

Description

Calculate and verify age from birth date.

Inputs

in_birth_date

Input value to be cleansed or verified.

Outputs

out_age

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

max_age

Maximum expected age.

Tags

EMPTY_OR_WHITESPACES

Input value is an empty string or contains all whitespaces.

BIRTHDATE_IN_FUTURE

Birth date is in future.

GREATER_THAN_MAX_AGE

Calculated age is greater than max expected age.

CALCULATED_AGE_FROM_BIRTHDATE

Successfully parsed date and calculated age.

FAILED_TO_PARSE_BIRTHDATE

Invalid birth date, failed to parse date value.

Verify Value in Range

Name

verify_value_in_range

Description

Verify value is in a defined range.

Inputs

in_value

Input value to be cleansed or verified.

Outputs

out_value

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

min_value

Lower value of the range.

max_value

Upper value of the range.

data_type

Expected data type, expected values are str, int, float, complex.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

min_value

Lower value of the range.

max_value

Upper value of the range.

default_value

Default value when tag_category is MISSING or INVALID.

data_type

Expected data type, expected values are str, int, float, complex.

Tags

VALUE_IN_RANGE

Input value is within desired range.

VALUE_OUT_OF_RANGE

Input value is outside desired range.

COMPARISON_FAILED_INCOMPATIBLE_TYPES

Input value data type is incompatible with desired range.

Verify Value of Data Type

Name

verify_value_of_datatype

Description

Verify value against a known data type.

Inputs

in_value

Input value to be cleansed or verified.

Outputs

data_type

Expected data type.

out_value

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

default_value

Default value when tag_category is MISSING or INVALID.

data_type

Expected data type, expected values are str, int, float, complex.

Tags

VALID_DATA_TYPE

Input value is of expected data type

INVALID_DATA_TYPE

Input value does not match expected data type.

Verify with Lookup

Name

lookup_values

Description

Verify input values against a data set.

Inputs

in_value

Input value to be cleansed or verified.

Outputs

out_value

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

Look up file

Upload a new lookup file (csv with one column of values without a header row) or select an existing lookup file.

Tags

EMPTY_OR_WHITESPACES

Input value is an empty string or contains all whitespaces.

VALID_VALUE

Input value is a valid value that FOUND a match in the lookup dataset.

INVALID_VALUE

Input value is invalid because it was NOT FOUND in the lookup dataset.

CLEANSED_VALUE

Input value was cleansed and cleansed value FOUND a match in the lookup dataset.

Verify With Regular Expression

Name

verify_with_regex

Description

Verify value by matching with a regular expression.

Inputs

in_value

Input value to be cleansed or verified.

Outputs

regex

Regular expression.

out_value

Input value when tag_category is VALID, cleansed value when tag_category is CLEANSED, default value when tag_category is MISSING or INVALID.

tag_value

Tag value that provides explanation for malformed, unexpected, or missing data.

tag_category

Tag category that categorizes tags as Missing Data, Cleansed Data, or Invalid Data.

Rule Parameters

default_value

Default value when tag_category is MISSING or INVALID.

regex

Regular expression.

Tags

EMPTY_OR_WHITESPACES

Input value is an empty string or contains all whitespaces.

DID_NOT_MATCH_REGEX

Input value failed regular expression check.

MATCHED_REGEX

Input value is Valid because it matched regular expression.

TRIMMED_WHITESPACES

Input value was reformatted or cleansed.