What Is and Is Not Discovered

Discovery Guide > Introducing Discovery > About the Relationships Discovered > What Is and Is Not Discovered

At times, it might not be obvious why some relationships are discovered and others are not. The list below explains some of the reasons:

• The number of matches between two columns is less than three so the relationship is not discovered.

For example, if you have a column that contain the same kind of data but has only two possible values, Discovery does not discover this relationship. This threshold cannot be changed.

• Columns with STRING data types have dissimilar names, but the data has at least three matches, so the columns are discovered. Relationships involving columns with string data types put more emphasis on the data than on the column names. See Adjusting the Weights of the RPS Factors for the weights applied to string data types.

• Column names begin and end with the same characters, but the characters between do not match. For example, if CUSTOMER_ID and CUSTOMER_ACCOUNT_ID have matching data, the relationship might not be found or have a low score because the column names are different. The more characters in the column name that do not match, the lower the score.

• Two columns of INTEGER data type might not be found or have a low score. For example, if all the integers in two columns match, the score is still low because they are both of type BIGINT. In this case, Discovery puts more emphasis on the column name comparison.

• If the Min Score value set in the model Diagram is set too high, relationships might be found but not displayed. The default value is 75.

• Matches might not be found because capitalization is different. You can control this using the Use Case Sensitivity for Discovery configuration parameter. See Configuring Case Sensitivity.