Adapter Online Help > TDV Azure Data Lake Adapter > Connecting to Azure Data Lake Data Source > Advanced Tab
 
Advanced Tab
To connect to the Azure Data Lake Storage Adapters, set the following properties in the advanced tab of the New Data Source connection window:
 
Field
Description
Concurrent Request Limit
This configuration can take a value between 0 to 65536. It specifies the concurrency limits to be imposed on the underlying data source.
Default String Length
The default VARCHAR length.
Detect Partition During Introspection
Include this option to automatically detect partitions that the file might have.
Note that if they are not properly detected, both usability and performance will be adversely impacted.
CSV Options
 
Include CSV Files
Check this option to include the delimited files from the storage area.
Character Set
The character set used by the datasource.
Delimiter
Indicates the file delimiter character.
Text Qualifier
Indicates the type of qualifier that is used in the file to enclose a string field.
Has Header Row
Indicates whether or not the file has a header row.
Infer Schema
Choosing this option enables the parser to infer the schema and datatypes of each column based on the data in the file.
Note: If this option is selected, it is recommended to provide a “sampling ratio” while introspecting the data source, where sampling of the data might be used when inferring the schema. Providing the sampling ratio helps reduce the overhead of not having to read all the rows while inferring the schema. Parquet files do not require schema inference as their schema is encoded in their metadata.
CSV Escape Character
Indicates the character that should be ignored by the parser in the file.
CSV Parser Lib
The libraries used to parse the delimited files. The libraries supported currently are commons (default) and uniVocity. For more information, refer:
CSV Parsing Mode
The various parsing modes used by the data source. Allowed values are “PERMISSIVE (include a malformed row), DROPMALFORMED (Drop bad rows), FAILFAST (Fail the introspection when a bad row is encountered).
CSV Comment Character
Indicates the character that is used as comment in the file.
CSV Null Value
Indicates what is considered a Null value in a row.
CSV File Name Filters
Indicates the file name extensions that are valid.
Parquet Options
 
Include Parquet Files
Check this option to include the parquet files from the storage area.
Binary as String
Check this option to read binary value as string.
INT96 as Timestamp
Check this option to read INT96 value as Timestamp.
Compression Codec
Parquet files are typically compressed. This setting controls the compression algorithm used to process them. For more information about the different options, refer https://spark.apache.org/docs/2.4.3/sql-data-sources-parquet.html
Filter Push-Down
Controls whether a predicate specified in a WHERE clause in a SQL query will be pushed down to the Cloud File System data source.
Merge Schema
In case of partitioned files, choosing this option merges the data and creates a single schema that includes columns from all partitions.
Parquet File Name Filters
Indicates the file name extensions that are valid.