Indexing Options
|
These options are available only when the text source under Quick tab is set to Files or Spreadsheet.
|
Document upload
|
-
Backoff retries: Use this option to specify the number of times the node will retry, indexing of a document if the server is too busy.
-
Backoff time (sec) : Use this option to specify the duration, the node will wait, to retry indexing of a document if the server is too busy.
-
Max. deg. of Parallelism : Use this option to specify the maximum number of bulk request alive at a time.
-
Bulk request size : Use this option to specify the number of documents to send per bulk request.
Note: The indexing performance depends mainly on the total size of documents attempted at a time. The general recommendation for good performance is to keep a single request size to 20 MB, for example if you have documents about 1 MB in size each, then, you can use 5 for Bulk request size and 4 for max degree of parallelism so that the combination leads to 20 MB of documents indexed at a time.
|
Overwrite existing index
|
Select this option to overwrite, if the index with same name (specified in connection tab) already exists on Elasticsearch server.
|
Delete Index on completion
|
Select this option to delete the index created by the node at the end of the analysis.
|
|
Elasticsearch request options
|
Http request timeout (sec)
|
Use this option to specify the time the server will wait before failing a request.
|