Custom Analyzeredit
An analyzer of type custom that allows to combine a Tokenizer with
zero or more Token Filters, and zero or more Char Filters. The
custom analyzer accepts a logical/registered name of the tokenizer to
use, and a list of logical/registered names of token filters.
The name of the custom analyzer must not start with "_".
The following are settings that can be set for a custom analyzer type:
| Setting | Description |
|---|---|
| The logical / registered name of the tokenizer to use. |
| An optional list of logical / registered name of token filters. |
| An optional list of logical / registered name of char filters. |
| An optional number of positions to increment between each field value of a field using this analyzer. Defaults to 100. 100 was chosen because it prevents phrase queries with reasonably large slops (less than 100) from matching terms across field values. |
Here is an example:
index :
analysis :
analyzer :
myAnalyzer2 :
type : custom
tokenizer : myTokenizer1
filter : [myTokenFilter1, myTokenFilter2]
char_filter : [my_html]
position_increment_gap: 256
tokenizer :
myTokenizer1 :
type : standard
max_token_length : 900
filter :
myTokenFilter1 :
type : stop
stopwords : [stop1, stop2, stop3, stop4]
myTokenFilter2 :
type : length
min : 0
max : 2000
char_filter :
my_html :
type : html_strip
escaped_tags : [xxx, yyy]
read_ahead : 1024