Custom Analyzer | Elasticsearch Reference [1.7]

You are looking at documentation for an older release. Not what you want? See the current release documentation.

» » »

Custom Analyzeredit

An analyzer of type custom that allows to combine a Tokenizer with zero or more Token Filters, and zero or more Char Filters. The custom analyzer accepts a logical/registered name of the tokenizer to use, and a list of logical/registered names of token filters. The name of the custom analyzer must not start with "_".

The following are settings that can be set for a custom analyzer type:

Setting	Description
`tokenizer`	The logical / registered name of the tokenizer to use.
`filter`	An optional list of logical / registered name of token filters.
`char_filter`	An optional list of logical / registered name of char filters.
`position_offset_gap`	An optional number of positions to increment between each field value of a field using this analyzer.

Here is an example:

index :
    analysis :
        analyzer :
            myAnalyzer2 :
                type : custom
                tokenizer : myTokenizer1
                filter : [myTokenFilter1, myTokenFilter2]
                char_filter : [my_html]
                position_offset_gap: 256
        tokenizer :
            myTokenizer1 :
                type : standard
                max_token_length : 900
        filter :
            myTokenFilter1 :
                type : stop
                stopwords : [stop1, stop2, stop3, stop4]
            myTokenFilter2 :
                type : length
                min : 0
                max : 2000
        char_filter :
              my_html :
                type : html_strip
                escaped_tags : [xxx, yyy]
                read_ahead : 1024

« Snowball Analyzer Tokenizers »

Custom Analyzeredit

Top Videos

Be in the know with the latest and greatest from Elastic.