--- title: Using the annotated-text field description: The annotated-text tokenizes text content as per the more common text field (see "limitations" below) but also injects any marked-up annotation tokens... url: https://www.elastic.co/docs/reference/elasticsearch/plugins/mapper-annotated-text-usage --- # Using the annotated-text field The `annotated-text` tokenizes text content as per the more common [`text`](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/text) field (see "limitations" below) but also injects any marked-up annotation tokens directly into the search index: ```json { "mappings": { "properties": { "my_field": { "type": "annotated_text" } } } } ``` Such a mapping would allow marked-up text eg wikipedia articles to be indexed as both text and structured tokens. The annotations use a markdown-like syntax using URL encoding of one or more values separated by the `&` symbol. We can use the "_analyze" api to test how an example annotation would be stored as tokens in the search index: ```js GET my-index-000001/_analyze { "field": "my_field", "text":"Investors in [Apple](Apple+Inc.) rejoiced." } ``` Response: ```js { "tokens": [ { "token": "investors", "start_offset": 0, "end_offset": 9, "type": "", "position": 0 }, { "token": "in", "start_offset": 10, "end_offset": 12, "type": "", "position": 1 }, { "token": "Apple Inc.", "start_offset": 13, "end_offset": 18, "type": "annotation", "position": 2 }, { "token": "apple", "start_offset": 13, "end_offset": 18, "type": "", "position": 2 }, { "token": "rejoiced", "start_offset": 19, "end_offset": 27, "type": "", "position": 3 } ] } ``` We can now perform searches for annotations using regular `term` queries that don’t tokenize the provided search values. Annotations are a more precise way of matching as can be seen in this example where a search for `Beck` will not match `Jeff Beck` : ```json # Example documents { "my_field": "[Beck](Beck) announced a new tour"<1> } { "my_field": "[Jeff Beck](Jeff+Beck&Guitarist) plays a strat"<2> } # Example search { "query": { "term": { "my_field": "Beck" <3> } } } ``` Any use of `=` signs in annotation values eg `[Prince](person=Prince)` will cause the document to be rejected with a parse failure. In future we hope to have a use for the equals signs so will actively reject documents that contain this today. ## Synthetic `_source` Synthetic `_source` is Generally Available only for TSDB indices (indices that have `index.mode` set to `time_series`). For other indices synthetic `_source` is in technical preview. Features in technical preview may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. If using a sub-`keyword` field then the values are sorted in the same way as a `keyword` field’s values are sorted. By default, that means sorted with duplicates removed. So: ```json { "settings": { "index": { "mapping": { "source": { "mode": "synthetic" } } } }, "mappings": { "properties": { "text": { "type": "annotated_text", "fields": { "raw": { "type": "keyword" } } } } } } { "text": [ "the quick brown fox", "the quick brown fox", "jumped over the lazy dog" ] } ``` Will become: ```json { "text": [ "jumped over the lazy dog", "the quick brown fox" ] } ``` Reordering text fields can have an effect on [phrase](https://www.elastic.co/docs/reference/query-languages/query-dsl/query-dsl-match-query-phrase) and [span](https://www.elastic.co/docs/reference/query-languages/query-dsl/span-queries) queries. See the discussion about [`position_increment_gap`](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/position-increment-gap) for more detail. You can avoid this by making sure the `slop` parameter on the phrase queries is lower than the `position_increment_gap`. This is the default. If the `annotated_text` field sets `store` to true then order and duplicates are preserved. ```json { "settings": { "index": { "mapping": { "source": { "mode": "synthetic" } } } }, "mappings": { "properties": { "text": { "type": "annotated_text", "store": true } } } } { "text": [ "the quick brown fox", "the quick brown fox", "jumped over the lazy dog" ] } ``` Will become: ```json { "text": [ "the quick brown fox", "the quick brown fox", "jumped over the lazy dog" ] } ```