Python Elasticseach indexing error

Question

Elasticsearch was working well and fine before today.

Issue:

Some documents which are failing to index with error:

u'Limit of total fields [1000] in index [mintegral_incent] has been exceeded'

Error:

"BulkIndexError: (u'14 document(s) failed to index.', [{u'index': {u'status': 400, u'_type': u'mintegral_incent', u'_id': u'168108082', u'error': {u'reason': u'Limit of total fields [1000] in index [mintegral_incent] has been exceeded', u'type': u'illegal_argument_exception'}

Using Amazon Elastic service

Elasticsearch Version 5.1

ES setup:

from elasticsearch import Elasticsearch
from elasticsearch import helpers
es_repo = Elasticsearch(hosts=[settings.ES_INDEX_URL],
                        verify_certs=True)

Code giving issue:

def bulk_index_offers(index_name, id_field, docs):
    actions = []
    for doc in docs:
        action = {
            "_index": index_name,
            "_type": index_name,
            "_id": doc.get(id_field),
            "_source": doc
        }
        actions.append(action)
    # Error at this following line.
    resp = helpers.bulk(es_repo, actions)
    return resp

What I have tried:

I have tried setting chunks to smaller and increased read_timeout to 30 from default 10 like this : resp = helpers.bulk(es_repo, actions, chunks=500, read_timeout=30)

But still facing same issue.

Please help.

Val · Accepted Answer · 2017-09-13 12:30:44Z

3

By default, a mapping type is only allowed to contain up to 1000 fields and it seems you are exceeding that limit. In order to increase that threashold you can run this command:

PUT mintegral_incent/_settings
{ 
  "index": {
    "mapping": {
      "total_fields": {
        "limit": "2000"
      }
    }
  }
}

Using curl, it'd look like this:

curl -XPUT http://<your.amazon.host>/mintegral_incent/_settings -d '{ 
  "index": {
    "mapping": {
      "total_fields": {
        "limit": "2000"
      }
    }
  }
}'

Then you can run your bulk script again and it should work.

edited Sep 13, 2017 at 12:30

answered Sep 13, 2017 at 12:21

Val

218k14 gold badges377 silver badges384 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

Kishan Mehta Over a year ago

Sorry where to put this settings ?

Val Over a year ago

You can run this command on your live ES cluster, no need to change the configuration and restart it.

Kishan Mehta Over a year ago

Ok we are using amazon and we have endpoint but not sure how to go command line on that

Kishan Mehta Over a year ago

Still getting same error. even though limit is 2000 now it shows 1000 in logs. :(

Val Over a year ago

Then the limit was not increased. What was the output of the command when you ran it?

|

Pablo · Accepted Answer · 2019-08-22 08:56:42Z

In case you want to work from Python, try:

import requests

headers = {
    'Content-Type': 'application/json',
}

resp = requests.put('http://localhost:9200/your_index/_settings',
                    headers=headers,
                    data='{"index": {"mapping": {"total_fields": {"limit": "2000"}}}}')

print(f'\nHTTP code: {resp.status_code} -- response: {resp}\n')
print(f'Response text\n{resp.text}')

You may also use a terminal like indicated above, although I had to add a header, -H'Content-Type: application/json'

curl -XPUT http://localhost:9200/your_index/_settings -d '{"index": {"mapping": {"total_fields": {"limit": "2000"}}}}' -H'Content-Type: application/json'

If you need to use curl requests (get, put, post) from Python, this guide is very helpful (it is the source for my answer), and even provides code for a nice method to handle this.

Collectives™ on Stack Overflow

Python Elasticseach indexing error

Issue:

Error:

ES setup:

Code giving issue:

What I have tried:

2 Answers 2

9 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

Issue:

Error:

ES setup:

Code giving issue:

What I have tried:

2 Answers 2

9 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related