0

Elasticsearch was working well and fine before today.

Issue:

Some documents which are failing to index with error:

u'Limit of total fields [1000] in index [mintegral_incent] has been exceeded' 

Error:

"BulkIndexError: (u'14 document(s) failed to index.', [{u'index': {u'status': 400, u'_type': u'mintegral_incent', u'_id': u'168108082', u'error': {u'reason': u'Limit of total fields [1000] in index [mintegral_incent] has been exceeded', u'type': u'illegal_argument_exception'}

Using Amazon Elastic service

Elasticsearch Version 5.1

ES setup:

from elasticsearch import Elasticsearch
from elasticsearch import helpers
es_repo = Elasticsearch(hosts=[settings.ES_INDEX_URL],
                        verify_certs=True)

Code giving issue:

def bulk_index_offers(index_name, id_field, docs):
    actions = []
    for doc in docs:
        action = {
            "_index": index_name,
            "_type": index_name,
            "_id": doc.get(id_field),
            "_source": doc
        }
        actions.append(action)
    # Error at this following line.
    resp = helpers.bulk(es_repo, actions)
    return resp

What I have tried:

I have tried setting chunks to smaller and increased read_timeout to 30 from default 10 like this : resp = helpers.bulk(es_repo, actions, chunks=500, read_timeout=30)

But still facing same issue.

Please help.

2 Answers 2

3

By default, a mapping type is only allowed to contain up to 1000 fields and it seems you are exceeding that limit. In order to increase that threashold you can run this command:

PUT mintegral_incent/_settings
{ 
  "index": {
    "mapping": {
      "total_fields": {
        "limit": "2000"
      }
    }
  }
}

Using curl, it'd look like this:

curl -XPUT http://<your.amazon.host>/mintegral_incent/_settings -d '{ 
  "index": {
    "mapping": {
      "total_fields": {
        "limit": "2000"
      }
    }
  }
}'

Then you can run your bulk script again and it should work.

Sign up to request clarification or add additional context in comments.

9 Comments

Sorry where to put this settings ?
You can run this command on your live ES cluster, no need to change the configuration and restart it.
Ok we are using amazon and we have endpoint but not sure how to go command line on that
Still getting same error. even though limit is 2000 now it shows 1000 in logs. :(
Then the limit was not increased. What was the output of the command when you ran it?
|
1

In case you want to work from Python, try:

import requests

headers = {
    'Content-Type': 'application/json',
}

resp = requests.put('http://localhost:9200/your_index/_settings',
                    headers=headers,
                    data='{"index": {"mapping": {"total_fields": {"limit": "2000"}}}}')

print(f'\nHTTP code: {resp.status_code} -- response: {resp}\n')
print(f'Response text\n{resp.text}')

You may also use a terminal like indicated above, although I had to add a header, -H'Content-Type: application/json'

curl -XPUT http://localhost:9200/your_index/_settings -d '{"index": {"mapping": {"total_fields": {"limit": "2000"}}}}' -H'Content-Type: application/json'

If you need to use curl requests (get, put, post) from Python, this guide is very helpful (it is the source for my answer), and even provides code for a nice method to handle this.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.