1

I have data stored in MongoDB in the following format.

{
    "_id" : ObjectId("570b487fb5360dd1e5ef840c"),
    "internal_id" : 1,
    "created_at" : ISODate("2015-07-14T10:08:38.994Z"),
    "updated_at" : ISODate("2016-01-10T00:35:19.748Z"),
    "ad_account_id" : 1,
    "updated_time" : "2013-08-05T04:48:49-0700",
    "created_time" : "2013-08-05T04:46:35-0700",
    "name" : "Sale1",
    "daily": [
                 {"clicks": 5000, "date": "2015-04-16"},
                 {"clicks": 5100, "date": "2015-04-17"},
                 {"clicks": 5030, "date": "2015-04-20"}
             ]
    "custom_tags" : {
        "Event" : {
            "name" : "Clicks"
        },
        "Objective" : {
            "name" : "Sale"
        },
        "Image" : {
            "name" : "43c3fe7b262cde5f476ed303e472c65a"
        },
        "Goal" : {
            "name" : "10"
        },
        "Type" : {
             "name" : "None"
        },
        "Call To Action" : {
             "name" : "None",
        },
        "Landing Pages" : {
            "name" : "www.google.com",
    }
}

I am trying to group individual documents by internal_id to find the aggregate sum of clicks from say 2015-04-15 to 2015-04-21 using the aggregate method.

In pymongo, when I try to do an aggregate using just $project on internal_id, I get the results, but when I try to $project custom_tags fields, I get the following error:

OperationFailure: Exceeded memory limit for $group, but didn't allow external sort.
Pass allowDiskUse:true to opt in.

Following the answer here, I even changed my aggregate function to list(collection._get_collection().aggregate(mongo_query["pipeline"], allowDiskUse=True)). But this still keeps throwing the earlier error.

2 Answers 2

3

Take a look at this link: Can't get allowDiskUse:True to work with pymongo

This Works for me:

someSampleList= db.collectionName.aggregate(pipeline, allowDiskUse=True)

Where

pipeline = [
    {'$sort': {'sortField': 1}},
    {'$group': {'_id': '$distinctField'}}, 
    {'$limit': 20000}]
Sign up to request clarification or add additional context in comments.

Comments

-1

Try with that:

list(collection._get_collection().aggregate(mongo_query["pipeline"], {allowDiskUse : true}))

1 Comment

Your answer is not compliant to python. allowDiskUse and true are not defined

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.