1

Here's what my records look like:

I have obtained this by using collection.find().limit(1)

[
  {
    "_id": {"$oid": "..."},
    "husband.firstName": "John",
    "husband.secondName": "Smith",
    "wife.firstName": "Alice",
    "wife.secondName": "Watson",
    "...": "...",
  }
]

The husband and wife fields only contains firstName and secondName

I want to count how common each husband and wife name combos are.

I imagine the results in the form of something like:

[
  {
    "husband.firstName": "John",
    "husband.secondName": "Smith",
    "wife.firstName": "Alice",
    "wife.secondName": "Watson",
    "count": "456",
  },
  {
    "husband.firstName": "Jack",
    "husband.secondName": "Smith",
    "wife.firstName": "Alice",
    "wife.secondName": "Watson",
    "count": "123",
  }
]

I'm using Python and pymongo so I have tried the following:

pipeline = [
    {
        "$group": {
            "_id": {
                "husband": "$husband",
                "wife": "$wife"
            },
            "count": {"$sum": 1}
        }
    },
    {
        "$sort": {"count": -1}
    },
]

However this returns an empty result of:

[
  {
    "_id": "{}",
    "count": 47553
  }
]

I tried also grouping by the fields separately but the result was the same.

3

1 Answer 1

2

As pointed out in the comments by @cmgchess, the main difficulty for your case is the dot in your field name. You may want to refactor your schema to something like below:

{
    "husband": {
      "firstName": "John",
      "secondName": "Smith"
    },
    "wife": {
      "firstName": "Alice",
      "secondName": "Watson"
    }
  }

Nevertheless, for your current schema, you may workaround it through usage of $getField.

db.collection.aggregate([
  {
    "$group": {
      "_id": {
        "husband": {
          "firstName": {
            "$getField": {
              "field": "husband.firstName",
              "input": "$$ROOT"
            }
          },
          "secondName": {
            "$getField": {
              "field": "husband.secondName",
              "input": "$$ROOT"
            }
          }
        },
        "wife": {
          "firstName": {
            "$getField": {
              "field": "wife.firstName",
              "input": "$$ROOT"
            }
          },
          "secondName": {
            "$getField": {
              "field": "wife.secondName",
              "input": "$$ROOT"
            }
          }
        }
      },
      "count": {
        "$sum": 1
      }
    }
  },
  {
    "$sort": {
      "count": -1
    }
  }
])

Mongo Playground

Sign up to request clarification or add additional context in comments.

1 Comment

can even shorten the getField a bit mongoplayground.net/p/ABFRPQ4_S7A

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.