How To Find Documents With Exactly The Same Array Entries As In A Query
Solution 1:
There are several "very useful cases" here where in fact trying to create a "unique hash" over the array content is actually "getting in the way" of the myriad of problems that can be easily addressed.
Finding Common to "Me"
If you for example take "user 1" from the sample provided, and consider that you have that data loaded already and want to find "those in common with me" by the matched "itemsIds" from what the current user object has, then there are two simple query approaches:
Find "exactly" the same: Is where you want to inspect other user data to see those users that have the same "exact" interests. This is simple and "unordered" usage of the
$all
query operator:db.collection.find({ "itemsIds": { "$all": [399957190, 366369952] }, "userId": { "$ne": 1 } })
Which is going to return "user 3" since they are the one with "both" common "itemsIds" entries. Order is not important here as it is always a match in any order, as long as they are both there. This is another form of
$and
as query arguments.Find "similar" in common to me": Which is basically asking "do you have something that is the same?". For that you can use the
$in
query operator. It will match if "either" of the specified conditions is met:db.collection.find({ "itemsIds": { "$in": [399957190, 366369952] }, "userId": { "$ne": 1 } })
In this case "both" the "user 2" and "user 3" are going to match, as they "at least" share "one" of the conditions specified and that means that have "something in common" with the source data of the query.
This is in fact another form of the
$or
query operator, and just like before it is a lot simplier and concise to write this way given the conditions to be applied.
Finding Common "Things"
There are also cases where you might want to find things "in common" without having a base "user" to start from. So how do you tell that "user 1" and "user 2" share the same "itemIds", or in fact that various of the users might share the same "itemIds" value individually, but who are they?
Get the exact matches: Is of course where you look at the "itemsIds" values and
$group
them together. Generally the "order is important" here, so optimally you have them "pre-ordered" and consistently always to make this as simple as:db.collection.aggregate([ { "$group": { "_id": "$itemsIds", "common": { "$push": "$userId" } }} ])
And that is all there really is to it, as long as the order is already there. If not, then you can do a slightly longer winded form to do the "ordering", but the same could be said of generating a "hash":
db.collection.aggregate([ { "$unwind": "$itemsIds" }, { "$sort": { "_id": 1, "itemsIds": 1 } }, { "$group": { "_id": "$_id", "userId": { "$first": "$userId" }, "itemsIds": { "$push": "$itemsIds" } }}, { "$group": { "_id": "$itemsIds", "common": { "$push": "$userId" } }} ])
Not "super" performant, but it makes the point of why you always keep ordered on addition of array entries. Which is a very simple process.
Common "user" to "items": Which is another simple process abstracting on above with "breaking down" the array under
$unwind
, and then basically grouping back:db.collection.aggregate([ { "$unwind": "$itemsIds" }, { "$group": { "_id": "$itemsIds", "users": { "$addToSet": "$userId" } }} ])
And again, just a simple grouping aggregator of
$addToSet
does the job and collects the "distinct userId" values for each "itemsIds" value.
These are all basic solutions, and I could go on with "set intersections" and what not, but this is the "primer".
Don't try to compute a "hash", MongoDB has a good "arsenal" for matching the entries anyway. Use it and "abuse it" as well, until it breaks. Then try harder.
Post a Comment for "How To Find Documents With Exactly The Same Array Entries As In A Query"