javascript - Trouble Pivoting data with Map Reduce -
i having trouble pivoting dataset map reduce. i've been using mongodb cookbook help, i'm getting weird errors. want take below collection , pivot each user has list of of review ratings.
my collection looks this:
{ 'type': 'review', 'business_id': (encrypted business id), 'user_id': (encrypted user id), 'stars': (star rating), 'text': (review text), }
map function (wrapped in python):
map = code("""" function(){ key = {user : this.user_id}; value = {ratings: [this.business_id, this.stars]}; emit(key, value); } """)
the map function should return array of values associated key... reduce function (wrapped in python):
reduce = code(""" function(key, values){ var result = { value: [] }; temp = []; (var = 0; < values.length; i++){ temp.push(values[i].ratings); } result.value = temp; return result; } """)
however, results return one less rating total. in fact, users have none returned, can't happen. entries following:
{u'_id': {u'user: u'zwzytzniayfoqveg8xcvxw'}, u'value': [none, [u'e9nn4xxjdhj4qtkcopq_vg', 3.0], none, [...]...]
i can't pinpoint in code causing this. if there 3 reviews, have business ids , ratings in document. plus, using 'values.length + 1' in loop condition breaks values[i] reason.
edit 1
i've embraced fact reduce gets called multiple times on itself, below new reducer. returns array of [business, rating, business, rating]. idea how output [business, rating] arrays instead of 1 giant array?
function(key, value){ var result = { ratings:[] }; var temp = []; values.foreach(function(value){ value.ratings.foreach(function(rating){ if(temp.indexof(rating) == -1){ temp.push(rating); } }); }); result. rartings = temp; return result; }
heres test example:
1) add sample data:
db.test.drop(); db.test.insert( [{ 'type': 'review', 'business_id': 1, 'user_id': 1, 'stars': 1, }, { 'type': 'review', 'business_id': 2, 'user_id': 1, 'stars': 2, }, { 'type': 'review', 'business_id': 2, 'user_id': 2, 'stars': 3, }] );
2) map function
var map = function() { emit(this.user_id, [[this.business_id, this.stars]]); };
here set results want them @ end of process. why? because if there ever single review user (the key grouping by) results won't go through reduce phase.
3) reduce function
var reduce = function(key, values) { var result = { ratings: [] }; values.foreach(function(value){ result.ratings.push(value[0]); }); return result; };
here collect values, remembering nested them in map method, can pick out first value each set of results.
4) run map reduce:
db.test.mapreduce(map, reduce, {finalize: final, out: { inline: 1 }});
alternative - use aggregation framework:
db.test.aggregate({ $group: { _id: "$user_id", ratings: {$addtoset: {business_id: "$business_id", stars: "$stars"}} } });
Comments
Post a Comment