聚合框架的基本思想是:采用多个构建来创建一个管道,用于对一连串的文档进行处理。
这些构建包括:筛选(filtering),投影(projecting),分组(grouping),排序(sorting),限制(limiting)和跳过(skipping)。
生产测试数据
for(var i=0;i<100;i++){ for(var j=0;j<4;j++){ db.scores.insert({"studentId":"s"+i,"course":"课程"+j,"score":Math.random()*100}); } } 将产生400条数据,一个学生有四门课程的成绩: { "_id" : ObjectId("597ffbc1d8c0f25813bcc58b"), "studentId" : "s0", "course" : "课程0", "score" : 5.999703989474325 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc58c"), "studentId" : "s0", "course" : "课程1", "score" : 5.040777429297738 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc58d"), "studentId" : "s0", "course" : "课程2", "score" : 63.74926111710963 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc58e"), "studentId" : "s0", "course" : "课程3", "score" : 66.50186953829127 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc58f"), "studentId" : "s1", "course" : "课程0", "score" : 84.74255359132857 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc590"), "studentId" : "s1", "course" : "课程1", "score" : 85.7428949944855 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc591"), "studentId" : "s1", "course" : "课程2", "score" : 33.198227427842056 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc592"), "studentId" : "s1", "course" : "课程3", "score" : 5.346516072417174 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc593"), "studentId" : "s2", "course" : "课程0", "score" : 97.1040803312415 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc594"), "studentId" : "s2", "course" : "课程1", "score" : 20.47611488352149 } ...
查找score大于90的三条数据
db.scores.aggregate({"$match":{"score":{$gt:90}}},{"$limit":3}) { "_id" : ObjectId("597ffbc1d8c0f25813bcc593"), "studentId" : "s2", "course" : "课程0", "score" : 97.1040803312415 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc5a6"), "studentId" : "s6", "course" : "课程3", "score" : 95.37347295013626 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc5ab"), "studentId" : "s8", "course" : "课程0", "score" : 90.49825247672626 }只显示studentId score
db.scores.aggregate({"$match":{"score":{$gt:90}}},{"$project":{studentId:1,score:1,_id:0}},{"$limit":3}) { "studentId" : "s2", "score" : 97.1040803312415 } { "studentId" : "s6", "score" : 95.37347295013626 } { "studentId" : "s8", "score" : 90.49825247672626 }把 studentId 重命名为 sid $studentId 表示取值
db.scores.aggregate({"$match":{"score":{$gt:90}}},{"$project":{"sid":"$studentId",score:1,_id:0}},{"$limit":3}) { "score" : 97.1040803312415, "sid" : "s2" } { "score" : 95.37347295013626, "sid" : "s6" } { "score" : 90.49825247672626, "sid" : "s8" }管道操作符还可以使用表达式,以满足更复杂的需求。
1:$add : [expr1[,expr2,…exprn]]
2:$subtract:[expr1,expr2]
3:$multiply:[expr1[,expr2,…exprn]]
4:$divice:[expr1,expr2] 5:$mod:[expr1,expr2]
问题一:查询出分数大于90分的人员分数加20分
db.scores.aggregate({"$match":{"score":{$gt:90}}},{"$project":{"sid":"$studentId","score":1,_id:0,"newScore":{$add:["$score",20]}}},{"$limit":3}) { "score" : 97.1040803312415, "sid" : "s2", "newScore" : 117.1040803312415 } { "score" : 95.37347295013626, "sid" : "s6", "newScore" : 115.37347295013626 } { "score" : 90.49825247672626, "sid" : "s8", "newScore" : 110.49825247672626 }
合框架包含了一些用于提取日期信息的表达式,如下: $year、$month、$week、$dayOfMonth、$dayOfWeek、$dayOfYear、$hour、$minute、 $second 。
插入一条日期的文档
db.userdatas.insert({"date":new Date()}) { "_id" : ObjectId("5983f5c88eec53fbcd56a7ca"), "date" : ISODate("2017-08-04T04:19:20.693Z") } db.userdatas.aggregate({"$match":{"_id":ObjectId("5983f5c88eec53fbcd56a7ca")}}, {"$project":{"year":{"$year":"$date"},"month":{"$month":"$date"}}}) { "_id" : ObjectId("5983f5c88eec53fbcd56a7ca"), "year" : 2017, "month" : 8 }
1:$substr : [expr,开始位置,要取的字节个数]
2:$concat:[expr1[,expr2,…exprn]]
3:$toLower:expr
4:$toUpper:expr
db.scores.aggregate({"$project":{"sid":{"$concat":["$studentId","--"]}}},{"$limit":2}) { "_id" : ObjectId("597ffbc1d8c0f25813bcc58b"), "sid" : "s0--" } { "_id" : ObjectId("597ffbc1d8c0f25813bcc58c"), "sid" : "s0--" }
1:$cmp:[expr1,expr2] :比较两个表达式,0表示相等,正数前面的大,负数后面的大 。
2:$strcasecmp:[string1,string2] :比较两个字符串,区分大小写,只对由罗马字符组成的字符串有效。
3:$eq、$ne、$gt、$gte、$lt、$lte :[expr1,expr2]
4:$and、\$or、\$not
5:$cond:[booleanExpr,trueExpr,falseExpr]:如果boolean表达式为true,返回true表达式,否则返回false表达式
6:$ifNull:[expr,otherExpr]:如果expr为null,返回otherExpr,否则返回expr
问题一:字段studentId 与字符”s3”比较大小
db.scores.aggregate({"$project":{"result":{"$strcasecmp":["$studentId","s3"]},"studentId":1}},{"$skip":8},{"$limit":12}) { "_id" : ObjectId("597ffbc1d8c0f25813bcc593"), "studentId" : "s2", "result" : -1 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc594"), "studentId" : "s2", "result" : -1 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc595"), "studentId" : "s2", "result" : -1 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc596"), "studentId" : "s2", "result" : -1 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc597"), "studentId" : "s3", "result" : 0 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc598"), "studentId" : "s3", "result" : 0 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc599"), "studentId" : "s3", "result" : 0 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc59a"), "studentId" : "s3", "result" : 0 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc59b"), "studentId" : "s4", "result" : 1 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc59c"), "studentId" : "s4", "result" : 1 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc59d"), "studentId" : "s4", "result" : 1 } { "_id" : ObjectId("597ffbc1d8c0f25813bcc59e"), "studentId" : "s4", "result" : 1 }
1:$sum:value :对于每个文档,将value与计算结果相加
2:$avg:value :返回每个分组的平均值
3:$max:expr :返回分组内的最大值
4:$min:expr :返回分组内的最小值
5:$first:expr :返回分组的第一个值,忽略其他的值,一般只有排序后,明确知道数据顺 序的时候,这个操作才有意义
6:$last:expr :与上面一个相反,返回分组的最后一个值
7:$addToSet:expr :如果当前数组中不包含expr,那就将它加入到数组中
8:$push:expr:把expr加入到数组中
问题一:按照人进行分组
db.scores.aggregate({$group:{"_id":"$studentId"}})问题二:算出每个人的平均分前十名的学生
db.scores.aggregate({$group:{"_id":"$studentId","avgscore":{"$avg":"$score"}}},{"$project":{"score":1,"avgscore":1}},{"$sort":{"avgscore":-1}},{"$limit":10}) { "_id" : "s68", "avgscore" : 83.18088197080301 } { "_id" : "s29", "avgscore" : 81.47034353997472 } { "_id" : "s87", "avgscore" : 80.55368613639939 } { "_id" : "s14", "avgscore" : 78.15436971363175 } { "_id" : "s33", "avgscore" : 76.56089285481109 } { "_id" : "s35", "avgscore" : 74.60393122282694 } { "_id" : "s99", "avgscore" : 73.79357383668147 } { "_id" : "s6", "avgscore" : 72.401925493081 } { "_id" : "s89", "avgscore" : 71.40932193466372 } { "_id" : "s41", "avgscore" : 70.28421418013374 }构件1:{$group:{“_id”:”$studentId”,”avgscore”:{“$avg”:”$score”}}} 按照学生分组算出平均成绩
构件2:{“$project”:{“score”:1,”avgscore”:1}} :显示score avgscore字段 _id 会自动显示
构件3:{“$sort”:{“avgscore”:-1}}: 按照avgscore 降序排序
构件4:{“$limit”:10} 显示10条
问题三:算出每个人的总分最高的10个人。
db.scores.aggregate({"$group":{"_id":"$studentId","count":{"$sum":"$score"}}},{"$sort":{"count":-1}},{"$limit":10}) { "_id" : "s68", "count" : 332.72352788321206 } { "_id" : "s29", "count" : 325.8813741598989 } { "_id" : "s87", "count" : 322.21474454559757 } { "_id" : "s14", "count" : 312.617478854527 } { "_id" : "s33", "count" : 306.24357141924435 } { "_id" : "s35", "count" : 298.4157248913078 } { "_id" : "s99", "count" : 295.1742953467259 } { "_id" : "s6", "count" : 289.607701972324 } { "_id" : "s89", "count" : 285.6372877386549 } { "_id" : "s41", "count" : 281.136856720535 }问题四:每个最高的得分的科目。
db.scores.aggregate({"$group":{"_id":{"sid":"$studentId","course":"$course"},"max":{"$max":"$score"}}},{"$project":{"max":1,"course":1}},{"$sort":{"max":-1}},{"$limit":10}) { "_id" : { "sid" : "s63", "course" : "课程1" }, "max" : 99.90599299936622 } { "_id" : { "sid" : "s35", "course" : "课程3" }, "max" : 99.30405860298266 } { "_id" : { "sid" : "s64", "course" : "课程0" }, "max" : 99.1537696673095 } { "_id" : { "sid" : "s50", "course" : "课程2" }, "max" : 98.52420500572934 } { "_id" : { "sid" : "s20", "course" : "课程0" }, "max" : 98.28681013083249 } { "_id" : { "sid" : "s40", "course" : "课程0" }, "max" : 98.2529822556756 } { "_id" : { "sid" : "s30", "course" : "课程2" }, "max" : 98.2195611010153 } { "_id" : { "sid" : "s65", "course" : "课程3" }, "max" : 97.93186806691125 } { "_id" : { "sid" : "s10", "course" : "课程1" }, "max" : 97.87209639127002 } { "_id" : { "sid" : "s23", "course" : "课程3" }, "max" : 97.36489703312259 }
问题一:把score 字段数组拆分。
db.userdatas.find({"name":"u4"}) { "_id" : ObjectId("597f357a09c84cf58880e411"), "name" : "u4", "age" : 30, "score" : [ 7, 4, 2, 0 ] } db.userdatas.aggregate({"$match":{"name":"u4"}},{"$unwind":"$score"}) { "_id" : ObjectId("597f357a09c84cf58880e411"), "name" : "u4", "age" : 30, "score" : 7 } { "_id" : ObjectId("597f357a09c84cf58880e411"), "name" : "u4", "age" : 30, "score" : 4 } { "_id" : ObjectId("597f357a09c84cf58880e411"), "name" : "u4", "age" : 30, "score" : 2 } { "_id" : ObjectId("597f357a09c84cf58880e411"), "name" : "u4", "age" : 30, "score" : 0 }构件1: {“$match”:{“name”:”u4”}} 查询出 name=u4 的文档,然后把结果给 构件2.
构件2:{“ unwind":" score”} 把score 拆分成一个个独立的文档。
问题一:按照年龄字段排序
db.userdatas.find() { "_id" : ObjectId("59789a56bc629e73c4f09e1c"), "name" : "wang wu", "age" : 45 } { "_id" : ObjectId("59789a74bc629e73c4f09e1e"), "name" : "wang wu", "age" : 8 } { "_id" : ObjectId("59789ac0bc629e73c4f09e20"), "name" : "wang wu", "age" : 33 } { "_id" : ObjectId("597f357a09c84cf58880e40e"), "name" : "u1", "age" : 37 } { "_id" : ObjectId("597f357a09c84cf58880e40f"), "name" : "u1", "age" : 37 } { "_id" : ObjectId("597f357a09c84cf58880e410"), "name" : "u5", "age" : 78 } { "_id" : ObjectId("597f357a09c84cf58880e412"), "name" : "u3", "age" : 32 } { "_id" : ObjectId("597f357a09c84cf58880e411"), "name" : "u4", "age" : 30, "score" : [ 7, 4, 2, 0 ] } { "_id" : ObjectId("597fcc0f411f2b2fd30d0b3f"), "age" : 20, "score" : [ 7, 4, 2, 0, 10, 9, 8, 7 ], "name" : "lihao" } { "_id" : ObjectId("597f357a09c84cf58880e413"), "name" : "u2", "age" : 33, "wendang" : { "yw" : 80, "xw" : 90 } } { "_id" : ObjectId("5983f5c88eec53fbcd56a7ca"), "date" : ISODate("2017-08-04T04:19:20.693Z") } db.userdatas.aggregate({"$sort":{"age":-1}},{"$match":{"age":{"$exists":1}}},{"$project":{"name":1,"age":1,"_id":0}}) { "name" : "u5", "age" : 78 } { "name" : "wang wu", "age" : 45 } { "name" : "u1", "age" : 37 } { "name" : "u1", "age" : 37 } { "name" : "wang wu", "age" : 33 } { "name" : "u2", "age" : 33 } { "name" : "u3", "age" : 32 } { "name" : "u4", "age" : 30 } { "age" : 20, "name" : "lihao" } { "name" : "wang wu", "age" : 8 }构件1:{“$sort”:{“age”:-1}} 按照age字段降序,把数据给构件2.
构件2:{“$match”:{“age”:{“$exists”:1}}} 筛选出 age存在的字段,然后把数据给构件3,
构件3:{“$project”:{“name”:1,”age”:1,”_id”:0}} 投影 只显示 name, age 字段。