启动集群分片:
要对一个集合分片,首先你要对这个集合的 数据库启用分片,执行如下命令: sh.enableSharding("test") mydb被配置成启用分片后,配置信息是存放在配置服务器的数据库config的databases集合里! 片键:片键是集合的一个键,MongoDB根据这个键拆分数据。例如:username 。在启用分片之前,先在希望作为 片键的键上创建索引: db.users.ensureIndex({"username":1}) mydb的users集合被配置成启用分片后,配置信息是存放在配置服务器的数据库config的collections集合里! 对 集合分片: sh.shardCollection("test.users",{"username":1}) 路由服务器是不会存放配置信息(不用配置dbpath也是这个原因,但会缓存配置服务器上的配置!) mongos路由会在后台对各片进行负载均衡,直至各片的chunks块数量相等!。分片的balance,可以在规定时间执行这个操作,推荐在业务非繁忙的时候做。
下面来做一下数据库,在片上均衡实验(数据库为分片情况)
mongos> use mydb1 switched to db mydb1 mongos> db.mytest.insert({name:"zhang"}); WriteResult({ "nInserted" : 1 }) mongos> sh.status(); --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("57146d38f71635a4755aa902") } shards: { "_id" : "rs1", "host" : "rs1/mongodb01:10001,mongodb02:10001" } { "_id" : "rs2", "host" : "rs2/mongodb02:10002,mongodb03:10002" } { "_id" : "rs3", "host" : "rs3/mongodb01:10003,mongodb03:10003" } { "_id" : "rs4", "host" : "rs4/mongodb01:10004,mongodb02:10004" } active mongoses: "3.2.5" : 3 balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 5 Last reported error: could not find host matching read preference { mode: "primary" } for set rs4 Time of Reported error: Mon Apr 18 2016 16:54:52 GMT+0800 (CST) Migration Results for the last 24 hours: No recent migrations databases: { "_id" : "mydb", "primary" : "rs2", "partitioned" : true } mydb.mycol1 shard key: { "id" : 1 } unique: false balancing: true chunks: rs2 1 { "id" : { "$minKey" : 1 } } -->> { "id" : { "$maxKey" : 1 } } on : rs2 Timestamp(1, 0) { "_id" : "sisyphus", "primary" : "rs2", "partitioned" : true } { "_id" : "mydb1", "primary" : "rs1", "partitioned" : false } mongos> use mydb2 switched to db mydb2 mongos> db.mytest.insert({name:"zhang"}); WriteResult({ "nInserted" : 1 }) mongos> sh.status(); --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("57146d38f71635a4755aa902") } shards: { "_id" : "rs1", "host" : "rs1/mongodb01:10001,mongodb02:10001" } { "_id" : "rs2", "host" : "rs2/mongodb02:10002,mongodb03:10002" } { "_id" : "rs3", "host" : "rs3/mongodb01:10003,mongodb03:10003" } { "_id" : "rs4", "host" : "rs4/mongodb01:10004,mongodb02:10004" } active mongoses: "3.2.5" : 3 balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 5 Last reported error: could not find host matching read preference { mode: "primary" } for set rs4 Time of Reported error: Mon Apr 18 2016 16:54:52 GMT+0800 (CST) Migration Results for the last 24 hours: No recent migrations databases: { "_id" : "mydb", "primary" : "rs2", "partitioned" : true } mydb.mycol1 shard key: { "id" : 1 } unique: false balancing: true chunks: rs2 1 { "id" : { "$minKey" : 1 } } -->> { "id" : { "$maxKey" : 1 } } on : rs2 Timestamp(1, 0) { "_id" : "sisyphus", "primary" : "rs2", "partitioned" : true } { "_id" : "mydb1", "primary" : "rs1", "partitioned" : false } { "_id" : "mydb2", "primary" : "rs3", "partitioned" : false } 可以看到对于未启动分片的数据库, 可以看出我们随便创建的数据库的集合,他可以随机分布在不同的片上,mongo用这种机制实现主的平均分布,实现数据库 的均衡。
扩展一下,对于分片通常实验范围分片和hash分片,分片具体看业务情况的。而且分片以后是不能直接更改的,除非删除数据库或者collection,停业务实验mongodump重新导数据,这个代价太大,所以分片时候要谨慎。
范围分片:
1,适合普通范围查询,可以优化到让热查询定位到某个片,数据集中取出避免数据分散而走网络取数据
2,最好给分片键建立序列,实践证明依据序列查找比全片所有chunk查找快的多
3,缺点在于,如果shardkey有明显递增(或者递减)趋势,则新插入的文档多会分布到同一个chunk,无法扩展写的能力
hash分片:
1,适合大规模插入,由于范围平均可以充分利用整个集群的性能
2,适合高并发,集群的各个集群平均分担压力,
3,不能高效的服务范围查询,所有的范围查询要分发到后端所有的Shard才能找出满足条件的文档