MongoDB

2025-11-11
作者 Sirius
~33.26K 字
次阅读
条评论

1. MongoDB学习笔记
2. 核心知识总结
3. 参考资料

MongoDB is a source-available cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas.

MongoDB学习笔记

一、MongoDB基础概念

1.1 什么是MongoDB

MongoDB是一个基于分布式文件存储的文档型数据库，由C++语言编写。旨在为WEB应用提供可扩展的高性能数据存储解决方案。

MongoDB是一个介于关系数据库和非关系数据库之间的产品，是非关系数据库当中功能最丰富，最像关系数据库的。

核心特点：

面向文档存储（Document-Oriented）
高性能（High Performance）
高可用性（High Availability）
易扩展性（Easy Scalability）
丰富的查询语言（Rich Query Language）

1.2 MongoDB与关系型数据库对比

SQL术语/概念	MongoDB术语/概念	解释/说明
database	database	数据库
table	collection	数据库表/集合
row	document	数据记录行/文档
column	field	数据字段/域
index	index	索引
table joins	embedded documents	表连接（MongoDB不支持）/嵌入文档
primary key	primary key	主键，MongoDB自动将_id字段设置为主键

示例对比：

-- MySQL
CREATE TABLE users (
    id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(100),
    age INT,
    email VARCHAR(100)
);

INSERT INTO users (name, age, email) VALUES ('张三', 25, 'zhangsan@example.com');

// MongoDB
db.users.insertOne({
    name: "张三",
    age: 25,
    email: "zhangsan@example.com"
})

1.3 MongoDB数据类型

数据类型	描述	示例
String	字符串，UTF-8编码	“hello world”
Integer	整型数值	123, -456
Double	双精度浮点值	3.14, -2.5
Boolean	布尔值	true, false
Array	数组或列表	[1, 2, 3]
Object	嵌入式文档	{name: “张三”}
Null	空值	null
Date	日期时间	ISODate(“2024-01-01”)
ObjectId	对象ID，文档的唯一标识	ObjectId(“507f1f77bcf86cd799439011”)
Binary Data	二进制数据	图片、文件等
Code	JavaScript代码	function() { return 1; }
Regular Expression	正则表达式	/pattern/i

二、MongoDB安装与配置

2.1 Docker方式安装（推荐）

# 拉取MongoDB镜像
docker pull mongo:latest

# 启动MongoDB容器
docker run -d \
  --name mongodb \
  -p 27017:27017 \
  -v /data/mongodb:/data/db \
  -e MONGO_INITDB_ROOT_USERNAME=admin \
  -e MONGO_INITDB_ROOT_PASSWORD=password123 \
  mongo:latest

# 进入MongoDB容器
docker exec -it mongodb mongosh -u admin -p password123

2.2 Linux系统安装

# Ubuntu/Debian
wget -qO - https://www.mongodb.org/static/pgp/server-7.0.asc | sudo apt-key add -
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/7.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-7.0.list
sudo apt-get update
sudo apt-get install -y mongodb-org

# CentOS/RHEL
cat <<EOF > /etc/yum.repos.d/mongodb-org-7.0.repo
[mongodb-org-7.0]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/8/mongodb-org/7.0/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-7.0.asc
EOF

sudo yum install -y mongodb-org

# 启动MongoDB
sudo systemctl start mongod
sudo systemctl enable mongod
sudo systemctl status mongod

2.3 MongoDB配置文件

# /etc/mongod.conf
storage:
  dbPath: /var/lib/mongodb        # 数据存储路径
  journal:
    enabled: true                 # 启用日志

systemLog:
  destination: file
  path: /var/log/mongodb/mongod.log  # 日志文件路径
  logAppend: true

net:
  port: 27017                     # 端口
  bindIp: 0.0.0.0                 # 绑定IP（生产环境应限制）

security:
  authorization: enabled          # 启用权限验证

replication:
  replSetName: rs0                # 副本集名称

sharding:
  clusterRole: shardsvr           # 分片角色

三、MongoDB基本操作（CRUD）

3.1 数据库操作

// 查看所有数据库
show dbs
show databases

// 切换/创建数据库（如果不存在则创建）
use mydb

// 查看当前数据库
db

// 删除当前数据库
db.dropDatabase()

// 查看数据库统计信息
db.stats()

3.2 集合操作

// 创建集合
db.createCollection("users")
db.createCollection("users", {
    capped: true,        // 固定大小集合
    size: 6142800,       // 字节
    max: 10000           // 最大文档数
})

// 查看所有集合
show collections
show tables

// 删除集合
db.users.drop()

// 重命名集合
db.users.renameCollection("members")

3.3 文档插入（Create）

// 插入单个文档
db.users.insertOne({
    name: "张三",
    age: 25,
    email: "zhangsan@example.com",
    hobbies: ["读书", "旅游"],
    address: {
        city: "北京",
        district: "朝阳区"
    },
    createdAt: new Date()
})

// 插入多个文档
db.users.insertMany([
    {name: "李四", age: 30, email: "lisi@example.com"},
    {name: "王五", age: 28, email: "wangwu@example.com"},
    {name: "赵六", age: 35, email: "zhaoliu@example.com"}
])

// insert方法（可插入单个或多个）
db.users.insert({name: "钱七", age: 22})

3.4 文档查询（Read）

// 查询所有文档
db.users.find()
db.users.find().pretty()  // 格式化输出

// 条件查询
db.users.find({name: "张三"})                    // 等于
db.users.find({age: {$gt: 25}})                  // 大于
db.users.find({age: {$gte: 25}})                 // 大于等于
db.users.find({age: {$lt: 30}})                  // 小于
db.users.find({age: {$lte: 30}})                 // 小于等于
db.users.find({age: {$ne: 25}})                  // 不等于
db.users.find({age: {$in: [25, 30, 35]}})        // 在数组中
db.users.find({age: {$nin: [25, 30]}})           // 不在数组中

// AND查询
db.users.find({name: "张三", age: 25})
db.users.find({$and: [{name: "张三"}, {age: 25}]})

// OR查询
db.users.find({$or: [{name: "张三"}, {age: 30}]})

// NOT查询
db.users.find({age: {$not: {$gt: 25}}})

// 正则表达式查询
db.users.find({name: /张/})                      // 包含"张"
db.users.find({name: /^张/})                     // 以"张"开头
db.users.find({name: /三$/})                     // 以"三"结尾

// 嵌套文档查询
db.users.find({"address.city": "北京"})

// 数组查询
db.users.find({hobbies: "读书"})                 // 数组包含"读书"
db.users.find({hobbies: {$all: ["读书", "旅游"]}})  // 包含所有元素
db.users.find({hobbies: {$size: 2}})             // 数组长度为2

// 查询单个文档
db.users.findOne({name: "张三"})

// 限制返回字段（投影）
db.users.find({}, {name: 1, age: 1})             // 只返回name和age（_id默认返回）
db.users.find({}, {email: 0})                    // 排除email字段

// 排序
db.users.find().sort({age: 1})                   // 按age升序
db.users.find().sort({age: -1})                  // 按age降序
db.users.find().sort({age: -1, name: 1})         // 多字段排序

// 分页
db.users.find().limit(10)                        // 限制10条
db.users.find().skip(20).limit(10)               // 跳过20条，返回10条

// 统计
db.users.countDocuments()                        // 文档总数
db.users.countDocuments({age: {$gt: 25}})        // 条件统计

// 去重
db.users.distinct("age")                         // 返回所有不同的age值

3.5 文档更新（Update）

// 更新单个文档
db.users.updateOne(
    {name: "张三"},                              // 查询条件
    {$set: {age: 26, email: "zhangsan_new@example.com"}}  // 更新内容
)

// 更新多个文档
db.users.updateMany(
    {age: {$lt: 30}},
    {$set: {status: "active"}}
)

// 替换文档
db.users.replaceOne(
    {name: "张三"},
    {name: "张三", age: 27, email: "new@example.com"}
)

// 更新操作符
db.users.updateOne({name: "张三"}, {$inc: {age: 1}})           // age加1
db.users.updateOne({name: "张三"}, {$mul: {age: 2}})           // age乘2
db.users.updateOne({name: "张三"}, {$min: {age: 20}})          // 如果age>20，设为20
db.users.updateOne({name: "张三"}, {$max: {age: 30}})          // 如果age<30，设为30
db.users.updateOne({name: "张三"}, {$rename: {age: "userAge"}}) // 重命名字段

// 数组更新
db.users.updateOne({name: "张三"}, {$push: {hobbies: "游泳"}})  // 添加元素
db.users.updateOne({name: "张三"}, {$pull: {hobbies: "游泳"}})  // 删除元素
db.users.updateOne({name: "张三"}, {$addToSet: {hobbies: "跑步"}})  // 添加不重复元素
db.users.updateOne({name: "张三"}, {$pop: {hobbies: 1}})       // 删除最后一个元素
db.users.updateOne({name: "张三"}, {$pop: {hobbies: -1}})      // 删除第一个元素

// upsert（不存在则插入）
db.users.updateOne(
    {name: "孙八"},
    {$set: {age: 24, email: "sunba@example.com"}},
    {upsert: true}
)

// 更新时间戳
db.users.updateOne({name: "张三"}, {$currentDate: {lastModified: true}})

3.6 文档删除（Delete）

// 删除单个文档
db.users.deleteOne({name: "张三"})

// 删除多个文档
db.users.deleteMany({age: {$lt: 25}})

// 删除所有文档
db.users.deleteMany({})

// remove方法（旧版本）
db.users.remove({name: "张三"})              // 删除所有匹配的文档
db.users.remove({name: "张三"}, {justOne: true})  // 只删除一个

四、MongoDB高级查询

4.1 聚合查询（Aggregation）

// 基础聚合
db.orders.aggregate([
    // 阶段1：筛选
    {$match: {status: "completed"}},

    // 阶段2：分组统计
    {$group: {
        _id: "$customerId",
        totalAmount: {$sum: "$amount"},
        orderCount: {$sum: 1},
        avgAmount: {$avg: "$amount"}
    }},

    // 阶段3：排序
    {$sort: {totalAmount: -1}},

    // 阶段4：限制结果
    {$limit: 10}
])

// 常用聚合操作符
// $sum - 求和
db.sales.aggregate([
    {$group: {_id: "$product", total: {$sum: "$quantity"}}}
])

// $avg - 平均值
db.sales.aggregate([
    {$group: {_id: "$product", avgPrice: {$avg: "$price"}}}
])

// $min / $max - 最小值/最大值
db.sales.aggregate([
    {$group: {_id: "$product", minPrice: {$min: "$price"}, maxPrice: {$max: "$price"}}}
])

// $first / $last - 第一个/最后一个
db.sales.aggregate([
    {$sort: {date: 1}},
    {$group: {_id: "$product", firstSale: {$first: "$date"}}}
])

// $push - 将值添加到数组
db.sales.aggregate([
    {$group: {_id: "$product", customers: {$push: "$customer"}}}
])

// $addToSet - 添加不重复的值
db.sales.aggregate([
    {$group: {_id: "$product", uniqueCustomers: {$addToSet: "$customer"}}}
])

// $project - 字段投影和转换
db.users.aggregate([
    {$project: {
        name: 1,
        age: 1,
        birthYear: {$subtract: [2024, "$age"]},
        fullName: {$concat: ["$firstName", " ", "$lastName"]}
    }}
])

// $unwind - 展开数组
db.articles.aggregate([
    {$unwind: "$tags"},
    {$group: {_id: "$tags", count: {$sum: 1}}}
])

// $lookup - 关联查询（类似JOIN）
db.orders.aggregate([
    {$lookup: {
        from: "products",
        localField: "productId",
        foreignField: "_id",
        as: "productInfo"
    }}
])

// $out - 将结果输出到新集合
db.sales.aggregate([
    {$group: {_id: "$product", total: {$sum: "$amount"}}},
    {$out: "salesSummary"}
])

4.2 索引优化

// 创建索引
db.users.createIndex({name: 1})                    // 单字段升序索引
db.users.createIndex({age: -1})                    // 单字段降序索引
db.users.createIndex({name: 1, age: -1})           // 复合索引
db.users.createIndex({email: 1}, {unique: true})   // 唯一索引
db.users.createIndex({description: "text"})        // 文本索引
db.locations.createIndex({location: "2dsphere"})   // 地理空间索引

// 查看索引
db.users.getIndexes()

// 删除索引
db.users.dropIndex("name_1")
db.users.dropIndex({name: 1})

// 删除所有索引（_id除外）
db.users.dropIndexes()

// 查看查询计划
db.users.find({name: "张三"}).explain("executionStats")

// 索引性能分析
db.users.find({age: {$gt: 25}}).hint({age: 1}).explain()

// 后台创建索引（不阻塞）
db.users.createIndex({email: 1}, {background: true})

// TTL索引（自动过期）
db.logs.createIndex({createdAt: 1}, {expireAfterSeconds: 3600})  // 1小时后过期

// 部分索引
db.users.createIndex(
    {email: 1},
    {partialFilterExpression: {age: {$gte: 18}}}
)

五、MongoDB副本集（Replica Set）

5.1 副本集概念

副本集是一组维护相同数据集的MongoDB服务器。副本集提供了数据冗余和高可用性，是生产部署的基础。

副本集角色：

Primary（主节点）：接收所有写操作
Secondary（从节点）：复制主节点的操作日志（oplog）
Arbiter（仲裁节点）：不存储数据，仅参与选举投票

5.2 副本集配置

# 1. 启动三个MongoDB实例
mongod --replSet rs0 --port 27017 --dbpath /data/mongo1 --bind_ip 0.0.0.0
mongod --replSet rs0 --port 27018 --dbpath /data/mongo2 --bind_ip 0.0.0.0
mongod --replSet rs0 --port 27019 --dbpath /data/mongo3 --bind_ip 0.0.0.0

# 2. 连接主节点并初始化副本集
mongosh --port 27017

// 初始化副本集
rs.initiate({
    _id: "rs0",
    members: [
        {_id: 0, host: "localhost:27017", priority: 2},
        {_id: 1, host: "localhost:27018", priority: 1},
        {_id: 2, host: "localhost:27019", priority: 1}
    ]
})

// 查看副本集状态
rs.status()

// 查看副本集配置
rs.conf()

// 添加节点
rs.add("localhost:27020")

// 添加仲裁节点
rs.addArb("localhost:27021")

// 删除节点
rs.remove("localhost:27020")

// 查看主节点
db.isMaster()

// 设置读取优先级（从节点也可读）
db.getMongo().setReadPref("secondaryPreferred")

// 副本集维护
rs.stepDown()      // 主节点降级
rs.reconfig()      // 重新配置副本集

5.3 副本集故障转移测试

# 模拟主节点宕机
# 在主节点执行
db.adminCommand({shutdown: 1})

# 观察其他节点自动选举新主节点
# 连接到从节点
mongosh --port 27018
rs.status()  # 查看新的主节点

六、MongoDB分片集群（Sharding）

6.1 分片概念

分片是MongoDB的数据分布式存储方案，用于处理大量数据和高吞吐量操作。

分片集群组件：

Config Servers（配置服务器）：存储集群元数据
Shard（分片服务器）：存储实际数据
Mongos（路由服务器）：查询路由器

6.2 分片集群搭建

# 1. 启动配置服务器（至少3个）
mongod --configsvr --replSet configRS --port 27019 --dbpath /data/config1
mongod --configsvr --replSet configRS --port 27020 --dbpath /data/config2
mongod --configsvr --replSet configRS --port 27021 --dbpath /data/config3

# 初始化配置服务器副本集
mongosh --port 27019
rs.initiate({
    _id: "configRS",
    configsvr: true,
    members: [
        {_id: 0, host: "localhost:27019"},
        {_id: 1, host: "localhost:27020"},
        {_id: 2, host: "localhost:27021"}
    ]
})

# 2. 启动分片服务器（每个分片是一个副本集）
# Shard 1
mongod --shardsvr --replSet shard1RS --port 27022 --dbpath /data/shard1-1
mongod --shardsvr --replSet shard1RS --port 27023 --dbpath /data/shard1-2

# Shard 2
mongod --shardsvr --replSet shard2RS --port 27024 --dbpath /data/shard2-1
mongod --shardsvr --replSet shard2RS --port 27025 --dbpath /data/shard2-2

# 初始化每个分片副本集
mongosh --port 27022
rs.initiate({
    _id: "shard1RS",
    members: [{_id: 0, host: "localhost:27022"}, {_id: 1, host: "localhost:27023"}]
})

# 3. 启动mongos路由
mongos --configdb configRS/localhost:27019,localhost:27020,localhost:27021 --port 27017

# 4. 添加分片
mongosh --port 27017
sh.addShard("shard1RS/localhost:27022,localhost:27023")
sh.addShard("shard2RS/localhost:27024,localhost:27025")

# 5. 启用数据库分片
sh.enableSharding("mydb")

# 6. 对集合进行分片
sh.shardCollection("mydb.users", {userId: "hashed"})  // 哈希分片
sh.shardCollection("mydb.orders", {customerId: 1})    // 范围分片

# 查看分片状态
sh.status()

七、MongoDB安全配置

7.1 用户权限管理

// 创建管理员用户
use admin
db.createUser({
    user: "admin",
    pwd: "password123",
    roles: [{role: "userAdminAnyDatabase", db: "admin"}]
})

// 创建数据库用户
use mydb
db.createUser({
    user: "appuser",
    pwd: "app123",
    roles: [
        {role: "readWrite", db: "mydb"},
        {role: "read", db: "analytics"}
    ]
})

// 内置角色
// read - 读取数据
// readWrite - 读写数据
// dbAdmin - 数据库管理
// userAdmin - 用户管理
// clusterAdmin - 集群管理
// root - 超级管理员

// 查看用户
db.getUsers()
show users

// 删除用户
db.dropUser("appuser")

// 修改用户密码
db.changeUserPassword("appuser", "newpassword")

// 授予角色
db.grantRolesToUser("appuser", [{role: "dbAdmin", db: "mydb"}])

// 撤销角色
db.revokeRolesFromUser("appuser", [{role: "dbAdmin", db: "mydb"}])

7.2 网络安全配置

# mongod.conf
net:
  bindIp: 127.0.0.1,192.168.1.100  # 限制绑定IP
  ssl:
    mode: requireSSL                # 强制SSL
    PEMKeyFile: /etc/ssl/mongodb.pem

security:
  authorization: enabled            # 启用认证
  keyFile: /etc/mongodb/keyfile     # 副本集密钥文件

八、MongoDB性能优化

8.1 查询优化

// 使用explain分析查询
db.users.find({age: {$gt: 25}}).explain("executionStats")

// 关键指标：
// - executionTimeMillis: 执行时间
// - totalDocsExamined: 扫描文档数
// - totalKeysExamined: 扫描索引键数
// - nReturned: 返回文档数

// 理想情况：totalDocsExamined = nReturned（无多余扫描）

// 使用投影减少返回数据
db.users.find({age: {$gt: 25}}, {name: 1, age: 1, _id: 0})

// 使用covered query（覆盖查询）
db.users.createIndex({name: 1, age: 1})
db.users.find({name: "张三"}, {name: 1, age: 1, _id: 0})  // 只从索引返回数据

// 避免全表扫描
db.users.find().limit(10)  // 加limit

8.2 写入优化

// 批量插入
db.users.insertMany([...], {ordered: false})  // 无序插入，失败不影响其他文档

// 使用bulkWrite批量操作
db.users.bulkWrite([
    {insertOne: {document: {name: "张三", age: 25}}},
    {updateOne: {filter: {name: "李四"}, update: {$set: {age: 30}}}},
    {deleteOne: {filter: {name: "王五"}}}
], {ordered: false})

// 关闭日志（仅测试环境）
db.adminCommand({setParameter: 1, journalCommitInterval: 300})

8.3 内存和存储优化

// 查看内存使用
db.serverStatus().mem

// 查看数据库统计
db.stats()

// 集合统计
db.users.stats()

// 压缩集合（回收空间）
db.runCommand({compact: "users"})

// WiredTiger引擎压缩配置
db.createCollection("logs", {
    storageEngine: {
        wiredTiger: {
            configString: "block_compressor=zlib"
        }
    }
})

核心知识总结

一、MongoDB核心概念速查表

1.1 MongoDB vs SQL术语映射

MongoDB	SQL	说明
Database	Database	数据库
Collection	Table	集合/表
Document	Row	文档/行
Field	Column	字段/列
Embedded Document	Table Join	内嵌文档/表连接
_id	Primary Key	主键
Index	Index	索引

1.2 CRUD操作速查

操作	SQL	MongoDB
插入	`INSERT INTO users VALUES (...)`	`db.users.insertOne({...})`
查询	`SELECT * FROM users WHERE age > 25`	`db.users.find({age: {$gt: 25}})`
更新	`UPDATE users SET age = 26 WHERE name = '张三'`	`db.users.updateOne({name: "张三"}, {$set: {age: 26}})`
删除	`DELETE FROM users WHERE age < 18`	`db.users.deleteMany({age: {$lt: 18}})`
统计	`SELECT COUNT(*) FROM users`	`db.users.countDocuments()`
排序	`SELECT * FROM users ORDER BY age DESC`	`db.users.find().sort({age: -1})`
分页	`SELECT * FROM users LIMIT 10 OFFSET 20`	`db.users.find().skip(20).limit(10)`

二、查询操作符完整清单

2.1 比较操作符

// 相等性
{field: value}                    // 等于
{field: {$eq: value}}             // 等于（显式）
{field: {$ne: value}}             // 不等于

// 大小比较
{field: {$gt: value}}             // 大于
{field: {$gte: value}}            // 大于等于
{field: {$lt: value}}             // 小于
{field: {$lte: value}}            // 小于等于

// 范围查询
{field: {$in: [v1, v2, v3]}}      // 在数组中
{field: {$nin: [v1, v2]}}         // 不在数组中

2.2 逻辑操作符

// AND
{$and: [{条件1}, {条件2}]}
{条件1, 条件2}                     // 隐式AND

// OR
{$or: [{条件1}, {条件2}]}

// NOT
{field: {$not: {$gt: 10}}}

// NOR（都不满足）
{$nor: [{条件1}, {条件2}]}

2.3 元素操作符

1
2
3

{field: {$exists: true}}          // 字段存在
{field: {$type: "string"}}        // 字段类型
{field: {$type: 2}}               // 类型编号

2.4 数组操作符

1
2
3

{tags: {$all: ["tag1", "tag2"]}}  // 包含所有元素
{tags: {$size: 3}}                // 数组长度为3
{tags: {$elemMatch: {$gt: 5, $lt: 10}}}  // 元素匹配条件

2.5 更新操作符

// 字段更新
{$set: {field: value}}            // 设置值
{$unset: {field: ""}}             // 删除字段
{$rename: {old: "new"}}           // 重命名字段
{$inc: {count: 1}}                // 增加/减少
{$mul: {price: 1.1}}              // 乘法
{$min: {score: 60}}               // 取最小值
{$max: {score: 100}}              // 取最大值
{$currentDate: {lastModified: true}}  // 设置当前时间

// 数组更新
{$push: {tags: "new"}}            // 添加元素
{$pull: {tags: "old"}}            // 删除元素
{$addToSet: {tags: "unique"}}     // 添加不重复元素
{$pop: {tags: 1}}                 // 删除最后一个（-1删除第一个）
{$pullAll: {tags: ["t1", "t2"]}}  // 删除多个元素

三、索引策略与最佳实践

3.1 索引类型总结

索引类型	创建语法	适用场景
单字段索引	`db.col.createIndex({field: 1})`	单字段查询、排序
复合索引	`db.col.createIndex({f1: 1, f2: -1})`	多字段组合查询
多键索引	`db.col.createIndex({tags: 1})`	数组字段查询
文本索引	`db.col.createIndex({content: "text"})`	全文搜索
哈希索引	`db.col.createIndex({field: "hashed"})`	分片键、等值查询
地理空间索引	`db.col.createIndex({loc: "2dsphere"})`	地理位置查询
TTL索引	`db.col.createIndex({date: 1}, {expireAfterSeconds: 3600})`	自动过期数据
唯一索引	`db.col.createIndex({email: 1}, {unique: true})`	唯一性约束
部分索引	`db.col.createIndex({f: 1}, {partialFilterExpression: {...}})`	条件索引
稀疏索引	`db.col.createIndex({f: 1}, {sparse: true})`	字段不总是存在

3.2 复合索引设计原则（ESR规则）

// ESR: Equality, Sort, Range
// 1. Equality（等值查询）字段放前面
// 2. Sort（排序）字段放中间
// 3. Range（范围查询）字段放最后

// 示例查询
db.users.find({status: "active", age: {$gt: 25}}).sort({name: 1})

// 最佳索引
db.users.createIndex({
    status: 1,    // Equality
    name: 1,      // Sort
    age: 1        // Range
})

3.3 索引优化建议

何时创建索引：

频繁查询的字段
排序字段
唯一性约束字段
外键字段（关联查询）

何时避免索引：

数据量很小的集合（<1000文档）
写操作远多于读操作
字段值重复率很高（如性别）
返回结果集占比很大（>30%）

索引监控：

// 查看索引使用情况
db.users.aggregate([{$indexStats: {}}])

// 查看慢查询
db.getProfilingStatus()
db.setProfilingLevel(1, {slowms: 100})  // 记录>100ms的查询
db.system.profile.find().sort({ts: -1}).limit(5)

四、聚合管道实战案例

4.1 常见聚合场景

场景1：销售统计

// 统计每个产品的总销售额和平均价格
db.sales.aggregate([
    {$match: {status: "completed"}},
    {$group: {
        _id: "$productId",
        totalSales: {$sum: {$multiply: ["$price", "$quantity"]}},
        avgPrice: {$avg: "$price"},
        count: {$sum: 1}
    }},
    {$sort: {totalSales: -1}},
    {$limit: 10}
])

场景2：用户行为分析

// 按日期统计活跃用户
db.userLogs.aggregate([
    {$match: {
        timestamp: {$gte: ISODate("2024-01-01"), $lt: ISODate("2024-02-01")}
    }},
    {$group: {
        _id: {$dateToString: {format: "%Y-%m-%d", date: "$timestamp"}},
        uniqueUsers: {$addToSet: "$userId"},
        pageViews: {$sum: 1}
    }},
    {$project: {
        date: "$_id",
        userCount: {$size: "$uniqueUsers"},
        pageViews: 1
    }},
    {$sort: {date: 1}}
])

场景3：订单关联查询

// 订单关联用户和产品信息
db.orders.aggregate([
    // 关联用户
    {$lookup: {
        from: "users",
        localField: "userId",
        foreignField: "_id",
        as: "userInfo"
    }},
    {$unwind: "$userInfo"},

    // 关联产品
    {$lookup: {
        from: "products",
        localField: "productId",
        foreignField: "_id",
        as: "productInfo"
    }},
    {$unwind: "$productInfo"},

    // 投影结果
    {$project: {
        orderNo: 1,
        userName: "$userInfo.name",
        productName: "$productInfo.name",
        totalAmount: {$multiply: ["$productInfo.price", "$quantity"]}
    }}
])

4.2 聚合管道阶段速查

阶段	说明	示例
$match	过滤文档	`{$match: {age: {$gt: 25}}}`
$group	分组聚合	`{$group: {_id: "$city", count: {$sum: 1}}}`
$project	字段投影	`{$project: {name: 1, age: 1}}`
$sort	排序	`{$sort: {age: -1}}`
$limit	限制结果数	`{$limit: 10}`
$skip	跳过文档	`{$skip: 20}`
$unwind	展开数组	`{$unwind: "$tags"}`
$lookup	关联查询	`{$lookup: {from: "users", ...}}`
$out	输出到集合	`{$out: "result"}`
$count	计数	`{$count: "total"}`
$facet	多维聚合	`{$facet: {统计1: [...], 统计2: [...]}}`

五、副本集架构详解

5.1 副本集工作原理

graph TB
    Client[客户端] --> Primary[主节点Primary]
    Primary --> |oplog复制| Secondary1[从节点1]
    Primary --> |oplog复制| Secondary2[从节点2]
    Primary --> |心跳| Secondary1
    Primary --> |心跳| Secondary2
    Secondary1 --> |心跳| Secondary2

    subgraph 故障转移
    A[主节点宕机] --> B[从节点检测]
    B --> C[发起选举]
    C --> D[多数投票]
    D --> E[新主节点产生]
    end

5.2 副本集配置参数详解

rs.initiate({
    _id: "rs0",                    // 副本集名称
    members: [
        {
            _id: 0,
            host: "server1:27017",
            priority: 2,           // 优先级（0-1000），越高越容易成为主节点
            votes: 1,              // 投票权（0或1）
            arbiterOnly: false,    // 是否为仲裁节点
            hidden: false,         // 是否隐藏（不接受读请求）
            slaveDelay: 0,         // 延迟复制（秒）
            buildIndexes: true     // 是否构建索引
        },
        {
            _id: 1,
            host: "server2:27017",
            priority: 1
        },
        {
            _id: 2,
            host: "server3:27017",
            priority: 1
        },
        {
            _id: 3,
            host: "server4:27017",
            arbiterOnly: true      // 仲裁节点
        }
    ],
    settings: {
        chainingAllowed: true,     // 允许级联复制
        heartbeatTimeoutSecs: 10,  // 心跳超时
        electionTimeoutMillis: 10000  // 选举超时
    }
})

5.3 读写策略配置

// 写关注（Write Concern）
db.users.insertOne(
    {name: "张三"},
    {
        writeConcern: {
            w: "majority",         // 写入多数节点
            j: true,               // 写入journal日志
            wtimeout: 5000         // 超时时间（毫秒）
        }
    }
)

// w选项：
// w: 1              - 写入主节点即返回（默认）
// w: "majority"     - 写入多数节点
// w: 0              - 不等待确认（不安全）
// w: 2              - 写入主节点+1个从节点

// 读偏好（Read Preference）
db.getMongo().setReadPref("secondaryPreferred")

// 读偏好选项：
// primary            - 只从主节点读（默认）
// primaryPreferred   - 优先主节点，主节点不可用时从从节点读
// secondary          - 只从从节点读
// secondaryPreferred - 优先从节点
// nearest            - 从延迟最低的节点读

六、分片集群架构实战

6.1 分片键选择策略

分片键类型	优点	缺点	适用场景
哈希分片	数据分布均匀	不支持范围查询	随机访问、写入密集
范围分片	支持范围查询	可能数据倾斜	时间序列、有序数据
复合分片	兼顾均匀和查询	配置复杂	复杂查询场景

分片键选择原则：

// 1. 高基数（Cardinality）- 唯一值多
// 好：userId（百万级别）
// 差：gender（只有2-3个值）

// 2. 低频率（Frequency）- 值分布均匀
// 好：随机ID
// 差：时间戳（新数据集中在某个分片）

// 3. 单调性（Monotonicity）- 避免单调递增
// 好：哈希ID
// 差：自增ID、时间戳

// 示例：复合分片键
sh.shardCollection("mydb.orders", {
    region: 1,      // 地区（保证范围查询）
    userId: "hashed"  // 用户ID哈希（保证均匀分布）
})

6.2 分片集群监控

// 查看分片状态
sh.status()

// 查看集合分片信息
db.users.getShardDistribution()

// 均衡器状态
sh.getBalancerState()
sh.startBalancer()    // 启动均衡器
sh.stopBalancer()     // 停止均衡器

// 查看chunk分布
use config
db.chunks.find({ns: "mydb.users"}).count()

七、MongoDB性能调优清单

7.1 Schema设计优化

反模式1：过度嵌套

// 不好：嵌套太深
{
    user: {
        profile: {
            address: {
                location: {
                    coordinates: {lat: 39.9, lon: 116.4}
                }
            }
        }
    }
}

// 好：扁平化
{
    userName: "张三",
    city: "北京",
    location: {lat: 39.9, lon: 116.4}
}

反模式2：大数组

// 不好：无限增长的数组
{
    userId: 1001,
    logs: [{...}, {...}, ...] // 可能数万条
}

// 好：引用模式
// users集合
{userId: 1001, name: "张三"}

// logs集合
{userId: 1001, action: "login", time: ISODate(...)}

7.2 查询优化技巧

// 1. 使用投影减少数据传输
db.users.find({age: {$gt: 25}}, {name: 1, email: 1, _id: 0})

// 2. 使用covered query（覆盖查询）
db.users.createIndex({name: 1, age: 1})
db.users.find({name: "张三"}, {name: 1, age: 1, _id: 0})
// explain()中看到 "indexOnly": true

// 3. 避免$where和$regex
// 慢：
db.users.find({$where: "this.age > 25"})
// 快：
db.users.find({age: {$gt: 25}})

// 4. 使用hint强制索引
db.users.find({age: {$gt: 25}}).hint({age: 1})

// 5. 批量操作
db.users.bulkWrite([...], {ordered: false})

7.3 连接池配置

// Node.js驱动示例
const { MongoClient } = require('mongodb');

const client = new MongoClient(uri, {
    maxPoolSize: 50,          // 最大连接数
    minPoolSize: 10,          // 最小连接数
    maxIdleTimeMS: 60000,     // 空闲连接保持时间
    waitQueueTimeoutMS: 5000, // 等待连接超时
    serverSelectionTimeoutMS: 5000  // 服务器选择超时
});

八、MongoDB监控指标

8.1 关键监控指标

// 1. 服务器状态
db.serverStatus()

// 关键指标：
{
    connections: {
        current: 100,         // 当前连接数
        available: 51200      // 可用连接数
    },
    opcounters: {
        insert: 1000,
        query: 5000,
        update: 800,
        delete: 200
    },
    mem: {
        resident: 2048,       // 物理内存（MB）
        virtual: 4096         // 虚拟内存（MB）
    },
    locks: {...},            // 锁状态
    globalLock: {
        currentQueue: {
            total: 0,
            readers: 0,
            writers: 0
        }
    }
}

// 2. 数据库统计
db.stats()

// 3. 集合统计
db.users.stats()

// 4. 索引使用统计
db.users.aggregate([{$indexStats: {}}])

8.2 性能分析工具

// 1. explain()分析
db.users.find({age: {$gt: 25}}).explain("executionStats")

// 关键指标：
// - executionTimeMillis: 执行时间
// - totalKeysExamined: 扫描索引数
// - totalDocsExamined: 扫描文档数
// - nReturned: 返回文档数
// 最佳状态：totalDocsExamined = nReturned

// 2. profiler（慢查询分析）
db.setProfilingLevel(1, {slowms: 100})  // 记录>100ms的查询
db.system.profile.find().sort({ts: -1}).limit(10)

// 3. mongostat（实时监控）
// 命令行执行
mongostat --host localhost:27017

// 4. mongotop（集合操作时间）
mongotop --host localhost:27017

九、生产环境最佳实践

9.1 部署架构建议

# 小规模（<100GB，<10万QPS）
架构: 单副本集（1主2从）
配置: 16GB内存，4核CPU，SSD

# 中规模（<1TB，<50万QPS）
架构: 分片集群（2分片，每分片3节点副本集）
配置: 32GB内存，8核CPU，SSD

# 大规模（>1TB，>50万QPS）
架构: 分片集群（N分片，每分片3-5节点副本集）
配置: 64GB+内存，16核+CPU，NVMe SSD

9.2 备份策略

# 1. mongodump（逻辑备份）
mongodump --host localhost:27017 --out /backup/$(date +%Y%m%d)

# 2. 文件系统快照（物理备份）
# 适用于云服务器EBS快照

# 3. oplog备份（时间点恢复）
mongodump --host localhost:27017 --oplog --out /backup/oplog

# 4. 自动化备份脚本
#!/bin/bash
BACKUP_DIR="/backup/mongodb"
DATE=$(date +%Y%m%d_%H%M%S)
mongodump --host localhost:27017 --gzip --out $BACKUP_DIR/$DATE
find $BACKUP_DIR -type d -mtime +7 -exec rm -rf {} \;  # 删除7天前备份

9.3 安全加固建议

# mongod.conf安全配置
security:
  authorization: enabled              # 启用认证
  keyFile: /etc/mongodb/keyfile       # 副本集密钥

net:
  bindIp: 127.0.0.1,192.168.1.0/24   # 限制访问IP
  ssl:
    mode: requireSSL                  # 强制SSL
    PEMKeyFile: /etc/ssl/mongodb.pem

# 其他安全措施：
# 1. 最小权限原则
# 2. 定期更新密码
# 3. 启用审计日志
# 4. 防火墙限制
# 5. 加密敏感字段

Hi, Sirius