PostgreSQL数据库 加入小组

28个成员 7个话题 创建时间:2021-01-15

max_wal_size/min_wal_size参数理解和作用测试

发表于2021-04-07 6161次查看

对max_wal_size/min_wal_size这两个参数的理解一直不是很清楚,通过查看官方文档的解释后发现描述的比较晦涩,因此做了以下测试加深理解,如有不合理之处,麻烦指正。

首先是官方文档关于这两个参数的解释:
 
max_wal_size
在自动 WAL 检查点之间允许 WAL 增长到的最大尺寸。这是一个软限制, 在特殊的情况下 WAL 尺寸可能会超过max_wal_size, 例如在重度负荷下、archive_command失败或者高的 wal_keep_segments设置。默认为 1 GB。增加这个参数 可能导致崩溃恢复所需的时间。这个参数只能在postgresql.conf 或者服务器命令行中设置。
 
min_wal_size
只要 WAL 磁盘用量保持在这个设置之下,在检查点时旧的 WAL 文件总是 被回收以便未来使用,而不是直接被删除。这可以被用来确保有足够的 WAL 空间被保留来应付 WAL 使用的高峰,例如运行大型的批处理任务。 默认是 80 MB。这个参数只能在postgresql.conf 或者服务器命令行中设置。

  

测试环境

硬件环境:CVM云服务器
硬件配置:4Core 8GB HDD 50G 
操作系统:CentOS Linux release 7.8.2003 (Core)
PG版本:PostgreSQL 12.4 RPM安装

 

1.系统设置
 
2.数据库参数设置
listen_addresses = '0.0.0.0'
port = 5432
max_connections = 500
shared_buffers = 2GB
maintenance_work_mem = 600MB
dynamic_shared_memory_type = posix
vacuum_cost_delay = 0
bgwriter_delay = 10ms
bgwriter_lru_maxpages = 1000
bgwriter_lru_multiplier = 10.0
wal_level = replica
synchronous_commit = off
full_page_writes = on
wal_buffers = 64MB
checkpoint_timeout = 30min  #延长检查点间隔时间,让数据库通过max_wal_size来触发检查点
checkpoint_completion_target = 0.9
effective_cache_size = 6GB
logging_collector = on
log_truncate_on_rotation = on
log_checkpoints = on
log_connections = on
log_disconnections = on
log_timezone = 'PRC'
autovacuum = on
default_text_search_config = 'pg_catalog.english'
shared_preload_libraries='pg_stat_statements'
--重启数据库
sudo systemctl restart postgresql-12

3.测试工具

--BenchmarkSQL压力测试工具,配置过程略
  --benchmarksql-5.0
--shell脚本
check.sh
作用:当有检查点信息输出到log日志的时候,统计当前pg_wal目录的大小、wal日志文件的个数、wal日志文件的文件名
#!/bin/bash
cd $(dirname $0)
# CKPT_CNT1=`grep 'checkpoint complete' log/postgresql-Wed.log|wc -l`
CKPT_CNT1=0
CKPT_CNT2=0
while true
do
    if [[ $CKPT_CNT2 -gt $CKPT_CNT1 ]]; then
        echo '----------------------------'
        date
        grep 'checkpoint complete' log/postgresql-Wed.log|tail -1|awk '{print $1,$2,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$26,$27}'
        echo 'wal  size:' $(du -sh pg_wal|awk '{print $1}')
        echo 'wal count:' $(ls -l pg_wal/0000000*|wc -l)
        echo 'wal files:'
        echo "$(ls pg_wal/|column -t|grep -v 'archive_status')"
        CKPT_CNT1=$CKPT_CNT2
    fi
    sleep 1s
    CKPT_CNT2=`grep 'checkpoint complete' log/postgresql-Wed.log|wc -l`
done

walcnt.sh

作用:一直输出wal日志文件的数量,因为checkpoint删除wal日志文件速度可能过快,以上脚本可能无法捕捉到

while true
do
    echo "wal count: $(ls -l pg_wal/0000000*|wc -l)"
done
 
先简单说下checkpoint触发条件
(1) 手动执行checkpoint
(2) pg_start_backup/create database/pg_ctl stop 
(3)达到checkpoint_timeout时间限制
(4)上次checkpoint以来wal_segment数量达到max_wal_size大小
 
MAX_WAL_SIZE
通过对max_wal_size设置不同值,生成检查点看日志大小和数量变化
1.第一次测试 - max_wal_size = 200MB
postgres=# show max_wal_size ;
 max_wal_size
--------------
 200MB
 
--加载数据以产生大量wal日志
[postgres@VM-0-6-centos run]$ ./runDatabaseBuild.sh props.pg
 
--check.sh采集脚本结果
sh check.sh | tee check.log
----------------------------
.........
Wed Apr  7 12:41:27 CST 2021
2021-04-07 12:41:27.046 wrote 6842 buffers (2.6%); 0 WAL file(s) added, 1 removed, 5 recycled; sync files=3,
wal  size: 193M
wal count: 12
wal files:
    00000001000000030000009E
    00000001000000030000009F
    0000000100000003000000A0
    0000000100000003000000A1
    0000000100000003000000A2
    0000000100000003000000A3
    0000000100000003000000A4
    0000000100000003000000A5
    0000000100000003000000A6
    0000000100000003000000A7
    0000000100000003000000A8
    0000000100000003000000A9
----------------------------
.........
----------------------------
Wed Apr  7 12:41:34 CST 2021
2021-04-07 12:41:34.582 wrote 11178 buffers (4.3%); 0 WAL file(s) added, 1 removed, 5 recycled; sync files=2,
wal  size: 193M
wal count: 12
wal files:
    0000000100000003000000AA
    0000000100000003000000AB
    0000000100000003000000AC
    0000000100000003000000AD
    0000000100000003000000AE
    0000000100000003000000AF
    0000000100000003000000B0
    0000000100000003000000B1
    0000000100000003000000B2
    0000000100000003000000B3
    0000000100000003000000B4
    0000000100000003000000B5
----------------------------
Wed Apr  7 12:41:39 CST 2021
2021-04-07 12:41:38.790 wrote 11809 buffers (4.5%); 0 WAL file(s) added, 1 removed, 5 recycled; sync files=7,
wal  size: 193M
wal count: 12
wal files:
    0000000100000003000000B0
    0000000100000003000000B1
    0000000100000003000000B2
    0000000100000003000000B3
    0000000100000003000000B4
    0000000100000003000000B5
    0000000100000003000000B6
    0000000100000003000000B7
    0000000100000003000000B8
    0000000100000003000000B9
    0000000100000003000000BA
    0000000100000003000000BB
----------------------------
................

--walcnt脚本结果处理后输出

[postgres@VM-0-6-centos data]$ cat walcnt.log |sort -u
wal count: 10
wal count: 11
wal count: 12
wal count: 13
wal count: 9
 
通过以上可以看出,pg_wal目录大小基本保持在193M(wal_segment_size=16MB * 12 ),接近于max_wal_size设置大小
 
2.第二次测试 - max_wal_size = 400MB
postgres=# show max_wal_size;
 max_wal_size
--------------
 400MB

 

--加载测试数据

[postgres@VM-0-6-centos run]$ ./runDatabaseBuild.sh props.pg
 
--check.sh采集脚本结果

----------------------------
Wed Apr  7 13:08:57 CST 2021
2021-04-07 13:08:25.835 wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; sync files=0,
wal  size: 193M
wal count: 12
wal files:
    000000010000000400000099
    00000001000000040000009A
    00000001000000040000009B
    00000001000000040000009C
    00000001000000040000009D
    00000001000000040000009E
    00000001000000040000009F
    0000000100000004000000A0
    0000000100000004000000A1
    0000000100000004000000A2
    0000000100000004000000A3
    0000000100000004000000A4
----------------------------
Wed Apr  7 13:09:49 CST 2021
2021-04-07 13:09:48.261 wrote 24472 buffers (9.3%); 0 WAL file(s) added, 2 removed, 11 recycled; sync files=50,
wal  size: 401M
wal count: 25
wal files:
    0000000100000004000000A6
    0000000100000004000000A7
    0000000100000004000000A8
    0000000100000004000000A9
    0000000100000004000000AA
    0000000100000004000000AB
    0000000100000004000000AC
    0000000100000004000000AD
    0000000100000004000000AE
    0000000100000004000000AF
    0000000100000004000000B0
    0000000100000004000000B1
    0000000100000004000000B2
    0000000100000004000000B3
    0000000100000004000000B4
    0000000100000004000000B5
    0000000100000004000000B6
    0000000100000004000000B7
    0000000100000004000000B8
    0000000100000004000000B9
    0000000100000004000000BA
    0000000100000004000000BB
    0000000100000004000000BC
    0000000100000004000000BD
    0000000100000004000000BE
----------------------------
Wed Apr  7 13:09:57 CST 2021
2021-04-07 13:09:57.424 wrote 26044 buffers (9.9%); 0 WAL file(s) added, 2 removed, 11 recycled; sync files=5,
wal  size: 401M
wal count: 25
wal files:
    0000000100000004000000B3
    0000000100000004000000B4
    0000000100000004000000B5
    0000000100000004000000B6
    0000000100000004000000B7
    0000000100000004000000B8
    0000000100000004000000B9
    0000000100000004000000BA
    0000000100000004000000BB
    0000000100000004000000BC
    0000000100000004000000BD
    0000000100000004000000BE
    0000000100000004000000BF
    0000000100000004000000C0
    0000000100000004000000C1
    0000000100000004000000C2
    0000000100000004000000C3
    0000000100000004000000C4
    0000000100000004000000C5
    0000000100000004000000C6
    0000000100000004000000C7
    0000000100000004000000C8
    0000000100000004000000C9
    0000000100000004000000CA
    0000000100000004000000CB
----------------------------
...............

--walcnt脚本结果处理后输出

[postgres@VM-0-6-centos data]$ cat walcnt.log |sort -u
wal count: 12
wal count: 13
wal count: 14
wal count: 15
wal count: 16
wal count: 17
wal count: 18
wal count: 19
wal count: 20
wal count: 21
wal count: 22
wal count: 23
wal count: 24
wal count: 25
wal count: 26
wal count: 27
 
通过以上可以看出,pg_wal目录大小基本保持在401M(wal_segment_size=16MB * 25 ),同样接近于max_wal_size设置大小
 

 

整体来看:
(1)当前max_wal_size=200M,pg_wal目录大小稳定在193M,正好可以完整容纳12个wal日志文件,并且12个日志文件不断回收使用
(2)当前max_wal_size=400M,pg_wal目录大小稳定在401M,正好可以完整容纳25个wal日志文件,并且25个日志文件是不断回收使用
(3)回收后的文件会根据当前使用的wal日志文件序号依次递增,以供将来使用
(4)通过walcnt.sh统计可以看出max_wal_size并不会严格显示pg_wal目录的大小或者wal日志文件的数量,官方文档也说明了这一点,但是最终基本会稳定在floor(max_wal_size/wal_segment_size)
 
 
MIN_WAL_SIZE
按照max_wal_size的理解,反过来min_wal_size就是限制pg_wal目录的最小值,也就是限制最少的wal日志文件数量,在数据库空闲的时候,如果pg_wal目录大小低于min_wal_size,数据库将会重用wal日志文件而不是删除。
 
1.第一次测试 - min_wal_size = 50MB
--为了方便观察调整了checkpoint_completion_target以及checkpoint_timeout
postgres=# show checkpoint_completion_target;
 checkpoint_completion_target
------------------------------
 0.2

postgres=# show checkpoint_timeout;
 checkpoint_timeout
--------------------
 2min

postgres=# show min_wal_size;
 min_wal_size
--------------
 50MB
--手动切换归档日志,通过多次切换wal日志,使wal日志的数量>floor(min_wal_size/wal_segment_size)
--这里为了效果明显,切换了5次
[postgres@VM-0-6-centos data]$ psql -c "select pg_switch_wal();"

--生成检查点

[postgres@VM-0-6-centos data]$ psql -c "checkpoint;"
CHECKPOINT

--持续切换wal日志并生成检查点,观察wal日志状态

while true
do 
    sleep 1s 
    psql -c "select pg_switch_wal()
    checkpoint;" 
done

--观察wal日志变化

----------------------------
Wed Apr  7 15:19:27 CST 2021
2021-04-07 15:19:27.066 wrote 654 buffers (0.2%); 0 WAL file(s) added, 0 removed, 4 recycled; sync files=24,
wal  size: 81M
wal count: 5
wal files:
000000010000000500000051
000000010000000500000052
000000010000000500000053
000000010000000500000054
000000010000000500000055
----------------------------
..............
----------------------------
Wed Apr  7 15:31:04 CST 2021
2021-04-07 15:31:03.299 wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; sync files=0,
wal  size: 81M
wal count: 5
wal files:
000000010000000500000057
000000010000000500000058
000000010000000500000059
00000001000000050000005A
00000001000000050000005B
----------------------------
Wed Apr  7 15:31:17 CST 2021
2021-04-07 15:31:16.912 wrote 0 buffers (0.0%); 0 WAL file(s) added, 1 removed, 0 recycled; sync files=0,
wal  size: 65M
wal count: 4
wal files:
000000010000000500000058
000000010000000500000059
00000001000000050000005A
00000001000000050000005B
----------------------------
..............
----------------------------
Wed Apr  7 15:35:35 CST 2021
2021-04-07 15:35:34.638 wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; sync files=0,
wal  size: 65M
wal count: 4
wal files:
00000001000000050000005F
000000010000000500000060
000000010000000500000061
000000010000000500000062
----------------------------
Wed Apr  7 15:35:36 CST 2021
2021-04-07 15:35:35.646 wrote 0 buffers (0.0%); 0 WAL file(s) added, 1 removed, 0 recycled; sync files=0,
wal  size: 49M
wal count: 3
wal files:
000000010000000500000061
000000010000000500000062
000000010000000500000063
----------------------------
Wed Apr  7 15:35:37 CST 2021
2021-04-07 15:35:36.996 wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; sync files=0,
wal  size: 49M
wal count: 3
wal files:
000000010000000500000062
000000010000000500000063
000000010000000500000064
----------------------------
..............

 

通过以上可以看出,数据库空闲状态下,pg_wal目录大小基本保持在49M(wal_segment_size=16MB * 3 ),接近于min_wal_size设置大小,持续下去pg_wal基本保持在49M
 
2.第二次测试 - min_wal_size = 100MB
postgres=# show min_wal_size ;
 min_wal_size
--------------
 100MB
--手动切换归档日志,通过多次切换wal日志,使wal日志的数量>floor(min_wal_size/wal_segment_size)
--这里为了效果明显,切换7次
[postgres@VM-0-6-centos data]$ psql -c "select pg_switch_wal();"

--生成检查点

[postgres@VM-0-6-centos data]$ psql -c "checkpoint;"
CHECKPOINT

--持续切换wal日志并生成检查点,观察wal日志状态

while true
do 
    sleep 1s 
    psql -c "select pg_switch_wal()
    checkpoint;" 
done

--观察wal日志变化

Wed Apr  7 15:52:42 CST 2021
2021-04-07 15:52:42.200 wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; sync files=0,
wal  size: 97M
wal count: 6
wal files:
0000000100000006000000DB
0000000100000006000000DC
0000000100000006000000DD
0000000100000006000000DE
0000000100000006000000DF
0000000100000006000000E0
----------------------------
Wed Apr  7 15:52:43 CST 2021
2021-04-07 15:52:43.239 wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; sync files=0,
wal  size: 97M
wal count: 6
wal files:
0000000100000006000000DC
0000000100000006000000DD
0000000100000006000000DE
0000000100000006000000DF
0000000100000006000000E0
0000000100000006000000E1
----------------------------
Wed Apr  7 15:52:44 CST 2021
2021-04-07 15:52:44.280 wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; sync files=0,
wal  size: 97M
wal count: 6
wal files:
0000000100000006000000DD
0000000100000006000000DE
0000000100000006000000DF
0000000100000006000000E0
0000000100000006000000E1
0000000100000006000000E2
----------------------------
Wed Apr  7 15:52:45 CST 2021
2021-04-07 15:52:45.321 wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; sync files=0,
wal  size: 97M
wal count: 6
wal files:
0000000100000006000000DE
0000000100000006000000DF
0000000100000006000000E0
0000000100000006000000E1
0000000100000006000000E2
0000000100000006000000E3
 
通过以上可以看出,pg_wal目录大小基本保持在97M(wal_segment_size=16MB * 6 ),接近于min_wal_size设置大小,持续下去pg_wal基本保持在97M
 
总结:
(1)max_wal_size用来限制pg_wal目录容量的上限,但仅是软限制,可能会出现超过设置值大小的情况,但是目录大小最终会稳定在max_wal_size设置值以内
(2)min_wal_size用来防止pg_wal中的wal日志文件被过多的删除,保留最少数量的wal日志文件回收以供将来使用,这样可以在一定程度上避免重建wal日志文件的造成的性能消耗
(3)通俗的将max_wal_size就是用来限制pg_wal目录可以最多容纳多少个wal日志文件
(4)min_wal_size用来保证pg_wal中至少要保留多少个wal日志文件
(5)当max_wal_size设置不合理,日志中会有相应提示HINT:  Consider increasing the configuration parameter "max_wal_size"
         建议 max_wal_size=checkpoint_timeout * data_per_min * wal_segment_size
(6)而min_wal_size不适合设置过小,过小的值会导致保留的wal日志文件过少
         建议min_wal_size=max_wal_size/4
2回复
发表回复
你还没有登录,请先 登录或 注册!