监控目标与总体思路
PostgreSQL 内置视图与关键 SQL
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;shared_preload_libraries = 'pg_stat_statements',然后重启实例。SELECT pid, usename, client_addr, state, query, wait_event_type, wait_event FROM pg_stat_activity WHERE state = 'active';SELECT datname, numbackends, tps, blk_read_time, blk_write_time, (blks_hit::float/(blks_hit+blks_read+1)) AS hit_ratio FROM pg_stat_database;SELECT now() - pg_last_xact_replay_timestamp() AS replay_lag;SELECT schemaname, tablename, seq_scan, idx_scan, n_tup_ins, n_tup_upd, n_tup_del, n_live_tup, n_dead_tup FROM pg_stat_user_tables ORDER BY n_dead_tup DESC LIMIT 20;SELECT schemaname, tablename, indexname, idx_scan FROM pg_stat_user_indexes ORDER BY idx_scan;SELECT query, calls, total_time/calls AS avg_ms, rows FROM pg_stat_statements ORDER BY total_time DESC LIMIT 10;命令行与轻量级工具
CREATE USER monitor WITH PASSWORD '***'; GRANT pg_monitor TO monitor;(PostgreSQL 9.6+ 提供 pg_monitor 角色)pgcenter top -h 127.0.0.1 -p 5432 -U monitor -d postgrespgcenter record -f /tmp/pg.dump -d postgres -U monitor 与 pgcenter report -f /tmp/pg.dump --format=csv -o /tmp/pg.csvtop/vmstat/iostat/free/netstat 观察 CPU、内存、I/O、网络与连接状态,与数据库指标交叉验证瓶颈所在。图形化与平台化监控
告警阈值与优化方向
pg_stat_database 评估 shared_buffers 与工作负载匹配度。pg_stat_bgwriter 观察检查点与后台写入统计,必要时调整 checkpoint_timeout、checkpoint_completion_target。work_mem 与 SQL 写法优化(避免大排序/哈希聚合溢出)。pg_stat_activity 与锁视图定位持有/等待会话,优化事务边界与隔离级别。shared_buffers(常见为内存的 1/4 左右)、work_mem、maintenance_work_mem、effective_cache_size、checkpoint_timeout/checkpoint_completion_target。pg_size_pretty(pg_database_size('db'))、pg_total_relation_size('schema.tbl');pg_waldump 分析异常写入与回放行为。