Debian下Tomcat监控与告警实操指南
一 监控体系与组件选型
tail -f 实时查看。二 指标监控与告警落地 Prometheus JMX Exporter
CATALINA_OPTS="$CATALINA_OPTS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=1099 -Dcom.sun.management.jmxremote.rmi.port=1099 -Djava.rmi.server.hostname=<服务器IP> -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"jconsole 连接 export CATALINA_OPTS="$CATALINA_OPTS -javaagent:$CATALINA_HOME/lib/jmx_prometheus_javaagent-<version>.jar=<exporter_port>:/etc/tomcat/jmx-exporter.yml"rules:
- pattern: "Catalina<type=GlobalRequestProcessor, name=\"http-nio-8080\">"
name: tomcat_global_request_processor
labels: { connector: "http-nio-8080" }
help: Tomcat global request processor
- pattern: "Catalina<type=ThreadPool, name=\"http-nio-8080\">"
name: tomcat_threads
labels: { connector: "http-nio-8080" }
help: Tomcat thread pool
- pattern: "java.lang<type=Memory>"
name: jvm_memory
help: JVM memory pools
scrape_configs:
- job_name: 'tomcat'
static_configs:
- targets: ['<IP>:<exporter_port>']
- job_name: 'node'
static_configs:
- targets: ['<IP>:9100']
groups:
- name: tomcat_alerts
rules:
- alert: HighHeapMemoryUsage
expr: jvm_memory_used_bytes{area="heap"} / jvm_memory_max_bytes{area="heap"} > 0.90
for: 5m
labels: { severity: critical }
annotations:
summary: "Tomcat堆内存使用率过高"
description: "堆内存使用率 {{ $value | humanizePercentage }}"
- alert: HighErrorRate
expr: sum(rate(tomcat_global_request_processor_error_count_total{connector=~".+"}[5m]))
/ sum(rate(tomcat_global_request_processor_request_count_total{connector=~".+"}[5m])) > 0.05
for: 2m
labels: { severity: critical }
annotations:
summary: "Tomcat请求错误率过高"
description: "5分钟错误率 {{ $value | humanizePercentage }}"
- alert: ThreadPoolBusyHigh
expr: tomcat_threads_current_threads_busy / tomcat_threads_max_threads > 0.90
for: 3m
labels: { severity: warning }
annotations:
summary: "Tomcat线程池繁忙率过高"
description: "连接器 {{ $labels.connector }} 繁忙率 {{ $value | humanizePercentage }}"
常用指标计算示例:
rate(tomcat_global_request_processor_error_count_total[5m]) / rate(tomcat_global_request_processor_request_count_total[5m])tomcat_threads_current_threads_busy / tomcat_threads_max_threadsrate()/irate() 计算速率与分位耗时。三 日志监控与告警落地 ELK Kibana
1catalina.org.apache.juli.AsyncFileHandler.level=INFO、1catalina.org.apache.juli.AsyncFileHandler.maxDays=30;<Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs" prefix="localhost_access_log" suffix=".txt" pattern="%h %l %u %t \"%r\" %s %b %D %T %I" />tail -f /var/log/tomcat/catalina.out 与 tail -f /var/log/tomcat/localhost_access_log.* 观察实时日志。四 可用性与基础资源监控
五 关键告警阈值与落地建议