一、前置准备
在Linux系统(如CentOS、Ubuntu)上搭建GitLab监控前,需确保已安装GitLab且服务正常运行。可通过gitlab-ctl status(源安装)或systemctl status gitlab-runsvdir(RPM安装)检查GitLab状态。
二、启用GitLab自监控功能 GitLab内置了Prometheus指标导出功能,需通过配置文件开启:
/etc/gitlab/gitlab.rb,RPM安装为/etc/gitlab/gitlab.rb),添加以下配置:gitlab_rails['prometheus_enable'] = true
gitlab_rails['prometheus_port'] = 9090 # 指标端口(默认9090,可自定义)
gitlab_rails['prometheus_address'] = '0.0.0.0' # 监听所有IP
sudo gitlab-ctl reconfigure
sudo gitlab-ctl restart
http://<GitLab服务器IP>:9090/metrics,若能看到GitLab相关的指标(如gitlab_rails_database_queries_seconds、sidekiq_queue_size),则说明自监控已开启。三、安装与配置Prometheus采集GitLab指标 Prometheus是开源时序数据库,用于收集、存储GitLab指标,步骤如下:
prometheus-2.40.0.linux-amd64.tar.gz),解压后进入目录:wget https://github.com/prometheus/prometheus/releases/download/v2.40.0/prometheus-2.40.0.linux-amd64.tar.gz
tar xvfz prometheus-2.40.0.linux-amd64.tar.gz
cd prometheus-2.40.0.linux-amd64
./prometheus --config.file=prometheus.yml
prometheus.yml,添加GitLab的scrape_configs:scrape_configs:
- job_name: 'gitlab'
static_configs:
- targets: ['<GitLab服务器IP>:9090'] # 替换为GitLab服务器IP
若GitLab启用了认证,需添加basic_auth或bearer_token_file配置。http://<Prometheus服务器IP>:9090),在“Graph”页面输入gitlab_rails_up(GitLab状态指标),若返回值为1,则表示采集成功。四、配置Grafana可视化监控 Grafana是开源可视化工具,可与Prometheus集成,创建直观的监控仪表盘:
grafana-10.0.0.linux-amd64.tar.gz),解压后进入目录:wget https://dl.grafana.com/oss/release/grafana-10.0.0.linux-amd64.tar.gz
tar -zxvf grafana-10.0.0.linux-amd64.tar.gz
cd grafana-10.0.0
./bin/grafana-server
systemd管理(参考Grafana官方文档)。http://<Grafana服务器IP>:3000),登录(默认账号admin,密码admin,首次登录需修改密码)。http://<Prometheus服务器IP>:9090),点击“Save & test”,确认连接成功。4385,涵盖CPU、内存、作业等指标),点击“Load”,然后选择Prometheus数据源,点击“Import”即可查看可视化面板。五、设置Alertmanager报警 Alertmanager用于处理Prometheus的报警规则,并通过邮件、Slack等方式通知管理员:
alertmanager-0.26.0.linux-amd64.tar.gz),解压后进入目录:wget https://github.com/prometheus/alertmanager/releases/download/v0.26.0/alertmanager-0.26.0.linux-amd64.tar.gz
tar xvfz alertmanager-0.26.0.linux-amd64.tar.gz
cd alertmanager-0.26.0.linux-amd64
./alertmanager --config.file=alertmanager.yml
alertmanager.yml,设置报警接收方式(以邮件为例):global:
smtp_smarthost: 'smtp.example.com:587'
smtp_from: 'alertmanager@example.com'
smtp_auth_username: 'username'
smtp_auth_password: 'password'
route:
receiver: 'email'
receivers:
- name: 'email'
email_configs:
- to: 'admin@example.com'
alerts.yml文件,添加GitLab报警规则(如CPU使用率超过80%、内存不足):groups:
- name: gitlab_alerts
rules:
- alert: GitLabHighCPU
expr: sum(gitlab_rails_process_cpu_seconds_total) / count(gitlab_rails_process_cpu_seconds_total) > 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "GitLab CPU usage is high (instance {{ $labels.instance }})"
description: "GitLab CPU usage has exceeded 80% for 5 minutes."
- alert: GitLabHighMemory
expr: sum(gitlab_rails_process_resident_memory_bytes) / 1024 / 1024 > 2048 # 内存超过2GB
for: 5m
labels:
severity: warning
annotations:
summary: "GitLab memory usage is high (instance {{ $labels.instance }})"
description: "GitLab memory usage has exceeded 2GB for 5 minutes."
在Prometheus的prometheus.yml中加载报警规则:rule_files:
- "/path/to/alerts.yml"
重启Prometheus使规则生效。0.8改为0.1)触发报警,检查是否收到邮件通知。六、可选:使用GitLab内置监控面板 GitLab Admin Area提供了内置的监控面板,无需额外安装工具:
七、系统级辅助监控 除上述工具外,可使用Linux系统自带工具辅助监控GitLab所在服务器的整体状态:
top按P键按CPU排序,M键按内存排序)。vmstat 1 5每秒刷新一次,共5次)。free -h以人类可读格式显示)。iostat -x 1查看详细磁盘统计)。ss -tulnp查看监听端口)。