您好,登录后才能下订单哦!
密码登录
登录注册
点击 登录注册 即表示同意《亿速云用户服务条款》
# 如何进行R语言ggplot2包画曼哈顿图的简单分析
## 摘要
曼哈顿图(Manhattan Plot)是基因组学研究中展示全基因组关联分析(GWAS)结果的经典可视化工具。本文将详细介绍如何使用R语言中的`ggplot2`包绘制曼哈顿图,包括数据准备、基础绘图、高级定制以及结果解读。通过本教程,读者将掌握利用R语言进行GWAS结果可视化的核心技能。
---
## 1. 曼哈顿图简介
曼哈顿图因其形似纽约曼哈顿天际线而得名,主要用于:
- 展示GWAS中SNP位点的显著性水平(-log10(p-value))
- 识别基因组中与表型显著相关的区域
- 直观呈现全基因组范围内的关联信号
典型特征:
- X轴:染色体位置
- Y轴:关联显著性(通常取-log10转换)
- 阈值线:标注显著性水平(如5×10⁻⁸)
---
## 2. 准备工作
### 2.1 安装必要R包
```r
install.packages(c("ggplot2", "qqman", "dplyr"))
library(ggplot2)
library(dplyr)
使用qqman
包内置的GWAS结果数据:
data(gwasResults)
head(gwasResults)
数据结构应包含:
- CHR
: 染色体编号
- BP
: 碱基位置
- P
: p值
- SNP
: SNP标识符(可选)
ggplot(gwasResults, aes(x = BP, y = -log10(P), color = factor(CHR))) +
geom_point(alpha = 0.6) +
scale_color_manual(values = rep(c("skyblue", "orange"), 22)) +
labs(x = "Chromosomal Position", y = "-log10(p-value)") +
theme_minimal()
参数 | 作用 |
---|---|
alpha |
控制点透明度(0-1) |
size |
点的大小 |
scale_color_manual |
交替染色体颜色 |
ggplot(gwasResults) +
geom_point(aes(x = BP, y = -log10(P), alpha = 0.6) +
geom_hline(yintercept = -log10(5e-8), color = "red", linetype = "dashed") +
geom_hline(yintercept = -log10(1e-5), color = "blue", linetype = "dashed")
gwasResults <- gwasResults %>%
group_by(CHR) %>%
mutate(BP_cum = cumsum(as.numeric(BP)))
ggplot(gwasResults, aes(x = BP_cum, y = -log10(P), color = factor(CHR))) +
geom_point() +
scale_x_continuous(label = 1:22, breaks = gwasResults %>% group_by(CHR) %>% summarize(center = mean(BP_cum)) %>% pull(center))
significant_snps <- gwasResults %>% filter(P < 5e-8)
ggplot(gwasResults, aes(x = BP, y = -log10(P))) +
geom_point(aes(color = factor(CHR))) +
geom_point(data = significant_snps, color = "red", size = 2) +
ggrepel::geom_text_repel(data = significant_snps, aes(label = SNP), size = 3)
library(ggplot2)
library(dplyr)
library(ggrepel)
# 数据处理
gwasResults <- gwasResults %>%
group_by(CHR) %>%
mutate(BP_cum = cumsum(BP) - cumsum(rep(mean(diff(BP)), n())))
# 确定染色体中心位置
axis_df <- gwasResults %>%
group_by(CHR) %>%
summarize(center = mean(BP_cum))
# 绘图
manhattan_plot <- ggplot(gwasResults, aes(x = BP_cum, y = -log10(P),
color = factor(CHR %% 2))) +
geom_point(alpha = 0.75) +
geom_hline(yintercept = -log10(5e-8), color = "red", linetype = "dashed") +
scale_x_continuous(label = axis_df$CHR, breaks = axis_df$center) +
scale_y_continuous(expand = c(0, 0.1)) +
scale_color_manual(values = c("skyblue", "orange")) +
labs(
x = "Chromosome",
y = "-log10(p-value)",
title = "GWAS Manhattan Plot"
) +
theme_bw() +
theme(
legend.position = "none",
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank()
)
# 标记显著位点
if(nrow(significant_snps) > 0){
manhattan_plot <- manhattan_plot +
geom_point(data = significant_snps, color = "red") +
ggrepel::geom_text_repel(
data = significant_snps,
aes(label = SNP),
size = 3,
box.padding = 0.5
)
}
print(manhattan_plot)
alpha
参数或使用geom_hex()
theme(axis.text.x = element_text(angle = 45, hjust = 1))
data.table
处理数据或先采样”`
注:本文代码已在R 4.2.0 + ggplot2 3.4.0环境下测试通过。实际应用时请根据数据特征调整参数。
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。