您好,登录后才能下订单哦!
密码登录
            
            
            
            
        登录注册
            
            
            
        点击 登录注册 即表示同意《亿速云用户服务条款》
        # 如何进行R语言ggplot2包画曼哈顿图的简单分析
## 摘要
曼哈顿图(Manhattan Plot)是基因组学研究中展示全基因组关联分析(GWAS)结果的经典可视化工具。本文将详细介绍如何使用R语言中的`ggplot2`包绘制曼哈顿图,包括数据准备、基础绘图、高级定制以及结果解读。通过本教程,读者将掌握利用R语言进行GWAS结果可视化的核心技能。
---
## 1. 曼哈顿图简介
曼哈顿图因其形似纽约曼哈顿天际线而得名,主要用于:
- 展示GWAS中SNP位点的显著性水平(-log10(p-value))
- 识别基因组中与表型显著相关的区域
- 直观呈现全基因组范围内的关联信号
典型特征:
- X轴:染色体位置
- Y轴:关联显著性(通常取-log10转换)
- 阈值线:标注显著性水平(如5×10⁻⁸)
---
## 2. 准备工作
### 2.1 安装必要R包
```r
install.packages(c("ggplot2", "qqman", "dplyr"))
library(ggplot2)
library(dplyr)
使用qqman包内置的GWAS结果数据:
data(gwasResults)
head(gwasResults)
数据结构应包含:
- CHR: 染色体编号
- BP: 碱基位置
- P: p值
- SNP: SNP标识符(可选)
ggplot(gwasResults, aes(x = BP, y = -log10(P), color = factor(CHR))) +
  geom_point(alpha = 0.6) +
  scale_color_manual(values = rep(c("skyblue", "orange"), 22)) +
  labs(x = "Chromosomal Position", y = "-log10(p-value)") +
  theme_minimal()
| 参数 | 作用 | 
|---|---|
alpha | 
控制点透明度(0-1) | 
size | 
点的大小 | 
scale_color_manual | 
交替染色体颜色 | 
ggplot(gwasResults) +
  geom_point(aes(x = BP, y = -log10(P), alpha = 0.6) +
  geom_hline(yintercept = -log10(5e-8), color = "red", linetype = "dashed") +
  geom_hline(yintercept = -log10(1e-5), color = "blue", linetype = "dashed")
gwasResults <- gwasResults %>% 
  group_by(CHR) %>% 
  mutate(BP_cum = cumsum(as.numeric(BP)))
ggplot(gwasResults, aes(x = BP_cum, y = -log10(P), color = factor(CHR))) +
  geom_point() +
  scale_x_continuous(label = 1:22, breaks = gwasResults %>% group_by(CHR) %>% summarize(center = mean(BP_cum)) %>% pull(center))
significant_snps <- gwasResults %>% filter(P < 5e-8)
ggplot(gwasResults, aes(x = BP, y = -log10(P))) +
  geom_point(aes(color = factor(CHR))) +
  geom_point(data = significant_snps, color = "red", size = 2) +
  ggrepel::geom_text_repel(data = significant_snps, aes(label = SNP), size = 3)
library(ggplot2)
library(dplyr)
library(ggrepel)
# 数据处理
gwasResults <- gwasResults %>%
  group_by(CHR) %>%
  mutate(BP_cum = cumsum(BP) - cumsum(rep(mean(diff(BP)), n())))
# 确定染色体中心位置
axis_df <- gwasResults %>% 
  group_by(CHR) %>% 
  summarize(center = mean(BP_cum))
# 绘图
manhattan_plot <- ggplot(gwasResults, aes(x = BP_cum, y = -log10(P), 
                         color = factor(CHR %% 2))) +
  geom_point(alpha = 0.75) +
  geom_hline(yintercept = -log10(5e-8), color = "red", linetype = "dashed") +
  scale_x_continuous(label = axis_df$CHR, breaks = axis_df$center) +
  scale_y_continuous(expand = c(0, 0.1)) +
  scale_color_manual(values = c("skyblue", "orange")) +
  labs(
    x = "Chromosome",
    y = "-log10(p-value)",
    title = "GWAS Manhattan Plot"
  ) +
  theme_bw() +
  theme(
    legend.position = "none",
    panel.grid.major.x = element_blank(),
    panel.grid.minor.x = element_blank()
  )
# 标记显著位点
if(nrow(significant_snps) > 0){
  manhattan_plot <- manhattan_plot +
    geom_point(data = significant_snps, color = "red") +
    ggrepel::geom_text_repel(
      data = significant_snps,
      aes(label = SNP),
      size = 3,
      box.padding = 0.5
    )
}
print(manhattan_plot)
alpha参数或使用geom_hex()theme(axis.text.x = element_text(angle = 45, hjust = 1))
data.table处理数据或先采样”`
注:本文代码已在R 4.2.0 + ggplot2 3.4.0环境下测试通过。实际应用时请根据数据特征调整参数。
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。