您好,登录后才能下订单哦!
密码登录
登录注册
点击 登录注册 即表示同意《亿速云用户服务条款》
# 如何使用ggplot2绘制蝴蝶图
## 一、什么是蝴蝶图
蝴蝶图(Butterfly Chart)是一种特殊的数据可视化形式,因其左右对称的形态类似蝴蝶翅膀而得名。它通常用于:
1. 对比两组相关联的分类数据(如男女比例、AB测试结果)
2. 展示同一指标在不同时间点的变化
3. 比较两种不同条件下的数据分布
典型应用场景包括:
- 人口统计学中的性别年龄分布
- 产品功能的用户偏好对比
- 实验组与对照组的指标比较
## 二、准备工作
### 1. 安装必要工具
```r
# 如果尚未安装tidyverse
install.packages("tidyverse")
# 加载必要库
library(ggplot2)
library(dplyr)
library(tidyr)
library(scales) # 用于百分比格式化
我们使用模拟的电商用户性别-年龄分布数据:
set.seed(123)
butterfly_data <- data.frame(
age_group = rep(c("18-24", "25-34", "35-44", "45-54", "55-64", "65+"), 2),
gender = rep(c("Male", "Female"), each = 6),
proportion = c(abs(rnorm(6, 0.3, 0.1)), # 男性比例
stringsAsFactors = FALSE
) %>%
mutate(
proportion = ifelse(gender == "Male", -proportion, proportion),
label_pos = proportion * 1.1 # 标签位置调整
)
ggplot(butterfly_data,
aes(x = age_group, y = proportion, fill = gender)) +
geom_bar(stat = "identity", width = 0.7) +
coord_flip() + # 使条形图横向显示
scale_y_continuous(labels = function(x) paste0(abs(x)*100, "%")) +
labs(title = "用户年龄-性别分布",
x = "年龄组",
y = "百分比") +
theme_minimal()
coord_flip()
:实现横向条形图scale_y_continuous
:自定义y轴标签显示绝对值width
:控制条形宽度(0-1之间)ggplot(butterfly_data) +
geom_bar(aes(x = age_group, y = proportion, fill = gender),
stat = "identity", width = 0.7) +
geom_hline(yintercept = 0, color = "black", size = 0.5) + # 强调零线
coord_flip() +
scale_y_continuous(
breaks = seq(-0.5, 0.5, 0.1),
labels = function(x) paste0(abs(x)*100, "%"),
limits = c(-0.5, 0.5) # 强制对称范围
)
ggplot(butterfly_data) +
geom_bar(aes(x = age_group, y = proportion, fill = gender),
stat = "identity") +
geom_text(aes(x = age_group, y = label_pos,
label = paste0(abs(round(proportion*100)), "%"),
color = gender),
size = 3.5) +
scale_color_manual(values = c("Male" = "navy", "Female" = "darkred")) +
coord_flip()
custom_colors <- c("Male" = "#3498db", "Female" = "#e74c3c")
ggplot(butterfly_data) +
geom_bar(aes(x = age_group, y = proportion, fill = gender),
stat = "identity", alpha = 0.8) +
scale_fill_manual(values = custom_colors) +
theme(
panel.background = element_rect(fill = "grey97"),
plot.title = element_text(face = "bold", size = 16),
legend.position = "top"
)
feature_data <- data.frame(
feature = rep(c("搜索", "推荐", "收藏", "分享"), 2),
group = rep(c("实验组", "对照组"), each = 4),
usage_rate = c(0.45, 0.32, 0.18, 0.12, -0.38, -0.29, -0.15, -0.08)
)
ggplot(feature_data, aes(x = reorder(feature, abs(usage_rate)),
y = usage_rate, fill = group)) +
geom_col(width = 0.6) +
coord_flip() +
labs(title = "AB测试功能使用率对比", x = "功能", y = "使用率差值") +
scale_y_continuous(labels = function(x) paste0(abs(x)*100, "%"))
trend_data <- data.frame(
year = rep(2015:2020, 2),
category = rep(c("线上", "线下"), each = 6),
value = c(seq(0.1, 0.35, 0.05), -seq(0.3, 0.05, -0.05))
)
ggplot(trend_data, aes(x = year, y = value, fill = category)) +
geom_col(position = "identity", alpha = 0.7) +
geom_line(aes(color = category), size = 1, show.legend = FALSE) +
scale_x_continuous(breaks = 2015:2020) +
labs(title = "销售渠道变化趋势")
# 按照绝对值大小排序
butterfly_data %>%
mutate(age_group = fct_reorder(age_group, abs(proportion))) %>%
ggplot(aes(x = age_group, y = proportion)) +
geom_col()
geom_text(aes(label = ifelse(proportion < 0,
paste0("-", abs(proportion)*100, "%"),
paste0("+", proportion*100, "%"))))
theme(legend.position = c(0.85, 0.15),
legend.background = element_rect(fill = "white", color = "grey"))
pyramid_data <- data.frame(
age = rep(seq(10, 70, 10), 2),
gender = rep(c("Male", "Female"), each = 7),
population = c(seq(5, 8, length.out = 7), -seq(6, 3, length.out = 7))
ggplot(pyramid_data, aes(x = age, y = population, fill = gender)) +
geom_col() +
coord_flip() +
scale_y_continuous(labels = abs)
ggplot(butterfly_data, aes(x = age_group, y = proportion)) +
geom_col(aes(fill = gender), width = 0.5) +
geom_errorbar(aes(ymin = proportion*0.9, ymax = proportion*1.1),
width = 0.2)
ggplot2绘制蝴蝶图的核心要点:
coord_flip()
实现横向显示scale_y_continuous
控制轴范围geom_hline
强调零线进阶技巧:
- 添加参考线:geom_vline(xintercept = ...)
- 分面显示:facet_wrap(~variable)
- 交互式版本:转换为plotly对象
完整代码模板:
butterfly_template <- function(data, x_var, y_var, fill_var,
title = "",
color_palette = c("#1f77b4", "#ff7f0e")) {
ggplot(data, aes(x = {{x_var}}, y = {{y_var}}, fill = {{fill_var}})) +
geom_col(width = 0.7, alpha = 0.8) +
coord_flip() +
geom_hline(yintercept = 0, color = "black") +
scale_y_continuous(labels = function(x) scales::percent(abs(x))) +
scale_fill_manual(values = color_palette) +
labs(title = title) +
theme_minimal() +
theme(legend.position = "top")
}
通过灵活运用这些技术,您可以创建出专业级的蝴蝶图可视化,有效展示各类对比数据。 “`
亿速云「云服务器」,即开即用、新一代英特尔至强铂金CPU、三副本存储NVMe SSD云盘,价格低至29元/月。点击查看>>
开发者交流群:
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。
原文链接:https://my.oschina.net/u/3335309/blog/4391702