Pandas如何进行数据排序和筛选

发布时间：2025-11-14 00:56:00 作者：小樊
来源：亿速云阅读：86

Pandas 是一个强大的 Python 数据分析库，它提供了许多方法来对数据进行排序和筛选。以下是一些常用的方法和示例：

数据排序

sort_values()：根据指定的列对 DataFrame 进行排序。

import pandas as pd

# 创建一个示例 DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 28],
        'Salary': [50000, 60000, 70000, 55000]}
df = pd.DataFrame(data)

# 根据 'Age' 列升序排序
sorted_df = df.sort_values(by='Age')
print(sorted_df)

# 根据 'Salary' 列降序排序
sorted_df = df.sort_values(by='Salary', ascending=False)
print(sorted_df)

sort_index()：根据索引对 DataFrame 进行排序。

# 创建一个带有自定义索引的 DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 28],
        'Salary': [50000, 60000, 70000, 55000]}
df = pd.DataFrame(data, index=['D', 'B', 'A', 'C'])

# 根据索引升序排序
sorted_df = df.sort_index()
print(sorted_df)

# 根据索引降序排序
sorted_df = df.sort_index(ascending=False)
print(sorted_df)

数据筛选

loc[]：基于标签的索引，用于选择行和列。

# 选择 'Age' 大于 30 的所有行
filtered_df = df.loc[df['Age'] > 30]
print(filtered_df)

# 选择 'Name' 列和 'Age' 列
filtered_df = df.loc[:, ['Name', 'Age']]
print(filtered_df)

iloc[]：基于整数位置的索引，用于选择行和列。

# 选择前两行和前两列
filtered_df = df.iloc[:2, :2]
print(filtered_df)

# 选择索引为 1 和 3 的行
filtered_df = df.iloc[[1, 3]]
print(filtered_df)

query()：使用查询字符串来筛选数据。

# 选择 'Age' 大于 30 的所有行
filtered_df = df.query('Age > 30')
print(filtered_df)

# 选择 'Name' 列以 'A' 开头的所有行
filtered_df = df.query('Name.str.startswith("A")')
print(filtered_df)

isin()：检查数据是否在给定的集合中。

# 选择 'Name' 列值为 'Alice' 或 'David' 的所有行
filtered_df = df[df['Name'].isin(['Alice', 'David'])]
print(filtered_df)

这些方法和示例应该能帮助你开始使用 Pandas 进行数据排序和筛选。根据你的具体需求，你可以组合使用这些方法来实现更复杂的数据操作。

Pandas如何进行数据排序和筛选

数据排序

数据筛选

相关阅读