Python中re.findall()怎么使用

发布时间：2022-07-28 10:31:06 作者：iii
来源：亿速云阅读：206

Python中re.findall()怎么使用

在Python中，re模块提供了强大的正则表达式操作功能。re.findall()是re模块中一个常用的函数，用于在字符串中查找所有与正则表达式匹配的子串，并返回一个包含所有匹配结果的列表。本文将详细介绍re.findall()的使用方法，并通过示例代码帮助读者更好地理解其应用场景。

1. `re.findall()`函数的基本用法

re.findall()函数的语法如下：

re.findall(pattern, string, flags=0)

pattern: 正则表达式模式。
string: 要搜索的字符串。
flags: 可选参数，用于控制正则表达式的匹配方式（如忽略大小写、多行匹配等）。

re.findall()会返回一个列表，其中包含所有与正则表达式匹配的子串。如果没有找到匹配项，则返回一个空列表。

示例1：查找所有匹配的数字

import re

text = "The price of the product is $19.99, and the discount is $5.00."
pattern = r'\d+\.\d+'
matches = re.findall(pattern, text)
print(matches)

输出：

['19.99', '5.00']

在这个例子中，正则表达式\d+\.\d+用于匹配浮点数。re.findall()返回了所有匹配的浮点数。

示例2：查找所有匹配的单词

import re

text = "Hello, world! This is a test."
pattern = r'\w+'
matches = re.findall(pattern, text)
print(matches)

输出：

['Hello', 'world', 'This', 'is', 'a', 'test']

在这个例子中，正则表达式\w+用于匹配单词。re.findall()返回了所有匹配的单词。

2. 使用分组捕获

re.findall()还支持使用分组捕获。如果正则表达式中包含分组（即使用圆括号()），re.findall()会返回一个元组列表，每个元组包含一个匹配项及其分组内容。

示例3：捕获分组内容

import re

text = "John has 3 apples, Mary has 5 oranges."
pattern = r'(\w+) has (\d+) (\w+)'
matches = re.findall(pattern, text)
print(matches)

输出：

[('John', '3', 'apples'), ('Mary', '5', 'oranges')]

在这个例子中，正则表达式(\w+) has (\d+) (\w+)包含三个分组，分别匹配人名、数量和水果名称。re.findall()返回了一个元组列表，每个元组包含一个匹配项及其分组内容。

3. 使用`flags`参数

flags参数可以用于控制正则表达式的匹配方式。常用的flags包括：

re.IGNORECASE（或re.I）：忽略大小写。
re.MULTILINE（或re.M）：多行匹配。
re.DOTALL（或re.S）：使.匹配包括换行符在内的所有字符。

示例4：忽略大小写匹配

import re

text = "Hello, World! hello, world!"
pattern = r'hello'
matches = re.findall(pattern, text, flags=re.IGNORECASE)
print(matches)

输出：

['Hello', 'hello']

在这个例子中，正则表达式hello在匹配时忽略了大小写，因此re.findall()返回了所有匹配的hello（包括大小写不同的形式）。

示例5：多行匹配

import re

text = """Line 1: Hello
Line 2: World
Line 3: Hello again"""
pattern = r'^Line \d+: (\w+)'
matches = re.findall(pattern, text, flags=re.MULTILINE)
print(matches)

输出：

['Hello', 'World', 'Hello']

在这个例子中，正则表达式^Line \d+: (\w+)用于匹配每行开头的Line后面的单词。re.MULTILINE标志使得^可以匹配每行的开头，因此re.findall()返回了所有匹配的单词。

4. 处理复杂匹配

re.findall()还可以用于处理更复杂的匹配场景，如匹配嵌套结构、非贪婪匹配等。

示例6：非贪婪匹配

import re

text = "<div>Content 1</div><div>Content 2</div>"
pattern = r'<div>(.*?)</div>'
matches = re.findall(pattern, text)
print(matches)

输出：

['Content 1', 'Content 2']

在这个例子中，正则表达式<div>(.*?)</div>使用了非贪婪匹配（.*?），以确保匹配最短的可能字符串。re.findall()返回了所有匹配的<div>标签中的内容。

5. 总结

re.findall()是Python中一个非常实用的函数，用于在字符串中查找所有与正则表达式匹配的子串。通过掌握其基本用法、分组捕获、flags参数以及复杂匹配技巧，可以有效地处理各种文本处理任务。希望本文的介绍和示例能够帮助读者更好地理解和使用re.findall()函数。

Python中re.findall()怎么使用

Python中re.findall()怎么使用

1. re.findall()函数的基本用法

示例1：查找所有匹配的数字

示例2：查找所有匹配的单词

2. 使用分组捕获

示例3：捕获分组内容

3. 使用flags参数

示例4：忽略大小写匹配

示例5：多行匹配

4. 处理复杂匹配

示例6：非贪婪匹配

5. 总结

相关阅读

1. `re.findall()`函数的基本用法

3. 使用`flags`参数