python findall的使用技巧 - 问答

findall() 是 Python 中正则表达式模块 re 的一个函数，用于在字符串中查找所有与正则表达式匹配的子串

使用原始字符串：为了避免转义字符带来的困扰，可以使用原始字符串（在字符串前加 r）来编写正则表达式。例如：

import re

text = "The quick brown fox jumps over the lazy dog."
pattern = r'\b\w{5}\b'
result = re.findall(pattern, text)
print(result)  # 输出：['quick', 'brown']

指定匹配模式：re.findall() 函数有一个可选参数 flags，用于指定匹配模式。例如，re.IGNORECASE 可以用于执行不区分大小写的匹配。

import re

text = "The quick brown fox jumps over the lazy dog."
pattern = r'\b\w{5}\b'
result = re.findall(pattern, text, flags=re.IGNORECASE)
print(result)  # 输出：['Quick', 'Brown']

使用分组：在正则表达式中使用圆括号 () 可以创建分组，findall() 函数将返回一个包含所有匹配分组的列表。

import re

text = "The quick brown fox jumps over the lazy dog."
pattern = r'(\w{5})'
result = re.findall(pattern, text)
print(result)  # 输出：['quick', 'brown', 'fox', 'jumps', 'over', 'lazy', 'dog']

使用 re.finditer()：re.finditer() 函数与 re.findall() 类似，但它返回一个迭代器，而不是一个列表。这样可以节省内存，特别是在处理大型字符串时。

import re

text = "The quick brown fox jumps over the lazy dog."
pattern = r'\b\w{5}\b'
result = re.finditer(pattern, text)

for match in result:
    print(match.group())  # 输出：quick, brown, fox, jumps, over, lazy, dog

使用 re.sub()：如果你想替换字符串中与正则表达式匹配的部分，可以使用 re.sub() 函数。

import re

text = "The quick brown fox jumps over the lazy dog."
pattern = r'\b\w{5}\b'
result = re.sub(pattern, '<word>', text)
print(result)  # 输出：The <word> <word> <word> jumps over the <word> <word> dog.

0 赞

0 踩