Python怎么进行字符串处理和文本分析

发布时间：2022-03-16 17:35:10 作者：iii
来源：亿速云阅读：299

这篇文章主要介绍“Python怎么进行字符串处理和文本分析”，在日常操作中，相信很多人在Python怎么进行字符串处理和文本分析问题上存在疑惑，小编查阅了各式资料，整理出简单好用的操作方法，希望对大家解答”Python怎么进行字符串处理和文本分析”的疑惑有所帮助！接下来，请跟着小编一起来学习吧！

空格剥离

空格剥离作为处理字符串的基本操作，常用方法有lstrip()（剥离签到空格）、rstrip()（剥离尾随空格）、strip()（剥离前导和尾随空格）。

s = ' This is a sentence with whitespace. \n'

print('Strip leading whitespace: {}'.format(s.lstrip()))

print('Strip trailing whitespace: {}'.format(s.rstrip()))

print('Strip all whitespace: {}'.format(s.strip()))

Strip leading whitespace: This is a sentence with whitespace.

Strip trailing whitespace: This is a sentence with whitespace.

Strip all whitespace: This is a sentence with whitespace.

当然同样的方法也有很多，另一个比较常见的就是通过指定想要剥离的字符来处理字符串：

s = 'This is a sentence with unwanted characters.AAAAAAAA'

print('Strip unwanted characters: {}'.format(s.rstrip('A')))

字符串拆分

字符串拆分是利用Python中的split()将字符串拆分成较小的字符串列表。

s = 'KDnuggets is a fantastic resource'

print(s.split())

未加参数时，split()默认根据空格进行拆分，但同样也可以按指定字符进行拆分字符串。

s = 'these,words,are,separated,by,comma'

print('\',\' separated split -> {}'.format(s.split(',')))

s = 'abacbdebfgbhhgbabddba'

print('\'b\' separated split -> {}'.format(s.split('b')))

',' separated split -> ['these', 'words', 'are', 'separated', 'by', 'comma']

'b' separated split -> ['a', 'ac', 'de', 'fg', 'hhg', 'a', 'dd', 'a']

将列表元素合成字符串

上述讲了如何讲一个字符串拆分成许多了，这里讲如何将许多个字符串合成一个字符串。那就要用到join()方法。

s = ['KDnuggets', 'is', 'a', 'fantastic', 'resource']

print(' '.join(s))

KDnuggets is a fantastic resource

字符串反转

Python目前没有字符串反转的方法，但是我们可以先将一个字符串当做多个字符组成的列表，在利用反转表元素的方式对整个字符串进行反转。

大小写转换

Python中字符串的大小写转换还是非常简单的，只需要利用好upper()、lower()、swapcase()这三个方法，就能实现大小写之间的转换。

s = 'KDnuggets'

print('\'KDnuggets\' as uppercase: {}'.format(s.upper()))

print('\'KDnuggets\' as lowercase: {}'.format(s.lower()))

print('\'KDnuggets\' as swapped case: {}'.format(s.swapcase()))

'KDnuggets' as uppercase: KDNUGGETS

'KDnuggets' as lowercase: kdnuggets

'KDnuggets' as swapped case: kdNUGGETS

检查是否有字符串成员

Python中检测字符串成员最简单的方法就是使用in运算符。它的语法和自然语十分相似。

s1 = 'perpendicular'

s2 = 'pen'

s3 = 'pep'

print('\'pen\' in \'perpendicular\' -> {}'.format(s2 in s1))

print('\'pep\' in \'perpendicular\' -> {}'.format(s3 in s1))

'pen' in 'perpendicular' -> True

'pep' in 'perpendicular' -> False

当然如果不单单只是为了检测字符是否存在，而是要找到具体的位置，则需要使用find()方法。

s = 'Does this string contain a substring?'

print('\'string\' location -> {}'.format(s.find('string')))

print('\'spring\' location -> {}'.format(s.find('spring')))

'string' location -> 10

'spring' location -> -1

默认情况下，find（）返回子字符串第一次出现的第一个字符的索引，如果找不到子字符串，则返回-1。

子字符串替换

如果在找到字符串之后，我们想替换这一字符串，该怎么办？那就要用到replace()方法的功能。

s1 = 'The theory of data science is of the utmost importance.'

s2 = 'practice'

print('The new sentence: {}'.format(s1.replace('theory', s2)))

The new sentence: The practice of data science is of the utmost importance.

如果同一个子字符串出现多次的话，利用计数参数这一选项，可以指定要进行连续替换的最大次数。

到此，关于“Python怎么进行字符串处理和文本分析”的学习就结束了，希望能够解决大家的疑惑。理论与实践的搭配能更好的帮助大家学习，快去试试吧！若想继续学习更多相关知识，请继续关注亿速云网站，小编会继续努力为大家带来更多实用的文章！

Python怎么进行字符串处理和文本分析

相关阅读