在Python中,可以使用gzip
或bz2
库对爬取到的数据进行压缩
gzip
库进行压缩:import gzip
import io
def compress_data(data):
compressed_data = io.BytesIO()
with gzip.GzipFile(fileobj=compressed_data, mode='wb') as f:
f.write(data)
compressed_data = compressed_data.getvalue()
return compressed_data
def decompress_data(compressed_data):
decompressed_data = io.BytesIO(compressed_data)
with gzip.GzipFile(fileobj=decompressed_data, mode='rb') as f:
data = f.read()
return data
# 示例
data = b"This is some data to compress."
compressed_data = compress_data(data)
print("Compressed data:", compressed_data)
decompressed_data = decompress_data(compressed_data)
print("Decompressed data:", decompressed_data)
bz2
库进行压缩:import bz2
import io
def compress_data(data):
compressed_data = io.BytesIO()
with bz2.compress(data) as f:
compressed_data.write(f.read())
compressed_data = compressed_data.getvalue()
return compressed_data
def decompress_data(compressed_data):
decompressed_data = io.BytesIO(compressed_data)
with bz2.decompress(compressed_data) as f:
data = f.read()
return data
# 示例
data = b"This is some data to compress."
compressed_data = compress_data(data)
print("Compressed data:", compressed_data)
decompressed_data = decompress_data(compressed_data)
print("Decompressed data:", decompressed_data)
在这两个示例中,我们首先定义了compress_data
函数,该函数接受原始数据作为输入,然后使用相应的库(gzip
或bz2
)对其进行压缩。接下来,我们定义了decompress_data
函数,该函数接受压缩后的数据作为输入,并使用相应的库对其进行解压缩。
在示例部分,我们使用了一个简单的字符串作为原始数据,并将其压缩为二进制格式。然后,我们将压缩后的数据解压缩回原始格式,以便进行比较。