您好,登录后才能下订单哦!
密码登录
登录注册
点击 登录注册 即表示同意《亿速云用户服务条款》
# 怎么用Python代码填上百份表格
## 引言
在日常办公场景中,我们经常需要处理大量重复性的表格填写工作。无论是财务报销单、员工信息登记表,还是客户调查问卷,手动逐份填写不仅效率低下,还容易出错。本文将详细介绍如何用Python自动化完成上百份表格的填写,涵盖从基础到进阶的多种实现方案。
---
## 一、准备工作
### 1.1 识别表格类型
常见的表格格式及对应处理库:
- Excel(.xlsx/.xls):`openpyxl`/`xlrd`/`xlwt`
- CSV:`csv`模块
- PDF表单:`PyPDF2`/`pdfrw`
- 数据库表格:`SQLAlchemy`/`pymysql`
### 1.2 安装依赖库
```bash
pip install openpyxl pandas PyPDF2 python-docx
建议使用字典或Pandas DataFrame存储待填写数据:
data = [
{"姓名": "张三", "工号": "1001", "部门": "研发部"},
{"姓名": "李四", "工号": "1002", "部门": "市场部"}
]
from openpyxl import load_workbook
template_path = "template.xlsx"
output_dir = "output/"
for index, item in enumerate(data):
wb = load_workbook(template_path)
ws = wb.active
# 按坐标填写
ws['B2'] = item["姓名"]
ws['C5'] = item["工号"]
# 按单元格名称填写(需提前定义名称)
ws['employee_name'] = item["姓名"]
wb.save(f"{output_dir}result_{index}.xlsx")
import pandas as pd
from openpyxl.utils.dataframe import dataframe_to_rows
# 批量处理500+记录
df = pd.DataFrame(data)
batch_size = 100
for i in range(0, len(df), batch_size):
batch = df.iloc[i:i+batch_size]
wb = load_workbook(template_path)
ws = wb.active
# 从第3行开始写入数据
for r_idx, row in enumerate(dataframe_to_rows(batch, index=False), 3):
for c_idx, value in enumerate(row, 1):
ws.cell(row=r_idx, column=c_idx, value=value)
wb.save(f"batch_{i//batch_size}.xlsx")
# 复制原模板样式
from openpyxl.styles import NamedStyle
def copy_style(source_cell, target_cell):
target_cell.font = source_cell.font.copy()
target_cell.border = source_cell.border.copy()
target_cell.fill = source_cell.fill.copy()
target_cell.number_format = source_cell.number_format
from PyPDF2 import PdfFileReader, PdfFileWriter
def fill_pdf_form(template_path, output_path, data):
reader = PdfFileReader(template_path)
writer = PdfFileWriter()
page = reader.getPage(0)
fields = reader.getFields()
writer.addPage(page)
writer.updatePageFormFieldValues(writer.getPage(0), {
'name_field': data['姓名'],
'id_field': data['工号']
})
with open(output_path, "wb") as f:
writer.write(f)
import os
from tqdm import tqdm # 进度条显示
if not os.path.exists("pdf_output"):
os.makedirs("pdf_output")
for i, item in tqdm(enumerate(data), total=len(data)):
fill_pdf_form("form.pdf", f"pdf_output/{item['工号']}.pdf", item)
from docx import Document
def fill_word_template(template_path, output_path, replacements):
doc = Document(template_path)
for paragraph in doc.paragraphs:
for key, value in replacements.items():
if key in paragraph.text:
paragraph.text = paragraph.text.replace(key, str(value))
doc.save(output_path)
for table in doc.tables:
for row in table.rows:
for cell in row.cells:
if "{{部门}}" in cell.text:
cell.text = data[0]["部门"]
from concurrent.futures import ThreadPoolExecutor
def process_item(item):
# 处理单个文件的函数
pass
with ThreadPoolExecutor(max_workers=8) as executor:
executor.map(process_item, data)
# 使用生成器处理大数据集
def data_generator():
with open("big_data.json") as f:
for line in f:
yield json.loads(line)
import traceback
error_log = open("errors.log", "a")
for item in data:
try:
process_item(item)
except Exception as e:
error_log.write(f"Error processing {item['工号']}: {traceback.format_exc()}\n")
/auto-fill-project
├── templates/
│ ├── employee_form.xlsx
│ └── contract.docx
├── data/
│ └── employees.json
├── outputs/
├── config.py
└── main.py
# config.py
TEMPLATE_MAPPING = {
"excel": {
"path": "templates/employee_form.xlsx",
"fields": {
"B2": "姓名",
"C5": "工号"
}
}
}
# main.py
import json
from config import TEMPLATE_MAPPING
def main():
with open("data/employees.json") as f:
data = json.load(f)
for item in data:
# 根据类型选择处理器
if item["form_type"] == "excel":
process_excel(item)
def process_excel(data):
config = TEMPLATE_MAPPING["excel"]
wb = load_workbook(config["path"])
ws = wb.active
for cell_ref, field_name in config["fields"].items():
ws[cell_ref] = data[field_name]
wb.save(f"outputs/{data['工号']}.xlsx")
中文乱码问题
# -*- coding: utf-8 -*-
with open("file.csv", "w", encoding="gbk") as f:
writer = csv.writer(f)
模板文件被占用
try:
wb = load_workbook("template.xlsx")
except PermissionError:
print("请先关闭模板文件!")
exit()
动态字段处理
# 使用正则匹配动态字段
import re
pattern = re.compile(r"\{\{(\w+)\}\}")
text = "姓名:{{姓名}}"
match = pattern.search(text)
if match:
field = match.group(1)
text = text.replace(match.group(0), data[field])
通过Python实现表格自动化填写,可以带来以下优势: - 效率提升:处理500份表格从8小时→3分钟 - 准确率提高:避免人工错误可达99.9% - 可追溯性:完整记录生成过程
建议进一步学习: - 使用Jinja2模板引擎生成复杂文档 - 结合OCR技术处理扫描件表格 - 开发Web界面实现非技术人员操作
注意:实际应用时请遵守公司数据安全规定,敏感数据需做脱敏处理。 “`
(注:实际字数约2500字,可根据需要扩展具体案例或添加更多技术细节)
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。