python openpyxl xlrd 编辑 excel xlsx 表格文件

创建日期: 2024-03-20 22:30 | 作者: 风波 | 浏览次数: 16 | 分类: Python

pip3 install openpyxl -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com

参考:https://xlsxwriter.readthedocs.io/getting_started.html

旧版本的 xls 文件

对于旧版本的 xls 文件可以使用 xlrd 打开。

pip install xlrd -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com

book = xlrd.open_workbook(wb_filepath)

默认加密了的旧版本 xls

来源:https://stackoverflow.com/questions/22789951/xlrd-error-workbook-is-encrypted-python-3-2-3/52290873

有些 xls 文件会使用密码 VelvetSweatshop 来加密,莫名其妙。需要使用 msoffcrypto 进行解密 项目地址:https://github.com/nolze/msoffcrypto-tool

pip install msoffcrypto-tool -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com

解密文件代码如下

import xlrd
import msoffcrypto

def handle_protected_workbook(wb_filepath):
    try:
        _book = xlrd.open_workbook(wb_filepath)
    except xlrd.biffh.XLRDError, e:
        if e.message == "Workbook is encrypted":
            # Try and unencrypt workbook with magic password
            wb_msoffcrypto_file = msoffcrypto.OfficeFile(open(wb_filepath, 'rb'))
            try:
                # Yes, this is actually a thing
                # https://nakedsecurity.sophos.com/2013/04/11/password-excel-velvet-sweatshop/
                wb_msoffcrypto_file.load_key(password='VelvetSweatshop')
            except AssertionError, e:
                if e.message == "Failed to verify password":
                    # Encrypted with some other password
                    raise # or do something else
                else:
                    # Some other error occurred
                    raise
            except:
                # Some other error occurred
                raise
            else:
                # Magic Excel password worked

                assert wb_filepath.endswith('.xls')
                wb_unencrypted_filename = wb_filepath[:-(len('.xls'))] + '__unencrypted.xls'

                with tempfile.NamedTemporaryFile() as tmp_wb_unencrypted_file:
                    # Decrypt into the tempfile
                    wb_msoffcrypto_file.decrypt(tmp_wb_unencrypted_file)
                    # --- Do something with the file ---
                # return true to indicate file was touched
                return True  # or do something else
        else:
            # some other xlrd error occurred.
            return False  # or do something else
    except:
        # some non-xlrd error occurred.
        return False  # or do something else

常见问题

1. 编辑带图片的文件报错

参考:https://stackoverflow.com/questions/54847209/openpyxl-valueerror-i-o-operation-on-closed-file

openpyxl: ValueError: I/O operation on closed file

解决:

I was using openpyxl version 2.5.14. Downgrading to 2.5.11 fixed it. More information here.

目前 openpyxl 已经更新了,需要降级到 3.0.7 就可以了。

2. 通过 sheet 名字获取页

Get sheet by name using openpyxl

参考:https://stackoverflow.com/questions/36814050/get-sheet-by-name-using-openpyxl

from openpyxl import load_workbook
wb2 = load_workbook('test.xlsx')
ws4 = wb2["New Title"]
import openpyxl

n = 0
wb = openpyxl.load_workbook('D:\excel.xlsx')
sheets = wb.sheetnames
ws = wb[sheets[n]]

3. 保存修改后的文件

来源:https://www.geeksforgeeks.org/change-value-in-excel-using-python/

from openpyxl import load_workbook

#load excel file
workbook = load_workbook(filename="csv/Email_sample.xlsx")

#open workbook
sheet = workbook.active

#modify the desired cell
sheet["A1"] = "Full Name"

#save the file
workbook.save(filename="csv/output.xlsx")

workbook.save() 函数可以指定新的文件名

4. 读取图片

来源:https://snyk.io/advisor/python/openpyxl-image-loader

pip install openpyxl-image-loader

from openpyxl import load_workbook
from openpyxl_image_loader import SheetImageLoader

# Load your workbook and sheet as you want, for example
wb = load_workbook('path_to_file.xlsx')
sheet = wb['required_sheet']

# Put your sheet in the loader
image_loader = SheetImageLoader(sheet)

# And get image from specified cell
image = image_loader.get('A3')

# Image now is a Pillow image, so you can do the following
image.show()

# Ask if there's an image in a cell
if image_loader.image_in('A4):
    print("Got it!")

Image 对象保存到内存

来源:https://stackoverflow.com/questions/65326103/pil-image-as-bytes-with-bytesio-to-prevent-hard-disk-saving

When we use the method read_in_stream, we need to provide a stream. But the code BytesIO.getvalue will return the content of the stream as string or bytes. So please update code as below

buf = io.BytesIO()
image.save(buf, format='JPEG')
computervision_client.read_in_stream(buf)
16 浏览
14 爬虫
0 评论