我有很多以下格式的字幕文件。100:00:01,000 --> 00:00:02,008some dummy text200:00:02,008 --> 00:00:05,006some dummy textsome dummy text300:00:05,006 --> 00:00:08,008some dummy textsome dummy text我想通过删除时间和之前的数字之间的空白行将它们转换成下面的内容。100:00:01,000 --> 00:00:02,008some dummy text200:00:02,008 --> 00:00:05,006some dummy textsome dummy text300:00:05,006 --> 00:00:08,008some dummy textsome dummy text由于它们有很多文件,我需要一段代码来应用于目录及其子目录中的所有文件。是否有机会覆盖现有文件?
1 回答
蓝山帝景
TA贡献1843条经验 获得超7个赞
import os
import re
for root, dirs, files in os.walk('C:\\Users\\User\\Desktop\\Folder\\'):
for file in files:
if file.endswith('.txt'):
fpath = os.path.join(root, file)
with open(fpath, 'r') as f:
t = re.sub('(?<=\d)\n*(?=\d\d\:\d\d:\d\d\,\d\d\d)','\n',f.read())
with open(fpath, 'w') as f:
f.write(t)
添加回答
举报
0/150
提交
取消