为了账号安全,请及时绑定邮箱和手机立即绑定

Python麻烦保存几个池处理的文件

Python麻烦保存几个池处理的文件

米脂 2021-08-17 10:27:41
我需要在 praralel 中处理一些文件。我正在使用池,但我无法保存池处理的文件。这是代码:... All imports...def extract(text_lines):    line_tr01 = []    line_tr02 = []    line_tr03 = []    line_tr03 = []    for line in text_lines:        treatment01 = treatment_a(line, args)        line_tr01.append(treatment01)        treatment02 = treatment_b(line, args)        line_tr02.append(treatment02)        treatment03 = treatment_c(line, args)        line_tr03.append(treatment03)        treatment04 = treatment_d(line, args)        line_tr04.append(treatment04)for file in folder:    text_lines = read_file_into_list(file_path)    chunk_size=len(text_lines)/6    divided=[]    divided.append(text_lines[0:chunk_size])    divided.append(text_lines[chunk_size:2*chunk_size])    divided.append(text_lines[2*chunk_size:3*chunk_size])    divided.append(text_lines[3*chunk_size:4*chunk_size])    divided.append(text_lines[4*chunk_size:5*chunk_size])    divided.append(text_lines[5*chunk_size:6*chunk_size])    lines=[]    p = Pool(6)    lines.extend(p.map(extract(text_lines),divided))    p.close()    p.join()    p.terminate()    line_tr01=lines[0]    with open(pkl_filename, 'wb') as f:        pickle.dump(line_tr01, f)    line_tr02=lines[1]    with open(pkl_filename, 'wb') as f:        pickle.dump(line_tr02, f)    line_tr03=lines[2]    with open(pkl_filename, 'wb') as f:        pickle.dump(line_tr03, f)    line_tr04=lines[3]    with open(pkl_filename, 'wb') as f:        pickle.dump(line_tr04, f)关于如何停止覆盖文件的任何信息都将受到欢迎。提前致谢
查看完整描述

1 回答

?
莫回无

TA贡献1865条经验 获得超7个赞

所以问题是,当你将东西分解成池时,你不再拥有当前(ab)使用的公共全局命名空间。所以让我们重写它以正确传递东西。


def extract(text_lines):

    treatments = dict(tr01=[], tr02=[], tr03=[], tr04=[])

    for line in text_lines:

        treatments['tr01'].append(treatment_a(line, args))

        treatments['tr02'].append(treatment_b(line, args))

        treatments['tr03'].append(treatment_c(line, args))

        treatments['tr04'].append(treatment_d(line, args))

    return treatments


def line_gen(lines, chunk_size=1):

    for i in range(0, len(lines), chunk_size):

        yield lines[i:i + chunk_size]


for file in folder:

    text_lines = read_file_into_list(file_path)

    treatments = dict(tr01=[], tr02=[], tr03=[], tr04=[])

    p = Pool(6)

    for treat_data in p.imap(extract, line_gen(text_lines, chunk_size=int(len(text_lines)/6))):

        for tr, data in treat_data.items():

            treatments[tr].extend(data)


    # Do something with all your data in the treatments dict

这应该将所有数据堆积到一个名为 的 dict 中treatments,因为它从正在运行的子进程返回数据extract,然后您可以以任何您喜欢的方式写出数据。


查看完整回答
反对 回复 2021-08-17
  • 1 回答
  • 0 关注
  • 137 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
微信客服

购课补贴
联系客服咨询优惠详情

帮助反馈 APP下载

慕课网APP
您的移动学习伙伴

公众号

扫描二维码
关注慕课网微信公众号