1 回答

TA贡献1865条经验 获得超7个赞
所以问题是,当你将东西分解成池时,你不再拥有当前(ab)使用的公共全局命名空间。所以让我们重写它以正确传递东西。
def extract(text_lines):
treatments = dict(tr01=[], tr02=[], tr03=[], tr04=[])
for line in text_lines:
treatments['tr01'].append(treatment_a(line, args))
treatments['tr02'].append(treatment_b(line, args))
treatments['tr03'].append(treatment_c(line, args))
treatments['tr04'].append(treatment_d(line, args))
return treatments
def line_gen(lines, chunk_size=1):
for i in range(0, len(lines), chunk_size):
yield lines[i:i + chunk_size]
for file in folder:
text_lines = read_file_into_list(file_path)
treatments = dict(tr01=[], tr02=[], tr03=[], tr04=[])
p = Pool(6)
for treat_data in p.imap(extract, line_gen(text_lines, chunk_size=int(len(text_lines)/6))):
for tr, data in treat_data.items():
treatments[tr].extend(data)
# Do something with all your data in the treatments dict
这应该将所有数据堆积到一个名为 的 dict 中treatments,因为它从正在运行的子进程返回数据extract,然后您可以以任何您喜欢的方式写出数据。
添加回答
举报