1 回答
TA贡献1934条经验 获得超2个赞
正如@BlueSheepToken 所建议的那样,来自 itertools 的 group by 是您的朋友。其他 python 本机和高性能解决方案在其中一个funcy或toolz包中实现。这里有一个解决方案toolz
import csv
from operator import itemgetter
import toolz
import toolz.curried
def stream_file(fp):
with open(fp) as file:
for line in csv.DictReader(file):
res = dict(line)
res['G'] = float(res['G'])
res['PTS'] = float(res['PTS'])
yield res
# groups from stream
groups = toolz.groupby(['Tm', 'Lg', 'Pos'], stream_file('test_file.csv'))
# aggregation functions: get some value from list, then sum it up
pts_counter = toolz.compose_left(toolz.curried.map(itemgetter('PTS')), sum)
g_counter = toolz.compose_left(toolz.curried.map(itemgetter('G')), sum)
# apply both functions to the input
aggregations = toolz.juxt(pts_counter, g_counter)
# for each group's value compute aggregations
toolz.valmap(aggregations, groups)
输出:
{('CHI', 'NBA', 'SG'): (18.3, 60.0),
('CLE', 'NBA', 'SG'): (11.2, 46.0),
('MIA', 'NBA', 'PG'): (16.2, 61.0),
('MIA', 'NBA', 'SG'): (315.4, 887.0)}
添加回答
举报