为了账号安全,请及时绑定邮箱和手机立即绑定

比较 2 个不同的 csv 文件并将所有更改输出到新的 csv 中

比较 2 个不同的 csv 文件并将所有更改输出到新的 csv 中

牛魔王的故事 2023-12-29 14:31:50
我有 2 个 CSV,分别是 New.csv 和 Old.csv,如下所示:旧的.csvlongName,shortName,eventType,number,severityACTAGENT201,ACAT201,RES,1,INFOACTAGENT202,ACAT202,RES,2,ALERTACODE801,AC801,ADMIN,1,MINORACODE802,AC802,ADMIN,2,MINORACODE102,AC102,COMM,2,CRITICALACODE103,AC103,COMM,3,CRITICALACODE104,AC104,COMM,4,CRITICALACODE105,AC105,COMM,5,CRITICALACODE106,AC106,COMM,6,CRITICAL新.csvlongName,shortName,eventType,number,severityACTAGENT201,ACAT201,RES,1,INFOACTAGENT202,ACAT202,RES,2,ALERTACODE801,AC801,ADMIN,1,MINORACODE802,AC802,ThisHasBeenChanged,2,MINORACODE102,AC102,COMM,2,CRITICALACODE103,AC103,COMM,3,CRITICALACODE104,AC104,COMM,4,THISHASBEENCHANGEDACODE105,AC105,COMM,5,CRITICALACODE106,AC106,COMM,6,CRITICAL如果行中的某一列中的数据已在 old.csv 和 new.csv 之间进行了修改/更改,则应将整行附加到changes.csv,就像 old.csv 中的每一列一样new.csv 彼此相邻:
查看完整描述

1 回答

?
一只萌萌小番薯

TA贡献1795条经验 获得超7个赞

首先,将两个 CSV 文件读入字典,使用longName值作为键。


import csv


with open(old_csv_file, "r") as fh:

    reader = csv.reader(fh)

    old_csv = {row[0]: row for row in reader}


with open(new_csv_file, "r") as fh:

    reader = csv.reader(fh)

    new_csv = {row[0]: row for row in reader}


然后,使用集合操作可以轻松找到新添加和删除的键。


old_longNames = set(old_csv.keys())

new_longNames = set(new_csv.keys())


# common: set intersection

common_longNames = old_longNames.intersection(new_longNames)

# removed: whatever's in old but not in new

removed_longNames = old_longNames - new_longNames

# added: whatever's in new but not in old

added_longNames = new_longNames - old_longNames

最后,迭代公共集以查找有变化的地方:


changed_longNames = []

for key in common_longNames:

    old_row = old_csv[key]

    new_row = new_csv[key]

    # if any(o != n for o, n in zip(old_row, new_row)):

    if old_row != new_row:

        # this row has at least one column changed. Do whatever

        print(f"LongName {key} has changes")

        changed_longNames.append(key)

或者,作为列表理解:


changed_longNames = [key for key in common_longNames if old_csv[key] != new_csv[key]]

将所有内容写入新的 csv 文件也相当简单。请注意,这些集合不保留顺序,因此您可能无法以相同的顺序获得结果。


with open("deleted.csv", "w") as fh:

    writer = csv.writer(fh)

    for key in removed_longNames:

        writer.writerow(old_csv[key])


with open("inserted.csv", "w") as fh:

    writer = csv.writer(fh)

    for key in added_longNames:

        writer.writerow(new_csv[key])


with open("changed.csv", "w") as fh:

    writer = csv.writer(fh)

    for key in changed_longNames:

        old_row = old_csv[key]

        new_row = new_csv[key]

        merged_row = []

        for oi, ni in zip(old_row, new_row):

            merged_row.append(oi)

            merged_row.append(ni)

        writer.writerow(merged_row)


查看完整回答
反对 回复 2023-12-29
  • 1 回答
  • 0 关注
  • 106 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信