为了账号安全,请及时绑定邮箱和手机立即绑定

如何从非结构化文本创建 python 字典?

如何从非结构化文本创建 python 字典?

拉风的咖菲猫 2021-09-28 15:08:15
我有一组存在于文本文件中的断开链接检查器结果:Getting links from: https://www.foo.com/├───OK─── http://www.this.com/├───OK─── http://www.is.com/├─BROKEN─ http://www.broken.com/├───OK─── http://www.set.com/├───OK─── http://www.one.com/5 links found. 0 excluded. 1 broken.Getting links from: https://www.bar.com/├───OK─── http://www.this.com/├───OK─── http://www.is.com/├─BROKEN─ http://www.broken.com/3 links found. 0 excluded. 1 broken.Getting links from: https://www.boo.com/├───OK─── http://www.this.com/├───OK─── http://www.is.com/2 links found. 0 excluded. 0 broken.我正在尝试编写一个脚本,该脚本读取文件并创建一个字典列表,其中每个根链接作为键,其子链接作为值(包括摘要行)。我试图实现的输出如下所示:{"Getting links from: https://www.foo.com/": ["├───OK─── http://www.this.com/", "├───OK─── http://www.is.com/", "├─BROKEN─ http://www.broken.com/", "├───OK─── http://www.set.com/", "├───OK─── http://www.one.com/", "5 links found. 0 excluded. 1 broken."], "Getting links from: https://www.bar.com/": ["├───OK─── http://www.this.com/", "├───OK─── http://www.is.com/", "├─BROKEN─ http://www.broken.com/", "3 links found. 0 excluded. 1 broken."],"Getting links from: https://www.boo.com/": ["├───OK─── http://www.this.com/", "├───OK─── http://www.is.com/", "2 links found. 0 excluded. 0 broken."] }这是我到目前为止所拥有的:result_list = []with open('link_checker_result.txt', 'r') as f:    temp_list = f.readlines()    for line in temp_list:        result_list.append(line)这给了我输出:['Getting links from: https://www.foo.com/', '├───OK─── http://www.this.com/', '├───OK─── http://www.is.com/', '├─BROKEN─ http://www.broken.com/', '├───OK─── http://www.set.com/', '├───OK─── http://www.one.com/', '5 links found. 0 excluded. 1 broken.', 'Getting links from: https://www.bar.com/', '├───OK─── http://www.this.com/', '├───OK─── http://www.is.com/', '...'  ]我认识到这些集合中的每一个都有一些共享的功能,例如,它们之间有一个空行,或者它们以“Getting...”开头。这是我应该在写入字典之前尝试拆分的东西吗?我是 Python 新手,所以我承认我什至不确定我是否朝着正确的方向前进。真的很感谢一些专家的眼光!提前致谢!
查看完整描述

2 回答

?
繁花如伊

TA贡献2012条经验 获得超12个赞

这将产生您想要的结果:


result = {}


with open('link_checker_result.txt', 'r') as f:

    temp_list = f.readlines()

    key = ''

    value = []

    for line in temp_list:

        if not line:

            result[key] = value

            key = ''

            value = []

        elif not key:

            key = line

        else:

            value.append(line)


    if key:

      result[key] = value


查看完整回答
反对 回复 2021-09-28
  • 2 回答
  • 0 关注
  • 161 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信