为了账号安全,请及时绑定邮箱和手机立即绑定

如何使用前导空格将txt文件转换为json?

如何使用前导空格将txt文件转换为json?

慕后森 2022-06-28 18:02:15
我已经格式化了 txt 文件,如下所示:Hostinfo Start  DATE 190819 1522  HOST midas  DOMAIN test.de  HW_PLATFORM x86_64  SERVER_TYPE virtual  CPU_INFO    CPU_TYPE  Intel(R) Xeon(R) CPU E7-8867 v4 @ 2.40GHz    CPU_COUNT 2    CORE_COUNT 2  THREAD_COUNT 8  MEMORY       32951312 kB  OS Start    OS Linux    OS_VERSION 4.9.0-6-amd64    OS_UPTIME 536 days 21:08    OS End  RELEASE Debian GNU/Linux 9 (stretch)  RELEASE_VERSION 9  RELEASE_PATCHLEVELHostinfo End使用前导空格的计数需要将其转换为类似于此的 json 格式:"Hostinfo": [{  "DATE": "190819 1522"  "HOST": "midas"  "DOMAIN": "test.de"  "HW_PLATFORM": "x86_64"  "SERVER_TYPE": "virtual"  "CPU_INFO": {    "CPU_TYPE": "Intel(R) Xeon(R) CPU E7-8867 v4 @ 2.40GHz"    "CPU_COUNT": "2"    "CORE_COUNT": "2"    }  "THREAD_COUNT": "8"  "MEMORY": "32951312 kB"  "OS": [      {      "OS": "Linux"      "OS_VERSION": "4.9.0-6-amd64"      "OS_UPTIME": "536 days 21:08"      }    ]  "RELEASE": "Debian GNU/Linux 9 (stretch)"  "RELEASE_VERSION": "9"  "RELEASE_PATCHLEVEL" : ""}]我对此脚本有一些承诺,但无法解决如何将大括号之间的行设置为上层字典(级别)的对象:#!/usr/bin/pythonimport jsonimport itertoolsimport stringimport refilename = 'commands.txt'commands = {}with open(filename) as fh:    previous_line = 0    mark_line = ""    for line in fh:        current_line = ((len(line) - len(line.lstrip()))/2)         diff = current_line - previous_line        if re.search(' Start$', line.strip()):            line = line.strip().replace(' Start', ':{')            print(line)            mark_line = "start_line"        elif re.search(' Ende$', line.strip()):            line = line.strip().replace(' Ende', '')            print("}")            mark_line = "end_line"        elif diff == 0:            print(line.strip())S        elif diff > 0:            if mark_line == "start_line" or mark_line == "end_line":                mark_line = "0"            else:                print("{")                print(line.strip())        elif diff < 0:也许你能让我知道如何解决这个问题或接受一些建议?可能是一些简化此脚本的模块吗?UPD: 在我的混乱脚本中添加一些 json 标记。如果我以错误/正确的方式移动,你能得到建议吗?
查看完整描述

1 回答

?
GCT1015

TA贡献1827条经验 获得超4个赞

您可以使用itertools.groupby递归:


import itertools as it, re

data = [[*re.findall('^\s+', b), *re.split('(?<=[A-Z])\s+', i)] for b in open('os_stuff.txt') if not (i:=re.sub('^\s+|\sStart\n$', '', b)).endswith('End\n')]

def to_tree(d):

   _d = [(a, list(b)) for a, b in it.groupby(d, key=lambda x:bool(re.findall('^\s+$', x[0])))]

   new_dict, _last = {}, None

   for i, [a, b] in enumerate(_d):

      if not a:

         for j, *k in b:

            if not k or (not k[0] and i < len(_d) - 2):

               _last = j

            else:

               new_dict[j] = ' '.join(k).strip('\n')

      else:

         new_dict[_last] = [to_tree([[k[2:], *j] if k[2:] else j for k, *j in b])]

   return new_dict

import json

print(json.dumps(to_tree(data), indent=4))

输出:


{

  "Hostinfo": [

    {

        "DATE": "190819 1522",

        "HOST": "midas",

        "DOMAIN": "test.de",

        "HW_PLATFORM": "x86_64",

        "SERVER_TYPE": "virtual",

        "CPU_INFO": [

            {

                "CPU_TYPE": "Intel(R) Xeon(R) CPU E7-8867 v4 @ 2.40GHz",

                "CPU_COUNT": "2",

                "CORE_COUNT": "2"

            }

        ],

        "THREAD_COUNT": "8",

        "MEMORY": "32951312 kB ",

        "OS": [

            {

                "OS": "Linux",

                "OS_VERSION": "4.9.0-6-amd64",

                "OS_UPTIME": "536 days 21:08"

            }

        ],

        "RELEASE": "Debian GNU/Linux 9 (stretch)",

        "RELEASE_VERSION": "9",

        "RELEASE_PATCHLEVEL": ""

     }

  ]

}

编辑:Python2.7 解决方案:


import itertools as it, re

new_data = [[i, re.sub('^\s+|\sStart\n$', '', i)] for i in open('os_stuff.txt')]

data = [re.findall('^\s+', a)+re.split('(?<=[A-Z])\s+', b) for a, b in new_data if not b.endswith('End\n')]

def to_tree(d):

  _d = [(a, list(b)) for a, b in it.groupby(d, key=lambda x:bool(re.findall('^\s+$', x[0])))]

  new_dict, _last = {}, None

  for i, [a, b] in enumerate(_d):

     if not a:

       for j_k in b:

         if not j_k[1:] or (not j_k[1:][0] and i < len(_d) - 2):

            _last = j_k[0]

         else:

            new_dict[j_k[0]] = ' '.join(j_k[1:]).strip('\n')

     else:

       new_dict[_last] = [to_tree([[k_j[0][2:]]+k_j[1:] if k_j[0][2:] else k_j[1:] for k_j in b])]

  return new_dict



print(to_dict(data))


查看完整回答
反对 回复 2022-06-28
  • 1 回答
  • 0 关注
  • 125 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信