1 回答
TA贡献1810条经验 获得超4个赞
你想要的所有值Total Due吗?如果是这样,您只需执行以下操作:
sep = "|"
value_name = "Total Due"
result = []
with open("thefile.txt", 'r') as f:
for line in f:
tokens = line.split(sep)
try:
ind_total_due = tokens.index(value_name) + 1
result.append((value_name, tokens[ind_total_due]))
except ValueError:
continue
结果将是:
[('Total Due', '86,578.93'),
('Total Due', '26,451.75'),
('Total Due', '3,483.28'),
('Total Due', '983.04'),
('Total Due', '- 197,358.33')]
第一个“令牌”似乎是一个唯一标识符。如果您想要 csv 导出和多列支持,您可以这样做:
token_sep = "|"
csv_sep = ";"
# Lambda function that whill format total due
float_formater = lambda string : float(
string.replace(" ", "").replace(",", "")
)
# Attributes you want to parse
col_names = (
("Total Due", float_formater, 1),
("Due Date", None, 1),
)
# Python dictionary which associate to each identifier, a total due
# and a due date
records = {}
with open("thefile.txt", 'r') as f:
for line in f:
tokens = line.strip().split(token_sep)
# We assume the first token is an identifier
unique_id = tokens[0]
# For each new identifier we create a new record and initialize
# total due and due date to an empty string
if unique_id and unique_id not in records:
records[unique_id] = {col_name: "" for col_name, _ in col_names}
# then we look for values we are interesting in. If we find one, we can
# update one value of the record
for col_name, formatter, index_val in col_names:
try:
ind_col = tokens.index(col_name) + index_val
value = tokens[ind_col]
if formatter:
value = formatter(value)
records[unique_id][col_name] = value
except ValueError:
continue
# For an easier csv export we reformat the record dict to a list of values
list_values = [
(unique_id,) + tuple((values[col] for col, _ in col_names))
for unique_id, values in records.items()
]
# We can then easily write all the records one by one
with open("mycsv.csv", "w") as f:
f.write(csv_sep.join(["id"] + [c for c, _ in col_names]))
for values in list_values:
print(values)
f.write("\n")
f.write(csv_sep.join(map(str, values)))
mycsv.csv:
id;Total Due;Due Date
112002209769;3483.28;2015/09/23
142002121343;-197358.33;
600001130260;86578.93;2015/09/22
28002385859;26451.75;2015/09/23
100002232416;983.04;2015/09/23
添加回答
举报