首页猿问仅从 python...

仅从 python 中的列表列表中提取相关信息

Python

鸿蒙传说 2022-04-23 21:51:54

我有一个包含字符串子列表的列表。喜欢：info = [['Price: 5000', 'In warranty', 'Weight: 8 kg'], ['Refundable', 'Price: 2800', 'Weight: 5.5 kg', 'Extra battery power'], ['Price: 9000', 'Non-exchangeable', 'Weight: 8 kg', 'High-Quality']..]每个子列表都有一些不相关的额外字符串。我只需要子列表中准确描述产品信息的 5 个值，并且这 5 个值中的每一个都有自己的关键字。使用关键字从子列表中提取有用字符串并丢弃其余字符串的方法是什么？在上面的示例中，我只想保留“价格”、“重量”。

查看完整描述

3 回答

胡说叔叔

TA贡献1804条经验获得超8个赞

关键字可in用于查看一个字符串（或列表）是否包含另一个字符串。您可以使用any关键字一次检查多个项目。

info = [

['Price: 5000', 'In warranty', 'Weight: 8 kg'],

['Refundable', 'Price: 2800', 'Weight: 5.5 kg', 'Extra battery power'],

['Price: 9000', 'Non-exchangeable', 'Weight: 8 kg', 'High-Quality']

]

keywords = ['Price', 'Weight']

for item in info:

print([x for x in item if any(kw in x for kw in keywords)])

输出：

['Price: 5000', 'Weight: 8 kg']

['Price: 2800', 'Weight: 5.5 kg']

['Price: 9000', 'Weight: 8 kg']

此数据的更简洁格式可能是使用字典。

info = [

{

'Price': 5000,

'Weight': '8 kg',

'Attributes': ['In warranty']

{

'Price': 2800,

'Weight': '5.5 kg',

'Attributes': ['Refundable', 'Extra battery power']

{

'Price': 9000,

'Weight': '8 kg',

'Attributes': ['Non-exchangeable', 'High-Quality']

}

]

keywords = ['Price', 'Weight']

info_filterd = [{k: v for k, v in item.items() if k in keywords} for item in info]

print(info_filterd)

输出：

[

{

"Price": 5000,

"Weight": "8 kg"

{

"Price": 2800,

"Weight": "5.5 kg"

{

"Price": 9000,

"Weight": "8 kg"

}

]

反对回复 2022-04-23

潇潇雨雨

TA贡献1833条经验获得超4个赞

使用函数式编程的单线（地图、过滤器和任何）

info = [

['Price: 5000', 'In warranty', 'Weight: 8 kg'],

['Refundable', 'Price: 2800', 'Weight: 5.5 kg', 'Extra battery power'],

['Price: 9000', 'Non-exchangeable', 'Weight: 8 kg', 'High-Quality']

]

keywords = ['Price', 'Weight']

l = map(lambda sub_list: list(filter(lambda element: any(map(lambda keyword: keyword in element, keywords)), sub_list)), info)

print(list(l))

输出：

[['Price: 5000', 'Weight: 8 kg'], ['Price: 2800', 'Weight: 5.5 kg'], ['Price: 9000', 'Weight: 8 kg']]

单线各部分说明

map(lambda sub_list: list(filter(lambda element: any(map(lambda keyword: keyword in element, keywords)), sub_list)), info)

迭代应用 lambda 函数的所有信息元素

filter(lambda element: any(map(lambda keyword: keyword in element, keywords)), sub_list)

在 sub_list 的所有值中，获取至少包含一个关键字的值（过滤器）

any(map(lambda keyword: keyword in element, keywords))

如果关键字中的任何关键字出现在元素中，这将返回真或假

注意：list() 用于扩展生成器

反对回复 2022-04-23

aluckdog

TA贡献1847条经验获得超7个赞

difflib.SequenceMatcher使用（doc ）的一种可能解决方案。但是，可能需要对比率进行一些调整：

from difflib import SequenceMatcher

info = [['Price: 5000', 'In warranty', 'Weight: 8 kg'],

['Refundable', 'Price: 2800', 'Weight: 5.5 kg', 'Extra battery power'],

['Price: 9000', 'Non-exchangeable', 'Weight: 8 kg', 'High-Quality']]

keywords = ['Price', 'Weight']

out = []

for i in info:

out.append([])

for item in i:

if any(SequenceMatcher(None, item.lower(), kw.lower()).ratio() > 0.5 for kw in keywords):

out[-1].append(item)

from pprint import pprint

pprint(out)

印刷：

[['Price: 5000', 'Weight: 8 kg'],

['Price: 2800', 'Weight: 5.5 kg'],

['Price: 9000', 'Weight: 8 kg']]

反对回复 2022-04-23

3 回答
0 关注
113 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

仅从 python 中的列表列表中提取相关信息

仅从 python 中的列表列表中提取相关信息

3 回答

添加回答