3 回答

TA贡献1804条经验 获得超8个赞
关键字可in用于查看一个字符串(或列表)是否包含另一个字符串。您可以使用any关键字一次检查多个项目。
info = [
['Price: 5000', 'In warranty', 'Weight: 8 kg'],
['Refundable', 'Price: 2800', 'Weight: 5.5 kg', 'Extra battery power'],
['Price: 9000', 'Non-exchangeable', 'Weight: 8 kg', 'High-Quality']
]
keywords = ['Price', 'Weight']
for item in info:
print([x for x in item if any(kw in x for kw in keywords)])
输出:
['Price: 5000', 'Weight: 8 kg']
['Price: 2800', 'Weight: 5.5 kg']
['Price: 9000', 'Weight: 8 kg']
此数据的更简洁格式可能是使用字典。
info = [
{
'Price': 5000,
'Weight': '8 kg',
'Attributes': ['In warranty']
},
{
'Price': 2800,
'Weight': '5.5 kg',
'Attributes': ['Refundable', 'Extra battery power']
},
{
'Price': 9000,
'Weight': '8 kg',
'Attributes': ['Non-exchangeable', 'High-Quality']
}
]
keywords = ['Price', 'Weight']
info_filterd = [{k: v for k, v in item.items() if k in keywords} for item in info]
print(info_filterd)
输出:
[
{
"Price": 5000,
"Weight": "8 kg"
},
{
"Price": 2800,
"Weight": "5.5 kg"
},
{
"Price": 9000,
"Weight": "8 kg"
}
]

TA贡献1833条经验 获得超4个赞
使用函数式编程的单线(地图、过滤器和任何)
info = [
['Price: 5000', 'In warranty', 'Weight: 8 kg'],
['Refundable', 'Price: 2800', 'Weight: 5.5 kg', 'Extra battery power'],
['Price: 9000', 'Non-exchangeable', 'Weight: 8 kg', 'High-Quality']
]
keywords = ['Price', 'Weight']
l = map(lambda sub_list: list(filter(lambda element: any(map(lambda keyword: keyword in element, keywords)), sub_list)), info)
print(list(l))
输出:
[['Price: 5000', 'Weight: 8 kg'], ['Price: 2800', 'Weight: 5.5 kg'], ['Price: 9000', 'Weight: 8 kg']]
单线各部分说明
map(lambda sub_list: list(filter(lambda element: any(map(lambda keyword: keyword in element, keywords)), sub_list)), info)
迭代应用 lambda 函数的所有信息元素
filter(lambda element: any(map(lambda keyword: keyword in element, keywords)), sub_list)
在 sub_list 的所有值中,获取至少包含一个关键字的值(过滤器)
any(map(lambda keyword: keyword in element, keywords))
如果关键字中的任何关键字出现在元素中,这将返回真或假
注意:list() 用于扩展生成器

TA贡献1847条经验 获得超7个赞
difflib.SequenceMatcher使用(doc )的一种可能解决方案。但是,可能需要对比率进行一些调整:
from difflib import SequenceMatcher
info = [['Price: 5000', 'In warranty', 'Weight: 8 kg'],
['Refundable', 'Price: 2800', 'Weight: 5.5 kg', 'Extra battery power'],
['Price: 9000', 'Non-exchangeable', 'Weight: 8 kg', 'High-Quality']]
keywords = ['Price', 'Weight']
out = []
for i in info:
out.append([])
for item in i:
if any(SequenceMatcher(None, item.lower(), kw.lower()).ratio() > 0.5 for kw in keywords):
out[-1].append(item)
from pprint import pprint
pprint(out)
印刷:
[['Price: 5000', 'Weight: 8 kg'],
['Price: 2800', 'Weight: 5.5 kg'],
['Price: 9000', 'Weight: 8 kg']]
添加回答
举报