首页猿问如何从嵌套列表中找到包含较高值的列...

如何从嵌套列表中找到包含较高值的列表并返回这些列表？

Python

jeck猫 2023-03-01 16:54:52

我有这个包含重复条目的嵌套列表：[['Coloring book moana', 'ART_AND_DESIGN', '3.9', 967, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'], ['Coloring book moana', 'FAMILY', '3.9', 974, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'], ['Gmail', 'COMMUNICATION', '4.3', 4604324, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'], ['Gmail', 'COMMUNICATION', '4.3', 4604483, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'], ['Instagram', 'SOCIAL', '4.5', 66577313, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'], ['Instagram', 'SOCIAL', '4.5', 66577446, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'], ['Instagram', 'SOCIAL', '4.5', 66509917, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']]我想通过 i[3] 过滤嵌套列表，所以最终输出将是这样的[['Gmail', 'COMMUNICATION', '4.3', 4604483, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'], ['Coloring book moana', 'FAMILY', '3.9', 974, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'], ['Instagram', 'SOCIAL', '4.5', 66577446, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']]我尝试了一个 for 循环，但我无法弄清楚如何获得重复列表的最高值

查看完整描述

3 回答

ITMISS

TA贡献1871条经验获得超8个赞

这是我能想到的最 pythonic 的方式。我的做法是先对列表的列表进行排序，按sublist[3]，这意味着当我们遍历列表时，我们最终会在遇到重复项之前遇到具有最大评论数的子列表。这个技巧将用于构建最终列表。

meta_list = [['Coloring book moana', 'ART_AND_DESIGN', '3.9', 967, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'],

['Coloring book moana', 'FAMILY', '3.9', 974, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'],

['Gmail', 'COMMUNICATION', '4.3', 4604324, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'],

['Gmail', 'COMMUNICATION', '4.3', 4604483, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'],

['Instagram', 'SOCIAL', '4.5', 66577313, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'],

['Instagram', 'SOCIAL', '4.5', 66577446, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'],

['Instagram', 'SOCIAL', '4.5', 66509917, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']]

# Sort the list by review count and review name - make sure the highest review is first

meta_list.sort(key=lambda x: (int(x[3]), x[0]), reverse=True)

# This is the list we'll use to store the final data in

final_list = []

# Go through all the items in the meta_list

for meta in meta_list:

if not meta[0] in [item[0] for item in final_list]:

'''

If another meta with the same name (0th index)

doesn't already exist in final_list, add it

'''

final_list.append(meta)

输出-

[['Instagram',

'SOCIAL',

'4.5',

66577446,

'Varies with device',

'1,000,000,000+',

'Free',

'0',

'Teen',

'Social',

'July 31, 2018',

'Varies with device',

'Varies with device'],

['Gmail',

'COMMUNICATION',

'4.3',

4604483,

'Varies with device',

'1,000,000,000+',

'Free',

'0',

'Everyone',

'Communication',

'August 2, 2018',

'Varies with device',

'Varies with device'],

['Coloring book moana',

'FAMILY',

'3.9',

974,

'14M',

'500,000+',

'Free',

'0',

'Everyone',

'Art & Design;Pretend Play',

'January 15, 2018',

'2.0.0',

'4.0.3 and up']]

基本上它将所有不存在的元数据添加到final_list. 为什么这行得通？因为您在循环时遇到的第一个元数据是评论数最高的元数据。所以一旦那个被添加，它的复制品就不能被添加，我们就完成了。

注意：这不会保留评论本身的顺序。它只会确保只保留评论数最高的评论，以防出现同名的重复评论。

反对回复 2023-03-01

MMTTMM

TA贡献1869条经验获得超4个赞

这个问题可能有更优雅/pythonic 的解决方案，但这是一个可能的途径：

my_list = [...] # Nested list here

def compare_duplicates(nested_list, name_index=0, compare_index=3):

max_values = dict() # Used two dictionaries for readability

final_indexes = dict()

for i, item in enumerate(nested_list):

name, value = item[name_index], item[compare_index]

if value > max_values.get(name, 0):

max_values[name] = value

final_indexes[name] = i

return [nested_list[i] for i in final_indexes.values()]

print(compare_duplicates(my_list))

反对回复 2023-03-01

忽然笑

TA贡献1806条经验获得超5个赞

是这样的：

_DATA = [

['Coloring book moana', 'ART_AND_DESIGN', '3.9', 967, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'],

['Coloring book moana', 'ART_AND_DESIGN', '3.9', 974, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'],

['Gmail', 'COMMUNICATION', '4.3', 4604324, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'],

['Gmail', 'COMMUNICATION', '4.3', 4604483, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'],

['Instagram', 'SOCIAL', '4.5', 66577313, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'],

['Instagram', 'SOCIAL', '4.5', 66577446, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'],

['Instagram', 'SOCIAL', '4.5', 66509917, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']

]

def print_highest(data):

list_map = {}

for d in data:

key = str(d[0:3] + d[4:])

if key not in list_map:

list_map[key] = d

continue

if d[3] > list_map[key][3]:

list_map[key] = d

for l in list_map.values():

print(l)

print_highest(_DATA)

输出：

['Coloring book moana', 'ART_AND_DESIGN', '3.9', 974, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']

['Gmail', 'COMMUNICATION', '4.3', 4604483, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device']

['Instagram', 'SOCIAL', '4.5', 66577446, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']

反对回复 2023-03-01

3 回答
0 关注
100 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

如何从嵌套列表中找到包含较高值的列表并返回这些列表？

如何从嵌套列表中找到包含较高值的列表并返回这些列表？

3 回答

添加回答