首页猿问如何将使用日期时间 df 的 if...

如何将使用日期时间 df 的 if 语句的 for 循环转换为列表理解

Python

弑天下 2022-12-14 17:37:20

我正在尝试将以下带有 if 语句的 for 循环转换为列表理解。# Create dictionary to hold results trip_counts = {'AM': 0, 'PM': 0}# Loop over all tripsfor trip in onebike_datetimes: # Check to see if the trip starts before noon if trip['start'].hour < 12: # Increment the counter for before noon trip_counts["AM"] += 1 else: # Increment the counter for after noon trip_counts["PM"] += 1我试过了[trip_counts["AM"]+=1 if trip['start'].hour <12 else trip_counts['PM']+= 1 for trip in onebike_datetimes] 但我不断收到语法错误

查看完整描述

4 回答

慕娘9325324

TA贡献1783条经验获得超5个赞

您可以使用列表理解（实际上，只是一个生成器表达式），但不是您所想的那样。构建一个AMs 和PMs 的生成器，然后用它来构建一个Counter实例。

from collections import Counter

trip_counts = Counter(("AM" if trip['start'].hour < 12 else "PM")

for trip in onebike_datetimes)

一个独立的演示：

from collections import Counter

from types import SimpleNamespace

onebike_datetimes = [

{'start': SimpleNamespace(hour=9)},

{'start': SimpleNamespace(hour=3)},

{'start': SimpleNamespace(hour=14)},

{'start': SimpleNamespace(hour=19)},

{'start': SimpleNamespace(hour=7)},

]

trip_counts = Counter(("AM" if trip['start'].hour < 12 else "PM")

for trip in onebike_datetimes)

assert trip_counts["AM"] == 3

assert trip_counts["PM"] == 2

反对回复 2022-12-14

湖上湖

TA贡献2003条经验获得超2个赞

保留你的 for 循环要清楚得多。

如果你真的想使用列表理解，你可以这样做：

l = ["AM" if trip["start"].hour < 12 else "PM" for trip in onebike_datetimes]

am_count = l.count("AM")

trip_counts = {"AM": am_count, "PM": len(l) - am_count}

trip_counts（如果你使用这个，你不需要初始化）

反对回复 2022-12-14

芜湖不芜

TA贡献1796条经验获得超7个赞

如果这是DataFrame您正在使用的 pandas，为什么不过滤值并立即对它们求和呢？

这样的事情可能会起作用：

trip_counts['AM'] = len(trip[trip.loc[:, 'hour'] < 12].index)

trip_counts['PM'] = len(trip[trip.loc[:, 'hour'] >= 12].index)

编辑：我只是对这里给出的答案进行了一些基准测试，因为有些人认为列表理解会自动更快。

正如您所看到的，在这种情况下，常规的 for 循环或多或少具有最佳性能，仅Counter与此处其他答案之一中提到的列表推导的使用相匹配。

请注意，我稍微修改了我的 Pandas 实现以匹配我认为您的数据可能的结构（即，不在 DataFrame 中），因此在每次运行时将您的数据转换为 DataFrame 可能会有更多的开销。

基准

生成此图的代码如下所示：

import pandas as pd

import numpy as np

from collections import Counter

from types import SimpleNamespace

import perfplot

def gen_data(n):

onebike_datetimes = [

{'start': SimpleNamespace(hour=9)},

{'start': SimpleNamespace(hour=3)},

{'start': SimpleNamespace(hour=14)},

{'start': SimpleNamespace(hour=19)},

{'start': SimpleNamespace(hour=7)},

{'start': SimpleNamespace(hour=14)},

{'start': SimpleNamespace(hour=19)},

{'start': SimpleNamespace(hour=2)},

{'start': SimpleNamespace(hour=20)},

{'start': SimpleNamespace(hour=12)},

]*n

return onebike_datetimes

def use_vanilla_for(a):

# onebike_datetimes = gen_data(n)

onebike_datetimes = a

trip_counts = {'AM': 0, 'PM': 0}

for trip in onebike_datetimes:

if trip['start'].hour < 12:

trip_counts["AM"] += 1

else:

trip_counts["PM"] += 1

return 1

# return trip_counts

def use_list_comp(a):

# onebike_datetimes = gen_data(n)

onebike_datetimes = a

trip_counts = {'AM': 0, 'PM': 0}

l = ["AM" if trip["start"].hour < 12 else "PM" for trip in onebike_datetimes]

trip_counts = {i: l.count(i) for i in l}

return 1

# return trip_counts

def use_counter(a):

# onebike_datetimes = gen_data(n)

onebike_datetimes = a

trip_counts = {'AM': 0, 'PM': 0}

trip_counts = Counter(("AM" if trip['start'].hour < 12 else "PM")

for trip in onebike_datetimes)

return 1

# return trip_counts

def use_pandas(a):

# onebike_datetimes = gen_data(n)

onebike_datetimes = a

trip = pd.DataFrame(list(map(lambda a: a['start'].hour, onebike_datetimes)), columns=['hrs'])

trip_counts = {'AM': 0, 'PM': 0}

trip_counts['AM'] = len(trip[trip['hrs'] < 12].index)

trip_counts['PM'] = len(trip[trip['hrs'] >= 12].index)

return 1

# return trip_counts

perfplot.show(

setup=lambda n: gen_data(n),

kernels=[

lambda a: use_vanilla_for(a),

lambda a: use_list_comp(a),

lambda a: use_counter(a),

lambda a: use_pandas(a),

labels=["vanilla_for", "list_comp", "counter", "dataframe"],

n_range=[2 ** k for k in range(10)],

xlabel="len(a)",

)

反对回复 2022-12-14

倚天杖

TA贡献1828条经验获得超3个赞

赋值是语句。语句在列表推导中不可用。使用循环

你真的不应该这样做，但为了完整起见：

trip_counts = {'AM': 0, 'PM': 0}

[trip_counts.__setitem__('AM', trip_counts['AM']+1) if trip['start']['hour'] <12 else trip_counts.__setitem__('PM', trip_counts['PM']+1) for trip in onebike_datetimes]

print(f"With list comprehension: {trip_counts}")

OUT: With list comprehension: {'AM': 1, 'PM': 2}

反对回复 2022-12-14

4 回答
0 关注
214 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

如何将使用日期时间 df 的 if 语句的 for 循环转换为列表理解

如何将使用日期时间 df 的 if 语句的 for 循环转换为列表理解

4 回答

添加回答