首页猿问如何使用 seaborn...

如何使用 seaborn 为每小时独特设备绘制 KDE？

Python

跃然一笑 2023-04-18 17:28:14

我有以下熊猫df（datetime属于类型datetime64）： device datetime0 846ee 2020-03-22 14:27:291 0a26e 2020-03-22 15:33:312 8a906 2020-03-27 16:19:063 6bf11 2020-03-27 16:05:204 d3923 2020-03-23 18:58:51我想使用 Seaborn 的 KDE 功能distplot。尽管我不完全明白为什么，但我还是让它工作了：df['hour'] = df['datetime'].dt.floor('T').dt.timedf['hour'] = pd.to_timedelta(df['hour'].astype(str)) / pd.Timedelta(hours=1)进而sns.distplot(df['hour'], hist=False, bins=arr, label='tef')问题是：我如何做同样的事情，但只计算 unique devices？我努力了df.groupby(['hour']).nunique().reset_index()df.groupby(['hour'])[['device']].size().reset_index()但是他们给了我不同的结果（数量级相同，但或多或少）。我想我不明白我在做什么pd.to_timedelta(df['hour'].astype(str)) / pd.Timedelta(hours=1)，这让我无法思考独特之处……也许吧。

查看完整描述

2 回答

30秒到达战场

TA贡献1828条经验获得超6个赞

pd.to_timedelta(df['time'].astype(str))箱子输出像0 days 01:00:00
pd.to_timedelta(df['time'].astype(str)) / pd.Timedelta(hours=1)创建类似的输出1.00，它是float小时数。
timedeltas。

import pandas as pd

import numpy as np # for test data

import random # for test data

# test data

np.random.seed(365)

random.seed(365)

rows = 40

data = {'device': [random.choice(['846ee', '0a26e', '8a906', '6bf11', 'd3923']) for _ in range(rows)],

'datetime': pd.bdate_range(datetime(2020, 7, 1), freq='15min', periods=rows).tolist()}

# create test dataframe

df = pd.DataFrame(data)

# this date column is already in a datetime format; for the real dataframe, make sure it's converted

# df.datetime = pd.to_datetime(df.datetime)

# this extracts the time component from the datetime and is a datetime.time object

df['time'] = df['datetime'].dt.floor('T').dt.time

# this creates a timedelta column; note it's format

df['timedelta'] = pd.to_timedelta(df['time'].astype(str))

# this creates a float representing the hour and its fractional component (minutes)

df['hours'] = pd.to_timedelta(df['time'].astype(str)) / pd.Timedelta(hours=1)

# extracts just the hour

df['hour'] = df['datetime'].dt.hour

显示（df.head（））

这个观点应该阐明时间提取方法之间的区别。

device datetime time timedelta hours hour

0 8a906 2020-07-01 00:00:00 00:00:00 0 days 00:00:00 0.00 0

1 0a26e 2020-07-01 00:15:00 00:15:00 0 days 00:15:00 0.25 0

2 8a906 2020-07-01 00:30:00 00:30:00 0 days 00:30:00 0.50 0

3 d3923 2020-07-01 00:45:00 00:45:00 0 days 00:45:00 0.75 0

4 0a26e 2020-07-01 01:00:00 01:00:00 0 days 01:00:00 1.00 1

5 d3923 2020-07-01 01:15:00 01:15:00 0 days 01:15:00 1.25 1

6 6bf11 2020-07-01 01:30:00 01:30:00 0 days 01:30:00 1.50 1

7 d3923 2020-07-01 01:45:00 01:45:00 0 days 01:45:00 1.75 1

8 6bf11 2020-07-01 02:00:00 02:00:00 0 days 02:00:00 2.00 2

9 d3923 2020-07-01 02:15:00 02:15:00 0 days 02:15:00 2.25 2

10 0a26e 2020-07-01 02:30:00 02:30:00 0 days 02:30:00 2.50 2

11 846ee 2020-07-01 02:45:00 02:45:00 0 days 02:45:00 2.75 2

12 0a26e 2020-07-01 03:00:00 03:00:00 0 days 03:00:00 3.00 3

13 846ee 2020-07-01 03:15:00 03:15:00 0 days 03:15:00 3.25 3

14 846ee 2020-07-01 03:30:00 03:30:00 0 days 03:30:00 3.50 3

绘制每小时的设备计数seaborn.countplot

plt.figure(figsize=(8, 6))

sns.countplot(x='hour', hue='device', data=df)

plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')

//img3.sycdn.imooc.com/643e62d80001e50f05930369.jpg

`seaborn.distplot`为每个设备绘制 a

使用seaborn.FacetGrid
这将给出每个设备的每小时分布。

import seaborn as sns

import matplotlib.pyplot as plt

g = sns.FacetGrid(df, row='device', height=5)

g.map(sns.distplot, 'hours', bins=24, kde=True)

g.set(xlim=(0, 24), xticks=range(0, 25, 1))

//img3.sycdn.imooc.com/643e62fc0001c45703641063.jpg

//img4.sycdn.imooc.com/643e63050001c25203560730.jpg

反对回复 2023-04-18

婷婷同学_

TA贡献1844条经验获得超8个赞

你可以试试

df['hour'] = df['datetime'].dt.strftime('%Y-%m-%d %H')
s = df.groupby('hour')['device'].value_counts()

反对回复 2023-04-18

2 回答
0 关注
99 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

如何使用 seaborn 为每小时独特设备绘制 KDE？

如何使用 seaborn 为每小时独特设备绘制 KDE？

2 回答

seaborn.distplot为每个设备绘制 a

添加回答

`seaborn.distplot`为每个设备绘制 a