为了账号安全,请及时绑定邮箱和手机立即绑定

PDF绘图关注

PDF绘图关注

qq_花开花谢_0 2021-05-03 16:43:01
我尝试了以下手动方法:dict = {'id': ['a','b','c','d'], 'testers_time': [10, 30, 15, None], 'stage_1_to_2_time': [30, None, 30, None], 'activated_time' : [40, None, 45, None],'stage_2_to_3_time' : [30, None, None, None],'engaged_time' : [70, None, None, None]} df = pd.DataFrame(dict, columns=['id', 'testers_time', 'stage_1_to_2_time', 'activated_time', 'stage_2_to_3_time', 'engaged_time'])df= df.dropna(subset=['testers_time']).sort_values('testers_time')prob = df['testers_time'].value_counts(normalize=True)print(prob)#0.333333,  0.333333,  0.333333plt.plot(df['testers_time'], prob, marker='.', linestyle='-') plt.show()我尝试了以下在stackoverflow上发现的方法:dict = {'id': ['a','b','c','d'], 'testers_time': [10, 30, 15, None], 'stage_1_to_2_time': [30, None, 30, None], 'activated_time' : [40, None, 45, None],'stage_2_to_3_time' : [30, None, None, None],'engaged_time' : [70, None, None, None]} df = pd.DataFrame(dict, columns=['id', 'testers_time', 'stage_1_to_2_time', 'activated_time', 'stage_2_to_3_time', 'engaged_time'])df= df.dropna(subset=['testers_time']).sort_values('testers_time')fit = stats.norm.pdf(df['testers_time'], np.mean(df['testers_time']), np.std(df['testers_time']))  print(fit)#0.02902547,  0.04346777,  0.01829513]plt.plot(df['testers_time'], fit, marker='.', linestyle='-')plt.hist(df['testers_time'], normed='true')      plt.show()如您所见,我得到了完全不同的值-概率对于#1正确,但对于#2则不正确(也不累加到100%),直方图的y轴(%)基于6槽,不是3槽。您能解释一下我如何获得#2的正确概率吗?
查看完整描述

1 回答

  • 1 回答
  • 0 关注
  • 134 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
微信客服

购课补贴
联系客服咨询优惠详情

帮助反馈 APP下载

慕课网APP
您的移动学习伙伴

公众号

扫描二维码
关注慕课网微信公众号