为了账号安全,请及时绑定邮箱和手机立即绑定

Python:需要在箱线图中重叠实际数据

Python:需要在箱线图中重叠实际数据

开心每一天1111 2023-06-06 15:39:15
我正在使用此代码在箱线图中绘制我的数据:import matplotlib.pyplot as pltimport numpy as npfrom matplotlib.patches import Polygonrandom_dists = ['Overlap', 'Non overlap', ]Overlap= [6,6,5,1,3,4,4,3]non_overlap= [1,2,6,6,1,3,3,3,3,3,5,2,2]data = [    Overlap,    non_overlap]fig, ax1 = plt.subplots(figsize=(6, 6))fig.canvas.set_window_title('A Boxplot Example')fig.subplots_adjust(left=0.075, right=0.95, top=0.9, bottom=0.25)# bp = ax1.boxplot(data, notch=0, sym='+', vert=1, whis=1.5)bp = ax1.boxplot(data)plt.setp(bp['boxes'], color='black')plt.setp(bp['whiskers'], color='black')plt.setp(bp['fliers'], color='red', marker='+')        # Add a horizontal grid to the plot, but make it very light in color# so we can use it for reading data values but not be distractingax1.yaxis.grid(True, linestyle='-', which='major', color='lightgrey',               alpha=0.5)# Hide these grid behind plot objectsax1.set_axisbelow(True)ax1.set_title('overlap and non_overlap against mRS')# ax1.set_xlabel('Distribution')# ax1.set_ylabel('Value')# Now fill the boxes with desired colorsbox_colors = ['darkkhaki', 'royalblue']num_boxes = len(data)medians = np.empty(num_boxes)for i in range(num_boxes):    box = bp['boxes'][i]    boxX = []    boxY = []    for j in range(5):        boxX.append(box.get_xdata()[j])        boxY.append(box.get_ydata()[j])    box_coords = np.column_stack([boxX, boxY])    # Alternate between Dark Khaki and Royal Blue    ax1.add_patch(Polygon(box_coords, facecolor=box_colors[i % 2]))    # Now draw the median lines back over what we just filled in我需要的是将数据作为散点图重叠到其中,我真的很努力地使用链接上的代码,并尝试在 overstack 上搜索以找到解决方案,但我在编码方面不是那么好,我也尝试使用 seaborn 库,但我总是得到一个错误:'list' object has没有属性'get'并且无法修复它所以请任何人帮助()
查看完整描述

1 回答

?
守着星空守着你

TA贡献1799条经验 获得超8个赞

当前版本plt.boxplot()允许绘制这些元素中的大部分标准。showmeans如果设置为,将绘制均值True。它的属性可以通过字典来控制meanprops。设置时patch_artist=True,将绘制一个填充框,而不仅仅是轮廓,boxprops控制它们的外观。


要在顶部绘制散点图,只需调用ax1.scatter. x 位置可以通过 随机抖动i + np.random.uniform(-0.4, 0.4)。要强制它们位于箱线图之上,可以更改它们的 z 顺序。


由于传单也是散点数据的一部分,因此将它们排除在外可能是有意义的 ( showfliers=False)。


要创建图例,您可以收集所有所需元素的句柄并将它们传递给ax1.legend(). 请注意,您的箱线图已经在 x 轴上获得了标签,因此将它们也放在图例中可能有点多余。


import matplotlib.pyplot as plt

import numpy as np


random_dist_names = ['Overlap', 'Non overlap']

overlap = [6, 6, 5, 1, 3, 4, 4, 3]

non_overlap = [1, 2, 6, 6, 1, 3, 3, 3, 3, 3, 5, 2, 2]

data = [overlap, non_overlap]


fig, ax1 = plt.subplots(figsize=(6, 6))

fig.canvas.set_window_title('A Boxplot Example')

fig.subplots_adjust(left=0.075, right=0.95, top=0.9, bottom=0.25)


box_colors = ['darkkhaki', 'royalblue']

scatter_colors = ['purple', 'crimson']

legend_handles = []

for i, (values, box_color, scatter_color) in enumerate(zip(data, box_colors, scatter_colors), start=1):

    bp = ax1.boxplot(values, positions=[i], showmeans=True, patch_artist=True, showfliers=False,

                     boxprops={'edgecolor': 'black', 'facecolor': box_color},

                     whiskerprops={'color': 'black'},  # flierprops={'color': 'red', 'marker': '+'},

                     medianprops={'color': 'lime', 'linewidth': 2, 'linestyle': ':'},

                     meanprops={'markerfacecolor': 'w', 'marker': '*', 'markeredgecolor': 'k', 'markersize': 10})

    if i == 1:

        legend_handles.append(bp['means'][0])

    legend_handles.append(bp['boxes'][0])

    ax1.scatter(i + np.random.uniform(-0.4, 0.4, len(values)), values, color=scatter_color, alpha=0.5, zorder=3)


ax1.yaxis.grid(True, linestyle='-', which='major', color='lightgrey', alpha=0.5)

ax1.set_axisbelow(True)

ax1.set_title('overlap and non_overlap against mRS')


ax1.set_xlim(0.5, len(data) + 0.5)

ax1.set_ylim(ymin=0)

ax1.set_xticklabels(random_dist_names, rotation=0, fontsize=8)

ax1.legend(legend_handles, ['Mean'] + random_dist_names, bbox_to_anchor=[1, -0.1], loc='upper right')


plt.show()

//img1.sycdn.imooc.com//647ee2df0001058c07060537.jpg

请注意,您的数据点很少,而且它们都有整数值,这使得红点出现在水平线上。


PS:要创建与 Seaborn 类似的东西,数据的组织方式必须更类似于 pandas 数据框。这样的数据框将有一列包含所有值,一列包含类别。


可以更自动地创建图例。为了将均值也纳入图例,必须通过 将标签分配给均值meanprops={..., 'label': 'Mean'}。不幸的是,这会为每个框创建一个图例条目。这些可以通过首先获取所有图例条目并ax.get_legend_handles_labels()获取句柄和标签的子数组来跳过。


import matplotlib.pyplot as plt

import numpy as np

import seaborn as sns


random_dist_names = ['Overlap', 'Non overlap']

overlap = [6, 6, 5, 1, 3, 4, 4, 3]

non_overlap = [1, 2, 6, 6, 1, 3, 3, 3, 3, 3, 5, 2, 2]

data_names = np.repeat(random_dist_names, [len(overlap), len(non_overlap)])

data_values = np.concatenate([overlap, non_overlap])


ax = sns.boxplot(x=data_names, y=data_values, hue=data_names, palette=['darkkhaki', 'royalblue'],

                 dodge=False, showfliers=False, showmeans=True,

                 meanprops={'markerfacecolor': 'w', 'marker': '*', 'markeredgecolor': 'k', 'markersize': 10, 'label': 'Mean'})

sns.stripplot(x=data_names, y=data_values, color='red', alpha=0.4)

handles, labels = ax.get_legend_handles_labels()

skip_pos = len(random_dist_names) - 1

ax.legend(handles[skip_pos:], labels[skip_pos:], bbox_to_anchor=(1.02, -0.05), loc='upper right')

plt.tight_layout()

plt.show()


查看完整回答
反对 回复 2023-06-06
  • 1 回答
  • 0 关注
  • 164 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信