我想检查从数据帧的左上角到最右下角元素的数据帧中的所有数据是否完整(数据应填充为矩形)。如果在数据主体之后有空白列或行,这很好(它会有这个)。好的和坏的数据帧示例如下:bad_dataframe = pd.DataFrame([[1,1,1,""],["","","",""],[1,"",1,""],["","","",""]])good_dataframe = pd.DataFrame([[1,1,1,""],[1,1,1,""],[1,1,1,""],[1,1,1,""],["","","",""]])我这样做的方式如下def not_rectangle_data(DataFrame): """ This function will check if the data given to it is a "rectangle" """ #removes all rows and columns that contain only blanks reduced_dataframe = DataFrame[DataFrame != ""].dropna(how="all",axis = 1).dropna(how="all",axis = 0) #removes all rows and columns that contain any blanks super_reduced_dataframe = reduced_dataframe.dropna(how="any",axis = 1).dropna(how="any",axis = 0) #Check that dataframe is not empty and that no column or no rows are half empty if not reduced_dataframe.empty and \ super_reduced_dataframe.equals(reduced_dataframe): #Check that columns in remain data are still present if ((max(reduced_dataframe.index) + 1) == reduced_dataframe.shape[0]) and \ ((max(reduced_dataframe.columns) + 1) == reduced_dataframe.shape[1]): return True else: return False else: return False但是我觉得应该有一种更简洁的方法来做到这一点。
1 回答

哔哔one
TA贡献1854条经验 获得超8个赞
使用numpy:
import numpy as np
def check_rectangle(df):
non_zeros = np.nonzero(df.values)
arr = np.zeros(np.max(non_zeros, 1)+1)
np.add.at(arr, non_zeros, 1)
return np.alltrue(arr)
check_rectangle(good_dataframe)
# True
check_rectangle(bad_dataframe)
# False
np.nonzero获取所有不为零的索引(''此处视为零)。
np.zeros(np.max(non_zeros, 1)+1)创建适合 的最小矩形non_zeros。
np.add.at添加1到所有非零位置。
最后,如果矩形被填充,则np.alltrue返回True,否则返回False。
添加回答
举报
0/150
提交
取消