我正在尝试使用 SVM 分类器使用自定义交叉验证折叠来建模二元分类问题,但它给了我错误 **需要至少一个数组来连接 ** 与 cross_val_predict。该代码在 cros_val_predict 中的 cv=3 下工作正常,但是当我使用 custom_cv 时,它会出现此错误。下面是代码:from sklearn.model_selection import LeavePOutimport numpy as npfrom sklearn.svm import SVCfrom time import *from sklearn.metrics import roc_auc_scorefrom sklearn.model_selection import cross_val_predict,cross_val_scoreclf = SVC(kernel='linear',C=25)X = np.array([[1, 2], [3, 4], [5, 6], [7, 8],[9,10]])y = np.array([0,1,1,0,0])lpo = LeavePOut(2)print(lpo.get_n_splits(X))LeavePOut(p=2)test_index_list=[]train_index_list=[]for train_index, test_index in lpo.split(X,y): if(y[test_index[0]]==y[test_index[1]]): pass else: print("TRAIN:", train_index, "TEST:", test_index) X_train, X_test = X[train_index], X[test_index] y_train, y_test = y[train_index], y[test_index] train_index_list.append(train_index) test_index_list.append(test_index)custom_cv = zip(train_index_list, test_index_list)scores = cross_val_score(clf, X, y, cv=custom_cv)print(scores)print('accuracy:',scores.mean())predicted=cross_val_predict(clf,X,y,cv=custom_cv) # error with this lineprint('Confusion matrix:',confusion_matrix(labels, predicted))以下是错误的完整跟踪:ValueError Traceback (most recent call last)<ipython-input-11-d78feac932b2> in <module>() 31 print(scores) 32 print('accuracy:',scores.mean())---> 33 predicted=cross_val_predict(clf,X,y,cv=custom_cv) 34 35 print('Confusion matrix:',confusion_matrix(labels, predicted))关于如何解决此错误有什么建议吗?
1 回答
慕尼黑5688855
TA贡献1848条经验 获得超2个赞
这里有2个错误:
如果您想重用
zip
对象,请创建一个列表。该物体在使用一次后就会耗尽。你可以这样修复它:
custom_cv = [*zip(train_index_list, test_index_list)]
交叉验证列表
cross_val_predict
应该是实际数组的分区(每个样本应该只属于一个测试集)。就你而言,事实并非如此。如果您考虑一下,堆叠交叉验证列表的输出将产生长度为6 的数组,而原始y的长度为 5。您可以像这样实现自定义交叉验证预测:
def custom_cross_val_predict(clf, X, y, cv):
y_pred, y_true = [], []
for tr_idx, vl_idx in cv:
X_tr, y_tr = X[tr_idx], y[tr_idx]
X_vl, y_vl = X[vl_idx], y[vl_idx]
clf.fit(X_tr, y_tr)
y_true.extend(y_vl)
y_pred.extend(clf.predict(X_vl))
return y_true, y_pred
labels, predicted = custom_cross_val_predict(clf,X,y,cv=custom_cv)
print('Confusion matrix:',confusion_matrix(labels, predicted))
添加回答
举报
0/150
提交
取消