我有一个大数据集,正在尝试将仅包含数字数据的“对象”列转换为 python/pandas 中的“整数”数据类型。对于我尝试的每个代码,我都收到以下错误:CODE SNIPPET (see below for options I have tried)PATH/frame.py in __setiten__(self, key, value) 3482 self._setitem_frame(key, value) 3483 elif isinstance(key, (Series, np.ndarray, list, Index)): -->3484 self._setiten_array(key, value) 3485 else: PATH/frame.py in _setitem_array(self, key, value) 3507 raise ValueError("Columns must be same length as key") 3508 for k1, k2 in zip(key, value.columns): -->3509 self[k1] = value[k2] 3510 else: 3511 indexer = self.loc._convert_to_indexer(key, axis=1) PATH/frame.py in __setitem__(self, key, value) 3485 else: 3486 #set column -->3487 self._set_item(key, value) 3488 3489 def _setitem_slice(self, key, value):PATH/frame.py in _set_item(self, key, value) 3562 3563 self._ensure_valid_index(value) -->3564 value = self._sanitize_column(key, value) 3565 NDFrame._set_item(self, key, value)PATH/frame.py in _sanitize_column(self, key, value, broadcast) 3778 if broadcast and key in self.columns and value.ndim == 1: 3780 if not self.columns.is_unique or isinstance(self.columns, MultiIndex): -->3781 existing_piece = self[key] 3782 if isinstance(existing_piece, DataFrame): 3783 value = np.tile(value, (len(existing_piece.columns), 1))PATH/frame.py in __getitem__(self, key) 2971 if self.columns.nlevels > 1: 2972 return self.getitem_multilevel(key) -->2973 return self.__get_item_cache(key_ 2974 2975 # Do we have a slicer (on rows)?
2 回答
阿波罗的战车
TA贡献1862条经验 获得超6个赞
尝试这个:
for col in ["column1", "column 2", "column 3", "column 4"]:
# df[col].reshape((1,-1))
df[col] = [int(n) for n in df[col]]
眼眸繁星
TA贡献1873条经验 获得超9个赞
我找到了答案。问题可能是我正在使用 Oracle 数据库连接,我不确定。如果有人有更简单的方法在 Python 中执行此操作,我仍然很乐意听到更多评论,但我是这样做的:
#coerce stores all non-convertible values as NA and ignore keeps original values, so column may have mixed data types.
df['column names'] = df[['column names']].apply(pd.to_numeric, errors = 'coerce').fillna(df)
请注意,对非数字项目使用强制可能会删除其数据并将其切换为 NA。:) 这虽然有效!
添加回答
举报
0/150
提交
取消