我正在尝试将 csv 文件作为 pandas 数据框导入,其中 csv 文件位于 zip 文件内。为了高效导入,我尝试先获取标题,然后再将其加载到 pandas 数据帧中。到目前为止我尝试过的是:from zipfile import ZipFilefrom io import TextIOWrapperimport pandas as pdwith ZipFile(zip_path, 'r') as zipfile: with zipfile.open(file_path, 'r') as file: reader = csv.reader(TextIOWrapper(file, 'utf-8', newline='')) headers = next(reader) df = pd.read_csv(file)问题是,当我获取next(reader)底层文件的标头受到影响时,该文件将作为没有标头的 pandas 数据框导入。真的很感激任何修复。
1 回答
![?](http://img1.sycdn.imooc.com/545863dc00011d2202200220-100-100.jpg)
梦里花落0921
TA贡献1772条经验 获得超6个赞
您可以使用函数eek()重置CSV迭代器:
with ZipFile('test.zip', 'r') as zipfile:
with zipfile.open('test.csv', 'r') as file:
reader = csv.reader(TextIOWrapper(file, 'utf-8', newline=''))
headers = next(reader)
# reset CSV iterator
file.seek(0)
df = pd.read_csv(file)
添加回答
举报
0/150
提交
取消