首页猿问如何将多列字典转换为数据框？

如何将多列字典转换为数据框？

Python

尚方宝剑之说 2023-09-02 16:13:38

我正在使用 cryptocompare API 并尝试提取一堆不同代码的历史数据。从 Excel 电子表格中读取股票代码，然后通过 API 加载以获取每个股票代码的历史价格。这是我当前收到的数据框： https: //docs.google.com/spreadsheets/d/1jQ7F0H2D-voTBxjHI3QVPTRNyR4m4qKob8NE04JWVY0/edit ?usp=sharing我想知道一些事情：如何使索引成为日期和我如何访问“关闭”数据点？这是目前我的代码：import pandas as pdimport cryptocompareimport datetimedf = pd.read_excel('C:/Users/Jacob/Downloads/Crypto Tickers1.xlsx', sheet_name='Sheet1')tickers_list = df['Ticker'].tolist()data = pd.DataFrame(columns=tickers_list)for ticker in tickers_list: data[ticker] = cryptocompare.get_historical_price_day(ticker, 'USD', limit = 1000 , exchange='CCCAGG', toTs=datetime.datetime(2020,9,22)) export_excel = data.to_excel(r'C:/Users/Jacob/Downloads/Crypto Scrape df.xlsx', sheet_name='Sheet1', index= True)

查看完整描述

1 回答

慕村225694

TA贡献1880条经验获得超4个赞

对于每个股票代码，该文件由dicts, as组成的列。strings
加载您的文件pandas.read_excel。
使用 .将类型转换str为类型。dictast.literal_eval
将的每一列转换dicts为一个数据框pandas.json_normalize。

每个数据帧将被添加到dict, df_dict，其中键将是股票代码。

将股票代码作为一列添加到每个数据帧
将所有数据帧组合成一个数据帧，并带有pandas.concat.
使用 , 将列转换'time'为日期时间格式pandas.to_datetime，并设置为索引。
使用或访问该'close'列。df.closedf['close']
用于pandas.DataFrame.pivot获取'close'值，以代码作为标题和'time'索引。

import pandas as pd

from ast import literal_eval

# load the file

df = pd.read_excel('Crypto Scrape df.xlsx', sheet_name='Sheet1')

# drop the Unnamed column

df.drop(columns=['Unnamed: 0'], inplace=True)

# apply literal_eval to all columns to convert them from strings to dicts

df = df.applymap(literal_eval)

# create a dict of dataframes in a dict comprehension

df_dict = {col: pd.json_normalize(df[col]) for col in df.columns}

# add a ticker column

for k, d in df_dict.items():

df_dict[k]['ticker'] = k

# combine all the dicts into a single dataframe

df = pd.concat(df_dict.values()).reset_index(drop=True)

# convert the time column to a datetime format

df.time = pd.to_datetime(df.time, unit='s')

# set the time column as the index

df.set_index('time', inplace=True)

# to get only close values under each ticker with time as the index

dfp = df[['close', 'ticker']].pivot(columns='ticker', values='close')

# set the column and index name as None, if desired

dfp.columns.name = None

dfp.index.name = None

显示前 5 行和前 5 列dfp

# display(dfp.iloc[:5, :5])

ADA ALGO ATOM BAT BCH

2017-12-27 0.00 0.0 0.0 0.3200 2710.64

2017-12-28 0.00 0.0 0.0 0.6891 2484.96

2017-12-29 0.00 0.0 0.0 0.4013 2619.32

2017-12-30 0.59 0.0 0.0 0.4001 2209.96

2017-12-31 0.71 0.0 0.0 0.5910 2371.83

反对回复 2023-09-02

1 回答
0 关注
104 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

如何将多列字典转换为数据框？

如何将多列字典转换为数据框？

1 回答

添加回答