已解决430363个问题，去搜搜看，总会有你想问的

根据字符串位置将 Pandas 系列分解为多个 DataFrame 列

首页猿问根据字符串位置将 Pandas...

根据字符串位置将 Pandas 系列分解为多个 DataFrame 列

Python

慕神8447489 2021-06-10 18:01:07

给定一个Series带有字符串的 Pandas ，我想DataFrame为Series基于位置的每个部分创建一个列。例如，给定以下输入：s = pd.Series(['abcdef', '123456'])ind = [2, 3, 1]理想情况下，我会得到这个：target_df = pd.DataFrame({ 'col1': ['ab', '12'], 'col2': ['cde', '345'], 'col3': ['f', '6']})一种方法是一一创建它们，例如：df['col1'] = s.str[:3]df['col2'] = s.str[3:5]df['col3'] = s.str[5]但我猜这比单次拆分要慢。我尝试了正则表达式，但不确定如何解析结果：pd.DataFrame(s.str.split("(^(\w{2})(\w{3})(\w{1}))"))# 0# 0 [, abcdef, ab, cde, f, ]# 1 [, 123456, 12, 345, 6, ]

查看完整描述

1 回答

人到中年有点甜

TA贡献1895条经验获得超7个赞

您的正则表达式几乎就在那里（注意Series.str.extract(expand=True)返回 a DataFrame）：

df = s.str.extract("^(\w{2})(\w{3})(\w{1})", expand = True)

df.columns = ['col1', 'col2', 'col3']

# col1 col2 col3

# 0 ab cde f

# 1 12 345 6

这是一个概括这一点的函数：

def split_series_by_position(s, ind, cols):

# Construct regex.

regex = "^(\w{" + "})(\w{".join(map(str, ind)) + "})"

df = s.str.extract(regex, expand=True)

df.columns = cols

return df

# Example which will produce the result above.

split_series_by_position(s, ind, ['col1', 'col2', 'col3'])

反对回复 2021-06-22

1 回答
0 关注
140 浏览

关注

添加回答

0/150

提交

取消

微信客服

购课补贴
联系客服咨询优惠详情

帮助反馈 APP下载

慕课网APP
您的移动学习伙伴

公众号

扫描二维码
关注慕课网微信公众号

热搜

最近搜索清空

根据字符串位置将 Pandas 系列分解为多个 DataFrame 列

根据字符串位置将 Pandas 系列分解为多个 DataFrame 列

1 回答

添加回答