3 回答
TA贡献1836条经验 获得超4个赞
在中,我们可以使用(from )R为新名称创建一个列,并将其转换为“宽”格式rowiddata.table
library(dplyr)
library(data.table)
library(stringr)
x %>%
mutate(name = str_c('col_', rowid(ID))) %>%
pivot_wider(names_from = name, values_from = Group)
# A tibble: 2 x 6
# ID col_1 col_2 col_3 col_4 col_5
# <chr> <chr> <chr> <chr> <chr> <chr>
#1 samp_1 4 4.2 4.2.1 4.2.1.1 <NA>
#2 samp_2 1 1.2 1.2.1 1.2.1.2 1.2.1.2.1
或使用data.table
library(data.table)
dcast(setDT(x), ID ~ paste0('col_', rowid(ID)), value.var = 'Group')
# ID col_1 col_2 col_3 col_4 col_5
#1: samp_1 4 4.2 4.2.1 4.2.1.1 <NA>
#2: samp_2 1 1.2 1.2.1 1.2.1.2 1.2.1.2.1
或base R与reshape
reshape(transform(x, name = paste0('col_', ave(seq_along(ID), ID,
FUN = seq_along))), idvar = 'ID', direction = 'wide', timevar = 'name')
TA贡献1847条经验 获得超7个赞
akrun 的优秀选择。如果数据有点乱,你可能想试试这个:
x %>%
mutate(temp = str_c('col_', str_count(Group, "\\."))) %>%
pivot_wider(names_from = temp, values_from = Group) %>%
select(ID, order(colnames(.)))
数据:
groups <- c("41.2","4","4.2.1","4.2.1.1", "1", "1.2", "1.2.1", "1.2.1.2","1.2.1.2.1")
x <- data.frame(ID = c(rep("samp_1", 4), rep("samp_2", 5)), Group = groups)
结果:
# A tibble: 2 x 6
ID col_0 col_1 col_2 col_3 col_4
<chr> <chr> <chr> <chr> <chr> <chr>
1 samp_1 4 41.2 4.2.1 4.2.1.1 NA
2 samp_2 1 1.2 1.2.1 1.2.1.2 1.2.1.2.1
TA贡献1802条经验 获得超4个赞
你可以在 python 中试试这个:
import pandas as pd
import numpy as np
df= pd.DataFrame({'ID':np.repeat(["samp_1","samp_2"],[4,5]),
'groups':["4","4.2","4.2.1","4.2.1.1", "1", "1.2", "1.2.1", "1.2.1.2","1.2.1.2.1"],})
df['entry']=df.groupby(['ID']).cumcount()+1
我们为每组提供一个数字,并将其添加为entry列。下面我们像在 R 中一样进行旋转,使用该列提供列名,最后我们重置索引:
df.pivot(values='groups',columns='entry',index='ID').reset_index()
entry ID 1 2 3 4 5
0 samp_1 4 4.2 4.2.1 4.2.1.1 NaN
1 samp_2 1 1.2 1.2.1 1.2.1.2 1.2.1.2.1
添加回答
举报