1 回答
![?](http://img1.sycdn.imooc.com/5458683f00017bab02200220-100-100.jpg)
TA贡献2019条经验 获得超9个赞
这是关系数据库的行为,不能仅emp_id每个组显示一次并且其所有元素每行显示一个。改变数据显示方式是前端的特权和责任,而不是数据库的。因此,请在 Python 中执行此操作。
话虽如此,Impala 具有SPLIT_PART()字符串函数,它返回由作为参数传递的分隔符分隔的字符串的第 n 个标记。
因此,与一系列连续整数交叉连接,然后应用SPLIT_PART(skills,'|',i)即可满足您的需要。
实际上,以我(从来没有)谦虚的观点来看,每当有人向您抛出格式如此不合适的文件以将其加载到数据库中时,您就应该这样做。始终使用下面的技术垂直化逗号/条/分号/或任何分隔的“值”列表,并垂直存储数据。:
WITH
-- your input
input( emp_id,skills) AS (
SELECT 1234,'python|java|sql|R|javascript'
UNION ALL SELECT 5639,'C|HTML|php|perl'
)
,
-- a big enough series of integers ..
i(i) AS (
SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
UNION ALL SELECT 6
UNION ALL SELECT 7
UNION ALL SELECT 8
)
SELECT
emp_id
, SPLIT_PART(skills,'|',i) AS skill
FROM input
CROSS JOIN i
WHERE SPLIT_PART(skills,'|',i) <> ''
ORDER BY
emp_id
, i
;
-- out emp_id | skill
-- out --------+------------
-- out 1234 | python
-- out 1234 | java
-- out 1234 | sql
-- out 1234 | R
-- out 1234 | javascript
-- out 5639 | C
-- out 5639 | HTML
-- out 5639 | php
-- out 5639 | perl
使用两个横杠/逗号分隔的列,它可能如下所示:
WITH
-- your input, enhanced
input( emp_id,skills,pubs) AS (
SELECT 1234,'python|java|sql|R|javascript','ship inn,anchor,stag'
UNION ALL SELECT 5639,'C|HTML|php|perl' ,'black horse,crown,mitre'
)
,
-- a big enough series of integers ..
i(i) AS (
SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
UNION ALL SELECT 6
UNION ALL SELECT 7
UNION ALL SELECT 8
)
,
-- another big enough series of integers ..
j(j) AS (
SELECT
i AS j
FROM i
)
SELECT
emp_id
, i AS skill_sequence
, SPLIT_PART(skills,'|',i) AS skill
, j AS pub_sequence
, SPLIT_PART(pubs,',',j) AS pub
FROM input
CROSS JOIN i
CROSS JOIN j
WHERE SPLIT_PART(skills,'|',i) <> ''
AND SPLIT_PART(pubs, ',',j) <> ''
ORDER BY
emp_id
, i
, j
;
-- out emp_id | skill_sequence | skill | pub_sequence | pub
-- out --------+----------------+------------+--------------+-------------
-- out 1234 | 1 | python | 1 | ship inn
-- out 1234 | 1 | python | 2 | anchor
-- out 1234 | 1 | python | 3 | stag
-- out 1234 | 2 | java | 1 | ship inn
-- out 1234 | 2 | java | 2 | anchor
-- out 1234 | 2 | java | 3 | stag
-- out 1234 | 3 | sql | 1 | ship inn
-- out 1234 | 3 | sql | 2 | anchor
-- out 1234 | 3 | sql | 3 | stag
-- out 1234 | 4 | R | 1 | ship inn
-- out 1234 | 4 | R | 2 | anchor
-- out 1234 | 4 | R | 3 | stag
-- out 1234 | 5 | javascript | 1 | ship inn
-- out 1234 | 5 | javascript | 2 | anchor
-- out 1234 | 5 | javascript | 3 | stag
-- out 5639 | 1 | C | 1 | black horse
-- out 5639 | 1 | C | 2 | crown
-- out 5639 | 1 | C | 3 | mitre
-- out 5639 | 2 | HTML | 1 | black horse
-- out 5639 | 2 | HTML | 2 | crown
-- out 5639 | 2 | HTML | 3 | mitre
-- out 5639 | 3 | php | 1 | black horse
-- out 5639 | 3 | php | 2 | crown
-- out 5639 | 3 | php | 3 | mitre
-- out 5639 | 4 | perl | 1 | black horse
-- out 5639 | 4 | perl | 2 | crown
-- out 5639 | 4 | perl | 3 | mitre
添加回答
举报