我是新来的火花。我试过爆炸 a 的array内部struct。JSON 循环有点复杂,如下所示。{"id": 1,"firstfield": "abc","secondfield": "zxc","firststruct": { "secondstruct": { "firstarray": [{ "firstarrayfirstfield": "asd", "firstarraysecondfield": "dasd", "secondarray": [{ "score": " 7 " }] }] }}}我正在尝试访问score字段下的secondarray字段,以便能够计算一些指标并得出每个id.
1 回答
LEATH
TA贡献1936条经验 获得超6个赞
如果您使用 Glue,那么您应该将 DynamicFrame 转换为 Spark 的 DataFrame,然后使用爆炸函数:
from pyspark.sql.functions import col, explode
scoresDf = dynamicFrame.toDF
.withColumn("firstExplode", explode(col("firststruct.secondstruct.firstarray")))
.withColumn("secondExplode", explode(col("firstExplode.secondarray")))
.select("secondExplode.score")
scoresDyf = DynamicFrame.fromDF(scoresDf, glueContext, "scoresDyf")
添加回答
举报
0/150
提交
取消