如何使用嵌套列表创建 Spark 表

我如何使用这个答案（List to DataFrame in pyspark a new answer to create a table using spark for nested list?lst = [{'sfObject': 'event', 'objID': 'Id', 'interimRun': 'True', 'numAttributes_Total': 140, 'numAttributes_Compounded': 0, 'numAttributes_nonCompounded': 140, 'chunks': 1, 'compoundStatus': 'False', 'allAttributes': ['Id', 'RecordTypeId', 'WhoId', 'Advisor_Team__c’,…], 'compoundAttributes': [], 'nonCompoundAttributes': ['Id', 'RecordTypeId', 'WhoId', 'WhatId’…]}, {'sfObject': 'fund__c', 'objID': 'Id', 'interimRun': 'False', 'numAttributes_Total': 40, 'numAttributes_Compounded': 0, 'numAttributes_nonCompounded': 40, 'chunks': 1, 'compoundStatus': 'False', 'allAttributes': ['Id', 'IsDeleted', 'Name’…], 'compoundAttributes': [], 'nonCompoundAttributes': ['Id', 'IsDeleted', 'Name', 'RecordTypeId’…]}] 我想创建将这个列表存储到一个表中，所以需要它的结构是这样的：下面的链接是我需要使用上面的 lst 创建的表的图像：在此处输入图像描述此嵌套列表最多包含 30 个不同的项目，因此答案需要为每个项目动态创建最多 30 行。谢谢！

查看完整描述

1 回答

慕神8447489

TA贡献1780条经验获得超1个赞

获得字典列表后，运行以下命令。它将推断模式。

df = sc.parallelize(lst).toDF()

如果你想把它当作一个表来运行 SQL 查询，运行：

df.createOrReplaceTempView("df_table")

new_df = spark.sql("SELECT * FROM df_table")

反对回复 2023-01-04

热搜

最近搜索清空

如何使用嵌套列表创建 Spark 表

如何使用嵌套列表创建 Spark 表

1 回答

添加回答