我编写了读取 CSV 文件并将结果写入控制台的 spark 程序。我在运行它时收到错误。我正在使用火花 2.2.0。示例文件:EmployeeID,FirstName,LastName,DepartmentId,Salaray1,Gowdhaman,Dhandapani,IT,100002,Shaara,Gowdhaman,IT,1500003,Karthiga,Gowdhaman,IT,1200004,Aravind,Gunasekaran,Mech,1000005,Padma,Dhandapani,Home,10000程序:from pyspark.sql import SparkSessiondef read_csv(spark, filename): df = spark.read.load(filename, format='.csv', sep=',', header = 'true') return dfdef main(): spark = SparkSession \ .builder \ .appName('Python Spark SQL Basic example') \ .getOrCreate() emp = read_csv(spark, 'Employee.csv') emp.show()if __name__ == '__main__': main()
添加回答
举报
0/150
提交
取消