1 回答
TA贡献1943条经验 获得超7个赞
首先使类标题成为java bean类,即编写获取器和设置器。
public class Title implements Serializable {
String txn_date;
Timestamp timestamp;
String txn_type;
String txn_rcvd_time;
String txn_ref;
String txn_status;
public Title(String data){... //set values for fields with the data}
// add all getters and setters for fields
}
Dataset<Title> resultdf = df.selectExpr("CAST(value AS STRING)").map(value -> new Title(value), Encoders.bean(Title.class))
resultdf.filter(title -> // apply any predicate on title)
如果要先筛选数据,然后应用编码,
df.selectExpr("CAST(value AS STRING)")
.filter(get_json_object(col("value"), "$.sample_title").isNotNull)
// for simple filter use, .filter(t-> t.contains("sample_title"))
.map(value -> new Title(value), Encoders.bean(Title.class))
添加回答
举报