def map[U: ClassTag](f: T => U): RDD[U] = {val cleanF = sc.clean(f)new MapPartitionsRDD[U, T](this, (context, pid, iter) => iter.map(cleanF))}RDD.scala里的这个方法里的context, pid, iter不知道从哪来的啊??https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDD.scala
2 回答
慕勒3428872
TA贡献1848条经验 获得超6个赞
private[spark] class MapPartitionsRDD[U: ClassTag, T: ClassTag](
prev: RDD[T],
f: (TaskContext, Int, Iterator[T]) => Iterator[U], // (TaskContext, partition index, iterator)
preservesPartitioning: Boolean = false)
extends RDD[U](prev) {
override def compute(split: Partition, context: TaskContext): Iterator[U] =
f(context, split.index, firstParent[T].iterator(split, context))
}
方法的参数列表,传入一个参数为(TaskContext, Int, Iterator[T])返回为Iterator[U]的函数作为MapPartitionsRDD的构造函数的参数f,方法compute会调用这个方法。
- 2 回答
- 0 关注
- 972 浏览
添加回答
举报
0/150
提交
取消