首页手记 Spark core篇 Rpc源码1

Spark core篇 Rpc源码1

标签：

Spark

描述了Spark Master和Worker启动的流程，里面无论是Master还是Workermain方法的第一步都是构建RpcEnv，这个是消息通信的核心，这里就来详细分析分析Rpc
首先看看Master和Worker的一段相似构建RpcEnv的代码：

Master:
 val rpcEnv = RpcEnv.create(SYSTEM_NAME, host, port, conf, securityMgr)
    val masterEndpoint = rpcEnv.setupEndpoint(ENDPOINT_NAME,      new Master(rpcEnv, rpcEnv.address, webUiPort, securityMgr, conf))//这句master终端点send a message to the corresponding [[RpcEndpoint]]，这个RpcEndpoint就是Master
    val portsResponse = masterEndpoint.askWithRetry[BoundPortsResponse](BoundPortsRequest)

Worker:
val rpcEnv = RpcEnv.create(systemName, host, port, conf, securityMgr)
    val masterAddresses = masterUrls.map(RpcAddress.fromSparkURL(_))
    rpcEnv.setupEndpoint(ENDPOINT_NAME, new Worker(rpcEnv, webUiPort, cores, memory,
      masterAddresses, ENDPOINT_NAME, workDir, conf, securityMgr))

查看可知道其实这两部比较重要， RpcEnv.create和rpcEnv.setupEndpoint。这里就单独详细分析这两块的内容

RpcEnv.create

RpcEnv.create流程图大致为如此：

image.png

底层是启动Netty的Server，开启Netty端通信（server = transportContext.createServer(host, port, bootstraps)）

rpcEnv.setupEndpoint

Spark所有的消息实际上都是通过RpcEnv处理，然后RpcEnv分发到对应的Endpoint。RpcEndpointRef相当于RpcEndpoint的引用，如果想给RpcEndpoint发送消息，则需要先获取RpcEndpoint的引用RpcEndpointRef
这里以Master举例：

val masterEndpoint = rpcEnv.setupEndpoint(ENDPOINT_NAME,      new Master(rpcEnv, rpcEnv.address, webUiPort, securityMgr, conf))

new Master是一个RpcEndpoint，会转到NettyRpcEnv类的setupEndpoint方法：

dispatcher.registerRpcEndpoint(name, endpoint)

这之后会转到Dispatcher类的registerRpcEndpoint方法中：

def registerRpcEndpoint(name: String, endpoint: RpcEndpoint): NettyRpcEndpointRef = {//因为Dispatcher关联NettyRpcEnv对象， 因此可以通过nettyEnv.address获取。nettyEnv.address代表启动此NettyRpcEnv的address(由host和Port构成)
    val addr = RpcEndpointAddress(nettyEnv.address, name)//创建endpointRef ，此处应该是对应Master的RpcEndpointRef, 它实际上是一个NettyRpcEndpointRef对象
    val endpointRef = new NettyRpcEndpointRef(nettyEnv.conf, addr, nettyEnv)    synchronized {      if (stopped) {        throw new IllegalStateException("RpcEnv has been stopped")
      }//判断endpoints是否有对应名字的EndPointData， 没有就加入进去
      if (endpoints.putIfAbsent(name, new EndpointData(name, endpoint, endpointRef)) != null) {        throw new IllegalArgumentException(s"There is already an RpcEndpoint called $name")
      }
      val data = endpoints.get(name)//添加进入endpointRefs
      endpointRefs.put(data.endpoint, data.ref)//将data添加进入receivers队列， 等待线程池拉取，取其消息进行执行。
      receivers.offer(data)  // for the OnStart message
    }
    endpointRef
  }

Dispatcher有几个变量很重要：

//endpoints是一个线程安全的ConcurrentMap，key是名字，值是EndpointDataprivate val endpoints: ConcurrentMap[String, EndpointData] =    new ConcurrentHashMap[String, EndpointData]//endpointRefs 存放了RpcEndpoint与RpcEndpointRef的一一映射关系
  private val endpointRefs: ConcurrentMap[RpcEndpoint, RpcEndpointRef] =    new ConcurrentHashMap[RpcEndpoint, RpcEndpointRef]  // Track the receivers whose inboxes may contain messages.//receivers是一个队列，Dispatcher会有threadpool线程池去消费receivers中的信息
  private val receivers = new LinkedBlockingQueue[EndpointData]

EndpointData由名字，RpcEndpoint，NettyRpcEndpointRef构成，并会实例化Inbox，Inbox new对象时会将OnStart加到Messages的队列中作为inbox的首条消息，这也是为何RpcEndpoint构造函数执行完之后就立马执行onStar()函数了 private class EndpointData(
      val name: String,
      val endpoint: RpcEndpoint,
      val ref: NettyRpcEndpointRef) {
    val inbox = new Inbox(ref, endpoint)
  }

NettyRpcEnv有两个方法用于序列化和反序列化的，因为NettyRpcEnv需要远程传输，远程通信：

private[netty] def serialize(content: Any): ByteBuffer = {
    javaSerializerInstance.serialize(content)
  }

  private[netty] def deserialize[T: ClassTag](client: TransportClient, bytes: ByteBuffer): T = {
    NettyRpcEnv.currentClient.withValue(client) {
      deserialize { () =>
        javaSerializerInstance.deserialize[T](bytes)
      }
    }
  }

同样的Worker启动的进程也是如此，通过setupEndpoint方法创建Worker 与 NettyRpcEnvRef的映射关系。

Rpc通信

首先看看RpcEndpointRef中的两个总要方法：

/**
   * Sends a one-way asynchronous message. Fire-and-forget semantics.
   */
  def send(message: Any): Unit/**
   * Send a message to the corresponding [[RpcEndpoint.receiveAndReply)]] and return a [[Future]] to
   * receive the reply within the specified timeout.
   *
   * This method only sends the message once and never retries.
   */
  def ask[T: ClassTag](message: Any, timeout: RpcTimeout): Future[T]

一个是send, 源码描述它就是一种异步的one-way的消息，实际上也就是发送过去无需回复。
而ask与send不同需要回复，它是发送一个消息到指定的终端点，然后接收此消息的终端点收到消息处理后进行reply，这个可能是local模式也可能remote模式。

image.png

这里我们从中选中一点代码进行分析：如Master中：
如以下代码：

case Heartbeat(workerId, worker) =>
      idToWorker.get(workerId) match {        case Some(workerInfo) =>
          workerInfo.lastHeartbeat = System.currentTimeMillis()        case None =>          if (workers.map(_.id).contains(workerId)) {
            logWarning(s"Got heartbeat from unregistered worker $workerId." +              " Asking it to re-register.")
            worker.send(ReconnectWorker(masterUrl))
          } else {
            logWarning(s"Got heartbeat from unregistered worker $workerId." +              " This worker was never registered, so ignoring the heartbeat.")
          }
      }

上面的代码逻辑是Worker会定时发送心跳包到Master端，如果Master检测到workerId对应的workerInfo找不到了，则会校验workers集合是不是包含此workerId,包含则会发送重连给Worker
worker.send(ReconnectWorker(masterUrl))中worker是RpcendpointRef,实际上也是NettyRpcEnvRef,接着就到：

override def send(message: Any): Unit = {
    require(message != null, "Message is null")
    nettyEnv.send(RequestMessage(nettyEnv.address, this, message))    // RequestMessage(address, RpcEnv, message)
  }
RequestMessage:/**
 * The message that is sent from the sender to the receiver.
 */private[netty] case class RequestMessage(
    senderAddress: RpcAddress, receiver: NettyRpcEndpointRef, content: Any)

然后就到：

private[netty] def send(message: RequestMessage): Unit = {
    val remoteAddr = message.receiver.address    if (remoteAddr == address) {      // Message to a local RPC endpoint.
      try {
        dispatcher.postOneWayMessage(message)
      } catch {        case e: RpcEnvStoppedException => logWarning(e.getMessage)
      }
    } else {      // Message to a remote RPC endpoint.
      postToOutbox(message.receiver, OneWayOutboxMessage(serialize(message)))
    }
  }

即根据messager的receiver方的地址与本机地址是否相同，相同说明是local Rpc，不同则说明是remote Rpc，，这里由于Master节点要往Worker节点发消息，则属于remote 模式。下面分别介绍两种模式下的情景。

（1）remote RPC:

private def postToOutbox(receiver: NettyRpcEndpointRef, message: OutboxMessage): Unit = {    if (receiver.client != null) {
      message.sendWith(receiver.client)
    } else {
      require(receiver.address != null,        "Cannot send message to client endpoint with no listen address.")
      val targetOutbox = {
        val outbox = outboxes.get(receiver.address)        if (outbox == null) {
          val newOutbox = new Outbox(this, receiver.address)
          val oldOutbox = outboxes.putIfAbsent(receiver.address, newOutbox)          if (oldOutbox == null) {
            newOutbox
          } else {
            oldOutbox
          }
        } else {
          outbox
        }
      }      if (stopped.get) {        // It's possible that we put `targetOutbox` after stopping. So we need to clean it.
        outboxes.remove(receiver.address)
        targetOutbox.stop()
      } else {
        targetOutbox.send(message)
      }
    }
  }

remote Rpc发送最终会通过TransportClient去发送，

/**
   * Sends an opaque message to the RpcHandler on the server-side. The callback will be invoked
   * with the server's response or upon any failure.
   *
   * @param message The message to send.
   * @param callback Callback to handle the RPC's reply.
   * @return The RPC's id.
   */
  public long sendRpc(ByteBuffer message, final RpcResponseCallback callback) {

即通过Netty框架将数据发送到远程服务器端的RpcHandler那里，让其去处理。

然后NettyRpcHandler收到消息，就会发到inBox中，让线程池来消费消息

  /** Posts a message sent by a remote endpoint. */
  def postRemoteMessage(message: RequestMessage, callback: RpcResponseCallback): Unit = {
    val rpcCallContext =      new RemoteNettyRpcCallContext(nettyEnv, callback, message.senderAddress)
    val rpcMessage = RpcMessage(message.senderAddress, message.content, rpcCallContext)
    postMessage(message.receiver.name, rpcMessage, (e) => callback.onFailure(e))
  }

线程池消费：

/** Message loop used for dispatching messages. */
  private class MessageLoop extends Runnable {    override def run(): Unit = {
      NettyRpcEnv.rpcThreadFlag.value = true
      try {        while (true) {          try {
            val data = receivers.take()            if (data == PoisonPill) {              // Put PoisonPill back so that other MessageLoops can see it.
              receivers.offer(PoisonPill)              return
            }
            data.inbox.process(Dispatcher.this)
          } catch {            case NonFatal(e) => logError(e.getMessage, e)
          }
        }
      } catch {        case ie: InterruptedException => // exit
      }
    }
  }

处理远程消息的代码：

**
   * Process stored messages.
   */  def process(dispatcher: Dispatcher): Unit = {
    var message: InboxMessage = null
    inbox.synchronized {      if (!enableConcurrent && numActiveThreads != 0) {        return
      }
      message = messages.poll()      if (message != null) {
        numActiveThreads += 1
      } else {        return
      }
    }    while (true) {
      safelyCall(endpoint) {
        message match {          case RpcMessage(_sender, content, context) =>            try {
              endpoint.receiveAndReply(context).applyOrElse[Any, Unit](content, { msg =>                throw new SparkException(s"Unsupported message $message from ${_sender}")
              })
            } catch {              case NonFatal(e) =>
                context.sendFailure(e)                // Throw the exception -- this exception will be caught by the safelyCall function.
                // The endpoint's onError function will be called.
                throw e
            }

作者：kason_zhang
链接：https://www.jianshu.com/p/bda13682889f

点击查看更多内容

为 TA 点赞

若觉得本文不错，就分享一下吧！

评论

评论

共同学习，写下你的评论

评论加载中...

展开查看更多评论

作者其他优质文章

正在加载中

慕村9548890

手记
篇

粉丝

227

获赞与收藏

991

关注作者，订阅最新文章

阅读免费教程

后端通用面试教程

41个小节 30975 346

网络编程入门教程

20个小节 12754 240

Pandas 入门教程

25个小节 18635 342

推荐

评论

收藏

共同学习，写下你的评论



感谢您的支持，我会继续努力的～

扫码打赏，你说多少就多少

赞赏金额会直接到老师账户

支付方式

打开微信扫一扫，即可进行扫码打赏哦

今天注册有机会得

100积分直接送

付费专栏免费学

大额优惠券免费领

立即参与放弃机会

点击
抽奖

慕课手记新用户专享福利

恭喜你，你的运气太好了，居然抽中了 100个积分！

恭喜你，抽中了价值元的专栏！

太棒了，直接落到你账户里！

积分商城里的罗技鼠标、机械键盘、
Kindle 阅读器、小米平衡车
Apple iPad （10.2英寸）、大额优惠券
在等着你去兑换了噢

作者：

免费赠送

兑换码：1111222211 复制

优惠券可用于购买实战课、体系课
无门槛使用

先去看看，有什么好东西马上兑换我爱学习，选课去


热搜

最近搜索清空

Spark core篇 Rpc源码1

RpcEnv.create

rpcEnv.setupEndpoint

Rpc通信

（1）remote RPC:

阅读免费教程