Potentially Low Throughput in Sink Implementation

I am currently investigating what could be causing low throughput to Pulsar in our application. I looked at `PulsarSinkGraphStage.scala` implementation and the following lines caught my eye:

```scala
override def preStart(): Unit = {
  producer = createFn()
  produceCallback = getAsyncCallback {
    case Success(_) =>
      pull(in)
    case Failure(e) =>
      logger.error("Failing pulsar sink stage", e)
      failStage(e)
  }
  pull(in)
}

override def onPush(): Unit = {
  try {
    val t = grab(in)
    logger.debug(s"Sending message $t")
    producer.sendAsync(t).onComplete(produceCallback.invoke)
  } catch {
    case e: Throwable =>
      logger.error("Failing pulsar sink stage", e)
      failStage(e)
  }
}
```

I haven't implemented any akka-streams Sinks myself, so my assumptions could be wrong here. But if you are doing `producer.sendAsync(t).onComplete(produceCallback.invoke)` doesn't that mean that a new message will be pulled only after a successful response from `producer.sendAsync`? Meaning that messages are effectively sent one-by-one to Pulsar and producer batching settings have no effect?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potentially Low Throughput in Sink Implementation #267

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Potentially Low Throughput in Sink Implementation #267

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions