Add channel benchmarks #4546

murfel · 2025-10-16T15:53:55Z

No description provided.

murfel · 2025-10-16T15:56:16Z

It does perform unexpectedly badly, could you all take a look for any silly mistakes?

The simplest benchmark, sending X elements only (not receiving anything), is already pretty sad:

ChannelBenchmark.sendUnlimited                1000  avgt   10  0.008 ±  0.001  ms/op
ChannelBenchmark.sendUnlimited               10000  avgt   10  0.080 ±  0.001  ms/op
ChannelBenchmark.sendUnlimited              100000  avgt   10  0.800 ±  0.005  ms/op

Full output for the first three counts (4KB, 40KB, 400KB) of Ints (just FYI, no reason to look at this, since the snippet above is already bad enough)

Benchmark                                  (count)  Mode  Cnt  Score    Error  Units
ChannelBenchmark.manySendersManyReceivers     1000  avgt   10  0.119 ±  0.001  ms/op
ChannelBenchmark.manySendersManyReceivers    10000  avgt   10  0.900 ±  0.006  ms/op
ChannelBenchmark.manySendersManyReceivers   100000  avgt   10  9.244 ±  0.418  ms/op
ChannelBenchmark.manySendersOneReceiver       1000  avgt   10  0.108 ±  0.001  ms/op
ChannelBenchmark.manySendersOneReceiver      10000  avgt   10  0.713 ±  0.042  ms/op
ChannelBenchmark.manySendersOneReceiver     100000  avgt   10  7.010 ±  0.115  ms/op
ChannelBenchmark.oneSenderManyReceivers       1000  avgt   10  0.138 ±  0.001  ms/op
ChannelBenchmark.oneSenderManyReceivers      10000  avgt   10  0.923 ±  0.003  ms/op
ChannelBenchmark.oneSenderManyReceivers     100000  avgt   10  8.411 ±  0.035  ms/op
ChannelBenchmark.sendConflated                1000  avgt   10  0.020 ±  0.001  ms/op
ChannelBenchmark.sendConflated               10000  avgt   10  0.187 ±  0.007  ms/op
ChannelBenchmark.sendConflated              100000  avgt   10  1.834 ±  0.013  ms/op
ChannelBenchmark.sendReceiveConflated         1000  avgt   10  0.039 ±  0.001  ms/op
ChannelBenchmark.sendReceiveConflated        10000  avgt   10  0.236 ±  0.009  ms/op
ChannelBenchmark.sendReceiveConflated       100000  avgt   10  1.906 ±  0.019  ms/op
ChannelBenchmark.sendReceiveRendezvous        1000  avgt   10  0.103 ±  0.001  ms/op
ChannelBenchmark.sendReceiveRendezvous       10000  avgt   10  0.866 ±  0.021  ms/op
ChannelBenchmark.sendReceiveRendezvous      100000  avgt   10  8.270 ±  0.071  ms/op
ChannelBenchmark.sendReceiveUnlimited         1000  avgt   10  0.077 ±  0.002  ms/op
ChannelBenchmark.sendReceiveUnlimited        10000  avgt   10  0.419 ±  0.005  ms/op
ChannelBenchmark.sendReceiveUnlimited       100000  avgt   10  3.443 ±  0.061  ms/op
ChannelBenchmark.sendUnlimited                1000  avgt   10  0.008 ±  0.001  ms/op
ChannelBenchmark.sendUnlimited               10000  avgt   10  0.080 ±  0.001  ms/op
ChannelBenchmark.sendUnlimited              100000  avgt   10  0.800 ±  0.005  ms/op

dkhalanskyjb · 2025-10-17T08:14:53Z

benchmarks/src/jmh/kotlin/benchmarks/ChannelBenchmark.kt

+    }
+
+    private suspend fun send(count: Int, channel: Channel<Int>) = coroutineScope {
+        list.take(count).forEach { channel.send(it) }


list.take(count) copies count elements to a new list, allocating a lot of new memory. I'd expect it to make a noticeable contribution to the runtime.

fzhinkin · 2025-10-21T14:58:50Z

Just for the record, we discussed benchmarks with @murfel offline and she'll rework them.

murfel · 2025-11-04T13:23:25Z

Ran (on freshly restarted macbook, without any apps open but the terminal and system monitor)
java -jar benchmarks.jar ".ChannelBenchmark." -p count=1000,100000000 -p prefill=0,1000000,100000000

# Run complete. Total time: 00:37:21

Benchmark                                    (count)  (prefill)  Mode  Cnt     Score      Error  Units
ChannelBenchmark.manySendersManyReceivers       1000          0  avgt   10     0.113 ±    0.001  ms/op
ChannelBenchmark.manySendersManyReceivers       1000    1000000  avgt   10     0.118 ±    0.007  ms/op
ChannelBenchmark.manySendersManyReceivers       1000  100000000  avgt   10     0.357 ±    0.055  ms/op
ChannelBenchmark.manySendersManyReceivers  100000000          0  avgt   10  8796.152 ±  543.292  ms/op
ChannelBenchmark.manySendersManyReceivers  100000000    1000000  avgt   10  8683.527 ±  254.436  ms/op
ChannelBenchmark.manySendersManyReceivers  100000000  100000000  avgt   10  9434.746 ±  310.576  ms/op
ChannelBenchmark.manySendersOneReceiver         1000          0  avgt   10     0.084 ±    0.002  ms/op
ChannelBenchmark.manySendersOneReceiver         1000    1000000  avgt   10     0.068 ±    0.001  ms/op
ChannelBenchmark.manySendersOneReceiver         1000  100000000  avgt   10     0.327 ±    0.043  ms/op
ChannelBenchmark.manySendersOneReceiver    100000000          0  avgt   10  6759.587 ± 1126.828  ms/op
ChannelBenchmark.manySendersOneReceiver    100000000    1000000  avgt   10  6730.408 ±  112.128  ms/op
ChannelBenchmark.manySendersOneReceiver    100000000  100000000  avgt   10  6222.171 ±  256.355  ms/op
ChannelBenchmark.oneSenderManyReceivers         1000          0  avgt   10     0.119 ±    0.003  ms/op
ChannelBenchmark.oneSenderManyReceivers         1000    1000000  avgt   10     0.121 ±    0.003  ms/op
ChannelBenchmark.oneSenderManyReceivers         1000  100000000  avgt   10     0.353 ±    0.065  ms/op
ChannelBenchmark.oneSenderManyReceivers    100000000          0  avgt   10  8785.786 ±  567.569  ms/op
ChannelBenchmark.oneSenderManyReceivers    100000000    1000000  avgt   10  8698.243 ±  517.566  ms/op
ChannelBenchmark.oneSenderManyReceivers    100000000  100000000  avgt   10  8594.145 ±  416.015  ms/op
ChannelBenchmark.sendConflated                  1000        N/A  avgt   10     0.017 ±    0.001  ms/op
ChannelBenchmark.sendConflated             100000000        N/A  avgt   10  1504.701 ±   27.829  ms/op
ChannelBenchmark.sendReceiveConflated           1000        N/A  avgt   10     0.037 ±    0.001  ms/op
ChannelBenchmark.sendReceiveConflated      100000000        N/A  avgt   10  1722.869 ±   85.603  ms/op
ChannelBenchmark.sendReceiveRendezvous          1000        N/A  avgt   10     0.122 ±    0.018  ms/op
ChannelBenchmark.sendReceiveRendezvous     100000000        N/A  avgt   10  7300.491 ±  107.318  ms/op
ChannelBenchmark.sendReceiveUnlimited           1000          0  avgt   10     0.057 ±    0.002  ms/op
ChannelBenchmark.sendReceiveUnlimited           1000    1000000  avgt   10     0.056 ±    0.003  ms/op
ChannelBenchmark.sendReceiveUnlimited           1000  100000000  avgt   10     0.314 ±    0.038  ms/op
ChannelBenchmark.sendReceiveUnlimited      100000000          0  avgt   10  3645.250 ±  658.235  ms/op
ChannelBenchmark.sendReceiveUnlimited      100000000    1000000  avgt   10  3192.487 ±  372.223  ms/op
ChannelBenchmark.sendReceiveUnlimited      100000000  100000000  avgt   10  3965.029 ±  386.913  ms/op
ChannelBenchmark.sendUnlimited                  1000        N/A  avgt   10     0.006 ±    0.001  ms/op
ChannelBenchmark.sendUnlimited             100000000        N/A  avgt   10  1157.710 ±  248.811  ms/op

murfel · 2025-11-04T13:52:17Z

Quick normalisation with ChatGPT

Produce the same table but divide the Score column [and the Error column] by the count column and Change to ns/op/element (https://chatgpt.com/share/e/690a0281-e828-800b-8895-144ecc4e07f3)

(Will do a proper Notebook for a JSON benchmark output after we agree on the benchmark correctness. Forgot to save this one as JSON and it takes 40 min to re-run.)

# Run complete. Total time: 00:37:21

Benchmark                                    (count)  (prefill)  Mode  Cnt     Score          Error        Units
ChannelBenchmark.manySendersManyReceivers       1000          0  avgt   10     113.000 ±     1.000  ns/op/element
ChannelBenchmark.manySendersManyReceivers       1000    1000000  avgt   10     118.000 ±     7.000  ns/op/element
ChannelBenchmark.manySendersManyReceivers       1000  100000000  avgt   10     357.000 ±    55.000  ns/op/element
ChannelBenchmark.manySendersManyReceivers  100000000          0  avgt   10      87.962 ±     5.433  ns/op/element
ChannelBenchmark.manySendersManyReceivers  100000000    1000000  avgt   10      86.835 ±     2.544  ns/op/element
ChannelBenchmark.manySendersManyReceivers  100000000  100000000  avgt   10      94.347 ±     3.106  ns/op/element
ChannelBenchmark.manySendersOneReceiver         1000          0  avgt   10      84.000 ±     2.000  ns/op/element
ChannelBenchmark.manySendersOneReceiver         1000    1000000  avgt   10      68.000 ±     1.000  ns/op/element
ChannelBenchmark.manySendersOneReceiver         1000  100000000  avgt   10     327.000 ±    43.000  ns/op/element
ChannelBenchmark.manySendersOneReceiver    100000000          0  avgt   10      67.596 ±    11.268  ns/op/element
ChannelBenchmark.manySendersOneReceiver    100000000    1000000  avgt   10      67.304 ±     1.121  ns/op/element
ChannelBenchmark.manySendersOneReceiver    100000000  100000000  avgt   10      62.222 ±     2.564  ns/op/element
ChannelBenchmark.oneSenderManyReceivers         1000          0  avgt   10     119.000 ±     3.000  ns/op/element
ChannelBenchmark.oneSenderManyReceivers         1000    1000000  avgt   10     121.000 ±     3.000  ns/op/element
ChannelBenchmark.oneSenderManyReceivers         1000  100000000  avgt   10     353.000 ±    65.000  ns/op/element
ChannelBenchmark.oneSenderManyReceivers    100000000          0  avgt   10      87.858 ±     5.676  ns/op/element
ChannelBenchmark.oneSenderManyReceivers    100000000    1000000  avgt   10      86.982 ±     5.176  ns/op/element
ChannelBenchmark.oneSenderManyReceivers    100000000  100000000  avgt   10      85.941 ±     4.160  ns/op/element
ChannelBenchmark.sendConflated                  1000        N/A  avgt   10      17.000 ±     1.000  ns/op/element
ChannelBenchmark.sendConflated             100000000        N/A  avgt   10      15.047 ±     0.278  ns/op/element
ChannelBenchmark.sendReceiveConflated           1000        N/A  avgt   10      37.000 ±     1.000  ns/op/element
ChannelBenchmark.sendReceiveConflated      100000000        N/A  avgt   10      17.229 ±     0.856  ns/op/element
ChannelBenchmark.sendReceiveRendezvous          1000        N/A  avgt   10     122.000 ±    18.000  ns/op/element
ChannelBenchmark.sendReceiveRendezvous     100000000        N/A  avgt   10      73.005 ±     1.073  ns/op/element
ChannelBenchmark.sendReceiveUnlimited           1000          0  avgt   10      57.000 ±     2.000  ns/op/element
ChannelBenchmark.sendReceiveUnlimited           1000    1000000  avgt   10      56.000 ±     3.000  ns/op/element
ChannelBenchmark.sendReceiveUnlimited           1000  100000000  avgt   10     314.000 ±    38.000  ns/op/element
ChannelBenchmark.sendReceiveUnlimited      100000000          0  avgt   10      36.453 ±     6.582  ns/op/element
ChannelBenchmark.sendReceiveUnlimited      100000000    1000000  avgt   10      31.925 ±     3.722  ns/op/element
ChannelBenchmark.sendReceiveUnlimited      100000000  100000000  avgt   10      39.651 ±     3.869  ns/op/element
ChannelBenchmark.sendUnlimited                  1000        N/A  avgt   10       6.000 ±     1.000  ns/op/element
ChannelBenchmark.sendUnlimited             100000000        N/A  avgt   10      11.577 ±     2.488  ns/op/element

fzhinkin · 2025-11-05T14:47:16Z

benchmarks/src/jmh/kotlin/benchmarks/ChannelBenchmark.kt

+        }
+    }
+
+    suspend fun <E> Channel<E>.forEach(action: (E) -> Unit) {


Suggested change

suspend fun <E> Channel<E>.forEach(action: (E) -> Unit) {

suspend inline fun <E> Channel<E>.forEach(action: (E) -> Unit) {

fzhinkin · 2025-11-05T14:58:26Z

benchmarks/src/jmh/kotlin/benchmarks/ChannelBenchmark.kt

+            repeat(maxCount) { add(it) }
+        }
+
+        @Setup(Level.Invocation)


Why it has to be done before every benchmark function invocation and not once per trial / iteration?

JFTR, https://github.com/openjdk/jmh/blob/2a316030b509aa9874dd6ab04e21962ac92cd634/jmh-core/src/main/java/org/openjdk/jmh/annotations/Level.java#L85

fzhinkin · 2025-11-05T15:07:15Z

benchmarks/src/jmh/kotlin/benchmarks/ChannelBenchmark.kt

+                    if (receiveAll) {
+                        channel.forEach { }
+                    } else {
+                       repeat(countPerReceiverAtLeast) {
+                            channel.receive()
+                        }


It makes sense to send received values into a blackhole (i.e. consume them).

fzhinkin · 2025-11-05T15:08:14Z

benchmarks/src/jmh/kotlin/benchmarks/ChannelBenchmark.kt

+@OutputTimeUnit(TimeUnit.NANOSECONDS)
+@State(Scope.Benchmark)
+@Fork(1)
+open class ChannelBenchmark {


Could you please elaborate what exactly you're trying to measure using these benchmarks?
Right now, it looks like "time required to create a new channel, send N messages into it (and, optionally, receive them), and then close the channel". However, I thought that initial idea was to measure the latency of sending (and receiving) a single message into the channel.

fzhinkin · 2025-11-05T15:14:09Z

benchmarks/src/jmh/kotlin/benchmarks/ChannelBenchmark.kt

+        require(senders > 0 && receivers > 0)
+        // Can be used with more than num cores but needs thinking it through,
+        // e.g., what would it measure?
+        require(senders + receivers <= cores)


Do we really need to include it into measurements? :)

dkhalanskyjb · 2025-11-05T14:22:45Z

benchmarks/src/jmh/kotlin/benchmarks/ChannelBenchmark.kt

+    // to allow for true parallelism
+    val cores = 4
+
+    //                4 KB,   40 KB,   400 KB,      4 MB,      40 MB,      400 MB


The comment wants to be in alignment with the numbers, but isn't.

dkhalanskyjb · 2025-11-05T14:27:47Z

benchmarks/src/jmh/kotlin/benchmarks/ChannelBenchmark.kt

+
+    @State(Scope.Benchmark)
+    open class UnlimitedChannelWrapper {
+        //                0,      4 MB,      40 MB,      400 MB


dkhalanskyjb · 2025-11-05T14:30:48Z

benchmarks/src/jmh/kotlin/benchmarks/ChannelBenchmark.kt

+    val maxCount = 100000000
+    val list = ArrayList<Int>(maxCount).apply {
+        repeat(maxCount) { add(it) }
+    }


Suggested change

val maxCount = 100000000

val list = ArrayList<Int>(maxCount).apply {

repeat(maxCount) { add(it) }

}

val list = List<Int>(100000000) { it }

dkhalanskyjb · 2025-11-05T14:47:41Z

benchmarks/src/jmh/kotlin/benchmarks/ChannelBenchmark.kt

+open class ChannelBenchmark {
+    // max coroutines launched per benchmark
+    // to allow for true parallelism
+    val cores = 4


Runtime.getRuntime().availableProcessors() instead of 4? Otherwise, cores is misleading.

dkhalanskyjb · 2025-11-05T15:05:30Z

benchmarks/src/jmh/kotlin/benchmarks/ChannelBenchmark.kt

+        Channel<Int>(capacity).also {
+            sendManyItems(count, it)
+        }


Minor, style.
From https://kotlinlang.org/docs/scope-functions.html#also:

When you see also in code, you can read it as " and also do the following with the object. "

In my opinion, sending many items to a channel is the main idea, not just something that you "also" do here, so I'd opt into either a form with a variable, like

val channel = Channel<Int>(capacity) repeat(count) { channel.send(list[it]) }

or used let:

Channel<Int>(capacity).let { sendManyItems(count, it) }

Of the two, I prefer the first one.

dkhalanskyjb · 2025-11-05T15:11:03Z

benchmarks/src/jmh/kotlin/benchmarks/ChannelBenchmark.kt

+    private suspend fun sendManyItems(count: Int, channel: Channel<Int>) {
+        repeat(count) {
+            // NB: it is `send`, not `trySend`, on purpose, since we are testing the `send` performance here.
+            channel.send(list[it])
+        }
+    }
+
+    private suspend fun runSend(count: Int, capacity: Int) {
+        Channel<Int>(capacity).also {
+            sendManyItems(count, it)


Minor, style: these two functions don't pass the https://wiki.haskell.org/Fairbairn_threshold for me, so I'd just inline them. Then, even the NB wouldn't be necessary, as it would be clear from the benchmark name that we are testing send.

dkhalanskyjb · 2025-11-05T15:17:23Z

benchmarks/src/jmh/kotlin/benchmarks/ChannelBenchmark.kt

+        val receiveAll = channel.isEmpty
+        // send almost `count` items, up to `senders - 1` items will not be sent (negligible)
+        val countPerSender = count / senders
+        // for prefilled channel only: up to `receivers - 1` items of the sent items will not be received (negligible)


I don't understand this.

In total, there will be countPerSender * senders elements sent in while this function is running, so there will be wrapper.prefill + countPerSender * senders elements ultimately sent to the channel. Every receiver will receive floor(countPerSender * senders / receivers) elements, that is, in total, floor(countPerSender * senders / receivers) * receivers will leave the channel, which can leave wrapper.prefill + receivers - 1 elements inside it.

For big enough values of prefill and a small enough count, none of the items sent in runSendReceive will be received.

dkhalanskyjb · 2025-11-05T15:26:24Z

benchmarks/src/jmh/kotlin/benchmarks/ChannelBenchmark.kt

+        // Can be used with more than num cores but needs thinking it through,
+        // e.g., what would it measure?
+        require(senders + receivers <= cores)
+        // if the channel is prefilled, do not receive the prefilled items


The prefilled items will be received, as the channel is FIFO, so the way I'd explain the logic I see here is that we only want to receive as many items as there were sent, which in case of a non-prefilled channel means, all the items.

add benchmarks

c5c72a3

murfel requested review from dkhalanskyjb and fzhinkin October 16, 2025 15:53

murfel marked this pull request as draft October 16, 2025 15:59

dkhalanskyjb reviewed Oct 17, 2025

View reviewed changes

murfel added 7 commits October 28, 2025 18:03

avoid allocating (avoid list.take(count))

37be033

rename send to sendManyItems to avoid false hinting at Channel.send

0383992

make senders+receivers sum up to cores == 4

33f9e72

add prefill

bf9af79

remove Dispatchers.Default, it's it the runSendReceive already

c406fbd

sendManyItems simplify and comment

8425561

replace consume with receive

be3f273

murfel marked this pull request as ready for review November 4, 2025 13:24

murfel requested a review from dkhalanskyjb November 4, 2025 13:24

murfel changed the title ~~[Draft] Add channel benchmarks~~ Add channel benchmarks Nov 4, 2025

change units from ms to ns

b5c7bf7

fzhinkin requested changes Nov 5, 2025

View reviewed changes

dkhalanskyjb reviewed Nov 5, 2025

View reviewed changes

	suspend fun <E> Channel<E>.forEach(action: (E) -> Unit) {
	suspend inline fun <E> Channel<E>.forEach(action: (E) -> Unit) {

Add channel benchmarks #4546

Are you sure you want to change the base?

Add channel benchmarks #4546

Uh oh!

Conversation

murfel commented Oct 16, 2025

Uh oh!

murfel commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fzhinkin commented Oct 21, 2025

Uh oh!

murfel commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

murfel commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

murfel commented Oct 16, 2025 •

edited

Loading

murfel commented Nov 4, 2025 •

edited

Loading

murfel commented Nov 4, 2025 •

edited

Loading