-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add channel benchmarks #4546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Add channel benchmarks #4546
Conversation
|
It does perform unexpectedly badly, could you all take a look for any silly mistakes? The simplest benchmark, sending X elements only (not receiving anything), is already pretty sad: Full output for the first three counts (4KB, 40KB, 400KB) of Ints (just FYI, no reason to look at this, since the snippet above is already bad enough) |
| } | ||
|
|
||
| private suspend fun send(count: Int, channel: Channel<Int>) = coroutineScope { | ||
| list.take(count).forEach { channel.send(it) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list.take(count) copies count elements to a new list, allocating a lot of new memory. I'd expect it to make a noticeable contribution to the runtime.
|
Just for the record, we discussed benchmarks with @murfel offline and she'll rework them. |
|
Ran (on freshly restarted macbook, without any apps open but the terminal and system monitor) |
|
Quick normalisation with ChatGPT
(Will do a proper Notebook for a JSON benchmark output after we agree on the benchmark correctness. Forgot to save this one as JSON and it takes 40 min to re-run.) |
| } | ||
| } | ||
|
|
||
| suspend fun <E> Channel<E>.forEach(action: (E) -> Unit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| suspend fun <E> Channel<E>.forEach(action: (E) -> Unit) { | |
| suspend inline fun <E> Channel<E>.forEach(action: (E) -> Unit) { |
| repeat(maxCount) { add(it) } | ||
| } | ||
|
|
||
| @Setup(Level.Invocation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why it has to be done before every benchmark function invocation and not once per trial / iteration?
| if (receiveAll) { | ||
| channel.forEach { } | ||
| } else { | ||
| repeat(countPerReceiverAtLeast) { | ||
| channel.receive() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to send received values into a blackhole (i.e. consume them).
| @OutputTimeUnit(TimeUnit.NANOSECONDS) | ||
| @State(Scope.Benchmark) | ||
| @Fork(1) | ||
| open class ChannelBenchmark { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please elaborate what exactly you're trying to measure using these benchmarks?
Right now, it looks like "time required to create a new channel, send N messages into it (and, optionally, receive them), and then close the channel". However, I thought that initial idea was to measure the latency of sending (and receiving) a single message into the channel.
| require(senders > 0 && receivers > 0) | ||
| // Can be used with more than num cores but needs thinking it through, | ||
| // e.g., what would it measure? | ||
| require(senders + receivers <= cores) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to include it into measurements? :)
| // to allow for true parallelism | ||
| val cores = 4 | ||
|
|
||
| // 4 KB, 40 KB, 400 KB, 4 MB, 40 MB, 400 MB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment wants to be in alignment with the numbers, but isn't.
|
|
||
| @State(Scope.Benchmark) | ||
| open class UnlimitedChannelWrapper { | ||
| // 0, 4 MB, 40 MB, 400 MB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
| val maxCount = 100000000 | ||
| val list = ArrayList<Int>(maxCount).apply { | ||
| repeat(maxCount) { add(it) } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| val maxCount = 100000000 | |
| val list = ArrayList<Int>(maxCount).apply { | |
| repeat(maxCount) { add(it) } | |
| } | |
| val list = List<Int>(100000000) { it } |
| open class ChannelBenchmark { | ||
| // max coroutines launched per benchmark | ||
| // to allow for true parallelism | ||
| val cores = 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Runtime.getRuntime().availableProcessors() instead of 4? Otherwise, cores is misleading.
| Channel<Int>(capacity).also { | ||
| sendManyItems(count, it) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor, style.
From https://kotlinlang.org/docs/scope-functions.html#also:
When you see also in code, you can read it as " and also do the following with the object. "
In my opinion, sending many items to a channel is the main idea, not just something that you "also" do here, so I'd opt into either a form with a variable, like
val channel = Channel<Int>(capacity)
repeat(count) {
channel.send(list[it])
}or used let:
Channel<Int>(capacity).let {
sendManyItems(count, it)
}Of the two, I prefer the first one.
| private suspend fun sendManyItems(count: Int, channel: Channel<Int>) { | ||
| repeat(count) { | ||
| // NB: it is `send`, not `trySend`, on purpose, since we are testing the `send` performance here. | ||
| channel.send(list[it]) | ||
| } | ||
| } | ||
|
|
||
| private suspend fun runSend(count: Int, capacity: Int) { | ||
| Channel<Int>(capacity).also { | ||
| sendManyItems(count, it) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor, style: these two functions don't pass the https://wiki.haskell.org/Fairbairn_threshold for me, so I'd just inline them. Then, even the NB wouldn't be necessary, as it would be clear from the benchmark name that we are testing send.
| val receiveAll = channel.isEmpty | ||
| // send almost `count` items, up to `senders - 1` items will not be sent (negligible) | ||
| val countPerSender = count / senders | ||
| // for prefilled channel only: up to `receivers - 1` items of the sent items will not be received (negligible) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this.
In total, there will be countPerSender * senders elements sent in while this function is running, so there will be wrapper.prefill + countPerSender * senders elements ultimately sent to the channel. Every receiver will receive floor(countPerSender * senders / receivers) elements, that is, in total, floor(countPerSender * senders / receivers) * receivers will leave the channel, which can leave wrapper.prefill + receivers - 1 elements inside it.
For big enough values of prefill and a small enough count, none of the items sent in runSendReceive will be received.
| // Can be used with more than num cores but needs thinking it through, | ||
| // e.g., what would it measure? | ||
| require(senders + receivers <= cores) | ||
| // if the channel is prefilled, do not receive the prefilled items |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The prefilled items will be received, as the channel is FIFO, so the way I'd explain the logic I see here is that we only want to receive as many items as there were sent, which in case of a non-prefilled channel means, all the items.
No description provided.