Skip to content

Commit 279a5be

Browse files
committed
Update Benchmarks.rst
1 parent 0b8012b commit 279a5be

File tree

7 files changed

+68
-60
lines changed

7 files changed

+68
-60
lines changed

docs/ops/doc/Benchmarks.rst

Lines changed: 33 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,22 @@
11
SciJava Ops Benchmarks
22
======================
33

4-
This page describes a quantitative analysis of the SciJava Ops framework, and is heavily inspired by a similar comparison of `ImgLib2 <https://imagej.net/libs/imglib2/benchmarks>`_.
4+
This page describes a quantitative analysis of the SciJava Ops framework, and is heavily inspired by a similar comparison of `ImgLib2 <https://imagej.net/libs/imglib2/benchmarks>`_. In all figures, benchmark times are displayed using bar charts (describing mean execution times, in microseconds) with error bars (used to denote the range of observed execution times).
55

66
Hardware and Software
77
---------------------
88

99
This analysis was performed with the following hardware:
1010

11-
* Dell Precision 7770
12-
* 12th Gen Intel i9-12950HX (24) @ 4.900GHz
13-
* 32 GB 4800 MT/s DDR5 RAM
11+
* 2021 Dell OptiPlex 5090 Small Form Factor
12+
* Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz
13+
* 64 GB 3200 MHz DDR4 RAM
1414

1515
The following software components were used:
1616

17-
* Ubuntu 23.10
18-
* Kernel 6.5.0-26-generic
19-
* OpenJDK 64-Bit Server VM GraalVM CE 21.0.2+13.1 (build 21.0.2+13-jvmci-23.1-b30, mixed mode, sharing)
20-
* SciJava Incubator commit `906b9d08 <https://github.com/scijava/incubator/commit/906b9d08301f4aafd7947f1fd08717f5351fd40b>`_
17+
* Ubuntu 20.04.6 LTS
18+
* Java HotSpot(TM) 64-Bit Server VM Oracle GraalVM 20.0.1+9.1 (build 20.0.1+9-jvmci-23.0-b12, mixed mode, sharing)
19+
* SciJava Incubator commit `0b8012b2 <https://github.com/scijava/incubator/commit/0b8012b2b00ba84b0583ef7260fab1be8f251041>`_
2120
* ImageJ Ops version ``2.0.0``
2221

2322
All benchmarks are executed using the `Java Microbenchmark Harness <https://github.com/openjdk/jmh>`_, using the following parameters:
@@ -31,7 +30,7 @@ All benchmarks are executed using the `Java Microbenchmark Harness <https://gith
3130
Op Matching
3231
-----------
3332

34-
We first analyze the performance of executing the following static method:
33+
We first analyze the performance of executing the following static method, written to contain the *fewest* instructions possible while also avoiding code removal by the Just In Time compiler:
3534

3635
.. code-block:: java
3736
@@ -45,31 +44,33 @@ We first analyze the performance of executing the following static method:
4544
data[0]++;
4645
}
4746
48-
We first benchmark the base penalty of executing this method using SciJava Ops, compared to direct execution of the static method. This method mutates a data structure in place, meaning the Ops engine can match it directly as an inplace Op, or adapt it to a function Op. Thus, we test the benchmark the following three scenarios:
47+
We first benchmark the overhead of executing this method through SciJava Ops, compared to direct execution of the static method. This method mutates a data structure in place, meaning the Ops engine can match it directly as an inplace Op, or **adapt** it to a function Op. Thus, we test the benchmark the following four scenarios:
4948

50-
* Direct static method invocation
49+
* Static method invocation
50+
* Output Buffer creation + static method invocation **(A)**
5151
* SciJava Ops inplace invocation
52-
* SciJava Ops adapted function invocation
52+
* SciJava Ops **function** invocation **(A)**
5353

54-
The results are shown in **Figure 1**. We find Op execution through the SciJava Ops framework adds a few milliseconds of additional overhead. A few additional milliseconds of overhead are observed when SciJava Ops is additionally tasked with creating an output buffer.
54+
The results are shown in **Figure 1**. We find Op execution through the SciJava Ops framework adds approximately 100 microseconds of additional overhead. An additional millisecond of overhead is observed when SciJava Ops additionally creates an output buffer.
5555

5656
.. chart:: ../images/BenchmarkMatching.json
5757

5858
**Figure 1:** Algorithm execution performance (lower is better)
5959

60-
Note that the above requests are benchmarked without assistance from the Op cache, i.e. they are designed to model the full matching process. As repeated Op requests will utilize the Op cache, we benchmark cached Op retrieval separately, with results shown in **Figure 2**. These benchmarks suggest Op caching helps avoid the additional overhead of Op adaptation as its performance approaches that of normal Op execution.
60+
Note that the above requests are benchmarked without assistance from the Op cache, measuring the overhead of full matching process. As repeated Op requests will utilize the Op cache, we benchmark cached Op execution separately, with results shown in **Figure 2**. From these results, we conlude that Op matching comprises the majority of SciJava Ops overhead, and repeated executions add only a few microseconds of overhead.
6161

6262
.. chart:: ../images/BenchmarkCaching.json
6363

6464
**Figure 2:** Algorithm execution performance with Op caching (lower is better)
6565

6666
Finally, we benchmark the overhead of SciJava Ops parameter conversion. Suppose we instead wish to operate upon a ``double[]`` - we must convert it to ``byte[]`` to call our Op. We consider the following procedures:
6767

68-
* Image conversion + output buffer creation + static method invocation
69-
* output buffer creation + SciJava Ops invocation using Op conversion
70-
* SciJava Ops invocation using Op conversion and Op adaptation
68+
* Array conversion + static method invocation **(C)**
69+
* Array buffer creation + array conversion + static method invocation **(A+C)**
70+
* SciJava Ops converted inplace invocation **(C)**
71+
* SciJava Ops converted **function** invocation **(A+C)**
7172

72-
The results are shown in **Figure 3**; note the Op cache is **not** enabled. We observe overheads on the order of 10 milliseconds to perform Op conversion with and without Op adaptation.
73+
The results are shown in **Figure 3**; note the Op cache is **not** enabled. We find that parameter conversion imposes additional overhead of approximately 1 millisecond, and when both parameter conversion and Op adaptation are used the overhead (~2 milliseconds) is *additive*.
7374

7475
.. chart:: ../images/BenchmarkConversion.json
7576

@@ -78,35 +79,35 @@ The results are shown in **Figure 3**; note the Op cache is **not** enabled. We
7879
Framework Comparison
7980
--------------------
8081

81-
To validate our development efforts atop the original `ImageJ Ops <https://imagej.net/libs/imagej-ops/>`_ framework, we benchmark executions of the following method:
82+
To validate our development efforts atop the original `ImageJ Ops <https://imagej.net/libs/imagej-ops/>`_ framework, we additionally wrap the above static method within ImageJ Ops:
8283

8384
.. code-block:: java
8485
85-
/**
86-
* @param data the data to invert
87-
* @implNote op name="benchmark.invert",type=Inplace1
88-
*/
89-
public static void invertRaw(final byte[] data) {
90-
for (int i = 0; i < data.length; i++) {
91-
final int value = data[i] & 0xff;
92-
final int result = 255 - value;
93-
data[i] = (byte) result;
86+
/** Increment Op wrapper for ImageJ Ops. */
87+
@Plugin(type = Op.class, name = "benchmark.increment")
88+
public static class IncrementByteOp extends AbstractUnaryInplaceOp<byte[]>
89+
implements Op
90+
{
91+
92+
@Override
93+
public void mutate(byte[] o) {
94+
incrementByte(o);
9495
}
9596
}
9697
97-
We then benchmark the performance of executing this code using the following pathways:
98+
We then benchmark the performance of executing the static method using the following pathways:
9899

99100
* Static method invocation
100101
* SciJava Ops invocation
101-
* ImageJ Ops invocation (using a ``Class`` wrapper to make the method discoverable within ImageJ Ops)
102+
* ImageJ Ops invocation (using the above wrapper)
102103

103-
The results are shown in **Figure 4**. When algorithm matching dominates execution time, the SciJava Ops matching framework provides significant improvement in matching performance in comparison with the original ImageJ Ops framework.
104+
The results are shown in **Figure 4**. From this figure we can see that the "Op overhead" from ImageJ Ops is approximately 70x the "Op overhead" from SciJava Ops.
104105

105106
.. chart:: ../images/BenchmarkFrameworks.json
106107

107108
**Figure 4:** Algorithm execution performance by Framework (lower is better)
108109

109-
Finally, here is a figure combining all the metrics above:
110+
We provide a final figure combining all the metrics above:
110111

111112
.. chart:: ../images/BenchmarkCombined.json
112113

docs/ops/graph_results.py

Lines changed: 30 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -8,34 +8,38 @@
88
# It expects JMH benchmark results be dumped to a file "scijava-ops-benchmark_results.json", within its directory.
99

1010
# If you'd like to add a plotly chart, add an entry to the following list.
11+
12+
A = "<b style=\"color:black\">[<b style=\"color:#009E73\">A</b>]</b>"
13+
C = "<b style=\"color:black\">[<b style=\"color:#E69F00\">C</b>]</b>"
14+
AC = "<b style=\"color:black\">[<b style=\"color:#CC79A7\">AC</b>]</b>"
1115
figures = [
1216
{
1317
"name": "BenchmarkMatching",
1418
"title": "Basic Op Matching Performance",
1519
"bars": {
1620
"noOps": "Static Method",
17-
"noOpsAdapted": "Static Method [A]",
18-
"sjOps": "Op Execution",
19-
"sjOpsAdapted": "Op Execution [A]"
21+
"noOpsAdapted": f"Static Method {A}",
22+
"sjOps": "SciJava Ops",
23+
"sjOpsAdapted": f"SciJava Ops {A}"
2024
}
2125
},
2226
{
2327
"name": "BenchmarkCaching",
2428
"title": "Caching Effects on Op Matching Performance",
2529
"bars": {
2630
"noOps": "Static Method",
27-
"sjOps": "Op Execution",
28-
"sjOpsWithCache": "Op Execution (cached)"
31+
"sjOps": "SciJava Ops",
32+
"sjOpsWithCache": "SciJava Ops (cached)"
2933
}
3034
},
3135
{
3236
"name": "BenchmarkConversion",
3337
"title": "Parameter Conversion Performance",
3438
"bars": {
35-
"noOpsConverted": "Static Method [C]",
36-
"noOpsAdaptedAndConverted": "Static Method [A+C]",
37-
"sjOpsConverted": "Op Execution [C]",
38-
"sjOpsConvertedAndAdapted": "Op Execution [A+C]"
39+
"noOpsConverted": f"Static Method {C}",
40+
"noOpsAdaptedAndConverted": f"Static Method {AC}",
41+
"sjOpsConverted": f"SciJava Ops {C}",
42+
"sjOpsConvertedAndAdapted": f"SciJava Ops {AC}"
3943
}
4044
},
4145
{
@@ -52,14 +56,14 @@
5256
"title": "Combined Performance Metrics",
5357
"bars": {
5458
"noOps": "Static Method",
55-
"noOpsAdapted": "Static Method [A]",
56-
"noOpsConverted": "Static Method [C]",
57-
"noOpsAdaptedAndConverted": "Static Method [A+C]",
58-
"sjOpsWithCache": "SciJava Ops (with caching)",
59-
"sjOps": "SciJava Ops (no caching)",
60-
"sjOpsAdapted": "SciJava Ops [A]",
61-
"sjOpsConverted": "SciJava Ops [C]",
62-
"sjOpsConvertedAndAdapted": "SciJava Ops [A+C]",
59+
"noOpsAdapted": f"Static Method {A}",
60+
"noOpsConverted": f"Static Method {C}",
61+
"noOpsAdaptedAndConverted": f"Static Method {AC}",
62+
"sjOpsWithCache": "SciJava Ops (cached)",
63+
"sjOps": "SciJava Ops",
64+
"sjOpsAdapted": f"SciJava Ops {A}",
65+
"sjOpsConverted": f"SciJava Ops {C}",
66+
"sjOpsConvertedAndAdapted": f"SciJava Ops {AC}",
6367
"ijOps": "ImageJ Ops",
6468
}
6569
}
@@ -74,8 +78,9 @@
7478
for row in data:
7579
test = row["benchmark"].split(".")[-1]
7680
score = row["primaryMetric"]["score"]
77-
error = row["primaryMetric"]["scoreError"]
78-
results[test] = {"score": score, "error": error}
81+
percentiles = row["primaryMetric"]["scorePercentiles"]
82+
minmax = [percentiles["0.0"], percentiles["100.0"]]
83+
results[test] = {"score": score, "minmax": minmax}
7984

8085
# Build charts and dump them to JSON.
8186
for figure in figures:
@@ -85,25 +90,27 @@
8590
x = []
8691
y = []
8792
error_y = []
93+
error_y_minus = []
8894

8995
# Add each benchmark in the class
9096
for test, label in figure["bars"].items():
9197
print(f".", end="")
9298
result = results[test]
9399
x.append(label)
94100
y.append(result["score"])
95-
error_y.append(result["error"])
101+
error_y.append(result["minmax"][1] - result["score"])
102+
error_y_minus.append(result["score"] - result["minmax"][0])
96103

97104
# Create a bar chart
98105
fig = go.Figure()
99106
fig.add_bar(
100107
x=x,
101108
y=y,
102-
error_y=dict(type='data', array=error_y),
109+
error_y=dict(type='data', array=error_y, arrayminus=error_y_minus),
103110
)
104111
fig.update_layout(
105-
title_text=figure["title"],
106-
yaxis_title="Performance (us/op)"
112+
title_text=figure["title"] + f"<br><sup style=\"color: gray\">{A}=Adaptation, {C}=Conversion, {AC}=Adaptation & Conversion</sup>",
113+
yaxis_title="<b>Performance (&mu;s/execution)</b>"
107114
)
108115

109116
# Convert to JSON and dump

0 commit comments

Comments
 (0)