scijava
diff --git a/‎docs/ops/doc/Benchmarks.rst‎
Lines changed: 33 additions & 32 deletions b/‎docs/ops/doc/Benchmarks.rst‎
Lines changed: 33 additions & 32 deletions
diff --git a/‎docs/ops/graph_results.py‎
Lines changed: 30 additions & 23 deletions b/‎docs/ops/graph_results.py‎
Lines changed: 30 additions & 23 deletions
@@ -1,23 +1,22 @@
 SciJava Ops Benchmarks
 ======================
 
-This page describes a quantitative analysis of the SciJava Ops framework, and is heavily inspired by a similar comparison of `ImgLib2 <https://imagej.net/libs/imglib2/benchmarks>`_.
+This page describes a quantitative analysis of the SciJava Ops framework, and is heavily inspired by a similar comparison of `ImgLib2 <https://imagej.net/libs/imglib2/benchmarks>`_. In all figures, benchmark times are displayed using bar charts (describing mean execution times, in microseconds) with error bars (used to denote the range of observed execution times).
 
 Hardware and Software
 ---------------------
 
 This analysis was performed with the following hardware:
 
-* Dell Precision 7770
-* 12th Gen Intel i9-12950HX (24) @ 4.900GHz
-* 32 GB 4800 MT/s DDR5 RAM
+* 2021 Dell OptiPlex 5090 Small Form Factor
+* Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz
+* 64 GB 3200 MHz DDR4 RAM
 
 The following software components were used:
 
-* Ubuntu 23.10
-* Kernel 6.5.0-26-generic
-* OpenJDK 64-Bit Server VM GraalVM CE 21.0.2+13.1 (build 21.0.2+13-jvmci-23.1-b30, mixed mode, sharing)
-* SciJava Incubator commit `906b9d08 <https://github.com/scijava/incubator/commit/906b9d08301f4aafd7947f1fd08717f5351fd40b>`_
+* Ubuntu 20.04.6 LTS
+* Java HotSpot(TM) 64-Bit Server VM Oracle GraalVM 20.0.1+9.1 (build 20.0.1+9-jvmci-23.0-b12, mixed mode, sharing)
+* SciJava Incubator commit `0b8012b2 <https://github.com/scijava/incubator/commit/0b8012b2b00ba84b0583ef7260fab1be8f251041>`_
 * ImageJ Ops version ``2.0.0``
 
 All benchmarks are executed using the `Java Microbenchmark Harness <https://github.com/openjdk/jmh>`_, using the following parameters:
@@ -31,7 +30,7 @@ All benchmarks are executed using the `Java Microbenchmark Harness <https://gith
 Op Matching
 -----------
 
-We first analyze the performance of executing the following static method:
+We first analyze the performance of executing the following static method, written to contain the *fewest* instructions possible while also avoiding code removal by the Just In Time compiler:
 
 ..  code-block:: java
 
@@ -45,31 +44,33 @@ We first analyze the performance of executing the following static method:
 		data[0]++;
 	}
 
-We first benchmark the base penalty of executing this method using SciJava Ops, compared to direct execution of the static method. This method mutates a data structure in place, meaning the Ops engine can match it directly as an inplace Op, or adapt it to a function Op. Thus, we test the benchmark the following three scenarios:
+We first benchmark the overhead of executing this method through SciJava Ops, compared to direct execution of the static method. This method mutates a data structure in place, meaning the Ops engine can match it directly as an inplace Op, or **adapt** it to a function Op. Thus, we test the benchmark the following four scenarios:
 
-* Direct static method invocation
+* Static method invocation
+* Output Buffer creation + static method invocation **(A)**
 * SciJava Ops inplace invocation
-* SciJava Ops adapted function invocation
+* SciJava Ops **function** invocation **(A)**
 
-The results are shown in **Figure 1**. We find Op execution through the SciJava Ops framework adds a few milliseconds of additional overhead. A few additional milliseconds of overhead are observed when SciJava Ops is additionally tasked with creating an output buffer.
+The results are shown in **Figure 1**. We find Op execution through the SciJava Ops framework adds approximately 100 microseconds of additional overhead. An additional millisecond of overhead is observed when SciJava Ops additionally creates an output buffer.
 
 .. chart:: ../images/BenchmarkMatching.json
 
 	**Figure 1:** Algorithm execution performance (lower is better)
 
-Note that the above requests are benchmarked without assistance from the Op cache, i.e. they are designed to model the full matching process. As repeated Op requests will utilize the Op cache, we benchmark cached Op retrieval separately, with results shown in **Figure 2**. These benchmarks suggest Op caching helps avoid the additional overhead of Op adaptation as its performance approaches that of normal Op execution.
+Note that the above requests are benchmarked without assistance from the Op cache, measuring the overhead of full matching process. As repeated Op requests will utilize the Op cache, we benchmark cached Op execution separately, with results shown in **Figure 2**. From these results, we conlude that Op matching comprises the majority of SciJava Ops overhead, and repeated executions add only a few microseconds of overhead.
 
 .. chart:: ../images/BenchmarkCaching.json
 
 	**Figure 2:** Algorithm execution performance with Op caching (lower is better)
 
 Finally, we benchmark the overhead of SciJava Ops parameter conversion. Suppose we instead wish to operate upon a ``double[]`` - we must convert it to ``byte[]`` to call our Op. We consider the following procedures:
 
-* Image conversion + output buffer creation + static method invocation
-* output buffer creation + SciJava Ops invocation using Op conversion
-* SciJava Ops invocation using Op conversion and Op adaptation
+* Array conversion + static method invocation **(C)**
+* Array buffer creation + array conversion + static method invocation **(A+C)**
+* SciJava Ops converted inplace invocation **(C)**
+* SciJava Ops converted **function** invocation **(A+C)**
 
-The results are shown in **Figure 3**; note the Op cache is **not** enabled. We observe overheads on the order of 10 milliseconds to perform Op conversion with and without Op adaptation.
+The results are shown in **Figure 3**; note the Op cache is **not** enabled. We find that parameter conversion imposes additional overhead of approximately 1 millisecond, and when both parameter conversion and Op adaptation are used the overhead (~2 milliseconds) is *additive*.
 
 .. chart:: ../images/BenchmarkConversion.json
 
@@ -78,35 +79,35 @@ The results are shown in **Figure 3**; note the Op cache is **not** enabled. We
 Framework Comparison
 --------------------
 
-To validate our development efforts atop the original `ImageJ Ops <https://imagej.net/libs/imagej-ops/>`_ framework, we benchmark executions of the following method:
+To validate our development efforts atop the original `ImageJ Ops <https://imagej.net/libs/imagej-ops/>`_ framework, we additionally wrap the above static method within ImageJ Ops:
 
 .. code-block:: java
 
-	/**
-	 * @param data the data to invert
-	 * @implNote op name="benchmark.invert",type=Inplace1
-	 */
-	public static void invertRaw(final byte[] data) {
-		for (int i = 0; i < data.length; i++) {
-			final int value = data[i] & 0xff;
-			final int result = 255 - value;
-			data[i] = (byte) result;
+	/** Increment Op wrapper for ImageJ Ops. */
+	@Plugin(type = Op.class, name = "benchmark.increment")
+	public static class IncrementByteOp extends AbstractUnaryInplaceOp<byte[]>
+		implements Op
+	{
+
+		@Override
+		public void mutate(byte[] o) {
+			incrementByte(o);
 		}
 	}
 
-We then benchmark the performance of executing this code using the following pathways:
+We then benchmark the performance of executing the static method using the following pathways:
 
 * Static method invocation
 * SciJava Ops invocation
-* ImageJ Ops invocation (using a ``Class`` wrapper to make the method discoverable within ImageJ Ops)
+* ImageJ Ops invocation (using the above wrapper)
 
-The results are shown in **Figure 4**. When algorithm matching dominates execution time, the SciJava Ops matching framework provides significant improvement in matching performance in comparison with the original ImageJ Ops framework.
+The results are shown in **Figure 4**. From this figure we can see that the "Op overhead" from ImageJ Ops is approximately 70x the "Op overhead" from SciJava Ops.
 
 .. chart:: ../images/BenchmarkFrameworks.json
 
 	**Figure 4:** Algorithm execution performance by Framework (lower is better)
 
-Finally, here is a figure combining all the metrics above:
+We provide a final figure combining all the metrics above:
 
 .. chart:: ../images/BenchmarkCombined.json
 
 
@@ -8,34 +8,38 @@
 # It expects JMH benchmark results be dumped to a file "scijava-ops-benchmark_results.json", within its directory.
 
 # If you'd like to add a plotly chart, add an entry to the following list.
+
+A = "<b style=\"color:black\">[<b style=\"color:#009E73\">A</b>]</b>"
+C = "<b style=\"color:black\">[<b style=\"color:#E69F00\">C</b>]</b>"
+AC = "<b style=\"color:black\">[<b style=\"color:#CC79A7\">AC</b>]</b>"
 figures = [
     {
         "name": "BenchmarkMatching",
         "title": "Basic Op Matching Performance",
         "bars": {
             "noOps": "Static Method",
-            "noOpsAdapted": "Static Method [A]",
-            "sjOps": "Op Execution",
-            "sjOpsAdapted": "Op Execution [A]"
+            "noOpsAdapted": f"Static Method {A}",
+            "sjOps": "SciJava Ops",
+            "sjOpsAdapted": f"SciJava Ops {A}"
         }
     },
     {
         "name": "BenchmarkCaching",
         "title": "Caching Effects on Op Matching Performance",
         "bars": {
             "noOps": "Static Method",
-            "sjOps": "Op Execution",
-            "sjOpsWithCache": "Op Execution (cached)"
+            "sjOps": "SciJava Ops",
+            "sjOpsWithCache": "SciJava Ops (cached)"
         }
     },
     {
         "name": "BenchmarkConversion",
         "title": "Parameter Conversion Performance",
         "bars": {
-            "noOpsConverted": "Static Method [C]",
-            "noOpsAdaptedAndConverted": "Static Method [A+C]",
-            "sjOpsConverted": "Op Execution [C]",
-            "sjOpsConvertedAndAdapted": "Op Execution [A+C]"
+            "noOpsConverted": f"Static Method {C}",
+            "noOpsAdaptedAndConverted": f"Static Method {AC}",
+            "sjOpsConverted": f"SciJava Ops {C}",
+            "sjOpsConvertedAndAdapted": f"SciJava Ops {AC}"
         }
     },
     {
@@ -52,14 +56,14 @@
         "title": "Combined Performance Metrics",
         "bars": {
             "noOps": "Static Method",
-            "noOpsAdapted": "Static Method [A]",
-            "noOpsConverted": "Static Method [C]",
-            "noOpsAdaptedAndConverted": "Static Method [A+C]",
-            "sjOpsWithCache": "SciJava Ops (with caching)",
-            "sjOps": "SciJava Ops (no caching)",
-            "sjOpsAdapted": "SciJava Ops [A]",
-            "sjOpsConverted": "SciJava Ops [C]",
-            "sjOpsConvertedAndAdapted": "SciJava Ops [A+C]",
+            "noOpsAdapted": f"Static Method {A}",
+            "noOpsConverted": f"Static Method {C}",
+            "noOpsAdaptedAndConverted": f"Static Method {AC}",
+            "sjOpsWithCache": "SciJava Ops (cached)",
+            "sjOps": "SciJava Ops",
+            "sjOpsAdapted": f"SciJava Ops {A}",
+            "sjOpsConverted": f"SciJava Ops {C}",
+            "sjOpsConvertedAndAdapted": f"SciJava Ops {AC}",
             "ijOps": "ImageJ Ops",
         }
     }
@@ -74,8 +78,9 @@
 for row in data:
     test = row["benchmark"].split(".")[-1]
     score = row["primaryMetric"]["score"]
-    error = row["primaryMetric"]["scoreError"]
-    results[test] = {"score": score, "error": error}
+    percentiles = row["primaryMetric"]["scorePercentiles"]
+    minmax = [percentiles["0.0"], percentiles["100.0"]]
+    results[test] = {"score": score, "minmax": minmax}
 
 # Build charts and dump them to JSON.
 for figure in figures:
@@ -85,25 +90,27 @@
     x = []
     y = []
     error_y = []
+    error_y_minus = []
 
     # Add each benchmark in the class
     for test, label in figure["bars"].items():
         print(f".", end="")
         result = results[test]
         x.append(label)
         y.append(result["score"])
-        error_y.append(result["error"])
+        error_y.append(result["minmax"][1] - result["score"])
+        error_y_minus.append(result["score"] - result["minmax"][0])
 
     # Create a bar chart
     fig = go.Figure()
     fig.add_bar(
         x=x,
         y=y,
-        error_y=dict(type='data', array=error_y),
+        error_y=dict(type='data', array=error_y, arrayminus=error_y_minus),
     )
     fig.update_layout(
-        title_text=figure["title"],
-        yaxis_title="Performance (us/op)"
+        title_text=figure["title"] + f"<br><sup style=\"color: gray\">{A}=Adaptation, {C}=Conversion, {AC}=Adaptation & Conversion</sup>",
+        yaxis_title="<b>Performance (&mu;s/execution)</b>"
     )
 
     # Convert to JSON and dump