[grid] reduce shared memory usage #2793

mtaillefumier · 2023-05-23T16:13:04Z

the coefficients are now read directly from global memory when lp_max > 4 instead of storing them in shared memory. This reduce shared memory usage for large l
grid_miniapp seems unhappy and trigger a race condition when hab coefficients are calculated. So add an atomicAdd.
minor changes in the cmake build system.

- the coefficients are now read directly from global memory when lp_max > 4 instead of storing them in shared memory. This reduce shared memory usage for large l - grid_miniapp seems unhappy and trigger a race condition when hab coefficients are calculated. So add an atomicAdd. - minor changes in the cmake build system.

oschuett · 2023-05-23T16:37:37Z

Oh this is nice!

I also contemplated storing the Cab matrix in global memory instead of doing multiple passes. However, it would have been a major rewrite and the performance gain was unclear because shared memory is much faster than global. Now we can actually benchmark and compare the two approaches :-)

Btw, this exception in the unittest should then no longer be needed.

Linking #1785 for posterity.

mtaillefumier · 2023-05-23T17:03:12Z

I only supressed the cxyz coefficients from the shared memory which means that we still have to store cab and alpha. alpha ain't an issue the cab might. for instance ncoset(l = 10) x ncoset(l = 10) is 286^2 doubles which is higher than the shared memory available.

One solution might be some hybrid case where we sort task in to low l and high l and then use a different algorithm for low and high l.

I can certainly do this on my side (cab in global memory) as integrate/collocate are separated from the calculation of the coefficients (at the prize of more used global memory).

I will lift the exception asap (tomorrow).

From experience collocate/integrate will always dominate the timers so if we loose a little with the cab been in global memory then be it. The overall gain of treating large l on GPU instead of CPU is worth the effort.

it would be worth triggering the HIP pascal tests. the grid_miniapp passes but still

oschuett · 2023-05-24T13:45:54Z

I only supressed the cxyz coefficients from the shared memory which means that we still have to store cab and alpha.

I actually did the opposite and partitioned Cab while keeping the entire Cxyz in memory because it's much smaller. Take e.g. the case la = lb = 5:

Cab is of size ncoset(la) * ncoset(lb) = 56**2 = 3136
Cxyz is of size ncoset(la + lb) = 286

mtaillefumier · 2023-05-25T08:55:06Z

I only supressed the cxyz coefficients from the shared memory which means that we still have to store cab and alpha.

I actually did the opposite and partitioned Cab while keeping the entire Cxyz in memory because it's much smaller. Take e.g. the case la = lb = 5:
* `Cab` is of size `ncoset(la) * ncoset(lb) = 56**2 = 3136`

* `Cxyz` is of size `ncoset(la + lb) = 286`

indeed. I removed the cab from shared memory entirely. Something is still puzzling me. I get this when I run grid_unittest (same hardware both cases).

GPU backend


Task: ../src/grid/sample_tasks/ortho_density_l3300.task         Collocate Batched   Cycles: 1.000000e+00   Max value: 8.606467e+00   Max rel diff: 2.389972e-15   Time: 3.558520e-04 sec
forces[0, 1] ref: 1.311625e-09 test: 1.205629e-09 diff:1.059952e-10 rel_diff: 1.059952e-10
forces[0, 2] ref: 1.128143e-09 test: 9.590056e-10 diff:1.691374e-10 rel_diff: 1.691374e-10
forces[1, 0] ref: 1.136793e-09 test: 1.125706e-09 diff:1.108688e-11 rel_diff: 1.108688e-11
forces[1, 1] ref: 1.311625e-09 test: 1.205629e-09 diff:1.059952e-10 rel_diff: 1.059952e-10
forces[1, 2] ref: 1.128143e-09 test: 9.590056e-10 diff:1.691374e-10 rel_diff: 1.691374e-10
virial[ 0, 0] ref: 6.013323e+06 test: 6.013323e+06 diff:0.000000e+00 rel_diff: 0.000000e+00
virial[ 0, 1] ref: 1.055391e+01 test: 1.055391e+01 diff:2.299203e-09 rel_diff: 2.178532e-10
virial[ 0, 2] ref: 1.382472e+01 test: 1.382472e+01 diff:4.074536e-10 rel_diff: 2.947282e-11
virial[ 1, 0] ref: 6.284282e+00 test: 6.284282e+00 diff:1.164153e-10 rel_diff: 1.852484e-11
virial[ 1, 1] ref: 6.013220e+06 test: 6.013220e+06 diff:0.000000e+00 rel_diff: 0.000000e+00
virial[ 1, 2] ref: 2.152792e+01 test: 2.152792e+01 diff:5.748007e-10 rel_diff: 2.670024e-11
virial[ 2, 0] ref: -3.347540e+01 test: -3.347540e+01 diff:7.821654e-10 rel_diff: 2.336538e-11
virial[ 2, 1] ref: 7.288609e+01 test: 7.288609e+01 diff:1.611625e-09 rel_diff: 2.211155e-11
virial[ 2, 2] ref: 6.013995e+06 test: 6.013995e+06 diff:5.587935e-09 rel_diff: 9.291552e-16
Task: ../src/grid/sample_tasks/ortho_density_l3333.task         Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 1.230013e+06   Max rel diff: 2.178532e-14   Time: 6.466130e-04 sec
forces[0, 0] ref: 1.136793e-09 test: 4.452016e-10 diff:6.915917e-10 rel_diff: 6.915917e-10
forces[0, 1] ref: 1.311625e-09 test: 6.708094e-10 diff:6.408152e-10 rel_diff: 6.408152e-10
forces[0, 2] ref: 1.128143e-09 test: 4.962475e-10 diff:6.318955e-10 rel_diff: 6.318955e-10
forces[1, 0] ref: 1.136793e-09 test: 4.452016e-10 diff:6.915917e-10 rel_diff: 6.915917e-10
forces[1, 1] ref: 1.311625e-09 test: 6.708094e-10 diff:6.408152e-10 rel_diff: 6.408152e-10
forces[1, 2] ref: 1.128143e-09 test: 4.962475e-10 diff:6.318955e-10 rel_diff: 6.318955e-10
virial[ 0, 0] ref: 6.013323e+06 test: 6.013323e+06 diff:1.303852e-08 rel_diff: 2.168271e-15
virial[ 0, 1] ref: 1.055391e+01 test: 1.055391e+01 diff:2.910383e-10 rel_diff: 2.757635e-11
virial[ 0, 2] ref: 1.382472e+01 test: 1.382472e+01 diff:5.238689e-09 rel_diff: 3.789363e-10
virial[ 1, 0] ref: 6.284282e+00 test: 6.284282e+00 diff:2.328306e-10 rel_diff: 3.704968e-11
virial[ 1, 1] ref: 6.013220e+06 test: 6.013220e+06 diff:1.210719e-08 rel_diff: 2.013429e-15
virial[ 1, 2] ref: 2.152792e+01 test: 2.152792e+01 diff:1.535227e-09 rel_diff: 7.131329e-11
virial[ 2, 0] ref: -3.347540e+01 test: -3.347540e+01 diff:9.968062e-10 rel_diff: 2.977728e-11
virial[ 2, 1] ref: 7.288609e+01 test: 7.288609e+01 diff:8.440111e-10 rel_diff: 1.157986e-11
virial[ 2, 2] ref: 6.013995e+06 test: 6.013995e+06 diff:3.259629e-08 rel_diff: 5.420072e-15

hip backend

Task: ./src/grid/sample_tasks/ortho_density_l3300.task          Collocate Batched   Cycles: 1.000000e+00   Max value: 8.606467e+00   Max rel diff: 2.573816e-15   Time: 5.322370e-04 sec
forces[0, 1] ref: 1.311625e-09 test: 1.205629e-09 diff:1.059952e-10 rel_diff: 1.059952e-10
forces[0, 2] ref: 1.128143e-09 test: 9.590056e-10 diff:1.691374e-10 rel_diff: 1.691374e-10
forces[1, 0] ref: 1.136793e-09 test: 1.125706e-09 diff:1.108688e-11 rel_diff: 1.108688e-11
forces[1, 1] ref: 1.311625e-09 test: 1.205629e-09 diff:1.059952e-10 rel_diff: 1.059952e-10
forces[1, 2] ref: 1.128143e-09 test: 9.590056e-10 diff:1.691374e-10 rel_diff: 1.691374e-10
virial[ 0, 0] ref: 6.013323e+06 test: 6.013323e+06 diff:0.000000e+00 rel_diff: 0.000000e+00
virial[ 0, 1] ref: 1.055391e+01 test: 1.055391e+01 diff:2.299203e-09 rel_diff: 2.178532e-10
virial[ 0, 2] ref: 1.382472e+01 test: 1.382472e+01 diff:4.074536e-10 rel_diff: 2.947282e-11
virial[ 1, 0] ref: 6.284282e+00 test: 6.284282e+00 diff:1.164153e-10 rel_diff: 1.852484e-11
virial[ 1, 1] ref: 6.013220e+06 test: 6.013220e+06 diff:0.000000e+00 rel_diff: 0.000000e+00
virial[ 1, 2] ref: 2.152792e+01 test: 2.152792e+01 diff:5.748007e-10 rel_diff: 2.670024e-11
virial[ 2, 0] ref: -3.347540e+01 test: -3.347540e+01 diff:7.821654e-10 rel_diff: 2.336538e-11
virial[ 2, 1] ref: 7.288609e+01 test: 7.288609e+01 diff:1.611625e-09 rel_diff: 2.211155e-11
virial[ 2, 2] ref: 6.013995e+06 test: 6.013995e+06 diff:5.587935e-09 rel_diff: 9.291552e-16
Task: ./src/grid/sample_tasks/ortho_density_l3333.task          Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 1.230013e+06   Max rel diff: 2.178532e-14   Time: 5.944550e-04 sec
forces[0, 0] ref: 1.136793e-09 test: 6.434798e-10 diff:4.933134e-10 rel_diff: 4.933134e-10
forces[0, 1] ref: 1.311625e-09 test: 6.021308e-10 diff:7.094938e-10 rel_diff: 7.094938e-10
forces[0, 2] ref: 1.128143e-09 test: 8.971696e-10 diff:2.309734e-10 rel_diff: 2.309734e-10
forces[1, 0] ref: 1.136793e-09 test: 6.434798e-10 diff:4.933134e-10 rel_diff: 4.933134e-10
forces[1, 1] ref: 1.311625e-09 test: 6.021308e-10 diff:7.094938e-10 rel_diff: 7.094938e-10
forces[1, 2] ref: 1.128143e-09 test: 8.971696e-10 diff:2.309734e-10 rel_diff: 2.309734e-10
virial[ 0, 0] ref: 6.013323e+06 test: 6.013323e+06 diff:1.210719e-08 rel_diff: 2.013395e-15
virial[ 0, 1] ref: 1.055391e+01 test: 1.055391e+01 diff:1.280569e-09 rel_diff: 1.213359e-10
virial[ 0, 2] ref: 1.382472e+01 test: 1.382472e+01 diff:3.550667e-09 rel_diff: 2.568346e-10
virial[ 1, 0] ref: 6.284282e+00 test: 6.284282e+00 diff:9.022187e-10 rel_diff: 1.435675e-10
virial[ 1, 1] ref: 6.013220e+06 test: 6.013220e+06 diff:1.583248e-08 rel_diff: 2.632946e-15
virial[ 1, 2] ref: 2.152792e+01 test: 2.152792e+01 diff:8.512870e-10 rel_diff: 3.954339e-11
virial[ 2, 0] ref: -3.347540e+01 test: -3.347540e+01 diff:2.044544e-09 rel_diff: 6.107602e-11
virial[ 2, 1] ref: 7.288609e+01 test: 7.288609e+01 diff:2.240995e-09 rel_diff: 3.074654e-11
virial[ 2, 2] ref: 6.013995e+06 test: 6.013995e+06 diff:1.862645e-08 rel_diff: 3.097184e-15

oschuett · 2023-05-25T09:29:48Z

Something is still puzzling me. I get this when I run grid_unittest (same hardware both cases).

That looks strangely identical. There should be at least numerical noise. What does the statistics at the end say?
Note that only the tests with Batched actually use the different backends.

mtaillefumier · 2023-05-25T09:34:25Z

GPU backend


Task: ../src/grid/sample_tasks/ortho_density_l0000.task         Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 1.181734e+03   Max rel diff: 1.065814e-18   Time: 6.982400e-05 sec
Task: ../src/grid/sample_tasks/ortho_density_l0000.task         Integrate Batched   Cycles: 1.000000e+00   Max value: 1.181734e+03   Max rel diff: 3.848136e-16   Time: 6.041870e-04 sec
Task: ../src/grid/sample_tasks/ortho_density_l0000.task         Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 3.505772e+00   Max rel diff: 0.000000e+00   Time: 1.714400e-05 sec
Task: ../src/grid/sample_tasks/ortho_density_l0000.task         Collocate Batched   Cycles: 1.000000e+00   Max value: 3.505772e+00   Max rel diff: 1.616032e-15   Time: 1.513650e-04 sec
Task: ../src/grid/sample_tasks/ortho_density_l0122.task         Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 8.245375e-10   Max rel diff: 2.067952e-25   Time: 6.054900e-05 sec
Task: ../src/grid/sample_tasks/ortho_density_l0122.task         Integrate Batched   Cycles: 1.000000e+00   Max value: 8.245375e-10   Max rel diff: 4.135903e-25   Time: 1.874039e-03 sec
Task: ../src/grid/sample_tasks/ortho_density_l0122.task         Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 5.197319e-10   Max rel diff: 0.000000e+00   Time: 5.854000e-05 sec
Task: ../src/grid/sample_tasks/ortho_density_l0122.task         Collocate Batched   Cycles: 1.000000e+00   Max value: 5.197319e-10   Max rel diff: 3.101927e-25   Time: 1.841950e-04 sec
Task: ../src/grid/sample_tasks/ortho_density_l2200.task         Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 8.664842e-04   Max rel diff: 1.350309e-18   Time: 2.087600e-05 sec
Task: ../src/grid/sample_tasks/ortho_density_l2200.task         Integrate Batched   Cycles: 1.000000e+00   Max value: 8.664842e-04   Max rel diff: 6.479539e-18   Time: 6.638940e-04 sec
Task: ../src/grid/sample_tasks/ortho_density_l2200.task         Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 1.579830e+00   Max rel diff: 0.000000e+00   Time: 9.223000e-06 sec
Task: ../src/grid/sample_tasks/ortho_density_l2200.task         Collocate Batched   Cycles: 1.000000e+00   Max value: 1.579830e+00   Max rel diff: 3.091403e-14   Time: 6.939300e-05 sec
Task: ../src/grid/sample_tasks/ortho_density_l3300.task         Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 6.519054e+02   Max rel diff: 2.210284e-16   Time: 1.398360e-04 sec
Task: ../src/grid/sample_tasks/ortho_density_l3300.task         Integrate Batched   Cycles: 1.000000e+00   Max value: 6.519054e+02   Max rel diff: 6.630853e-16   Time: 5.333688e-03 sec
Task: ../src/grid/sample_tasks/ortho_density_l3300.task         Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 8.606467e+00   Max rel diff: 0.000000e+00   Time: 5.796200e-05 sec
Task: ../src/grid/sample_tasks/ortho_density_l3300.task         Collocate Batched   Cycles: 1.000000e+00   Max value: 8.606467e+00   Max rel diff: 2.389972e-15   Time: 3.558520e-04 sec
forces[0, 1] ref: 1.311625e-09 test: 1.205629e-09 diff:1.059952e-10 rel_diff: 1.059952e-10
forces[0, 2] ref: 1.128143e-09 test: 9.590056e-10 diff:1.691374e-10 rel_diff: 1.691374e-10
forces[1, 0] ref: 1.136793e-09 test: 1.125706e-09 diff:1.108688e-11 rel_diff: 1.108688e-11
forces[1, 1] ref: 1.311625e-09 test: 1.205629e-09 diff:1.059952e-10 rel_diff: 1.059952e-10
forces[1, 2] ref: 1.128143e-09 test: 9.590056e-10 diff:1.691374e-10 rel_diff: 1.691374e-10
virial[ 0, 0] ref: 6.013323e+06 test: 6.013323e+06 diff:0.000000e+00 rel_diff: 0.000000e+00
virial[ 0, 1] ref: 1.055391e+01 test: 1.055391e+01 diff:2.299203e-09 rel_diff: 2.178532e-10
virial[ 0, 2] ref: 1.382472e+01 test: 1.382472e+01 diff:4.074536e-10 rel_diff: 2.947282e-11
virial[ 1, 0] ref: 6.284282e+00 test: 6.284282e+00 diff:1.164153e-10 rel_diff: 1.852484e-11
virial[ 1, 1] ref: 6.013220e+06 test: 6.013220e+06 diff:0.000000e+00 rel_diff: 0.000000e+00
virial[ 1, 2] ref: 2.152792e+01 test: 2.152792e+01 diff:5.748007e-10 rel_diff: 2.670024e-11
virial[ 2, 0] ref: -3.347540e+01 test: -3.347540e+01 diff:7.821654e-10 rel_diff: 2.336538e-11
virial[ 2, 1] ref: 7.288609e+01 test: 7.288609e+01 diff:1.611625e-09 rel_diff: 2.211155e-11
virial[ 2, 2] ref: 6.013995e+06 test: 6.013995e+06 diff:5.587935e-09 rel_diff: 9.291552e-16
Task: ../src/grid/sample_tasks/ortho_density_l3333.task         Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 1.230013e+06   Max rel diff: 2.178532e-14   Time: 6.466130e-04 sec
forces[0, 0] ref: 1.136793e-09 test: 4.452016e-10 diff:6.915917e-10 rel_diff: 6.915917e-10
forces[0, 1] ref: 1.311625e-09 test: 6.708094e-10 diff:6.408152e-10 rel_diff: 6.408152e-10
forces[0, 2] ref: 1.128143e-09 test: 4.962475e-10 diff:6.318955e-10 rel_diff: 6.318955e-10
forces[1, 0] ref: 1.136793e-09 test: 4.452016e-10 diff:6.915917e-10 rel_diff: 6.915917e-10
forces[1, 1] ref: 1.311625e-09 test: 6.708094e-10 diff:6.408152e-10 rel_diff: 6.408152e-10
forces[1, 2] ref: 1.128143e-09 test: 4.962475e-10 diff:6.318955e-10 rel_diff: 6.318955e-10
virial[ 0, 0] ref: 6.013323e+06 test: 6.013323e+06 diff:1.303852e-08 rel_diff: 2.168271e-15
virial[ 0, 1] ref: 1.055391e+01 test: 1.055391e+01 diff:2.910383e-10 rel_diff: 2.757635e-11
virial[ 0, 2] ref: 1.382472e+01 test: 1.382472e+01 diff:5.238689e-09 rel_diff: 3.789363e-10
virial[ 1, 0] ref: 6.284282e+00 test: 6.284282e+00 diff:2.328306e-10 rel_diff: 3.704968e-11
virial[ 1, 1] ref: 6.013220e+06 test: 6.013220e+06 diff:1.210719e-08 rel_diff: 2.013429e-15
virial[ 1, 2] ref: 2.152792e+01 test: 2.152792e+01 diff:1.535227e-09 rel_diff: 7.131329e-11
virial[ 2, 0] ref: -3.347540e+01 test: -3.347540e+01 diff:9.968062e-10 rel_diff: 2.977728e-11
virial[ 2, 1] ref: 7.288609e+01 test: 7.288609e+01 diff:8.440111e-10 rel_diff: 1.157986e-11
virial[ 2, 2] ref: 6.013995e+06 test: 6.013995e+06 diff:3.259629e-08 rel_diff: 5.420072e-15
Task: ../src/grid/sample_tasks/ortho_density_l3333.task         Integrate Batched   Cycles: 1.000000e+00   Max value: 1.230013e+06   Max rel diff: 6.915917e-14   Time: 8.169245e-02 sec
Task: ../src/grid/sample_tasks/ortho_density_l3333.task         Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 1.957149e+02   Max rel diff: 2.220419e-16   Time: 1.863800e-04 sec
Task: ../src/grid/sample_tasks/ortho_density_l3333.task         Collocate Batched   Cycles: 1.000000e+00   Max value: 1.957149e+02   Max rel diff: 2.425837e-14   Time: 1.433062e-03 sec
Task: ../src/grid/sample_tasks/ortho_density_l0505.task         Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 1.337700e-05   Max rel diff: 1.380358e-22   Time: 5.200737e-03 sec
Task: ../src/grid/sample_tasks/ortho_density_l0505.task         Integrate Batched   Cycles: 1.000000e+00   Max value: 1.337700e-05   Max rel diff: 8.470329e-21   Time: 5.319913e+00 sec
Task: ../src/grid/sample_tasks/ortho_density_l0505.task         Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 2.947394e-07   Max rel diff: 5.293956e-23   Time: 6.752830e-04 sec
Task: ../src/grid/sample_tasks/ortho_density_l0505.task         Collocate Batched   Cycles: 1.000000e+00   Max value: 2.947394e-07   Max rel diff: 4.499863e-22   Time: 3.630482e-02 sec
Task: ../src/grid/sample_tasks/ortho_non_periodic.task          Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 2.349539e+00   Max rel diff: 2.428613e-17   Time: 4.326000e-05 sec
Task: ../src/grid/sample_tasks/ortho_non_periodic.task          Integrate Batched   Cycles: 1.000000e+00   Max value: 2.349539e+00   Max rel diff: 5.670336e-16   Time: 1.521622e-03 sec
Task: ../src/grid/sample_tasks/ortho_non_periodic.task          Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 4.483815e-01   Max rel diff: 0.000000e+00   Time: 2.213880e-04 sec
Task: ../src/grid/sample_tasks/ortho_non_periodic.task          Collocate Batched   Cycles: 1.000000e+00   Max value: 4.483815e-01   Max rel diff: 2.220446e-16   Time: 5.039970e-04 sec
Task: ../src/grid/sample_tasks/ortho_tau.task                   Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 8.770995e-02   Max rel diff: 3.330669e-20   Time: 8.330700e-05 sec
Task: ../src/grid/sample_tasks/ortho_tau.task                   Integrate Batched   Cycles: 1.000000e+00   Max value: 8.770995e-02   Max rel diff: 2.775558e-17   Time: 1.339101e-02 sec
Task: ../src/grid/sample_tasks/ortho_tau.task                   Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 2.921986e-01   Max rel diff: 0.000000e+00   Time: 9.353000e-06 sec
Task: ../src/grid/sample_tasks/ortho_tau.task                   Collocate Batched   Cycles: 1.000000e+00   Max value: 2.921986e-01   Max rel diff: 1.665335e-16   Time: 1.833660e-04 sec
Task: ../src/grid/sample_tasks/general_density.task             Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 2.642039e+01   Max rel diff: 1.344686e-16   Time: 2.542300e-05 sec
Task: ../src/grid/sample_tasks/general_density.task             Integrate Batched   Cycles: 1.000000e+00   Max value: 2.642039e+01   Max rel diff: 1.344686e-16   Time: 3.781240e-04 sec
Task: ../src/grid/sample_tasks/general_density.task             Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 5.560563e-01   Max rel diff: 3.885781e-16   Time: 1.602500e-05 sec
Task: ../src/grid/sample_tasks/general_density.task             Collocate Batched   Cycles: 1.000000e+00   Max value: 5.560563e-01   Max rel diff: 2.775558e-16   Time: 7.976100e-05 sec
Task: ../src/grid/sample_tasks/general_tau.task                 Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 1.549947e+00   Max rel diff: 1.432595e-15   Time: 1.466280e-04 sec
Task: ../src/grid/sample_tasks/general_tau.task                 Integrate Batched   Cycles: 1.000000e+00   Max value: 1.549947e+00   Max rel diff: 8.595572e-16   Time: 2.636295e-03 sec
Task: ../src/grid/sample_tasks/general_tau.task                 Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 3.574332e-01   Max rel diff: 7.216450e-16   Time: 8.227800e-05 sec
Task: ../src/grid/sample_tasks/general_tau.task                 Collocate Batched   Cycles: 1.000000e+00   Max value: 3.574332e-01   Max rel diff: 8.881784e-16   Time: 3.365780e-04 sec
Task: ../src/grid/sample_tasks/general_subpatch0.task           Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 2.642039e+01   Max rel diff: 1.344686e-16   Time: 2.240400e-05 sec
Task: ../src/grid/sample_tasks/general_subpatch0.task           Integrate Batched   Cycles: 1.000000e+00   Max value: 2.642039e+01   Max rel diff: 1.344686e-16   Time: 3.777570e-04 sec
Task: ../src/grid/sample_tasks/general_subpatch0.task           Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 5.560563e-01   Max rel diff: 3.885781e-16   Time: 1.323300e-05 sec
Task: ../src/grid/sample_tasks/general_subpatch0.task           Collocate Batched   Cycles: 1.000000e+00   Max value: 5.560563e-01   Max rel diff: 2.775558e-16   Time: 8.003300e-05 sec
Task: ../src/grid/sample_tasks/general_subpatch16.task          Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 6.361029e-04   Max rel diff: 4.119968e-18   Time: 1.155540e-04 sec
Task: ../src/grid/sample_tasks/general_subpatch16.task          Integrate Batched   Cycles: 1.000000e+00   Max value: 6.361029e-04   Max rel diff: 4.336809e-18   Time: 1.624923e-03 sec
Task: ../src/grid/sample_tasks/general_subpatch16.task          Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 3.690679e-03   Max rel diff: 5.160802e-17   Time: 2.825520e-04 sec
Task: ../src/grid/sample_tasks/general_subpatch16.task          Collocate Batched   Cycles: 1.000000e+00   Max value: 3.690679e-03   Max rel diff: 5.117434e-17   Time: 8.980870e-04 sec
Task: ../src/grid/sample_tasks/general_overflow.task            Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 1.597303e+01   Max rel diff: 3.336294e-15   Time: 2.512200e-05 sec
Task: ../src/grid/sample_tasks/general_overflow.task            Integrate Batched   Cycles: 1.000000e+00   Max value: 1.597303e+01   Max rel diff: 3.558713e-15   Time: 2.696330e-04 sec
Task: ../src/grid/sample_tasks/general_overflow.task            Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 7.584822e+00   Max rel diff: 5.513005e-14   Time: 3.956000e-05 sec
Task: ../src/grid/sample_tasks/general_overflow.task            Collocate Batched   Cycles: 1.000000e+00   Max value: 7.584822e+00   Max rel diff: 5.513005e-14   Time: 1.088830e-04 sec

 -------------------------------------------------------------------------------
 -                                                                             -
 -                                GRID STATISTICS                              -
 -                                                                             -
 -------------------------------------------------------------------------------
 LP    KERNEL             BACKEND                              COUNT     PERCENT
 0     collocate general  CPU                                      4       7.69%
 3     integrate general  CPU                                      4       7.69%
 0     collocate general  GPU                                      4       7.69%
 3     integrate general  GPU                                      4       7.69%
 3     collocate ortho    CPU                                      3       5.77%
 6     integrate ortho    CPU                                      3       5.77%
 3     collocate ortho    GPU                                      3       5.77%
 6     integrate ortho    GPU                                      3       5.77%
 2     collocate ortho    CPU                                      2       3.85%
 5     integrate ortho    CPU                                      2       3.85%
 2     collocate ortho    GPU                                      2       3.85%
 5     integrate ortho    GPU                                      2       3.85%
 0     collocate ortho    CPU                                      1       1.92%
 6     collocate ortho    CPU                                      1       1.92%
 10    collocate ortho    CPU                                      1       1.92%
 3     integrate ortho    CPU                                      1       1.92%
 9     integrate ortho    CPU                                      1       1.92%
 13    integrate ortho    CPU                                      1       1.92%
 2     collocate general  CPU                                      1       1.92%
 5     integrate general  CPU                                      1       1.92%
 0     collocate ortho    GPU                                      1       1.92%
 6     collocate ortho    GPU                                      1       1.92%
 10    collocate ortho    GPU                                      1       1.92%
 3     integrate ortho    GPU                                      1       1.92%
 9     integrate ortho    GPU                                      1       1.92%
 13    integrate ortho    GPU                                      1       1.92%
 2     collocate general  GPU                                      1       1.92%
 5     integrate general  GPU                                      1       1.92%
 -------------------------------------------------------------------------------

All tests have passed :-)

HIP backend

Task: ./src/grid/sample_tasks/ortho_density_l0000.task          Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 1.181734e+03   Max rel diff: 1.065814e-18   Time: 6.884800e-05 sec
Task: ./src/grid/sample_tasks/ortho_density_l0000.task          Integrate Batched   Cycles: 1.000000e+00   Max value: 1.181734e+03   Max rel diff: 5.772204e-16   Time: 3.952390e-04 sec
Task: ./src/grid/sample_tasks/ortho_density_l0000.task          Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 3.505772e+00   Max rel diff: 0.000000e+00   Time: 2.553200e-05 sec
Task: ./src/grid/sample_tasks/ortho_density_l0000.task          Collocate Batched   Cycles: 1.000000e+00   Max value: 3.505772e+00   Max rel diff: 1.797597e-15   Time: 2.672700e-04 sec
Task: ./src/grid/sample_tasks/ortho_density_l0122.task          Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 8.245375e-10   Max rel diff: 2.067952e-25   Time: 7.017200e-05 sec
Task: ./src/grid/sample_tasks/ortho_density_l0122.task          Integrate Batched   Cycles: 1.000000e+00   Max value: 8.245375e-10   Max rel diff: 3.101927e-25   Time: 8.682160e-04 sec
Task: ./src/grid/sample_tasks/ortho_density_l0122.task          Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 5.197319e-10   Max rel diff: 0.000000e+00   Time: 5.977800e-05 sec
Task: ./src/grid/sample_tasks/ortho_density_l0122.task          Collocate Batched   Cycles: 1.000000e+00   Max value: 5.197319e-10   Max rel diff: 3.101927e-25   Time: 2.369920e-04 sec
Task: ./src/grid/sample_tasks/ortho_density_l2200.task          Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 8.664842e-04   Max rel diff: 1.350309e-18   Time: 2.036100e-05 sec
Task: ./src/grid/sample_tasks/ortho_density_l2200.task          Integrate Batched   Cycles: 1.000000e+00   Max value: 8.664842e-04   Max rel diff: 4.458339e-18   Time: 2.234590e-04 sec
Task: ./src/grid/sample_tasks/ortho_density_l2200.task          Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 1.579830e+00   Max rel diff: 0.000000e+00   Time: 1.036900e-05 sec
Task: ./src/grid/sample_tasks/ortho_density_l2200.task          Collocate Batched   Cycles: 1.000000e+00   Max value: 1.579830e+00   Max rel diff: 7.835881e-14   Time: 7.682200e-05 sec
Task: ./src/grid/sample_tasks/ortho_density_l3300.task          Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 6.519054e+02   Max rel diff: 2.210284e-16   Time: 1.333810e-04 sec
Task: ./src/grid/sample_tasks/ortho_density_l3300.task          Integrate Batched   Cycles: 1.000000e+00   Max value: 6.519054e+02   Max rel diff: 7.705377e-16   Time: 3.705109e-03 sec
Task: ./src/grid/sample_tasks/ortho_density_l3300.task          Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 8.606467e+00   Max rel diff: 0.000000e+00   Time: 6.417500e-05 sec
Task: ./src/grid/sample_tasks/ortho_density_l3300.task          Collocate Batched   Cycles: 1.000000e+00   Max value: 8.606467e+00   Max rel diff: 2.573816e-15   Time: 5.325280e-04 sec
forces[0, 1] ref: 1.311625e-09 test: 1.205629e-09 diff:1.059952e-10 rel_diff: 1.059952e-10
forces[0, 2] ref: 1.128143e-09 test: 9.590056e-10 diff:1.691374e-10 rel_diff: 1.691374e-10
forces[1, 0] ref: 1.136793e-09 test: 1.125706e-09 diff:1.108688e-11 rel_diff: 1.108688e-11
forces[1, 1] ref: 1.311625e-09 test: 1.205629e-09 diff:1.059952e-10 rel_diff: 1.059952e-10
forces[1, 2] ref: 1.128143e-09 test: 9.590056e-10 diff:1.691374e-10 rel_diff: 1.691374e-10
virial[ 0, 0] ref: 6.013323e+06 test: 6.013323e+06 diff:0.000000e+00 rel_diff: 0.000000e+00
virial[ 0, 1] ref: 1.055391e+01 test: 1.055391e+01 diff:2.299203e-09 rel_diff: 2.178532e-10
virial[ 0, 2] ref: 1.382472e+01 test: 1.382472e+01 diff:4.074536e-10 rel_diff: 2.947282e-11
virial[ 1, 0] ref: 6.284282e+00 test: 6.284282e+00 diff:1.164153e-10 rel_diff: 1.852484e-11
virial[ 1, 1] ref: 6.013220e+06 test: 6.013220e+06 diff:0.000000e+00 rel_diff: 0.000000e+00
virial[ 1, 2] ref: 2.152792e+01 test: 2.152792e+01 diff:5.748007e-10 rel_diff: 2.670024e-11
virial[ 2, 0] ref: -3.347540e+01 test: -3.347540e+01 diff:7.821654e-10 rel_diff: 2.336538e-11
virial[ 2, 1] ref: 7.288609e+01 test: 7.288609e+01 diff:1.611625e-09 rel_diff: 2.211155e-11
virial[ 2, 2] ref: 6.013995e+06 test: 6.013995e+06 diff:5.587935e-09 rel_diff: 9.291552e-16
Task: ./src/grid/sample_tasks/ortho_density_l3333.task          Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 1.230013e+06   Max rel diff: 2.178532e-14   Time: 6.034160e-04 sec
forces[0, 0] ref: 1.136793e-09 test: 6.434798e-10 diff:4.933134e-10 rel_diff: 4.933134e-10
forces[0, 1] ref: 1.311625e-09 test: 6.021308e-10 diff:7.094938e-10 rel_diff: 7.094938e-10
forces[0, 2] ref: 1.128143e-09 test: 8.971696e-10 diff:2.309734e-10 rel_diff: 2.309734e-10
forces[1, 0] ref: 1.136793e-09 test: 6.434798e-10 diff:4.933134e-10 rel_diff: 4.933134e-10
forces[1, 1] ref: 1.311625e-09 test: 6.021308e-10 diff:7.094938e-10 rel_diff: 7.094938e-10
forces[1, 2] ref: 1.128143e-09 test: 8.971696e-10 diff:2.309734e-10 rel_diff: 2.309734e-10
virial[ 0, 0] ref: 6.013323e+06 test: 6.013323e+06 diff:1.210719e-08 rel_diff: 2.013395e-15
virial[ 0, 1] ref: 1.055391e+01 test: 1.055391e+01 diff:1.280569e-09 rel_diff: 1.213359e-10
virial[ 0, 2] ref: 1.382472e+01 test: 1.382472e+01 diff:3.550667e-09 rel_diff: 2.568346e-10
virial[ 1, 0] ref: 6.284282e+00 test: 6.284282e+00 diff:9.022187e-10 rel_diff: 1.435675e-10
virial[ 1, 1] ref: 6.013220e+06 test: 6.013220e+06 diff:1.583248e-08 rel_diff: 2.632946e-15
virial[ 1, 2] ref: 2.152792e+01 test: 2.152792e+01 diff:8.512870e-10 rel_diff: 3.954339e-11
virial[ 2, 0] ref: -3.347540e+01 test: -3.347540e+01 diff:2.044544e-09 rel_diff: 6.107602e-11
virial[ 2, 1] ref: 7.288609e+01 test: 7.288609e+01 diff:2.240995e-09 rel_diff: 3.074654e-11
virial[ 2, 2] ref: 6.013995e+06 test: 6.013995e+06 diff:1.862645e-08 rel_diff: 3.097184e-15
Task: ./src/grid/sample_tasks/ortho_density_l3333.task          Integrate Batched   Cycles: 1.000000e+00   Max value: 1.230013e+06   Max rel diff: 7.094938e-14   Time: 7.670461e-02 sec
Task: ./src/grid/sample_tasks/ortho_density_l3333.task          Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 1.957149e+02   Max rel diff: 2.220419e-16   Time: 2.036780e-04 sec
Task: ./src/grid/sample_tasks/ortho_density_l3333.task          Collocate Batched   Cycles: 1.000000e+00   Max value: 1.957149e+02   Max rel diff: 4.826695e-14   Time: 1.821013e-02 sec
Task: ./src/grid/sample_tasks/ortho_density_l0505.task          Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 1.337700e-05   Max rel diff: 1.380358e-22   Time: 5.211712e-03 sec
Task: ./src/grid/sample_tasks/ortho_density_l0505.task          Integrate Batched   Cycles: 1.000000e+00   Max value: 1.337700e-05   Max rel diff: 8.470329e-21   Time: 7.079433e-01 sec
Task: ./src/grid/sample_tasks/ortho_density_l0505.task          Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 2.947394e-07   Max rel diff: 5.293956e-23   Time: 6.467670e-04 sec
Task: ./src/grid/sample_tasks/ortho_density_l0505.task          Collocate Batched   Cycles: 1.000000e+00   Max value: 2.947394e-07   Max rel diff: 4.499863e-22   Time: 5.112278e-02 sec
Task: ./src/grid/sample_tasks/ortho_non_periodic.task           Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 2.349539e+00   Max rel diff: 2.428613e-17   Time: 4.203000e-05 sec
Task: ./src/grid/sample_tasks/ortho_non_periodic.task           Integrate Batched   Cycles: 1.000000e+00   Max value: 2.349539e+00   Max rel diff: 3.780224e-16   Time: 9.335640e-04 sec
Task: ./src/grid/sample_tasks/ortho_non_periodic.task           Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 4.483815e-01   Max rel diff: 0.000000e+00   Time: 2.225200e-04 sec
Task: ./src/grid/sample_tasks/ortho_non_periodic.task           Collocate Batched   Cycles: 1.000000e+00   Max value: 4.483815e-01   Max rel diff: 2.220446e-16   Time: 5.692620e-04 sec
Task: ./src/grid/sample_tasks/ortho_tau.task                    Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 8.770995e-02   Max rel diff: 3.330669e-20   Time: 6.622400e-05 sec
Task: ./src/grid/sample_tasks/ortho_tau.task                    Integrate Batched   Cycles: 1.000000e+00   Max value: 8.770995e-02   Max rel diff: 1.387779e-17   Time: 3.755328e-03 sec
Task: ./src/grid/sample_tasks/ortho_tau.task                    Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 2.921986e-01   Max rel diff: 0.000000e+00   Time: 9.259000e-06 sec
Task: ./src/grid/sample_tasks/ortho_tau.task                    Collocate Batched   Cycles: 1.000000e+00   Max value: 2.921986e-01   Max rel diff: 2.498002e-16   Time: 4.846450e-04 sec
Task: ./src/grid/sample_tasks/general_density.task              Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 2.642039e+01   Max rel diff: 1.344686e-16   Time: 2.532800e-05 sec
Task: ./src/grid/sample_tasks/general_density.task              Integrate Batched   Cycles: 1.000000e+00   Max value: 2.642039e+01   Max rel diff: 1.344686e-16   Time: 1.745710e-04 sec
Task: ./src/grid/sample_tasks/general_density.task              Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 5.560563e-01   Max rel diff: 3.885781e-16   Time: 1.553000e-05 sec
Task: ./src/grid/sample_tasks/general_density.task              Collocate Batched   Cycles: 1.000000e+00   Max value: 5.560563e-01   Max rel diff: 2.775558e-16   Time: 1.511700e-04 sec
Task: ./src/grid/sample_tasks/general_tau.task                  Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 1.549947e+00   Max rel diff: 1.432595e-15   Time: 1.413200e-04 sec
Task: ./src/grid/sample_tasks/general_tau.task                  Integrate Batched   Cycles: 1.000000e+00   Max value: 1.549947e+00   Max rel diff: 1.002817e-15   Time: 1.431923e-03 sec
Task: ./src/grid/sample_tasks/general_tau.task                  Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 3.574332e-01   Max rel diff: 7.216450e-16   Time: 8.671700e-05 sec
Task: ./src/grid/sample_tasks/general_tau.task                  Collocate Batched   Cycles: 1.000000e+00   Max value: 3.574332e-01   Max rel diff: 8.881784e-16   Time: 2.145640e-04 sec
Task: ./src/grid/sample_tasks/general_subpatch0.task            Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 2.642039e+01   Max rel diff: 1.344686e-16   Time: 2.291600e-05 sec
Task: ./src/grid/sample_tasks/general_subpatch0.task            Integrate Batched   Cycles: 1.000000e+00   Max value: 2.642039e+01   Max rel diff: 1.344686e-16   Time: 1.483590e-04 sec
Task: ./src/grid/sample_tasks/general_subpatch0.task            Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 5.560563e-01   Max rel diff: 3.885781e-16   Time: 1.307200e-05 sec
Task: ./src/grid/sample_tasks/general_subpatch0.task            Collocate Batched   Cycles: 1.000000e+00   Max value: 5.560563e-01   Max rel diff: 2.775558e-16   Time: 1.487060e-04 sec
Task: ./src/grid/sample_tasks/general_subpatch16.task           Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 6.361029e-04   Max rel diff: 4.119968e-18   Time: 1.124520e-04 sec
Task: ./src/grid/sample_tasks/general_subpatch16.task           Integrate Batched   Cycles: 1.000000e+00   Max value: 6.361029e-04   Max rel diff: 4.011548e-18   Time: 1.540413e-03 sec
Task: ./src/grid/sample_tasks/general_subpatch16.task           Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 3.690679e-03   Max rel diff: 5.160802e-17   Time: 2.851540e-04 sec
Task: ./src/grid/sample_tasks/general_subpatch16.task           Collocate Batched   Cycles: 1.000000e+00   Max value: 3.690679e-03   Max rel diff: 5.117434e-17   Time: 8.207350e-04 sec
Task: ./src/grid/sample_tasks/general_overflow.task             Integrate PGF-Ref   Cycles: 1.000000e+00   Max value: 1.597303e+01   Max rel diff: 3.336294e-15   Time: 2.672600e-05 sec
Task: ./src/grid/sample_tasks/general_overflow.task             Integrate Batched   Cycles: 1.000000e+00   Max value: 1.597303e+01   Max rel diff: 3.336294e-15   Time: 1.615260e-04 sec
Task: ./src/grid/sample_tasks/general_overflow.task             Collocate PGF-Ref   Cycles: 1.000000e+00   Max value: 7.584822e+00   Max rel diff: 5.513005e-14   Time: 3.464700e-05 sec
Task: ./src/grid/sample_tasks/general_overflow.task             Collocate Batched   Cycles: 1.000000e+00   Max value: 7.584822e+00   Max rel diff: 5.513005e-14   Time: 1.087460e-04 sec

 -------------------------------------------------------------------------------
 -                                                                             -
 -                                GRID STATISTICS                              -
 -                                                                             -
 -------------------------------------------------------------------------------
 LP    KERNEL             BACKEND                              COUNT     PERCENT
 0     collocate general  CPU                                      4       7.69%
 3     integrate general  CPU                                      4       7.69%
 0     collocate general  HIP                                      4       7.69%
 3     integrate general  HIP                                      4       7.69%
 3     collocate ortho    CPU                                      3       5.77%
 6     integrate ortho    CPU                                      3       5.77%
 3     collocate ortho    HIP                                      3       5.77%
 6     integrate ortho    HIP                                      3       5.77%
 2     collocate ortho    CPU                                      2       3.85%
 5     integrate ortho    CPU                                      2       3.85%
 2     collocate ortho    HIP                                      2       3.85%
 5     integrate ortho    HIP                                      2       3.85%
 0     collocate ortho    CPU                                      1       1.92%
 6     collocate ortho    CPU                                      1       1.92%
 10    collocate ortho    CPU                                      1       1.92%
 3     integrate ortho    CPU                                      1       1.92%
 9     integrate ortho    CPU                                      1       1.92%
 13    integrate ortho    CPU                                      1       1.92%
 2     collocate general  CPU                                      1       1.92%
 5     integrate general  CPU                                      1       1.92%
 0     collocate ortho    HIP                                      1       1.92%
 6     collocate ortho    HIP                                      1       1.92%
 10    collocate ortho    HIP                                      1       1.92%
 3     integrate ortho    HIP                                      1       1.92%
 9     integrate ortho    HIP                                      1       1.92%
 13    integrate ortho    HIP                                      1       1.92%
 2     collocate general  HIP                                      1       1.92%
 5     integrate general  HIP                                      1       1.92%
 -------------------------------------------------------------------------------

All tests have passed :-)

mtaillefumier · 2023-05-25T09:37:05Z

they are different and it is not where my trouble comes from. It is more about the difference ref/GPU and ref/hip.

oschuett · 2023-05-25T16:42:15Z

they are different and it is not where my trouble comes from. It is more about the difference ref/GPU and ref/hip.

Those difference should be ok because they are below our tolerances of 1e-12 for matrix elements and 1e-8 for forces.

These "warning lines" are already printed when the diffs are surpassing 0.01 of our thresholds. While this was useful during development, now it's confusing. Hence, I've opened #2797 to fix this inconsistency.

mtaillefumier · 2023-05-25T16:52:32Z

perfect. thanks for the clarification because I was searching for something wrong in the code. #2797 can be merged first then I will update this PR unless there is no conflict you can merge both. You can squash all commits in to one if you wish

mtaillefumier and others added 2 commits May 23, 2023 19:05

Update grid_unittest.c

bf54eef

[grid] fix typo

efd2641

[grid] remove cab from shared memory entirely

866425d

oschuett merged commit b314dfa into cp2k:master May 25, 2023

mtaillefumier deleted the shared_mem_fix branch January 30, 2025 03:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[grid] reduce shared memory usage #2793

[grid] reduce shared memory usage #2793

Uh oh!

mtaillefumier commented May 23, 2023

Uh oh!

oschuett commented May 23, 2023

Uh oh!

mtaillefumier commented May 23, 2023

Uh oh!

oschuett commented May 24, 2023

Uh oh!

mtaillefumier commented May 25, 2023

Uh oh!

oschuett commented May 25, 2023

Uh oh!

mtaillefumier commented May 25, 2023 •

edited

Loading

Uh oh!

mtaillefumier commented May 25, 2023

Uh oh!

oschuett commented May 25, 2023

Uh oh!

mtaillefumier commented May 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[grid] reduce shared memory usage #2793

[grid] reduce shared memory usage #2793

Uh oh!

Conversation

mtaillefumier commented May 23, 2023

Uh oh!

oschuett commented May 23, 2023

Uh oh!

mtaillefumier commented May 23, 2023

Uh oh!

oschuett commented May 24, 2023

Uh oh!

mtaillefumier commented May 25, 2023

Uh oh!

oschuett commented May 25, 2023

Uh oh!

mtaillefumier commented May 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mtaillefumier commented May 25, 2023

Uh oh!

oschuett commented May 25, 2023

Uh oh!

mtaillefumier commented May 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mtaillefumier commented May 25, 2023 •

edited

Loading