ESSENSE is now enabled to run on a single GPU using OpenMP directives. Benchmark tests were carried out on Kebnekaise at HPC2N, with Intel Xeon E5-2690v4 (Broadwell) CPU and NVIDIA Tesla V100 GPU.
However, the speed is about 20% slower than the OpenACC version. Compared to a single CPU, a 33 times speedup was obtained for the lower resolution and about 50x for a higher resolution.
CPU | GPU | Speedup | |
65x65x65 | 3.965 [s] | 0.120 [s] | 33 |
129x129x129 | 30.581 [s] | 0.623 [s] | 49 |