Greetings all,
First off let me say K-Wave has been an invaluable tool for me and it's been running brilliantly on my workstation for months. Since I intend to run bigger simulations (and lots of them) I'm now trying to run kspaceFirstOrder-CUDA on my department's HPC using A100s. Unfortunately it doesn't seem to get very far and I receive the following:
'
INFO:root:k-wave forward, cycle: 1, wavelength_index: 1, pulse: 1
INFO:root:p0 loaded in 0.09530607238411903 seconds
INFO:root:kwave forward run in 229.18575995601714 seconds
INFO:root:sensor data saved in 0.010119317099452019 seconds
INFO:root:k-wave forward, cycle: 1, wavelength_index: 2, pulse: 1
INFO:root:p0 loaded in 0.021063480526208878 seconds
┌───────────────────────────────────────────────────────────────┐
│ !!! K-Wave experienced a fatal error !!! │
├───────────────────────────────────────────────────────────────┤
│ GPU error: an illegal memory access was encountered routine │
│ name: cudaGetLastError() in file OutputStreams/ │
│ OutputStreamsCudaKernels.cu, line 130. │
├───────────────────────────────────────────────────────────────┤
│ Execution terminated │
└───────────────────────────────────────────────────────────────┘
'
Now my script is simulating time-series images, so it runs k-wave multiple times. For the first run ("cycle: 1, wavelength_index: 1, pulse: 1"), it works fine! The output also seems completely sensible. But as soon as it reaches pulse number two it encounters the error. It's most probably related to the HPC since I haven't encountered this on my workstation, just wondering if anybody has seen anything similar or might have any idea as to what's causing this? The time reversal reconstructions are also set to run after the forward simulations and these are completely fine - although they are 2D and thus only require a fraction of the resources.