• Joined on 2022-07-21
gallo pushed tag cuda-energy-working to gallo/atrip 2023-01-26 01:52:32 +01:00
gallo pushed to cuda-energy at gallo/atrip 2023-01-26 01:52:16 +01:00
af42b353c4 Use acc::maybeConjugate for cpu code
e4f326e394 Fix the reordering kernel in cuda
93cba3c3ab Implement zeroing of tensors through memcpy and cuMemcpy
Compare 3 commits »
gallo pushed to cuda-energy at gallo/atrip 2023-01-25 16:26:00 +01:00
4e2d1143e5 Add getSize static method to calculate the size of sources in SliceUnion
gallo pushed to newtuples at gallo/atrip 2023-01-25 16:26:00 +01:00
8f7d05efda Add Building information and building for sources on GPU
gallo pushed to cuda at gallo/atrip 2023-01-25 16:26:00 +01:00
122329eca7 Fix zeroing
58c0bf078e Zero Tijk correctly in CPU code
3fe15e5e5c Fix bs and ths error in equations
0d223e6ed9 Fix vector types for energy in cpu
c8bdc4239f Fix an odd character in the warmup
Compare 12 commits »
gallo pushed to cuda-energy at gallo/atrip 2023-01-25 13:55:52 +01:00
933d556c84 Fix the reordering kernel in Equations
c7e3fa45bd Add old version of energies and only generate code for doubles
2b8b3bd421 don't copy self sufficient slices when sources on gpu
122329eca7 Fix zeroing
58c0bf078e Zero Tijk correctly in CPU code
Compare 8 commits »
gallo pushed to cuda-energy at gallo/atrip 2023-01-23 14:30:31 +01:00
be96e4bf8c 1.syntax error fix 2.allocate temporary buffers only once per sim
9003c218a3 don't need to copy to separate mpi_data array on the host when sources are resident on gpu
Compare 2 commits »
gallo pushed to cuda-energy at gallo/atrip 2023-01-23 14:22:30 +01:00
4af47a0bb7 Initialize sources on gpus when ATRIP_SOURCES_IN_GPU
gallo pushed to cuda-energy at gallo/atrip 2023-01-23 13:50:18 +01:00
9a5a2487be Add warmup in the SliceUnion
c4ec227185 Clean getEnergyDistinct
1ceb4cf0d6 Fix maybeConjugate cuda scope
34a4e79db0 Initial compiling implementation of the energy kernel
Compare 4 commits »
gallo created branch cuda-energy in gallo/atrip 2023-01-23 13:50:18 +01:00
gallo pushed to openacc at gallo/atrip 2023-01-11 13:08:07 +01:00
017cf43381 Add preliminary openacc support, atrip bench not linking
gallo pushed to openacc at gallo/atrip 2023-01-05 00:07:45 +01:00
77e1aaabeb Add bureaucracy for openacc in autotools
gallo created branch openacc in gallo/atrip 2023-01-05 00:07:45 +01:00
gallo pushed to cuda at gallo/atrip 2023-01-04 15:28:27 +01:00
249f1c0b51 Add raven modules for cuda
1d96800d45 Add support for reading tensors from file in atrip bench
9087e3af19 Update workflows
418fd9d389 Add simple cuda bench configuration
Compare 4 commits »
gallo pushed to cuda at gallo/atrip 2022-12-06 20:39:23 +01:00
895cd02778 Add some documentation about running the benches
8efa3d911e Add --max-iterations to main bench
Compare 2 commits »
gallo pushed to cuda at gallo/atrip 2022-12-06 14:18:55 +01:00
0fa24404e5 Improve the documentation in the readme for benches building
8f7d05efda Add Building information and building for sources on GPU
ad542fe856 Add the slicing into the GPU
658397ebd7 Update in SliceUnion ATRIP_SOURCES_IN_GPU
26e2f2d109 Add ATRIP_SOURCES_IN_GPU and ATRIP_CUDA_AWARE_MPI defines in configure
Compare 27 commits »
gallo pushed to newtuples at gallo/atrip 2022-12-05 22:10:32 +01:00
ad542fe856 Add the slicing into the GPU
658397ebd7 Update in SliceUnion ATRIP_SOURCES_IN_GPU
26e2f2d109 Add ATRIP_SOURCES_IN_GPU and ATRIP_CUDA_AWARE_MPI defines in configure
871471aae3 Fix naive-tuples experimentation bench
65a64f3f8c Test on all pushes
Compare 13 commits »