Commit Graph

234 Commits

Author SHA1 Message Date
3ddd507c17 Fix a bug in the building the abc tuples fakely 2022-10-07 12:23:10 +02:00
ae6736fc21 Implement a faster version of naive computation 2022-10-07 03:19:24 +02:00
d5cfe31b12 Add naive tuples scracth file 2022-10-07 01:11:22 +02:00
ddb4574380 Fix DatabaseCommunicator 2022-10-06 12:27:00 +02:00
118df09128 Add tentative DatabaseCommunicator 2022-10-06 01:10:06 +02:00
1e391e3749 Update tuples-distribution script 2022-10-06 01:07:53 +02:00
7734efeb97 Add tuples distribution bench 2022-10-03 17:13:21 +02:00
fa1a29c583 Create an implementation file of the Tuples 2022-10-03 17:11:49 +02:00
2cbff5c8c9 Add bench utils 2022-10-03 17:11:33 +02:00
50896e3cd0 Add vendor infrastructure 2022-10-03 17:11:08 +02:00
b11b53aca1 Add warning for not slicing 2022-09-30 14:36:05 +02:00
5f9725d515 Add forgotten memcpy for SelfSufficient buffers 2022-09-30 14:35:28 +02:00
1ca31f4929 Fix bench building for cublas-parallel-atrip 2022-09-30 14:34:54 +02:00
399447131c Exclude clang5 from the nvidia compilation 2022-09-14 14:52:27 +02:00
da2823cf54 Exclude (gcc, cuda) from matrix 2022-09-14 14:43:38 +02:00
4092f968b7 Tell nvcc to use the correct MPICXX binary
There was a bug related to

  https://gcc.gnu.org/pipermail/gcc-help/2021-June/140490.html

where nvcc was getting other versions of the gcc compiler
and the libstd++ version were mismatched, resulting in linking errors.
2022-09-14 13:28:15 +02:00
a263c0c933 Fix cuda bench 2022-09-13 15:14:45 +02:00
2895fd25a3 Disable debug in openblas 2022-09-13 15:00:06 +02:00
8abc516b1f Use --enable-cuda flag in workflows 2022-09-13 14:51:49 +02:00
Gallo Alejandro
5c41fb65e4 Fix ctf.mk for CUDA 2022-09-13 14:15:39 +02:00
4e2de62508 Fix memory consumption in test-cublas-parallel-atrip 2022-09-12 20:19:41 +02:00
81e89274ab Fix yaml workflows syntax 2022-09-12 20:02:51 +02:00
5ca94e3238 Try cuda in github ci 2022-09-12 20:00:46 +02:00
6effcbcdc8 Add ci for cuda branch too 2022-09-12 19:58:18 +02:00
Gallo Alejandro
20c29ed815 Add a couple of bench scripts for cuda and cublas testing 2022-09-12 19:53:50 +02:00
Gallo Alejandro
a5bb8f3d9c Give a better message when out of memory in allocating SliceUnion 2022-09-12 19:39:38 +02:00
Gallo Alejandro
5678ac0d9c Fix AniaBug#2: Lifecycle of SelfSufficient slices was wrong (comments)
This commit shows how the lifecycle of as slice goes.
At some point, a rank gets a list of slices that needs
in the next iteration, at classifies them according
to the characteristics of every situation.

If for instance we are given a slice with
an abc tuple such that we find that this tuple
was given to our rank, then we know that
we have to create a SelfSufficient tuple.

What we do is that we find a blank slice in our
SliceUnion slices bucket. This buffer is blank
and safe to do everything we want with it.
Without cuda, we just need to point this
blank slice to the correct memory address
of the data, that we (the SliceUnion) own.
This is therefore the line

  blank.data = sources[from.source].data();

Of course, doing this in CUDA will mess everything,
as it was until now, since we are pointing to a Host
address. Sadly the way the casting fu is now implemented,
the typechecker did not get that one and I foolishly
forgot about this important bit.

After the creation of the slice comes at some point
in the life cycle the destruction, which we also
have to handle separately.
This is done every iteration in the

    void clearUnusedSlicesForNext(ABCTuple const& abc);

function. There, normally the SelfSufficient slice
would just forget about the pointer it points, slice.data,
since this point is part of the original data of the tensor
distributed in the SliceUnion. In the CUDA case however,
we gave the SelfSufficient slice a freePointer from our
SliceUnion's bucket of allocated freePointers in the GPU
(of which we have around 6 per SliceUnion type).
This pointer needs to be marked again free to use
by a slice in the future, so it has to go back to the bucket,
we can't afford to lose it.
2022-09-12 19:29:35 +02:00
Gallo Alejandro
5483325626 Fix AniaBug #1: cublasCreate after context setting 2022-09-12 19:17:52 +02:00
Gallo Alejandro
23ad87214f Uncomment everything in Equations, (to test see comments)
For testing, comment out everything
that has REORDER, MAYBE_CONJ and zeroing.
2022-09-12 19:13:45 +02:00
Gallo Alejandro
3d7702d501 Minimal changes in Equations 2022-09-12 19:10:14 +02:00
Gallo Alejandro
c20b9e3bcb Add error checking in Blas.cxx 2022-09-12 19:07:48 +02:00
Gallo Alejandro
da704ad820 Add convenience _FORMAT macro 2022-09-12 18:42:36 +02:00
Gallo Alejandro
f1b2f37fe2 Check CUresult for mpi_data to device 2022-09-12 18:41:37 +02:00
Gallo Alejandro
1cd7bac187 Add macros to check CUerror and cublasStatus_t 2022-09-12 18:36:01 +02:00
Gallo Alejandro
68892d5dd8 Make bootstrap work from anywhere in the project 2022-09-12 18:35:30 +02:00
4277c07cc2 Add memory consumption in bench 2022-09-08 15:44:29 +02:00
0558148937 Fix small syntactic bug 2022-09-08 15:36:10 +02:00
49ff3b377c Add a chrono for mpi memcpy in cuda 2022-09-08 15:27:51 +02:00
00a28c990c Indent more conventionally 2022-09-08 15:27:38 +02:00
2c5a4620ca Lint and tidy up Equations 2022-09-08 13:51:49 +02:00
368c5619cc Fix the autotools atrip_cublas 2022-09-08 13:50:08 +02:00
0b14ac7704 Add bootstrap script 2022-09-06 15:26:28 +02:00
Gallo Alejandro
76a785044d Check ngcards agains ranks per node 2022-08-14 15:36:22 +02:00
Gallo Alejandro
7241bbe9fb Implement reordering on the GPU 2022-08-12 18:32:32 +02:00
Gallo Alejandro
c2e9e930ba Update main Atrip.cxx using several gpus 2022-08-12 18:30:55 +02:00
Gallo Alejandro
b4aef4db9e Fix compilation issues and add KernelSizes 2022-08-12 18:29:21 +02:00
Gallo Alejandro
4651231d3b Update test bench for CUDA 2022-08-12 18:28:20 +02:00
Gallo Alejandro
4101c89907 Improve cuda m4 2022-08-11 13:55:52 +02:00
f06cd7f562 Fix the request free problem 2022-08-08 18:28:51 +02:00
8c04280a65 Fix blas 2022-08-08 18:26:52 +02:00