Commit Graph

57 Commits

Author SHA1 Message Date
34a4e79db0 Initial compiling implementation of the energy kernel 2023-01-13 11:33:42 +01:00
8efa3d911e Add --max-iterations to main bench 2022-12-06 20:38:38 +01:00
ad542fe856 Add the slicing into the GPU 2022-12-05 21:16:30 +01:00
658397ebd7 Update in SliceUnion ATRIP_SOURCES_IN_GPU 2022-12-05 17:55:23 +01:00
118df09128 Add tentative DatabaseCommunicator 2022-10-06 01:10:06 +02:00
fa1a29c583 Create an implementation file of the Tuples 2022-10-03 17:11:49 +02:00
50896e3cd0 Add vendor infrastructure 2022-10-03 17:11:08 +02:00
b11b53aca1 Add warning for not slicing 2022-09-30 14:36:05 +02:00
5f9725d515 Add forgotten memcpy for SelfSufficient buffers 2022-09-30 14:35:28 +02:00
4092f968b7 Tell nvcc to use the correct MPICXX binary
There was a bug related to

  https://gcc.gnu.org/pipermail/gcc-help/2021-June/140490.html

where nvcc was getting other versions of the gcc compiler
and the libstd++ version were mismatched, resulting in linking errors.
2022-09-14 13:28:15 +02:00
Gallo Alejandro
a5bb8f3d9c Give a better message when out of memory in allocating SliceUnion 2022-09-12 19:39:38 +02:00
Gallo Alejandro
5678ac0d9c Fix AniaBug#2: Lifecycle of SelfSufficient slices was wrong (comments)
This commit shows how the lifecycle of as slice goes.
At some point, a rank gets a list of slices that needs
in the next iteration, at classifies them according
to the characteristics of every situation.

If for instance we are given a slice with
an abc tuple such that we find that this tuple
was given to our rank, then we know that
we have to create a SelfSufficient tuple.

What we do is that we find a blank slice in our
SliceUnion slices bucket. This buffer is blank
and safe to do everything we want with it.
Without cuda, we just need to point this
blank slice to the correct memory address
of the data, that we (the SliceUnion) own.
This is therefore the line

  blank.data = sources[from.source].data();

Of course, doing this in CUDA will mess everything,
as it was until now, since we are pointing to a Host
address. Sadly the way the casting fu is now implemented,
the typechecker did not get that one and I foolishly
forgot about this important bit.

After the creation of the slice comes at some point
in the life cycle the destruction, which we also
have to handle separately.
This is done every iteration in the

    void clearUnusedSlicesForNext(ABCTuple const& abc);

function. There, normally the SelfSufficient slice
would just forget about the pointer it points, slice.data,
since this point is part of the original data of the tensor
distributed in the SliceUnion. In the CUDA case however,
we gave the SelfSufficient slice a freePointer from our
SliceUnion's bucket of allocated freePointers in the GPU
(of which we have around 6 per SliceUnion type).
This pointer needs to be marked again free to use
by a slice in the future, so it has to go back to the bucket,
we can't afford to lose it.
2022-09-12 19:29:35 +02:00
Gallo Alejandro
da704ad820 Add convenience _FORMAT macro 2022-09-12 18:42:36 +02:00
Gallo Alejandro
f1b2f37fe2 Check CUresult for mpi_data to device 2022-09-12 18:41:37 +02:00
Gallo Alejandro
1cd7bac187 Add macros to check CUerror and cublasStatus_t 2022-09-12 18:36:01 +02:00
49ff3b377c Add a chrono for mpi memcpy in cuda 2022-09-08 15:27:51 +02:00
Gallo Alejandro
b4aef4db9e Fix compilation issues and add KernelSizes 2022-08-12 18:29:21 +02:00
f06cd7f562 Fix the request free problem 2022-08-08 18:28:51 +02:00
8c04280a65 Fix blas 2022-08-08 18:26:52 +02:00
Gallo Alejandro
a5b2a74e18 Changes in source files, makes cuda run 2022-08-05 13:42:04 +02:00
ad75a3de13 Add __device__ to some functions 2022-07-26 15:12:09 +02:00
bea9c7a75e Tangle sources 2022-05-06 13:58:26 +02:00
fed19ff52c Implement deleting Vppph 2022-04-05 17:16:11 +02:00
b54cdc0573 Silence ctf errors 2022-03-15 23:30:28 +01:00
c6eb805078 Now compiling 2022-03-11 14:27:52 +01:00
88c03698db Merge branch 'complex' 2022-03-04 17:42:40 +01:00
c25d6b3c18 Add license headers 2022-03-04 16:14:50 +01:00
d226cdc43e Tangle sources 2022-03-02 19:37:48 +01:00
10a7969710 Silence the logging in group-and-sort 2022-02-22 12:09:41 +01:00
bbbfb30c6f Add tangled sources 2022-02-18 12:54:59 +01:00
c2b1c78c67 Tanlge source files for complex 2022-02-07 22:29:47 +01:00
cdbad963b0 Add user printing mechanism (cherry pick) 2021-11-30 12:31:15 +01:00
54b568a669 Improve slice union not found error message 2021-11-18 20:30:11 +01:00
9f1f32e950 Increase the number of buffers for 1-d slices to 6 2021-11-18 11:05:35 +01:00
5aa10f31ad Fix number of tuples group-and-sort 2021-11-17 15:30:19 +01:00
1d6d14c398 Update sources 2021-11-08 17:33:13 +01:00
7b617930a6 Fix naive implementation, NH3 working for both 2021-11-04 16:10:23 +01:00
a5619146f0 Add group-and-sort with MPI_Scatter and not MPI_Scatterv 2021-11-03 19:46:38 +01:00
12f8c6294e Update and simplify the naive implementation, [working] 2021-11-03 18:31:02 +01:00
79a3f99cb3 Update all chronos to use the static chrono 2021-10-21 15:25:01 +02:00
da714d3b7f Add static chrono 2021-10-19 17:34:28 +02:00
03b66dac8d Tangle: Update files 2021-10-19 13:52:00 +02:00
a03ed439f2 Add an interface struct for distributions 2021-10-18 10:57:52 +02:00
65f804e637 Add group and sort algorithm, it compiles at least 2021-10-15 19:18:40 +02:00
944e93dc33 Define FAKE_TUPLE 2021-10-15 18:58:55 +02:00
3dd8af052d Comment tuples section 2021-10-14 15:23:30 +02:00
1cf6795130 Comment the Utils section 2021-10-13 18:05:54 +02:00
51c3203fda Comment most of Slice section 2021-10-13 17:47:58 +02:00
44977bd186 Update sources with chrono in input value 2021-10-05 16:58:37 +02:00
0a861618fd Fix getEnergy bug with index jj <> kk 2021-10-05 16:49:30 +02:00