atrip

Author	SHA1	Message	Date
Alejandro Gallo	3ddd507c17	Fix a bug in the building the abc tuples fakely	2022-10-07 12:23:10 +02:00
Alejandro Gallo	ae6736fc21	Implement a faster version of naive computation	2022-10-07 03:19:24 +02:00
Alejandro Gallo	d5cfe31b12	Add naive tuples scracth file	2022-10-07 01:11:22 +02:00
Alejandro Gallo	ddb4574380	Fix DatabaseCommunicator	2022-10-06 12:27:00 +02:00
Alejandro Gallo	118df09128	Add tentative DatabaseCommunicator	2022-10-06 01:10:06 +02:00
Alejandro Gallo	1e391e3749	Update tuples-distribution script	2022-10-06 01:07:53 +02:00
Alejandro Gallo	7734efeb97	Add tuples distribution bench	2022-10-03 17:13:21 +02:00
Alejandro Gallo	fa1a29c583	Create an implementation file of the Tuples	2022-10-03 17:11:49 +02:00
Alejandro Gallo	2cbff5c8c9	Add bench utils	2022-10-03 17:11:33 +02:00
Alejandro Gallo	50896e3cd0	Add vendor infrastructure	2022-10-03 17:11:08 +02:00
Alejandro Gallo	b11b53aca1	Add warning for not slicing	2022-09-30 14:36:05 +02:00
Alejandro Gallo	5f9725d515	Add forgotten memcpy for SelfSufficient buffers	2022-09-30 14:35:28 +02:00
Alejandro Gallo	1ca31f4929	Fix bench building for cublas-parallel-atrip	2022-09-30 14:34:54 +02:00
Alejandro Gallo	399447131c	Exclude clang5 from the nvidia compilation	2022-09-14 14:52:27 +02:00
Alejandro Gallo	da2823cf54	Exclude (gcc, cuda) from matrix	2022-09-14 14:43:38 +02:00
Alejandro Gallo	4092f968b7	Tell nvcc to use the correct MPICXX binary There was a bug related to https://gcc.gnu.org/pipermail/gcc-help/2021-June/140490.html where nvcc was getting other versions of the gcc compiler and the libstd++ version were mismatched, resulting in linking errors.	2022-09-14 13:28:15 +02:00
Alejandro Gallo	a263c0c933	Fix cuda bench	2022-09-13 15:14:45 +02:00
Alejandro Gallo	2895fd25a3	Disable debug in openblas	2022-09-13 15:00:06 +02:00
Alejandro Gallo	8abc516b1f	Use --enable-cuda flag in workflows	2022-09-13 14:51:49 +02:00
Gallo Alejandro	5c41fb65e4	Fix ctf.mk for CUDA	2022-09-13 14:15:39 +02:00
Alejandro Gallo	4e2de62508	Fix memory consumption in test-cublas-parallel-atrip	2022-09-12 20:19:41 +02:00
Alejandro Gallo	81e89274ab	Fix yaml workflows syntax	2022-09-12 20:02:51 +02:00
Alejandro Gallo	5ca94e3238	Try cuda in github ci	2022-09-12 20:00:46 +02:00
Alejandro Gallo	6effcbcdc8	Add ci for cuda branch too	2022-09-12 19:58:18 +02:00
Gallo Alejandro	20c29ed815	Add a couple of bench scripts for cuda and cublas testing	2022-09-12 19:53:50 +02:00
Gallo Alejandro	a5bb8f3d9c	Give a better message when out of memory in allocating SliceUnion	2022-09-12 19:39:38 +02:00
Gallo Alejandro	5678ac0d9c	Fix AniaBug#2: Lifecycle of SelfSufficient slices was wrong (comments) This commit shows how the lifecycle of as slice goes. At some point, a rank gets a list of slices that needs in the next iteration, at classifies them according to the characteristics of every situation. If for instance we are given a slice with an abc tuple such that we find that this tuple was given to our rank, then we know that we have to create a SelfSufficient tuple. What we do is that we find a blank slice in our SliceUnion slices bucket. This buffer is blank and safe to do everything we want with it. Without cuda, we just need to point this blank slice to the correct memory address of the data, that we (the SliceUnion) own. This is therefore the line blank.data = sources[from.source].data(); Of course, doing this in CUDA will mess everything, as it was until now, since we are pointing to a Host address. Sadly the way the casting fu is now implemented, the typechecker did not get that one and I foolishly forgot about this important bit. After the creation of the slice comes at some point in the life cycle the destruction, which we also have to handle separately. This is done every iteration in the void clearUnusedSlicesForNext(ABCTuple const& abc); function. There, normally the SelfSufficient slice would just forget about the pointer it points, slice.data, since this point is part of the original data of the tensor distributed in the SliceUnion. In the CUDA case however, we gave the SelfSufficient slice a freePointer from our SliceUnion's bucket of allocated freePointers in the GPU (of which we have around 6 per SliceUnion type). This pointer needs to be marked again free to use by a slice in the future, so it has to go back to the bucket, we can't afford to lose it.	2022-09-12 19:29:35 +02:00
Gallo Alejandro	5483325626	Fix AniaBug #1 : cublasCreate after context setting	2022-09-12 19:17:52 +02:00
Gallo Alejandro	23ad87214f	Uncomment everything in Equations, (to test see comments) For testing, comment out everything that has REORDER, MAYBE_CONJ and zeroing.	2022-09-12 19:13:45 +02:00
Gallo Alejandro	3d7702d501	Minimal changes in Equations	2022-09-12 19:10:14 +02:00
Gallo Alejandro	c20b9e3bcb	Add error checking in Blas.cxx	2022-09-12 19:07:48 +02:00
Gallo Alejandro	da704ad820	Add convenience _FORMAT macro	2022-09-12 18:42:36 +02:00
Gallo Alejandro	f1b2f37fe2	Check CUresult for mpi_data to device	2022-09-12 18:41:37 +02:00
Gallo Alejandro	1cd7bac187	Add macros to check CUerror and cublasStatus_t	2022-09-12 18:36:01 +02:00
Gallo Alejandro	68892d5dd8	Make bootstrap work from anywhere in the project	2022-09-12 18:35:30 +02:00
Alejandro Gallo	4277c07cc2	Add memory consumption in bench	2022-09-08 15:44:29 +02:00
Alejandro Gallo	0558148937	Fix small syntactic bug	2022-09-08 15:36:10 +02:00
Alejandro Gallo	49ff3b377c	Add a chrono for mpi memcpy in cuda	2022-09-08 15:27:51 +02:00
Alejandro Gallo	00a28c990c	Indent more conventionally	2022-09-08 15:27:38 +02:00
Alejandro Gallo	2c5a4620ca	Lint and tidy up Equations	2022-09-08 13:51:49 +02:00
Alejandro Gallo	368c5619cc	Fix the autotools atrip_cublas	2022-09-08 13:50:08 +02:00
Alejandro Gallo	0b14ac7704	Add bootstrap script	2022-09-06 15:26:28 +02:00
Gallo Alejandro	76a785044d	Check ngcards agains ranks per node	2022-08-14 15:36:22 +02:00
Gallo Alejandro	7241bbe9fb	Implement reordering on the GPU	2022-08-12 18:32:32 +02:00
Gallo Alejandro	c2e9e930ba	Update main Atrip.cxx using several gpus	2022-08-12 18:30:55 +02:00
Gallo Alejandro	b4aef4db9e	Fix compilation issues and add KernelSizes	2022-08-12 18:29:21 +02:00
Gallo Alejandro	4651231d3b	Update test bench for CUDA	2022-08-12 18:28:20 +02:00
Gallo Alejandro	4101c89907	Improve cuda m4	2022-08-11 13:55:52 +02:00
Alejandro Gallo	f06cd7f562	Fix the request free problem	2022-08-08 18:28:51 +02:00
Alejandro Gallo	8c04280a65	Fix blas	2022-08-08 18:26:52 +02:00

1 2 3 4 5

234 Commits