Extend Readme

2023-10-05 11:19:46 +02:00 · 2023-10-05 11:19:46 +02:00 · 69350034eb
commit 69350034eb
parent 6bac8f5a22
1 changed files with 85 additions and 4 deletions
--- a/README.org
+++ b/README.org
@ -8,7 +8,88 @@ multiplication on a single CPU core while using SYCL for both OpenMP and GPU
 parallelization. Subsequently, we will record and analyze the execution times.

 At this stage, the project showcases how to transfer and manipulate data on the
-GPU using the Unified Shared Memory (USM) model with explicit data movement.
-Unfortunately, I've encountered a hurdle as my current implementation with =hip=
-lacks a valid USM provider for my graphics card, the AMD Radeon RX 6700 XT,
-preventing me from achieving implicit data movement for demonstration 😔
+GPU using +the Unified Shared Memory (USM) model with explicit data movement+ an
+abstract view to the host and device memory using buffers and accessors. I will
+not attend to implement those functions using Unified Shared Memory.
+
+For more detailed information about the implementation and how specific
+functions are used, as well as explanations for the reasoning behind certain
+design choices, I recommend referring to the source code itself. The source code
+typically contains comments that provide insights into the code's functionality
+and rationale.
+
+* Compilation
+
+Regrettably, integrating Intel's oneAPI with the AMD GPU plugin proves to be
+quite challenging on Arch Linux, primarily due to the plugin's dependency on an
+older version of ROCm than what's available in the official repositories. While
+I could have chosen to compile my own ROCm/hip version, I opted for a more
+convenient solution and turned to the [[https://github.com/AdaptiveCpp/AdaptiveCpp/tree/develop][AdaptiveCpp]] compiler, which offers both
+CPU and GPU acceleration through CUDA and ROCm support. You can find a version
+of AdaptiveCpp compatible with AMD GPUs on the AUR (Arch User Repository).
+
+If your goal is to run benchmarks on an AMD GPU alongside AdaptiveCpp, I
+recommend using [[https://github.com/sobc/pkgbuilds/tree/master/hipsycl-rocm-git][this]] specific PKGBUILD. Other versions that rely on ROCm might
+not build correctly at the moment. I've already raised an issue with the
+responsible maintainer of the PKGBUILDs to address this compatibility issu
+
+Currently, I can only utilize CMake for generating makefiles when working with
+AdaptiveCpp. However, I intend to add CMake support for Intel's oneAPI as soon
+as I have a working version of the compiler.
+
+To generate Makefiles for AdaptiveCpp, you can follow these steps:
+
+#+BEGIN_SRC bash
+# Create a build directory and navigate to it
+mkdir build && cd build
+
+# Adjust the path to AdaptiveCpp and your target devices according to your system
+cmake .. -DAdaptiveCpp_DIR=/opt/AdaptiveCpp/ROCm/lib/cmake/AdaptiveCpp -DACPP_TARGETS="omp.accelerated;hip.integrated-multipass;gfx90c"
+#+END_SRC
+
+You can find more information about =ACPP_TARGETS= and the compilation process in
+the documentation [[https://github.com/AdaptiveCpp/AdaptiveCpp/blob/develop/doc/compilation.md][here]].
+
+Once your Makefiles are generated, you can build the project using the following
+command:
+
+#+BEGIN_SRC bash
+make -j$(nproc)
+#+END_SRC
+
+The compiled executable can be found in the =build/src= directory.
+
+* Data
+
+I provide 6 different matrices with 3 different sizes:
+
+- =sma*.txt= are matrices with the size of 16x16
+- =med*.txt= are matrices with the size of 2048x2048
+- =big*.txt= are matrices with the size of 8192x8192
+
+All matrices are stored in text files under =data=.
+
+*Warning*: If you're about to run the benchmark with the big matrices, please
+disable the benchmark on one single CPU core, unless you want to sit and wait
+forever. Do this by calling cmake with =-DSEQ_BENCH=OFF= and recompile the
+executable.
+
+Below you will find the combination of all multiplication of all matrices and
+their checksum. Let me now if you encounter other checksums.
+
+| Matrix A   | Matrix B   | Checksum     |
+|------------+------------+--------------|
+| =sma1.txt= | =sma1.txt= | =0xe6134d8e= |
+| =sma2.txt= | =sma2.txt= | =0xf1ba0ac6= |
+| =sma1.txt= | =sma2.txt= | =0xe71fdf1e= |
+| =sma2.txt= | =sma1.txt= | =0x36b44d2c= |
+|------------+------------+--------------|
+| =med1.txt= | =med1.txt= | =0xd92eb6d6= |
+| =med2.txt= | =med2.txt= | =0x9f0e1206= |
+| =med1.txt= | =med2.txt= | =0x4cf45b91= |
+| =med2.txt= | =med1.txt= | =0xfdeb52bf= |
+|------------+------------+--------------|
+| =big1.txt= | =big1.txt= | =0xde9b4c0d= |
+| =big2.txt= | =big2.txt= | =0x5365fc1=  |
+| =big1.txt= | =big2.txt= | =0xb185e6c1= |
+| =big2.txt= | =big1.txt= | =0x59f5ffef= |