This change is needed for being able to use different thread indices types
without changing Fetchs. Basically decoupling those two areas.
1. This commit removes concrete specialization instantiations of
ThreadIndicesTypes in all of the Fetch's specializations.
2. It also moves the ThreadIndicesType template parameter from the Fetch
struct to a template parameter in their methods Load/Store.
Signed-off-by: Vicente Adolfo Bolea Sanchez <vicente.bolea@kitware.com>
02ef5291f incorporate -fPIC flag in lodepng when buliding linux
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !2060
b380e702d only include the lodepng header when installing
c9cbd9693 fix type warnings
62fe68acd fixes to match old files
abf569288 Merge branch 'upstream-lodepng' into lodepng-in-lib
957568e36 lodepng 2020-04-16 (b51302e1)
e925d6d54 turn lodepng into a library
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !2034
This fixes, which where triggered since in the new CI, one of the
docker runner set `OMP_NUM_THREADS=3`:
1. `UnitTestOpenMPDeviceAdapter`
2. `UnitTestMeshQualityFilter`
In the redution optimized implementation for _OpenMP_, it unrolls
the reduce loop in iterations of four elements. The last iteration
in the loop might overflow the loop end element (when it is not a
multiple of four).
This commit fixes this by setting the OpenMP unrolled reduce loop
end element to its previous closest multiple of four of the original end
element.
Signed-off-by: Vicente Adolfo Bolea Sanchez <vicente.bolea@kitware.com>
- It also adds Google's benchmarch compare.py script
- It is installed to the build directory.
- It add a wrapper script called compare-benchmarks.py which:
- Let you run each of the benchmarks with different devices
- It adds a README.md explaining how to run the benchmarks
- BenchmarkDeviceAdapter input size range parametrized at compile time
Signed-off-by: Vicente Adolfo Bolea Sanchez <vicente.bolea@kitware.com>
1bf808c47 Add OpenMP to our asan dashboard
07f37c814 Add lsan suppression file
b3924ef30 Add an asan to our gitlab ci suite
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Acked-by: Nick <nathompson7@protonmail.com>
Merge-request: !2051
If you gave ReduceByKey a fancy output array that decorated another
array, you could get a runtime error for using an invalid array (if the
device adapter used the generic algorithm). The problem was that
ReduceByKey creates a temporary array, and that array was given the same
storage as the output array. That might not be valid for fancy arrays,
so instead use the default storage for the temporary array.
A single gitlab-runner can run multiple test stages
concurrently on the same hardware. When we have 3
jobs asking for 100% of the cpu's we get a 300%
reduction in performance caused by task switching
5c16b3be2 ubuntu1604 builders now use the correct c && c++ compilers
3c80b35b8 ubuntu1604 test step needs to know where MPI install location is
889cb33dd gitlab-ci test jobs better handle false positive failures
b2823d79a ubuntu1604 gcc48 builder install test now pass
93cbea2d7 Update README with update CMake and compiler tested versions
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !2042
If you gave ScanInclusiveByKey a fancy output array that decorated
another array, you would get a runtime error for using an invalid array.
The problem was that ScanInclusiveByKey creates a temporary output array
and then copies the result to the actual output array. The problem was
that the temporary output array was given the same storage as the output
array, which won't work if the output array is fancy. Instead, make the
storage for the temporary array default.
Problem was that CMake 3.12 was the CMake version used in the
build image, and test image was using 3.13. This was a problem
as the install test invocation aren't backwards convertible and
therefore failed.