vtk-m2/vtkm/cont/cuda/internal/DeviceAdapterTimerImplementationCuda.h
Robert Maynard e28244f345 Re-implement DeviceAdapterRuntimeDetector to avoid ODR violations.
The previous implementation of DeviceAdapterRuntimeDetector caused
multiple differing definitions of the same class to exist and
was causing the runtime device tracker to report CUDA as disabled
when it actually was enabled.

The ODR was caused by having a default implementation for
DeviceAdapterRuntimeDetector and a specific specialization for
CUDA. If a library had both CUDA and C++ sources it would pick up
both implementations and would have undefined behavior. In general
it would think the CUDA backend was disabled.

To avoid this kind of situation in the future I have reworked VTK-m
so that each device adapter must implement DeviceAdapterRuntimeDetector
for that device.
2018-05-15 13:08:34 -04:00

70 lines
2.2 KiB
C++

//============================================================================
// Copyright (c) Kitware, Inc.
// All rights reserved.
// See LICENSE.txt for details.
// This software is distributed WITHOUT ANY WARRANTY; without even
// the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
// PURPOSE. See the above copyright notice for more information.
//
// Copyright 2014 National Technology & Engineering Solutions of Sandia, LLC (NTESS).
// Copyright 2014 UT-Battelle, LLC.
// Copyright 2014 Los Alamos National Security.
//
// Under the terms of Contract DE-NA0003525 with NTESS,
// the U.S. Government retains certain rights in this software.
//
// Under the terms of Contract DE-AC52-06NA25396 with Los Alamos National
// Laboratory (LANL), the U.S. Government retains certain rights in
// this software.
//============================================================================
#ifndef vtk_m_cont_cuda_internal_DeviceAdapterTimerImplementationCuda_h
#define vtk_m_cont_cuda_internal_DeviceAdapterTimerImplementationCuda_h
#include <vtkm/cont/vtkm_cont_export.h>
#include <vtkm/Types.h>
#include <vtkm/cont/DeviceAdapterAlgorithm.h>
#include <vtkm/cont/cuda/internal/DeviceAdapterTagCuda.h>
#include <cuda.h>
namespace vtkm
{
namespace cont
{
///
/// Specialization of DeviceAdapterTimerImplementation for CUDA
/// CUDA contains its own high resolution timer that are able
/// to track how long it takes to execute async kernels.
/// If we simply measured time on the CPU it would incorrectly
/// just capture how long it takes to launch a kernel.
template <>
class VTKM_CONT_EXPORT DeviceAdapterTimerImplementation<vtkm::cont::DeviceAdapterTagCuda>
{
public:
VTKM_CONT DeviceAdapterTimerImplementation();
VTKM_CONT ~DeviceAdapterTimerImplementation();
VTKM_CONT void Reset();
VTKM_CONT vtkm::Float64 GetElapsedTime();
private:
// Copying CUDA events is problematic.
DeviceAdapterTimerImplementation(
const DeviceAdapterTimerImplementation<vtkm::cont::DeviceAdapterTagCuda>&) = delete;
void operator=(const DeviceAdapterTimerImplementation<vtkm::cont::DeviceAdapterTagCuda>&) =
delete;
cudaEvent_t StartEvent;
cudaEvent_t EndEvent;
};
}
} // namespace vtkm::cont
#endif