diy: remove in order to reset the import

This commit is contained in:
Sujin Philip 2019-08-22 14:04:03 -04:00
parent 0586525b74
commit c0b497cc83
67 changed files with 0 additions and 15612 deletions

@ -1,19 +0,0 @@
Copyright Notice
DIY2, Copyright (c) 2015, The Regents of the University of California, through
Lawrence Berkeley National Laboratory (subject to receipt of any required
approvals from the U.S. Dept. of Energy). All rights reserved.
If you have questions about your rights to use or distribute this software,
please contact Berkeley Lab's Technology Transfer Department at TTD@lbl.gov.
NOTICE. This software is owned by the U.S. Department of Energy. As such, the
U.S. Government has been granted for itself and others acting on its behalf a
paid-up, nonexclusive, irrevocable, worldwide license in the Software to
reproduce, prepare derivative works, and perform publicly and display publicly.
Beginning five (5) years after the date permission to assert copyright is
obtained from the U.S. Department of Energy, and subject to any subsequent five
(5) year renewals, the U.S. Government is granted for itself and others acting
on its behalf a paid-up, nonexclusive, irrevocable, worldwide license in the
Software to reproduce, prepare derivative works, distribute copies to the
public, perform publicly and display publicly, and to permit others to do so.

@ -1,41 +0,0 @@
License Agreement
"DIY2, Copyright (c) 2015, The Regents of the University of California, through
Lawrence Berkeley National Laboratory (subject to receipt of any required
approvals from the U.S. Dept. of Energy). All rights reserved."
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
(1) Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
(2) Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation and/or
other materials provided with the distribution.
(3) Neither the name of the University of California, Lawrence Berkeley National
Laboratory, U.S. Dept. of Energy nor the names of its contributors may be used
to endorse or promote products derived from this software without specific prior
written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
You are under no obligation whatsoever to provide any bug fixes, patches, or
upgrades to the features, functionality or performance of the source code
("Enhancements") to anyone; however, if you choose to make your Enhancements
available either publicly, or directly to Lawrence Berkeley National Laboratory,
without imposing a separate written license agreement for such Enhancements,
then you hereby grant the following license: a non-exclusive, royalty-free
perpetual license to install, use, modify, prepare derivative works, incorporate
into other computer software, distribute, and sublicense such enhancements or
derivative works thereof, in binary and source code form.

@ -1,85 +0,0 @@
## DIY is a block-parallel library
DIY is a block-parallel library for implementing scalable algorithms that can execute both
in-core and out-of-core. The same program can be executed with one or more threads per MPI
process, seamlessly combining distributed-memory message passing with shared-memory thread
parallelism. The abstraction enabling these capabilities is block parallelism; blocks
and their message queues are mapped onto processing elements (MPI processes or threads) and are
migrated between memory and storage by the DIY runtime. Complex communication patterns,
including neighbor exchange, merge reduction, swap reduction, and all-to-all exchange, are
possible in- and out-of-core in DIY.
## Licensing
DIY is released as open source software under a BSD-style [license](./LICENSE.txt).
## Dependencies
DIY requires an MPI installation. We recommend [MPICH](http://www.mpich.org/).
## Download, build, install
- You can clone this repository, or
- You can download the [latest tarball](https://github.com/diatomic/diy2/archive/master.tar.gz).
DIY is a header-only library. It does not need to be built; you can simply
include it in your project. The examples can be built using `cmake` from the
top-level directory.
## Documentation
[Doxygen pages](https://diatomic.github.io/diy)
## Example
A simple DIY program, shown below, consists of the following components:
- `struct`s called blocks,
- a diy object called the `master`,
- a set of callback functions performed on each block by `master.foreach()`,
- optionally, one or more message exchanges between the blocks by `master.exchange()`, and
- there may be other collectives and global reductions not shown below.
The callback functions (`enqueue_local()` and `average()` in the example below) receive the block
pointer and a communication proxy for the message exchange between blocks. It is usual for the
callback functions to enqueue or dequeue messages from the proxy, so that information can be
received and sent during rounds of message exchange.
```cpp
// --- main program --- //
struct Block { float local, average; }; // define your block structure
Master master(world); // world = MPI_Comm
... // populate master with blocks
master.foreach(&enqueue_local); // call enqueue_local() for each block
master.exchange(); // exchange enqueued data between blocks
master.foreach(&average); // call average() for each block
// --- callback functions --- //
// enqueue block data prior to exchanging it
void enqueue_local(Block* b, // current block
const Proxy& cp) // communication proxy provides access to the neighbor blocks
{
for (size_t i = 0; i < cp.link()->size(); i++) // for all neighbor blocks
cp.enqueue(cp.link()->target(i), b->local); // enqueue the data to be sent to this neighbor
// block in the next exchange
}
// use the received data after exchanging it, in this case compute its average
void average(Block* b, // current block
const Proxy& cp) // communication proxy provides access to the neighbor blocks
{
float x, average = 0;
for (size_t i = 0; i < cp.link()->size(); i++) // for all neighbor blocks
{
cp.dequeue(cp.link()->target(i).gid, x); // dequeue the data received from this
// neighbor block in the last exchange
average += x;
}
b->average = average / cp.link()->size();
}
```

@ -1,191 +0,0 @@
#ifndef VTKMDIY_ALGORITHMS_HPP
#define VTKMDIY_ALGORITHMS_HPP
#include <vector>
#include "master.hpp"
#include "assigner.hpp"
#include "reduce.hpp"
#include "reduce-operations.hpp"
#include "partners/swap.hpp"
#include "detail/algorithms/sort.hpp"
#include "detail/algorithms/kdtree.hpp"
#include "detail/algorithms/kdtree-sampling.hpp"
#include "log.hpp"
namespace diy
{
/**
* \ingroup Algorithms
* \brief sample sort `values` of each block, store the boundaries between blocks in `samples`
*/
template<class Block, class T, class Cmp>
void sort(Master& master, //!< master object
const Assigner& assigner, //!< assigner object
std::vector<T> Block::* values, //!< all values to sort
std::vector<T> Block::* samples, //!< (output) boundaries of blocks
size_t num_samples, //!< desired number of samples
const Cmp& cmp, //!< comparison function
int k = 2, //!< k-ary reduction will be used
bool samples_only = false) //!< false: results will be all_to_all exchanged; true: only sort but don't exchange results
{
bool immediate = master.immediate();
master.set_immediate(false);
// NB: although sorter will go out of scope, its member functions sample()
// and exchange() will return functors whose copies get saved inside reduce
detail::SampleSort<Block,T,Cmp> sorter(values, samples, cmp, num_samples);
// swap-reduce to all-gather samples
RegularDecomposer<DiscreteBounds> decomposer(1, interval(0,assigner.nblocks()), assigner.nblocks());
RegularSwapPartners partners(decomposer, k);
reduce(master, assigner, partners, sorter.sample(), detail::SkipIntermediate(partners.rounds()));
// all_to_all to exchange the values
if (!samples_only)
all_to_all(master, assigner, sorter.exchange(), k);
master.set_immediate(immediate);
}
/**
* \ingroup Algorithms
* \brief sample sort `values` of each block, store the boundaries between blocks in `samples`
* shorter version of above sort algorithm with the default less-than comparator used for T
* and all_to_all exchange included
*/
template<class Block, class T>
void sort(Master& master, //!< master object
const Assigner& assigner, //!< assigner object
std::vector<T> Block::* values, //!< all values to sort
std::vector<T> Block::* samples, //!< (output) boundaries of blocks
size_t num_samples, //!< desired number of samples
int k = 2) //!< k-ary reduction will be used
{
sort(master, assigner, values, samples, num_samples, std::less<T>(), k);
}
/**
* \ingroup Algorithms
* \brief build a kd-tree and sort a set of points into it (use histograms to determine split values)
*/
template<class Block, class Point>
void kdtree(Master& master, //!< master object
const Assigner& assigner, //!< assigner object
int dim, //!< dimensionality
const ContinuousBounds& domain, //!< global data extents
std::vector<Point> Block::* points, //!< input points to sort into kd-tree
size_t bins, //!< number of histogram bins for splitting a dimension
bool wrap = false)//!< periodic boundaries in all dimensions
{
if (assigner.nblocks() & (assigner.nblocks() - 1))
throw std::runtime_error(fmt::format("KD-tree requires a number of blocks that's a power of 2, got {}", assigner.nblocks()));
typedef diy::RegularContinuousLink RCLink;
for (size_t i = 0; i < master.size(); ++i)
{
RCLink* link = static_cast<RCLink*>(master.link(i));
*link = RCLink(dim, domain, domain);
if (wrap) // set up the links to self
{
diy::BlockID self = { master.gid(i), master.communicator().rank() };
for (int j = 0; j < dim; ++j)
{
diy::Direction dir, wrap_dir;
// left
dir[j] = -1; wrap_dir[j] = -1;
link->add_neighbor(self);
link->add_bounds(domain);
link->add_direction(dir);
link->add_wrap(wrap_dir);
// right
dir[j] = 1; wrap_dir[j] = 1;
link->add_neighbor(self);
link->add_bounds(domain);
link->add_direction(dir);
link->add_wrap(wrap_dir);
}
}
}
detail::KDTreePartition<Block,Point> kdtree_partition(dim, points, bins);
detail::KDTreePartners partners(dim, assigner.nblocks(), wrap, domain);
reduce(master, assigner, partners, kdtree_partition);
// update master.expected to match the links
int expected = 0;
for (size_t i = 0; i < master.size(); ++i)
expected += master.link(i)->size_unique();
master.set_expected(expected);
}
/**
* \ingroup Algorithms
* \brief build a kd-tree and sort a set of points into it (use sampling to determine split values)
*/
template<class Block, class Point>
void kdtree_sampling
(Master& master, //!< master object
const Assigner& assigner, //!< assigner object
int dim, //!< dimensionality
const ContinuousBounds& domain, //!< global data extents
std::vector<Point> Block::* points, //!< input points to sort into kd-tree
size_t samples, //!< number of samples to take in each block
bool wrap = false)//!< periodic boundaries in all dimensions
{
if (assigner.nblocks() & (assigner.nblocks() - 1))
throw std::runtime_error(fmt::format("KD-tree requires a number of blocks that's a power of 2, got {}", assigner.nblocks()));
typedef diy::RegularContinuousLink RCLink;
for (size_t i = 0; i < master.size(); ++i)
{
RCLink* link = static_cast<RCLink*>(master.link(i));
*link = RCLink(dim, domain, domain);
if (wrap) // set up the links to self
{
diy::BlockID self = { master.gid(i), master.communicator().rank() };
for (int j = 0; j < dim; ++j)
{
diy::Direction dir, wrap_dir;
// left
dir[j] = -1; wrap_dir[j] = -1;
link->add_neighbor(self);
link->add_bounds(domain);
link->add_direction(dir);
link->add_wrap(wrap_dir);
// right
dir[j] = 1; wrap_dir[j] = 1;
link->add_neighbor(self);
link->add_bounds(domain);
link->add_direction(dir);
link->add_wrap(wrap_dir);
}
}
}
detail::KDTreeSamplingPartition<Block,Point> kdtree_partition(dim, points, samples);
detail::KDTreePartners partners(dim, assigner.nblocks(), wrap, domain);
reduce(master, assigner, partners, kdtree_partition);
// update master.expected to match the links
int expected = 0;
for (size_t i = 0; i < master.size(); ++i)
expected += master.link(i)->size_unique();
master.set_expected(expected);
}
}
#endif

@ -1,264 +0,0 @@
#ifndef VTKMDIY_ASSIGNER_HPP
#define VTKMDIY_ASSIGNER_HPP
#include <vector>
#include <tuple>
#include "mpi.hpp" // needed for DynamicAssigner
namespace diy
{
// Derived types should define
// int rank(int gid) const
// that converts a global block id to a rank that it's assigned to.
class Assigner
{
public:
/**
* \ingroup Assignment
* \brief Manages how blocks are assigned to processes
*/
Assigner(int size__, //!< total number of processes
int nblocks__ //!< total (global) number of blocks
):
size_(size__), nblocks_(nblocks__) {}
//! returns the total number of process ranks
int size() const { return size_; }
//! returns the total number of global blocks
int nblocks() const { return nblocks_; }
//! sets the total number of global blocks
virtual void set_nblocks(int nblocks__) { nblocks_ = nblocks__; }
//! returns the process rank of the block with global id gid (need not be local)
virtual int rank(int gid) const =0;
//! batch lookup; returns the process ranks of the blocks with global id in the vector gids
inline
virtual std::vector<int>
ranks(const std::vector<int>& gids) const;
private:
int size_; // total number of ranks
int nblocks_; // total number of blocks
};
class StaticAssigner: public Assigner
{
public:
/**
* \ingroup Assignment
* \brief Intermediate type to express assignment that cannot change; adds `local_gids` query method
*/
using Assigner::Assigner;
//! gets the local gids for a given process rank
virtual void local_gids(int rank, std::vector<int>& gids) const =0;
};
class ContiguousAssigner: public StaticAssigner
{
public:
/**
* \ingroup Assignment
* \brief Assigns blocks to processes in contiguous gid (block global id) order
*/
using StaticAssigner::StaticAssigner;
using StaticAssigner::size;
using StaticAssigner::nblocks;
int rank(int gid) const override
{
int div = nblocks() / size();
int mod = nblocks() % size();
int r = gid / (div + 1);
if (r < mod)
{
return r;
} else
{
return mod + (gid - (div + 1)*mod)/div;
}
}
inline
void local_gids(int rank, std::vector<int>& gids) const override;
};
class RoundRobinAssigner: public StaticAssigner
{
public:
/**
* \ingroup Assignment
* \brief Assigns blocks to processes in cyclic or round-robin gid (block global id) order
*/
using StaticAssigner::StaticAssigner;
using StaticAssigner::size;
using StaticAssigner::nblocks;
int rank(int gid) const override { return gid % size(); }
inline
void local_gids(int rank, std::vector<int>& gids) const override;
};
class DynamicAssigner: public Assigner
{
public:
DynamicAssigner(const mpi::communicator& comm, int size__, int nblocks__):
Assigner(size__, nblocks__),
comm_(comm),
div_(nblocks__ / size__ + ((nblocks__ % size__) == 0 ? 0 : 1)), // NB: same size window everywhere means the last rank may allocate extra space
rank_map_(comm_, div_) { rank_map_.lock_all(MPI_MODE_NOCHECK); }
~DynamicAssigner() { rank_map_.unlock_all(); }
inline
virtual void set_nblocks(int nblocks__) override;
inline
virtual int rank(int gid) const override;
inline
virtual std::vector<int>
ranks(const std::vector<int>& gids) const override;
inline std::tuple<bool,int>
get_rank(int& rk, int gid) const;
inline void set_rank(const int& rk, int gid, bool flush = true);
inline void set_ranks(const std::vector<std::tuple<int,int>>& rank_gids);
std::tuple<int,int>
rank_offset(int gid) const { return std::make_tuple(gid / div_, gid % div_); }
private:
mpi::communicator comm_;
int div_;
mutable mpi::window<int> rank_map_;
};
}
std::vector<int>
diy::Assigner::
ranks(const std::vector<int>& gids) const
{
std::vector<int> result(gids.size());
for (size_t i = 0; i < gids.size(); ++i)
result[i] = rank(gids[i]);
return result;
}
void
diy::ContiguousAssigner::
local_gids(int rank_, std::vector<int>& gids) const
{
int div = nblocks() / size();
int mod = nblocks() % size();
int from, to;
if (rank_ < mod)
from = rank_ * (div + 1);
else
from = mod * (div + 1) + (rank_ - mod) * div;
if (rank_ + 1 < mod)
to = (rank_ + 1) * (div + 1);
else
to = mod * (div + 1) + (rank_ + 1 - mod) * div;
for (int gid = from; gid < to; ++gid)
gids.push_back(gid);
}
void
diy::RoundRobinAssigner::
local_gids(int rank_, std::vector<int>& gids) const
{
int cur = rank_;
while (cur < nblocks())
{
gids.push_back(cur);
cur += size();
}
}
void
diy::DynamicAssigner::
set_nblocks(int nblocks__)
{
Assigner::set_nblocks(nblocks__);
div_ = nblocks() / size() + ((nblocks() % size()) == 0 ? 0 : 1);
rank_map_.unlock_all();
rank_map_ = mpi::window<int>(comm_, div_);
rank_map_.lock_all(MPI_MODE_NOCHECK);
}
std::tuple<bool,int>
diy::DynamicAssigner::
get_rank(int& rk, int gid) const
{
// TODO: check if the gid is in cache
int r,offset;
std::tie(r,offset) = rank_offset(gid);
rank_map_.get(rk, r, offset);
return std::make_tuple(false, r); // false indicates that the data wasn't read from cache
}
int
diy::DynamicAssigner::
rank(int gid) const
{
int rk;
auto cached_gidrk = get_rank(rk, gid);
int gidrk = std::get<1>(cached_gidrk);
rank_map_.flush_local(gidrk);
return rk;
}
std::vector<int>
diy::DynamicAssigner::
ranks(const std::vector<int>& gids) const
{
bool all_cached = true;
std::vector<int> result(gids.size(), -1);
for (size_t i = 0; i < gids.size(); ++i)
{
auto cached_gidrk = get_rank(result[i], gids[i]);
bool cached = std::get<0>(cached_gidrk);
all_cached &= cached;
}
if (!all_cached)
rank_map_.flush_local_all();
return result;
}
void
diy::DynamicAssigner::
set_rank(const int& rk, int gid, bool flush)
{
// TODO: update cache
int r,offset;
std::tie(r,offset) = rank_offset(gid);
rank_map_.put(rk, r, offset);
if (flush)
rank_map_.flush(r);
}
void
diy::DynamicAssigner::
set_ranks(const std::vector<std::tuple<int,int>>& rank_gids)
{
for (auto& rg : rank_gids)
set_rank(std::get<0>(rg), std::get<1>(rg), false);
rank_map_.flush_all();
}
#endif

@ -1,121 +0,0 @@
#ifndef VTKMDIY_COLLECTION_HPP
#define VTKMDIY_COLLECTION_HPP
#include <vector>
#include "serialization.hpp"
#include "storage.hpp"
#include "thread.hpp"
namespace diy
{
class Collection
{
public:
typedef void* Element;
typedef std::vector<Element> Elements;
typedef critical_resource<int, recursive_mutex> CInt;
typedef void* (*Create)();
typedef void (*Destroy)(void*);
typedef detail::Save Save;
typedef detail::Load Load;
public:
Collection(Create create__,
Destroy destroy__,
ExternalStorage* storage__,
Save save__,
Load load__):
create_(create__),
destroy_(destroy__),
storage_(storage__),
save_(save__),
load_(load__),
in_memory_(0) {}
size_t size() const { return elements_.size(); }
const CInt& in_memory() const { return in_memory_; }
inline void clear();
int add(Element e) { elements_.push_back(e); external_.push_back(-1); ++(*in_memory_.access()); return static_cast<int>(elements_.size()) - 1; }
void* release(int i) { void* e = get(i); elements_[static_cast<size_t>(i)] = 0; return e; }
void* find(int i) const { return elements_[static_cast<size_t>(i)]; } // possibly returns 0, if the element is unloaded
void* get(int i) { if (!find(i)) load(i); return find(i); } // loads the element first, and then returns its address
int available() const { int i = 0; for (; i < (int)size(); ++i) if (find(i) != 0) break; return i; }
inline void load(int i);
inline void unload(int i);
Create creator() const { return create_; }
Destroy destroyer() const { return destroy_; }
Load loader() const { return load_; }
Save saver() const { return save_; }
void* create() const { return create_(); }
void destroy(int i) { if (find(i)) { destroy_(find(i)); elements_[static_cast<size_t>(i)] = 0; } else if (external_[static_cast<size_t>(i)] != -1) storage_->destroy(external_[static_cast<size_t>(i)]); }
bool own() const { return destroy_ != 0; }
ExternalStorage* storage() const { return storage_; }
private:
Create create_;
Destroy destroy_;
ExternalStorage* storage_;
Save save_;
Load load_;
Elements elements_;
std::vector<int> external_;
CInt in_memory_;
};
}
void
diy::Collection::
clear()
{
if (own())
for (size_t i = 0; i < size(); ++i)
destroy(static_cast<int>(i));
elements_.clear();
external_.clear();
*in_memory_.access() = 0;
}
void
diy::Collection::
unload(int i)
{
//BinaryBuffer bb;
void* e = find(i);
//save_(e, bb);
//external_[i] = storage_->put(bb);
external_[static_cast<size_t>(i)] = storage_->put(e, save_);
destroy_(e);
elements_[static_cast<size_t>(i)] = 0;
--(*in_memory_.access());
}
void
diy::Collection::
load(int i)
{
//BinaryBuffer bb;
//storage_->get(external_[i], bb);
void* e = create_();
//load_(e, bb);
storage_->get(external_[static_cast<size_t>(i)], e, load_);
elements_[static_cast<size_t>(i)] = e;
external_[static_cast<size_t>(i)] = -1;
++(*in_memory_.access());
}
#endif

@ -1,13 +0,0 @@
#ifndef VTKMDIY_COMMUNICATOR_HPP
#define VTKMDIY_COMMUNICATOR_HPP
#warning "diy::Communicator (in diy/communicator.hpp) is deprecated, use diy::mpi::communicator directly"
#include "mpi.hpp"
namespace diy
{
typedef mpi::communicator Communicator;
}
#endif

@ -1,26 +0,0 @@
#ifndef VTKMDIY_CONSTANTS_H
#define VTKMDIY_CONSTANTS_H
// Default DIY_MAX_DIM to 4, unless provided by the user
// (used for static min/max size in various Bounds)
#ifndef DIY_MAX_DIM
#define DIY_MAX_DIM 4
#endif
enum
{
DIY_X0 = 0x01, /* minimum-side x (left) neighbor */
DIY_X1 = 0x02, /* maximum-side x (right) neighbor */
DIY_Y0 = 0x04, /* minimum-side y (bottom) neighbor */
DIY_Y1 = 0x08, /* maximum-side y (top) neighbor */
DIY_Z0 = 0x10, /* minimum-side z (back) neighbor */
DIY_Z1 = 0x20, /* maximum-side z (front)neighbor */
DIY_T0 = 0x40, /* minimum-side t (earlier) neighbor */
DIY_T1 = 0x80 /* maximum-side t (later) neighbor */
};
#ifndef DIY_UNUSED
#define DIY_UNUSED(expr) do { (void)(expr); } while (0)
#endif
#endif

@ -1,47 +0,0 @@
#ifndef VTKMDIY_CRITICAL_RESOURCE_HPP
#define VTKMDIY_CRITICAL_RESOURCE_HPP
namespace diy
{
template<class T, class Mutex>
class resource_accessor
{
public:
resource_accessor(T& x, Mutex& m):
x_(x), lock_(m) {}
resource_accessor(resource_accessor&&) = default;
resource_accessor(const resource_accessor&) = delete;
T& operator*() { return x_; }
T* operator->() { return &x_; }
const T& operator*() const { return x_; }
const T* operator->() const { return &x_; }
private:
T& x_;
lock_guard<Mutex> lock_;
};
template<class T, class Mutex = fast_mutex>
class critical_resource
{
public:
typedef resource_accessor<T, Mutex> accessor;
typedef resource_accessor<const T, Mutex> const_accessor; // eventually, try shared locking
public:
critical_resource() {}
critical_resource(const T& x):
x_(x) {}
accessor access() { return accessor(x_, m_); }
const_accessor const_access() const { return const_accessor(x_, m_); }
private:
T x_;
mutable Mutex m_;
};
}
#endif

@ -1,716 +0,0 @@
#ifndef VTKMDIY_DECOMPOSITION_HPP
#define VTKMDIY_DECOMPOSITION_HPP
#include <vector>
#include <algorithm>
#include <iostream>
#include <cmath>
#include <sstream>
#include <stdexcept>
#include "link.hpp"
#include "assigner.hpp"
#include "master.hpp"
namespace diy
{
namespace detail
{
template<class Bounds_, class Enable = void>
struct BoundsHelper;
// discrete bounds
template<class Bounds>
struct BoundsHelper<Bounds, typename std::enable_if<std::is_integral<typename Bounds::Coordinate>::value>::type>
{
using Coordinate = typename Bounds::Coordinate;
static Coordinate from(int i, int n, Coordinate min, Coordinate max, bool) { return min + (max - min + 1)/n * i; }
static Coordinate to (int i, int n, Coordinate min, Coordinate max, bool shared_face)
{
if (i == n - 1)
return max;
else
return from(i+1, n, min, max, shared_face) - (shared_face ? 0 : 1);
}
static int lower(Coordinate x, int n, Coordinate min, Coordinate max, bool shared)
{
Coordinate width = (max - min + 1)/n;
Coordinate res = (x - min)/width;
if (res >= n) res = n - 1;
if (shared && x == from(res, n, min, max, shared))
--res;
return res;
}
static int upper(Coordinate x, int n, Coordinate min, Coordinate max, bool shared)
{
Coordinate width = (max - min + 1)/n;
Coordinate res = (x - min)/width + 1;
if (shared && x == from(res, n, min, max, shared))
++res;
return res;
}
};
// continuous bounds
template<class Bounds>
struct BoundsHelper<Bounds, typename std::enable_if<std::is_floating_point<typename Bounds::Coordinate>::value>::type>
{
using Coordinate = typename Bounds::Coordinate;
static Coordinate from(int i, int n, Coordinate min, Coordinate max, bool) { return min + (max - min)/n * i; }
static Coordinate to (int i, int n, Coordinate min, Coordinate max, bool) { return min + (max - min)/n * (i+1); }
static int lower(Coordinate x, int n, Coordinate min, Coordinate max, bool) { Coordinate width = (max - min)/n; Coordinate res = std::floor((x - min)/width); if (min + res*width == x) return (res - 1); else return res; }
static int upper(Coordinate x, int n, Coordinate min, Coordinate max, bool) { Coordinate width = (max - min)/n; Coordinate res = std::ceil ((x - min)/width); if (min + res*width == x) return (res + 1); else return res; }
};
}
//! \ingroup Decomposition
//! Decomposes a regular (discrete or continuous) domain into even blocks;
//! creates Links with Bounds along the way.
template<class Bounds_>
struct RegularDecomposer
{
typedef Bounds_ Bounds;
typedef typename BoundsValue<Bounds>::type Coordinate;
typedef typename RegularLinkSelector<Bounds>::type Link;
using Creator = std::function<void(int, Bounds, Bounds, Bounds, Link)>;
using Updater = std::function<void(int, int, Bounds, Bounds, Bounds, Link)>;
typedef std::vector<bool> BoolVector;
typedef std::vector<Coordinate> CoordinateVector;
typedef std::vector<int> DivisionsVector;
/// @param dim: dimensionality of the decomposition
/// @param domain: bounds of global domain
/// @param nblocks: total number of global blocks
/// @param share_face: indicates dimensions on which to share block faces
/// @param wrap: indicates dimensions on which to wrap the boundary
/// @param ghosts: indicates how many ghosts to use in each dimension
/// @param divisions: indicates how many cuts to make along each dimension
/// (0 means "no constraint," i.e., leave it up to the algorithm)
RegularDecomposer(int dim_,
const Bounds& domain_,
int nblocks_,
BoolVector share_face_ = BoolVector(),
BoolVector wrap_ = BoolVector(),
CoordinateVector ghosts_ = CoordinateVector(),
DivisionsVector divisions_ = DivisionsVector()):
dim(dim_), domain(domain_), nblocks(nblocks_),
share_face(share_face_),
wrap(wrap_), ghosts(ghosts_), divisions(divisions_)
{
if ((int) share_face.size() < dim) share_face.resize(dim);
if ((int) wrap.size() < dim) wrap.resize(dim);
if ((int) ghosts.size() < dim) ghosts.resize(dim);
if ((int) divisions.size() < dim) divisions.resize(dim);
fill_divisions(divisions);
}
// Calls create(int gid, const Bounds& bounds, const Link& link)
void decompose(int rank, const StaticAssigner& assigner, const Creator& create);
void decompose(int rank, const StaticAssigner& assigner, Master& master, const Updater& update);
void decompose(int rank, const StaticAssigner& assigner, Master& master);
// find lowest gid that owns a particular point
template<class Point>
int lowest_gid(const Point& p) const;
void gid_to_coords(int gid, DivisionsVector& coords) const { gid_to_coords(gid, coords, divisions); }
int coords_to_gid(const DivisionsVector& coords) const { return coords_to_gid(coords, divisions); }
void fill_divisions(std::vector<int>& divisions) const;
void fill_bounds(Bounds& bounds, const DivisionsVector& coords, bool add_ghosts = false) const;
void fill_bounds(Bounds& bounds, int gid, bool add_ghosts = false) const;
static bool all(const std::vector<int>& v, int x);
static void gid_to_coords(int gid, DivisionsVector& coords, const DivisionsVector& divisions);
static int coords_to_gid(const DivisionsVector& coords, const DivisionsVector& divisions);
static void factor(std::vector<unsigned>& factors, int n);
// Point to GIDs functions
template<class Point>
void point_to_gids(std::vector<int>& gids, const Point& p) const;
//! returns gid of a block that contains the point; ignores ghosts
template<class Point>
int point_to_gid(const Point& p) const;
template<class Point>
int num_gids(const Point& p) const;
template<class Point>
void top_bottom(int& top, int& bottom, const Point& p, int axis) const;
int dim;
Bounds domain;
int nblocks;
BoolVector share_face;
BoolVector wrap;
CoordinateVector ghosts;
DivisionsVector divisions;
};
/**
* \ingroup Decomposition
* \brief Decomposes the domain into a prescribed pattern of blocks.
*
* @param dim dimension of the domain
* @param rank local rank
* @param assigner decides how processors are assigned to blocks (maps a gid to a rank)
* also communicates the total number of blocks
* @param create the callback functor
* @param wrap indicates dimensions on which to wrap the boundary
* @param ghosts indicates how many ghosts to use in each dimension
* @param divs indicates how many cuts to make along each dimension
* (0 means "no constraint," i.e., leave it up to the algorithm)
*
* `create(...)` is called with each block assigned to the local domain. See [decomposition example](#decomposition-example).
*/
template<class Bounds>
void decompose(int dim,
int rank,
const Bounds& domain,
const StaticAssigner& assigner,
const typename RegularDecomposer<Bounds>::Creator& create,
typename RegularDecomposer<Bounds>::BoolVector share_face = typename RegularDecomposer<Bounds>::BoolVector(),
typename RegularDecomposer<Bounds>::BoolVector wrap = typename RegularDecomposer<Bounds>::BoolVector(),
typename RegularDecomposer<Bounds>::CoordinateVector ghosts = typename RegularDecomposer<Bounds>::CoordinateVector(),
typename RegularDecomposer<Bounds>::DivisionsVector divs = typename RegularDecomposer<Bounds>::DivisionsVector())
{
RegularDecomposer<Bounds>(dim, domain, assigner.nblocks(), share_face, wrap, ghosts, divs).decompose(rank, assigner, create);
}
/**
* \ingroup Decomposition
* \brief Decomposes the domain into a prescribed pattern of blocks.
*
* @param dim dimension of the domain
* @param rank local rank
* @param assigner decides how processors are assigned to blocks (maps a gid to a rank)
* also communicates the total number of blocks
* @param master gets the blocks once this function returns
* @param wrap indicates dimensions on which to wrap the boundary
* @param ghosts indicates how many ghosts to use in each dimension
* @param divs indicates how many cuts to make along each dimension
* (0 means "no constraint," i.e., leave it up to the algorithm)
*
* `master` must have been supplied a create function in order for this function to work.
*/
template<class Bounds>
void decompose(int dim,
int rank,
const Bounds& domain,
const StaticAssigner& assigner,
Master& master,
typename RegularDecomposer<Bounds>::BoolVector share_face = typename RegularDecomposer<Bounds>::BoolVector(),
typename RegularDecomposer<Bounds>::BoolVector wrap = typename RegularDecomposer<Bounds>::BoolVector(),
typename RegularDecomposer<Bounds>::CoordinateVector ghosts = typename RegularDecomposer<Bounds>::CoordinateVector(),
typename RegularDecomposer<Bounds>::DivisionsVector divs = typename RegularDecomposer<Bounds>::DivisionsVector())
{
RegularDecomposer<Bounds>(dim, domain, assigner.nblocks(), share_face, wrap, ghosts, divs).decompose(rank, assigner, master);
}
/**
* \ingroup Decomposition
* \brief A "null" decompositon that simply creates the blocks and adds them to the master
*
* @param rank local rank
* @param assigner decides how processors are assigned to blocks (maps a gid to a rank)
* also communicates the total number of blocks
* @param master gets the blocks once this function returns
*/
inline
void decompose(int rank,
const StaticAssigner& assigner,
Master& master)
{
std::vector<int> local_gids;
assigner.local_gids(rank, local_gids);
for (size_t i = 0; i < local_gids.size(); ++i)
master.add(local_gids[i], master.create(), new diy::Link);
}
/**
* \ingroup Decomposition
* \brief Add a decomposition (modify links) of an existing set of blocks that were
* added to the master previously
*
* @param rank local rank
* @param assigner decides how processors are assigned to blocks (maps a gid to a rank)
* also communicates the total number of blocks
*/
template<class Bounds>
void decompose(int dim,
int rank,
const Bounds& domain,
const StaticAssigner& assigner,
Master& master,
const typename RegularDecomposer<Bounds>::Updater& update,
typename RegularDecomposer<Bounds>::BoolVector share_face =
typename RegularDecomposer<Bounds>::BoolVector(),
typename RegularDecomposer<Bounds>::BoolVector wrap =
typename RegularDecomposer<Bounds>::BoolVector(),
typename RegularDecomposer<Bounds>::CoordinateVector ghosts =
typename RegularDecomposer<Bounds>::CoordinateVector(),
typename RegularDecomposer<Bounds>::DivisionsVector divs =
typename RegularDecomposer<Bounds>::DivisionsVector())
{
RegularDecomposer<Bounds>(dim, domain, assigner.nblocks(), share_face, wrap, ghosts, divs).
decompose(rank, assigner, master, update);
}
//! Decomposition example: \example decomposition/test-decomposition.cpp
//! Direct master insertion example: \example decomposition/test-direct-master.cpp
}
// decomposes domain and adds blocks to the master
template<class Bounds>
void
diy::RegularDecomposer<Bounds>::
decompose(int rank, const StaticAssigner& assigner, Master& master)
{
decompose(rank, assigner, [&master](int gid, const Bounds&, const Bounds&, const Bounds&, const Link& link)
{
void* b = master.create();
Link* l = new Link(link);
master.add(gid, b, l);
});
}
template<class Bounds>
void
diy::RegularDecomposer<Bounds>::
decompose(int rank, const StaticAssigner& assigner, const Creator& create)
{
std::vector<int> gids;
assigner.local_gids(rank, gids);
for (int i = 0; i < (int)gids.size(); ++i)
{
int gid = gids[i];
DivisionsVector coords;
gid_to_coords(gid, coords);
Bounds core, bounds;
fill_bounds(core, coords);
fill_bounds(bounds, coords, true);
// Fill link with all the neighbors
Link link(dim, core, bounds);
std::vector<int> offsets(dim, -1);
offsets[0] = -2;
while (!all(offsets, 1))
{
// next offset
int j;
for (j = 0; j < dim; ++j)
if (offsets[j] == 1)
offsets[j] = -1;
else
break;
++offsets[j];
if (all(offsets, 0)) continue; // skip ourselves
DivisionsVector nhbr_coords(dim);
Direction dir, wrap_dir;
bool inbounds = true;
for (int k = 0; k < dim; ++k)
{
nhbr_coords[k] = coords[k] + offsets[k];
// wrap
if (nhbr_coords[k] < 0)
{
if (wrap[k])
{
nhbr_coords[k] = divisions[k] - 1;
wrap_dir[k] = -1;
}
else
inbounds = false;
}
if (nhbr_coords[k] >= divisions[k])
{
if (wrap[k])
{
nhbr_coords[k] = 0;
wrap_dir[k] = 1;
}
else
inbounds = false;
}
// NB: this needs to match the addressing scheme in dir_t (in constants.h)
if (offsets[k] == -1 || offsets[k] == 1)
dir[k] = offsets[k];
}
if (!inbounds) continue;
int nhbr_gid = coords_to_gid(nhbr_coords);
BlockID bid; bid.gid = nhbr_gid; bid.proc = assigner.rank(nhbr_gid);
link.add_neighbor(bid);
Bounds nhbr_bounds;
fill_bounds(nhbr_bounds, nhbr_coords);
link.add_bounds(nhbr_bounds);
link.add_direction(dir);
link.add_wrap(wrap_dir);
}
create(gid, core, bounds, domain, link);
}
}
// decomposes domain but does not add blocks to master, assumes they were added already
template<class Bounds>
void
diy::RegularDecomposer<Bounds>::
decompose(int rank, const StaticAssigner& assigner, Master& master, const Updater& update)
{
decompose(rank, assigner, [&master,&update](int gid, const Bounds& core, const Bounds& bounds, const Bounds& domain_, const Link& link)
{
int lid = master.lid(gid);
Link* l = new Link(link);
master.replace_link(lid, l);
update(gid, lid, core, bounds, domain_, *l);
});
}
template<class Bounds>
bool
diy::RegularDecomposer<Bounds>::
all(const std::vector<int>& v, int x)
{
for (unsigned i = 0; i < v.size(); ++i)
if (v[i] != x)
return false;
return true;
}
template<class Bounds>
void
diy::RegularDecomposer<Bounds>::
gid_to_coords(int gid, DivisionsVector& coords, const DivisionsVector& divisions)
{
int dim = static_cast<int>(divisions.size());
for (int i = 0; i < dim; ++i)
{
coords.push_back(gid % divisions[i]);
gid /= divisions[i];
}
}
template<class Bounds>
int
diy::RegularDecomposer<Bounds>::
coords_to_gid(const DivisionsVector& coords, const DivisionsVector& divisions)
{
int gid = 0;
for (int i = static_cast<int>(coords.size()) - 1; i >= 0; --i)
{
gid *= divisions[i];
gid += coords[i];
}
return gid;
}
//! \ingroup Decomposition
//! Gets the bounds, with or without ghosts, for a block specified by its block coordinates
template<class Bounds>
void
diy::RegularDecomposer<Bounds>::
fill_bounds(Bounds& bounds, //!< (output) bounds
const DivisionsVector& coords, //!< coordinates of the block in the decomposition
bool add_ghosts) //!< whether to include ghosts in the output bounds
const
{
for (int i = 0; i < dim; ++i)
{
bounds.min[i] = detail::BoundsHelper<Bounds>::from(coords[i], divisions[i], domain.min[i], domain.max[i], share_face[i]);
bounds.max[i] = detail::BoundsHelper<Bounds>::to (coords[i], divisions[i], domain.min[i], domain.max[i], share_face[i]);
}
for (int i = dim; i < DIY_MAX_DIM; ++i) // set the unused dimension to 0
{
bounds.min[i] = 0;
bounds.max[i] = 0;
}
if (!add_ghosts)
return;
for (int i = 0; i < dim; ++i)
{
if (wrap[i])
{
bounds.min[i] -= ghosts[i];
bounds.max[i] += ghosts[i];
} else
{
bounds.min[i] = std::max(domain.min[i], bounds.min[i] - ghosts[i]);
bounds.max[i] = std::min(domain.max[i], bounds.max[i] + ghosts[i]);
}
}
}
//! \ingroup Decomposition
//! Gets the bounds, with or without ghosts, for a block specified by its gid
template<class Bounds>
void
diy::RegularDecomposer<Bounds>::
fill_bounds(Bounds& bounds, //!< (output) bounds
int gid, //!< global id of the block
bool add_ghosts) //!< whether to include ghosts in the output bounds
const
{
DivisionsVector coords;
gid_to_coords(gid, coords);
if (add_ghosts)
fill_bounds(bounds, coords, true);
else
fill_bounds(bounds, coords);
}
namespace diy { namespace detail {
// current state of division in one dimension used in fill_divisions below
template<class Coordinate>
struct Div
{
int dim; // 0, 1, 2, etc. e.g. for x, y, z etc.
int nb; // number of blocks so far in this dimension
Coordinate b_size; // block size so far in this dimension
// sort on descending block size unless tied, in which case
// sort on ascending num blocks in current dim unless tied, in which case
// sort on ascending dimension
bool operator<(Div rhs) const
{
// sort on second value of the pair unless tied, in which case sort on first
if (b_size == rhs.b_size)
{
if (nb == rhs.nb)
return(dim < rhs.dim);
return(nb < rhs.nb);
}
return(b_size > rhs.b_size);
}
};
} }
template<class Bounds>
void
diy::RegularDecomposer<Bounds>::
fill_divisions(std::vector<int>& divisions_) const
{
// prod = number of blocks unconstrained by user; c = number of unconstrained dimensions
int prod = 1; int c = 0;
for (int i = 0; i < dim; ++i)
if (divisions_[i] != 0)
{
prod *= divisions_[i];
++c;
}
if (nblocks % prod != 0)
throw std::runtime_error("Total number of blocks cannot be factored into provided divs");
if (c == (int) divisions_.size()) // nothing to do; user provided all divs
return;
// factor number of blocks left in unconstrained dimensions
// factorization is sorted from smallest to largest factors
std::vector<unsigned> factors;
factor(factors, nblocks/prod);
using detail::Div;
std::vector< Div<Coordinate> > missing_divs; // pairs consisting of (dim, #divs)
// init missing_divs
for (int i = 0; i < dim; i++)
{
if (divisions_[i] == 0)
{
Div<Coordinate> div;
div.dim = i;
div.nb = 1;
div.b_size = domain.max[i] - domain.min[i];
missing_divs.push_back(div);
}
}
// iterate over factorization of number of blocks (factors are sorted smallest to largest)
// NB: using int instead of size_t because must be negative in order to break out of loop
for (int i = factors.size() - 1; i >= 0; --i)
{
// fill in missing divs by dividing dimension w/ largest block size
// except when this would be illegal (resulting in bounds.max < bounds.min;
// only a problem for discrete bounds
// sort on decreasing block size
std::sort(missing_divs.begin(), missing_divs.end());
// split the dimension with the largest block size (first element in vector)
Coordinate min =
detail::BoundsHelper<Bounds>::from(0,
missing_divs[0].nb * factors[i],
domain.min[missing_divs[0].dim],
domain.max[missing_divs[0].dim],
share_face[missing_divs[0].dim]);
Coordinate max =
detail::BoundsHelper<Bounds>::to(0,
missing_divs[0].nb * factors[i],
domain.min[missing_divs[0].dim],
domain.max[missing_divs[0].dim],
share_face[missing_divs[0].dim]);
if (max >= min)
{
missing_divs[0].nb *= factors[i];
missing_divs[0].b_size = max - min;
}
else
{
std::ostringstream oss;
oss << "Unable to decompose domain into " << nblocks << " blocks: " << min << " " << max;
throw std::runtime_error(oss.str());
}
}
// assign the divisions
for (size_t i = 0; i < missing_divs.size(); i++)
divisions_[missing_divs[i].dim] = missing_divs[i].nb;
}
template<class Bounds>
void
diy::RegularDecomposer<Bounds>::
factor(std::vector<unsigned>& factors, int n)
{
while (n != 1)
for (int i = 2; i <= n; ++i)
{
if (n % i == 0)
{
factors.push_back(i);
n /= i;
break;
}
}
}
// Point to GIDs
// TODO: deal with wrap correctly
// TODO: add an optional ghosts argument to ignore ghosts (if we want to find the true owners, or something like that)
template<class Bounds>
template<class Point>
void
diy::RegularDecomposer<Bounds>::
point_to_gids(std::vector<int>& gids, const Point& p) const
{
std::vector< std::pair<int, int> > ranges(dim);
for (int i = 0; i < dim; ++i)
top_bottom(ranges[i].second, ranges[i].first, p, i);
// look up gids for all combinations
DivisionsVector coords(dim), location(dim);
while(location.back() < ranges.back().second - ranges.back().first)
{
for (int i = 0; i < dim; ++i)
coords[i] = ranges[i].first + location[i];
gids.push_back(coords_to_gid(coords, divisions));
location[0]++;
unsigned i = 0;
while (i < dim-1 && location[i] == ranges[i].second - ranges[i].first)
{
location[i] = 0;
++i;
location[i]++;
}
}
}
template<class Bounds>
template<class Point>
int
diy::RegularDecomposer<Bounds>::
point_to_gid(const Point& p) const
{
int gid = 0;
for (int axis = dim - 1; axis >= 0; --axis)
{
int bottom = detail::BoundsHelper<Bounds>::lower(p[axis], divisions[axis], domain.min[axis], domain.max[axis], share_face[axis]);
bottom = std::max(0, bottom);
// coupled with coords_to_gid
gid *= divisions[axis];
gid += bottom;
}
return gid;
}
template<class Bounds>
template<class Point>
int
diy::RegularDecomposer<Bounds>::
num_gids(const Point& p) const
{
int res = 1;
for (int i = 0; i < dim; ++i)
{
int top, bottom;
top_bottom(top, bottom, p, i);
res *= top - bottom;
}
return res;
}
template<class Bounds>
template<class Point>
void
diy::RegularDecomposer<Bounds>::
top_bottom(int& top, int& bottom, const Point& p, int axis) const
{
Coordinate l = p[axis] - ghosts[axis];
Coordinate r = p[axis] + ghosts[axis];
top = detail::BoundsHelper<Bounds>::upper(r, divisions[axis], domain.min[axis], domain.max[axis], share_face[axis]);
bottom = detail::BoundsHelper<Bounds>::lower(l, divisions[axis], domain.min[axis], domain.max[axis], share_face[axis]);
if (!wrap[axis])
{
bottom = std::max(0, bottom);
top = std::min(divisions[axis], top);
}
}
// find lowest gid that owns a particular point
template<class Bounds>
template<class Point>
int
diy::RegularDecomposer<Bounds>::
lowest_gid(const Point& p) const
{
// TODO: optimize - no need to compute all gids
std::vector<int> gids;
point_to_gids(gids, p);
std::sort(gids.begin(), gids.end());
return gids[0];
}
#endif

@ -1,450 +0,0 @@
#ifndef VTKMDIY_DETAIL_ALGORITHMS_KDTREE_SAMPLING_HPP
#define VTKMDIY_DETAIL_ALGORITHMS_KDTREE_SAMPLING_HPP
#include <vector>
#include <cassert>
#include "../../partners/all-reduce.hpp"
#include "../../log.hpp"
// TODO: technically, what's done now is not a perfect subsample:
// we take the same number of samples from every block, in reality this number should be selected at random,
// so that the total number of samples adds up to samples*nblocks
//
// NB: random samples are chosen using rand(), which is assumed to be seeded
// externally. Once we switch to C++11, we should use its more advanced
// random number generators (and take a generator as an external parameter)
// (TODO)
namespace diy
{
namespace detail
{
template<class Block, class Point>
struct KDTreeSamplingPartition
{
typedef diy::RegularContinuousLink RCLink;
typedef diy::ContinuousBounds Bounds;
typedef std::vector<float> Samples;
KDTreeSamplingPartition(int dim,
std::vector<Point> Block::* points,
size_t samples):
dim_(dim), points_(points), samples_(samples) {}
void operator()(Block* b, const diy::ReduceProxy& srp, const KDTreePartners& partners) const;
int divide_gid(int gid, bool lower, int round, int rounds) const;
void update_links(Block* b, const diy::ReduceProxy& srp, int dim, int round, int rounds, bool wrap, const Bounds& domain) const;
void split_to_neighbors(Block* b, const diy::ReduceProxy& srp, int dim) const;
diy::Direction
find_wrap(const Bounds& bounds, const Bounds& nbr_bounds, const Bounds& domain) const;
void compute_local_samples(Block* b, const diy::ReduceProxy& srp, int dim) const;
void add_samples(Block* b, const diy::ReduceProxy& srp, Samples& samples) const;
void receive_samples(Block* b, const diy::ReduceProxy& srp, Samples& samples) const;
void forward_samples(Block* b, const diy::ReduceProxy& srp, const Samples& samples) const;
void enqueue_exchange(Block* b, const diy::ReduceProxy& srp, int dim, const Samples& samples) const;
void dequeue_exchange(Block* b, const diy::ReduceProxy& srp, int dim) const;
void update_neighbor_bounds(Bounds& bounds, float split, int dim, bool lower) const;
bool intersects(const Bounds& x, const Bounds& y, int dim, bool wrap, const Bounds& domain) const;
float find_split(const Bounds& changed, const Bounds& original) const;
int dim_;
std::vector<Point> Block::* points_;
size_t samples_;
};
}
}
template<class Block, class Point>
void
diy::detail::KDTreeSamplingPartition<Block,Point>::
operator()(Block* b, const diy::ReduceProxy& srp, const KDTreePartners& partners) const
{
int dim;
if (srp.round() < partners.rounds())
dim = partners.dim(srp.round());
else
dim = partners.dim(srp.round() - 1);
if (srp.round() == partners.rounds())
update_links(b, srp, dim, partners.sub_round(srp.round() - 2), partners.swap_rounds(), partners.wrap, partners.domain); // -1 would be the "uninformative" link round
else if (partners.swap_round(srp.round()) && partners.sub_round(srp.round()) < 0) // link round
{
dequeue_exchange(b, srp, dim); // from the swap round
split_to_neighbors(b, srp, dim);
}
else if (partners.swap_round(srp.round()))
{
Samples samples;
receive_samples(b, srp, samples);
enqueue_exchange(b, srp, dim, samples);
} else if (partners.sub_round(srp.round()) == 0)
{
if (srp.round() > 0)
{
int prev_dim = dim - 1;
if (prev_dim < 0)
prev_dim += dim_;
update_links(b, srp, prev_dim, partners.sub_round(srp.round() - 2), partners.swap_rounds(), partners.wrap, partners.domain); // -1 would be the "uninformative" link round
}
compute_local_samples(b, srp, dim);
} else if (partners.sub_round(srp.round()) < (int) partners.histogram.rounds()/2) // we are reusing partners class, so really we are talking about the samples rounds here
{
Samples samples;
add_samples(b, srp, samples);
srp.enqueue(srp.out_link().target(0), samples);
} else
{
Samples samples;
add_samples(b, srp, samples);
if (samples.size() != 1)
{
// pick the median
std::nth_element(samples.begin(), samples.begin() + samples.size()/2, samples.end());
std::swap(samples[0], samples[samples.size()/2]);
//std::sort(samples.begin(), samples.end());
//samples[0] = (samples[samples.size()/2] + samples[samples.size()/2 + 1])/2;
samples.resize(1);
}
forward_samples(b, srp, samples);
}
}
template<class Block, class Point>
int
diy::detail::KDTreeSamplingPartition<Block,Point>::
divide_gid(int gid, bool lower, int round, int rounds) const
{
if (lower)
gid &= ~(1 << (rounds - 1 - round));
else
gid |= (1 << (rounds - 1 - round));
return gid;
}
// round here is the outer iteration of the algorithm
template<class Block, class Point>
void
diy::detail::KDTreeSamplingPartition<Block,Point>::
update_links(Block* b, const diy::ReduceProxy& srp, int dim, int round, int rounds, bool wrap, const Bounds& domain) const
{
auto log = get_logger();
int gid = srp.gid();
int lid = srp.master()->lid(gid);
RCLink* link = static_cast<RCLink*>(srp.master()->link(lid));
// (gid, dir) -> i
std::map<std::pair<int,diy::Direction>, int> link_map;
for (int i = 0; i < link->size(); ++i)
link_map[std::make_pair(link->target(i).gid, link->direction(i))] = i;
// NB: srp.enqueue(..., ...) should match the link
std::vector<float> splits(link->size());
for (int i = 0; i < link->size(); ++i)
{
float split; diy::Direction dir;
int in_gid = link->target(i).gid;
while(srp.incoming(in_gid))
{
srp.dequeue(in_gid, split);
srp.dequeue(in_gid, dir);
// reverse dir
for (int j = 0; j < dim_; ++j)
dir[j] = -dir[j];
int k = link_map[std::make_pair(in_gid, dir)];
log->trace("{} {} {} -> {}", in_gid, dir, split, k);
splits[k] = split;
}
}
RCLink new_link(dim_, link->core(), link->core());
bool lower = !(gid & (1 << (rounds - 1 - round)));
// fill out the new link
for (int i = 0; i < link->size(); ++i)
{
diy::Direction dir = link->direction(i);
//diy::Direction wrap_dir = link->wrap(i); // we don't use existing wrap, but restore it from scratch
if (dir[dim] != 0)
{
if ((dir[dim] < 0 && lower) || (dir[dim] > 0 && !lower))
{
int nbr_gid = divide_gid(link->target(i).gid, !lower, round, rounds);
diy::BlockID nbr = { nbr_gid, srp.assigner().rank(nbr_gid) };
new_link.add_neighbor(nbr);
new_link.add_direction(dir);
Bounds bounds = link->bounds(i);
update_neighbor_bounds(bounds, splits[i], dim, !lower);
new_link.add_bounds(bounds);
if (wrap)
new_link.add_wrap(find_wrap(new_link.bounds(), bounds, domain));
else
new_link.add_wrap(diy::Direction());
}
} else // non-aligned side
{
for (int j = 0; j < 2; ++j)
{
int nbr_gid = divide_gid(link->target(i).gid, j == 0, round, rounds);
Bounds bounds = link->bounds(i);
update_neighbor_bounds(bounds, splits[i], dim, j == 0);
if (intersects(bounds, new_link.bounds(), dim, wrap, domain))
{
diy::BlockID nbr = { nbr_gid, srp.assigner().rank(nbr_gid) };
new_link.add_neighbor(nbr);
new_link.add_direction(dir);
new_link.add_bounds(bounds);
if (wrap)
new_link.add_wrap(find_wrap(new_link.bounds(), bounds, domain));
else
new_link.add_wrap(diy::Direction());
}
}
}
}
// add link to the dual block
int dual_gid = divide_gid(gid, !lower, round, rounds);
diy::BlockID dual = { dual_gid, srp.assigner().rank(dual_gid) };
new_link.add_neighbor(dual);
Bounds nbr_bounds = link->bounds(); // old block bounds
update_neighbor_bounds(nbr_bounds, find_split(new_link.bounds(), nbr_bounds), dim, !lower);
new_link.add_bounds(nbr_bounds);
new_link.add_wrap(diy::Direction()); // dual block cannot be wrapped
if (lower)
{
diy::Direction right;
right[dim] = 1;
new_link.add_direction(right);
} else
{
diy::Direction left;
left[dim] = -1;
new_link.add_direction(left);
}
// update the link; notice that this won't conflict with anything since
// reduce is using its own notion of the link constructed through the
// partners
link->swap(new_link);
}
template<class Block, class Point>
void
diy::detail::KDTreeSamplingPartition<Block,Point>::
split_to_neighbors(Block* b, const diy::ReduceProxy& srp, int dim) const
{
int lid = srp.master()->lid(srp.gid());
RCLink* link = static_cast<RCLink*>(srp.master()->link(lid));
// determine split
float split = find_split(link->core(), link->bounds());
for (int i = 0; i < link->size(); ++i)
{
srp.enqueue(link->target(i), split);
srp.enqueue(link->target(i), link->direction(i));
}
}
template<class Block, class Point>
void
diy::detail::KDTreeSamplingPartition<Block,Point>::
compute_local_samples(Block* b, const diy::ReduceProxy& srp, int dim) const
{
// compute and enqueue local samples
Samples samples;
size_t points_size = (b->*points_).size();
size_t n = std::min(points_size, samples_);
samples.reserve(n);
for (size_t i = 0; i < n; ++i)
{
float x = (b->*points_)[rand() % points_size][dim];
samples.push_back(x);
}
srp.enqueue(srp.out_link().target(0), samples);
}
template<class Block, class Point>
void
diy::detail::KDTreeSamplingPartition<Block,Point>::
add_samples(Block* b, const diy::ReduceProxy& srp, Samples& samples) const
{
// dequeue and combine the samples
for (int i = 0; i < srp.in_link().size(); ++i)
{
int nbr_gid = srp.in_link().target(i).gid;
Samples smpls;
srp.dequeue(nbr_gid, smpls);
for (size_t j = 0; j < smpls.size(); ++j)
samples.push_back(smpls[j]);
}
}
template<class Block, class Point>
void
diy::detail::KDTreeSamplingPartition<Block,Point>::
receive_samples(Block* b, const diy::ReduceProxy& srp, Samples& samples) const
{
srp.dequeue(srp.in_link().target(0).gid, samples);
}
template<class Block, class Point>
void
diy::detail::KDTreeSamplingPartition<Block,Point>::
forward_samples(Block* b, const diy::ReduceProxy& srp, const Samples& samples) const
{
for (int i = 0; i < srp.out_link().size(); ++i)
srp.enqueue(srp.out_link().target(i), samples);
}
template<class Block, class Point>
void
diy::detail::KDTreeSamplingPartition<Block,Point>::
enqueue_exchange(Block* b, const diy::ReduceProxy& srp, int dim, const Samples& samples) const
{
int lid = srp.master()->lid(srp.gid());
RCLink* link = static_cast<RCLink*>(srp.master()->link(lid));
int k = srp.out_link().size();
if (k == 0) // final round; nothing needs to be sent; this is actually redundant
return;
// pick split points
float split = samples[0];
// subset and enqueue
std::vector< std::vector<Point> > out_points(srp.out_link().size());
for (size_t i = 0; i < (b->*points_).size(); ++i)
{
float x = (b->*points_)[i][dim];
int loc = x < split ? 0 : 1;
out_points[loc].push_back((b->*points_)[i]);
}
int pos = -1;
for (int i = 0; i < k; ++i)
{
if (srp.out_link().target(i).gid == srp.gid())
{
(b->*points_).swap(out_points[i]);
pos = i;
}
else
srp.enqueue(srp.out_link().target(i), out_points[i]);
}
if (pos == 0)
link->core().max[dim] = split;
else
link->core().min[dim] = split;
}
template<class Block, class Point>
void
diy::detail::KDTreeSamplingPartition<Block,Point>::
dequeue_exchange(Block* b, const diy::ReduceProxy& srp, int dim) const
{
int lid = srp.master()->lid(srp.gid());
RCLink* link = static_cast<RCLink*>(srp.master()->link(lid));
for (int i = 0; i < srp.in_link().size(); ++i)
{
int nbr_gid = srp.in_link().target(i).gid;
if (nbr_gid == srp.gid())
continue;
std::vector<Point> in_points;
srp.dequeue(nbr_gid, in_points);
for (size_t j = 0; j < in_points.size(); ++j)
{
if (in_points[j][dim] < link->core().min[dim] || in_points[j][dim] > link->core().max[dim])
throw std::runtime_error(fmt::format("Dequeued {} outside [{},{}] ({})",
in_points[j][dim], link->core().min[dim], link->core().max[dim], dim));
(b->*points_).push_back(in_points[j]);
}
}
}
template<class Block, class Point>
void
diy::detail::KDTreeSamplingPartition<Block,Point>::
update_neighbor_bounds(Bounds& bounds, float split, int dim, bool lower) const
{
if (lower)
bounds.max[dim] = split;
else
bounds.min[dim] = split;
}
template<class Block, class Point>
bool
diy::detail::KDTreeSamplingPartition<Block,Point>::
intersects(const Bounds& x, const Bounds& y, int dim, bool wrap, const Bounds& domain) const
{
if (wrap)
{
if (x.min[dim] == domain.min[dim] && y.max[dim] == domain.max[dim])
return true;
if (y.min[dim] == domain.min[dim] && x.max[dim] == domain.max[dim])
return true;
}
return x.min[dim] <= y.max[dim] && y.min[dim] <= x.max[dim];
}
template<class Block, class Point>
float
diy::detail::KDTreeSamplingPartition<Block,Point>::
find_split(const Bounds& changed, const Bounds& original) const
{
for (int i = 0; i < dim_; ++i)
{
if (changed.min[i] != original.min[i])
return changed.min[i];
if (changed.max[i] != original.max[i])
return changed.max[i];
}
assert(0);
return -1;
}
template<class Block, class Point>
diy::Direction
diy::detail::KDTreeSamplingPartition<Block,Point>::
find_wrap(const Bounds& bounds, const Bounds& nbr_bounds, const Bounds& domain) const
{
diy::Direction wrap;
for (int i = 0; i < dim_; ++i)
{
if (bounds.min[i] == domain.min[i] && nbr_bounds.max[i] == domain.max[i])
wrap[i] = -1;
if (bounds.max[i] == domain.max[i] && nbr_bounds.min[i] == domain.min[i])
wrap[i] = 1;
}
return wrap;
}
#endif

@ -1,569 +0,0 @@
#ifndef VTKMDIY_DETAIL_ALGORITHMS_KDTREE_HPP
#define VTKMDIY_DETAIL_ALGORITHMS_KDTREE_HPP
#include <vector>
#include <cassert>
#include "../../partners/all-reduce.hpp"
#include "../../log.hpp"
namespace diy
{
namespace detail
{
struct KDTreePartners;
template<class Block, class Point>
struct KDTreePartition
{
typedef diy::RegularContinuousLink RCLink;
typedef diy::ContinuousBounds Bounds;
typedef std::vector<size_t> Histogram;
KDTreePartition(int dim,
std::vector<Point> Block::* points,
size_t bins):
dim_(dim), points_(points), bins_(bins) {}
void operator()(Block* b, const diy::ReduceProxy& srp, const KDTreePartners& partners) const;
int divide_gid(int gid, bool lower, int round, int rounds) const;
void update_links(Block* b, const diy::ReduceProxy& srp, int dim, int round, int rounds, bool wrap, const Bounds& domain) const;
void split_to_neighbors(Block* b, const diy::ReduceProxy& srp, int dim) const;
diy::Direction
find_wrap(const Bounds& bounds, const Bounds& nbr_bounds, const Bounds& domain) const;
void compute_local_histogram(Block* b, const diy::ReduceProxy& srp, int dim) const;
void add_histogram(Block* b, const diy::ReduceProxy& srp, Histogram& histogram) const;
void receive_histogram(Block* b, const diy::ReduceProxy& srp, Histogram& histogram) const;
void forward_histogram(Block* b, const diy::ReduceProxy& srp, const Histogram& histogram) const;
void enqueue_exchange(Block* b, const diy::ReduceProxy& srp, int dim, const Histogram& histogram) const;
void dequeue_exchange(Block* b, const diy::ReduceProxy& srp, int dim) const;
void update_neighbor_bounds(Bounds& bounds, float split, int dim, bool lower) const;
bool intersects(const Bounds& x, const Bounds& y, int dim, bool wrap, const Bounds& domain) const;
float find_split(const Bounds& changed, const Bounds& original) const;
int dim_;
std::vector<Point> Block::* points_;
size_t bins_;
};
}
}
struct diy::detail::KDTreePartners
{
// bool = are we in a swap (vs histogram) round
// int = round within that partner
typedef std::pair<bool, int> RoundType;
typedef diy::ContinuousBounds Bounds;
KDTreePartners(int dim, int nblocks, bool wrap_, const Bounds& domain_):
decomposer(1, interval(0,nblocks-1), nblocks),
histogram(decomposer, 2),
swap(decomposer, 2, false),
wrap(wrap_),
domain(domain_)
{
for (unsigned i = 0; i < swap.rounds(); ++i)
{
// fill histogram rounds
for (unsigned j = 0; j < histogram.rounds(); ++j)
{
rounds_.push_back(std::make_pair(false, j));
dim_.push_back(i % dim);
if (j == histogram.rounds() / 2 - 1 - i)
j += 2*i;
}
// fill swap round
rounds_.push_back(std::make_pair(true, i));
dim_.push_back(i % dim);
// fill link round
rounds_.push_back(std::make_pair(true, -1)); // (true, -1) signals link round
dim_.push_back(i % dim);
}
}
size_t rounds() const { return rounds_.size(); }
size_t swap_rounds() const { return swap.rounds(); }
int dim(int round) const { return dim_[round]; }
bool swap_round(int round) const { return rounds_[round].first; }
int sub_round(int round) const { return rounds_[round].second; }
inline bool active(int round, int gid, const diy::Master& m) const
{
if (round == (int) rounds())
return true;
else if (swap_round(round) && sub_round(round) < 0) // link round
return true;
else if (swap_round(round))
return swap.active(sub_round(round), gid, m);
else
return histogram.active(sub_round(round), gid, m);
}
inline void incoming(int round, int gid, std::vector<int>& partners, const diy::Master& m) const
{
if (round == (int) rounds())
link_neighbors(-1, gid, partners, m);
else if (swap_round(round) && sub_round(round) < 0) // link round
swap.incoming(sub_round(round - 1) + 1, gid, partners, m);
else if (swap_round(round))
histogram.incoming(histogram.rounds(), gid, partners, m);
else
{
if (round > 0 && sub_round(round) == 0)
link_neighbors(-1, gid, partners, m);
else if (round > 0 && sub_round(round - 1) != sub_round(round) - 1) // jump through the histogram rounds
histogram.incoming(sub_round(round - 1) + 1, gid, partners, m);
else
histogram.incoming(sub_round(round), gid, partners, m);
}
}
inline void outgoing(int round, int gid, std::vector<int>& partners, const diy::Master& m) const
{
if (round == (int) rounds())
swap.outgoing(sub_round(round-1) + 1, gid, partners, m);
else if (swap_round(round) && sub_round(round) < 0) // link round
link_neighbors(-1, gid, partners, m);
else if (swap_round(round))
swap.outgoing(sub_round(round), gid, partners, m);
else
histogram.outgoing(sub_round(round), gid, partners, m);
}
inline void link_neighbors(int, int gid, std::vector<int>& partners, const diy::Master& m) const
{
int lid = m.lid(gid);
diy::Link* link = m.link(lid);
std::set<int> result; // partners must be unique
for (int i = 0; i < link->size(); ++i)
result.insert(link->target(i).gid);
for (std::set<int>::const_iterator it = result.begin(); it != result.end(); ++it)
partners.push_back(*it);
}
// 1-D domain to feed into histogram and swap
diy::RegularDecomposer<diy::DiscreteBounds> decomposer;
diy::RegularAllReducePartners histogram;
diy::RegularSwapPartners swap;
std::vector<RoundType> rounds_;
std::vector<int> dim_;
bool wrap;
Bounds domain;
};
template<class Block, class Point>
void
diy::detail::KDTreePartition<Block,Point>::
operator()(Block* b, const diy::ReduceProxy& srp, const KDTreePartners& partners) const
{
int dim;
if (srp.round() < partners.rounds())
dim = partners.dim(srp.round());
else
dim = partners.dim(srp.round() - 1);
if (srp.round() == partners.rounds())
update_links(b, srp, dim, partners.sub_round(srp.round() - 2), partners.swap_rounds(), partners.wrap, partners.domain); // -1 would be the "uninformative" link round
else if (partners.swap_round(srp.round()) && partners.sub_round(srp.round()) < 0) // link round
{
dequeue_exchange(b, srp, dim); // from the swap round
split_to_neighbors(b, srp, dim);
}
else if (partners.swap_round(srp.round()))
{
Histogram histogram;
receive_histogram(b, srp, histogram);
enqueue_exchange(b, srp, dim, histogram);
} else if (partners.sub_round(srp.round()) == 0)
{
if (srp.round() > 0)
{
int prev_dim = dim - 1;
if (prev_dim < 0)
prev_dim += dim_;
update_links(b, srp, prev_dim, partners.sub_round(srp.round() - 2), partners.swap_rounds(), partners.wrap, partners.domain); // -1 would be the "uninformative" link round
}
compute_local_histogram(b, srp, dim);
} else if (partners.sub_round(srp.round()) < (int) partners.histogram.rounds()/2)
{
Histogram histogram(bins_);
add_histogram(b, srp, histogram);
srp.enqueue(srp.out_link().target(0), histogram);
}
else
{
Histogram histogram(bins_);
add_histogram(b, srp, histogram);
forward_histogram(b, srp, histogram);
}
}
template<class Block, class Point>
int
diy::detail::KDTreePartition<Block,Point>::
divide_gid(int gid, bool lower, int round, int rounds) const
{
if (lower)
gid &= ~(1 << (rounds - 1 - round));
else
gid |= (1 << (rounds - 1 - round));
return gid;
}
// round here is the outer iteration of the algorithm
template<class Block, class Point>
void
diy::detail::KDTreePartition<Block,Point>::
update_links(Block* b, const diy::ReduceProxy& srp, int dim, int round, int rounds, bool wrap, const Bounds& domain) const
{
int gid = srp.gid();
int lid = srp.master()->lid(gid);
RCLink* link = static_cast<RCLink*>(srp.master()->link(lid));
// (gid, dir) -> i
std::map<std::pair<int,diy::Direction>, int> link_map;
for (int i = 0; i < link->size(); ++i)
link_map[std::make_pair(link->target(i).gid, link->direction(i))] = i;
// NB: srp.enqueue(..., ...) should match the link
std::vector<float> splits(link->size());
for (int i = 0; i < link->size(); ++i)
{
float split; diy::Direction dir;
int in_gid = link->target(i).gid;
while(srp.incoming(in_gid))
{
srp.dequeue(in_gid, split);
srp.dequeue(in_gid, dir);
// reverse dir
for (int j = 0; j < dim_; ++j)
dir[j] = -dir[j];
int k = link_map[std::make_pair(in_gid, dir)];
splits[k] = split;
}
}
RCLink new_link(dim_, link->core(), link->core());
bool lower = !(gid & (1 << (rounds - 1 - round)));
// fill out the new link
for (int i = 0; i < link->size(); ++i)
{
diy::Direction dir = link->direction(i);
//diy::Direction wrap_dir = link->wrap(i); // we don't use existing wrap, but restore it from scratch
if (dir[dim] != 0)
{
if ((dir[dim] < 0 && lower) || (dir[dim] > 0 && !lower))
{
int nbr_gid = divide_gid(link->target(i).gid, !lower, round, rounds);
diy::BlockID nbr = { nbr_gid, srp.assigner().rank(nbr_gid) };
new_link.add_neighbor(nbr);
new_link.add_direction(dir);
Bounds bounds = link->bounds(i);
update_neighbor_bounds(bounds, splits[i], dim, !lower);
new_link.add_bounds(bounds);
if (wrap)
new_link.add_wrap(find_wrap(new_link.bounds(), bounds, domain));
else
new_link.add_wrap(diy::Direction());
}
} else // non-aligned side
{
for (int j = 0; j < 2; ++j)
{
int nbr_gid = divide_gid(link->target(i).gid, j == 0, round, rounds);
Bounds bounds = link->bounds(i);
update_neighbor_bounds(bounds, splits[i], dim, j == 0);
if (intersects(bounds, new_link.bounds(), dim, wrap, domain))
{
diy::BlockID nbr = { nbr_gid, srp.assigner().rank(nbr_gid) };
new_link.add_neighbor(nbr);
new_link.add_direction(dir);
new_link.add_bounds(bounds);
if (wrap)
new_link.add_wrap(find_wrap(new_link.bounds(), bounds, domain));
else
new_link.add_wrap(diy::Direction());
}
}
}
}
// add link to the dual block
int dual_gid = divide_gid(gid, !lower, round, rounds);
diy::BlockID dual = { dual_gid, srp.assigner().rank(dual_gid) };
new_link.add_neighbor(dual);
Bounds nbr_bounds = link->bounds(); // old block bounds
update_neighbor_bounds(nbr_bounds, find_split(new_link.bounds(), nbr_bounds), dim, !lower);
new_link.add_bounds(nbr_bounds);
new_link.add_wrap(diy::Direction()); // dual block cannot be wrapped
if (lower)
{
diy::Direction right;
right[dim] = 1;
new_link.add_direction(right);
} else
{
diy::Direction left;
left[dim] = -1;
new_link.add_direction(left);
}
// update the link; notice that this won't conflict with anything since
// reduce is using its own notion of the link constructed through the
// partners
link->swap(new_link);
}
template<class Block, class Point>
void
diy::detail::KDTreePartition<Block,Point>::
split_to_neighbors(Block* b, const diy::ReduceProxy& srp, int dim) const
{
int lid = srp.master()->lid(srp.gid());
RCLink* link = static_cast<RCLink*>(srp.master()->link(lid));
// determine split
float split = find_split(link->core(), link->bounds());
for (int i = 0; i < link->size(); ++i)
{
srp.enqueue(link->target(i), split);
srp.enqueue(link->target(i), link->direction(i));
}
}
template<class Block, class Point>
void
diy::detail::KDTreePartition<Block,Point>::
compute_local_histogram(Block* b, const diy::ReduceProxy& srp, int dim) const
{
int lid = srp.master()->lid(srp.gid());
RCLink* link = static_cast<RCLink*>(srp.master()->link(lid));
// compute and enqueue local histogram
Histogram histogram(bins_);
float width = (link->core().max[dim] - link->core().min[dim])/bins_;
for (size_t i = 0; i < (b->*points_).size(); ++i)
{
float x = (b->*points_)[i][dim];
int loc = (x - link->core().min[dim]) / width;
if (loc < 0)
throw std::runtime_error(fmt::format("{} {} {}", loc, x, link->core().min[dim]));
if (loc >= (int) bins_)
loc = bins_ - 1;
++(histogram[loc]);
}
srp.enqueue(srp.out_link().target(0), histogram);
}
template<class Block, class Point>
void
diy::detail::KDTreePartition<Block,Point>::
add_histogram(Block* b, const diy::ReduceProxy& srp, Histogram& histogram) const
{
// dequeue and add up the histograms
for (int i = 0; i < srp.in_link().size(); ++i)
{
int nbr_gid = srp.in_link().target(i).gid;
Histogram hist;
srp.dequeue(nbr_gid, hist);
for (size_t j = 0; j < hist.size(); ++j)
histogram[j] += hist[j];
}
}
template<class Block, class Point>
void
diy::detail::KDTreePartition<Block,Point>::
receive_histogram(Block* b, const diy::ReduceProxy& srp, Histogram& histogram) const
{
srp.dequeue(srp.in_link().target(0).gid, histogram);
}
template<class Block, class Point>
void
diy::detail::KDTreePartition<Block,Point>::
forward_histogram(Block* b, const diy::ReduceProxy& srp, const Histogram& histogram) const
{
for (int i = 0; i < srp.out_link().size(); ++i)
srp.enqueue(srp.out_link().target(i), histogram);
}
template<class Block, class Point>
void
diy::detail::KDTreePartition<Block,Point>::
enqueue_exchange(Block* b, const diy::ReduceProxy& srp, int dim, const Histogram& histogram) const
{
auto log = get_logger();
int lid = srp.master()->lid(srp.gid());
RCLink* link = static_cast<RCLink*>(srp.master()->link(lid));
int k = srp.out_link().size();
if (k == 0) // final round; nothing needs to be sent; this is actually redundant
return;
// pick split points
size_t total = 0;
for (size_t i = 0; i < histogram.size(); ++i)
total += histogram[i];
log->trace("Histogram total: {}", total);
size_t cur = 0;
float width = (link->core().max[dim] - link->core().min[dim])/bins_;
float split = 0;
for (size_t i = 0; i < histogram.size(); ++i)
{
if (cur + histogram[i] > total/2)
{
split = link->core().min[dim] + width*i;
break;
}
cur += histogram[i];
}
log->trace("Found split: {} (dim={}) in {} - {}", split, dim, link->core().min[dim], link->core().max[dim]);
// subset and enqueue
std::vector< std::vector<Point> > out_points(srp.out_link().size());
for (size_t i = 0; i < (b->*points_).size(); ++i)
{
float x = (b->*points_)[i][dim];
int loc = x < split ? 0 : 1;
out_points[loc].push_back((b->*points_)[i]);
}
int pos = -1;
for (int i = 0; i < k; ++i)
{
if (srp.out_link().target(i).gid == srp.gid())
{
(b->*points_).swap(out_points[i]);
pos = i;
}
else
srp.enqueue(srp.out_link().target(i), out_points[i]);
}
if (pos == 0)
link->core().max[dim] = split;
else
link->core().min[dim] = split;
}
template<class Block, class Point>
void
diy::detail::KDTreePartition<Block,Point>::
dequeue_exchange(Block* b, const diy::ReduceProxy& srp, int dim) const
{
int lid = srp.master()->lid(srp.gid());
RCLink* link = static_cast<RCLink*>(srp.master()->link(lid));
for (int i = 0; i < srp.in_link().size(); ++i)
{
int nbr_gid = srp.in_link().target(i).gid;
if (nbr_gid == srp.gid())
continue;
std::vector<Point> in_points;
srp.dequeue(nbr_gid, in_points);
for (size_t j = 0; j < in_points.size(); ++j)
{
if (in_points[j][dim] < link->core().min[dim] || in_points[j][dim] > link->core().max[dim])
throw std::runtime_error(fmt::format("Dequeued {} outside [{},{}] ({})",
in_points[j][dim], link->core().min[dim], link->core().max[dim], dim));
(b->*points_).push_back(in_points[j]);
}
}
}
template<class Block, class Point>
void
diy::detail::KDTreePartition<Block,Point>::
update_neighbor_bounds(Bounds& bounds, float split, int dim, bool lower) const
{
if (lower)
bounds.max[dim] = split;
else
bounds.min[dim] = split;
}
template<class Block, class Point>
bool
diy::detail::KDTreePartition<Block,Point>::
intersects(const Bounds& x, const Bounds& y, int dim, bool wrap, const Bounds& domain) const
{
if (wrap)
{
if (x.min[dim] == domain.min[dim] && y.max[dim] == domain.max[dim])
return true;
if (y.min[dim] == domain.min[dim] && x.max[dim] == domain.max[dim])
return true;
}
return x.min[dim] <= y.max[dim] && y.min[dim] <= x.max[dim];
}
template<class Block, class Point>
float
diy::detail::KDTreePartition<Block,Point>::
find_split(const Bounds& changed, const Bounds& original) const
{
for (int i = 0; i < dim_; ++i)
{
if (changed.min[i] != original.min[i])
return changed.min[i];
if (changed.max[i] != original.max[i])
return changed.max[i];
}
assert(0);
return -1;
}
template<class Block, class Point>
diy::Direction
diy::detail::KDTreePartition<Block,Point>::
find_wrap(const Bounds& bounds, const Bounds& nbr_bounds, const Bounds& domain) const
{
diy::Direction wrap;
for (int i = 0; i < dim_; ++i)
{
if (bounds.min[i] == domain.min[i] && nbr_bounds.max[i] == domain.max[i])
wrap[i] = -1;
if (bounds.max[i] == domain.max[i] && nbr_bounds.min[i] == domain.min[i])
wrap[i] = 1;
}
return wrap;
}
#endif

@ -1,162 +0,0 @@
#ifndef VTKMDIY_DETAIL_ALGORITHMS_SORT_HPP
#define VTKMDIY_DETAIL_ALGORITHMS_SORT_HPP
#include <functional>
#include <algorithm>
namespace diy
{
namespace detail
{
template<class Block, class T, class Cmp>
struct SampleSort
{
typedef std::vector<T> Block::*ValuesVector;
struct Sampler;
struct Exchanger;
SampleSort(ValuesVector values_, ValuesVector samples_, const Cmp& cmp_, size_t num_samples_):
values(values_), samples(samples_),
cmp(cmp_), num_samples(num_samples_) {}
Sampler sample() const { return Sampler(values, samples, cmp, num_samples); }
Exchanger exchange() const { return Exchanger(values, samples, cmp); }
static void dequeue_values(std::vector<T>& v, const ReduceProxy& rp, bool skip_self = true)
{
auto log = get_logger();
int k_in = rp.in_link().size();
log->trace("dequeue_values(): gid={}, round={}; v.size()={}", rp.gid(), rp.round(), v.size());
if (detail::is_default< Serialization<T> >::value)
{
// add up sizes
size_t sz = 0;
size_t end = v.size();
for (int i = 0; i < k_in; ++i)
{
log->trace(" incoming size from {}: {}", rp.in_link().target(i).gid, sz);
if (skip_self && rp.in_link().target(i).gid == rp.gid()) continue;
MemoryBuffer& in = rp.incoming(rp.in_link().target(i).gid);
sz += in.size() / sizeof(T);
}
log->trace(" incoming size: {}", sz);
v.resize(end + sz);
for (int i = 0; i < k_in; ++i)
{
if (skip_self && rp.in_link().target(i).gid == rp.gid()) continue;
MemoryBuffer& in = rp.incoming(rp.in_link().target(i).gid);
size_t incoming_sz = in.size() / sizeof(T);
T* bg = (T*) &in.buffer[0];
std::copy(bg, bg + incoming_sz, &v[end]);
end += incoming_sz;
}
} else
{
for (int i = 0; i < k_in; ++i)
{
if (skip_self && rp.in_link().target(i).gid == rp.gid()) continue;
MemoryBuffer& in = rp.incoming(rp.in_link().target(i).gid);
while(in)
{
T x;
diy::load(in, x);
v.emplace_back(std::move(x));
}
}
}
log->trace(" v.size()={}", v.size());
}
ValuesVector values;
ValuesVector samples;
Cmp cmp;
size_t num_samples;
};
template<class Block, class T, class Cmp>
struct SampleSort<Block,T,Cmp>::Sampler
{
Sampler(ValuesVector values_, ValuesVector dividers_, const Cmp& cmp_, size_t num_samples_):
values(values_), dividers(dividers_), cmp(cmp_), num_samples(num_samples_) {}
void operator()(Block* b, const ReduceProxy& srp, const RegularSwapPartners& partners) const
{
int k_in = srp.in_link().size();
int k_out = srp.out_link().size();
std::vector<T> samples;
if (k_in == 0)
{
// draw random samples
for (size_t i = 0; i < num_samples; ++i)
samples.push_back((b->*values)[std::rand() % (b->*values).size()]);
} else
dequeue_values(samples, srp, false);
if (k_out == 0)
{
// pick subsamples that separate quantiles
std::sort(samples.begin(), samples.end(), cmp);
std::vector<T> subsamples(srp.nblocks() - 1);
int step = samples.size() / srp.nblocks(); // NB: subsamples.size() + 1
for (size_t i = 0; i < subsamples.size(); ++i)
subsamples[i] = samples[(i+1)*step];
(b->*dividers).swap(subsamples);
}
else
{
for (int i = 0; i < k_out; ++i)
{
MemoryBuffer& out = srp.outgoing(srp.out_link().target(i));
save(out, &samples[0], samples.size());
}
}
}
ValuesVector values;
ValuesVector dividers;
Cmp cmp;
size_t num_samples;
};
template<class Block, class T, class Cmp>
struct SampleSort<Block,T,Cmp>::Exchanger
{
Exchanger(ValuesVector values_, ValuesVector samples_, const Cmp& cmp_):
values(values_), samples(samples_), cmp(cmp_) {}
void operator()(Block* b, const ReduceProxy& rp) const
{
if (rp.round() == 0)
{
// enqueue values to the correct locations
for (size_t i = 0; i < (b->*values).size(); ++i)
{
int to = std::lower_bound((b->*samples).begin(), (b->*samples).end(), (b->*values)[i], cmp) - (b->*samples).begin();
rp.enqueue(rp.out_link().target(to), (b->*values)[i]);
}
(b->*values).clear();
} else
{
dequeue_values((b->*values), rp, false);
std::sort((b->*values).begin(), (b->*values).end(), cmp);
}
}
ValuesVector values;
ValuesVector samples;
Cmp cmp;
};
}
}
#endif

@ -1,31 +0,0 @@
#ifndef VTKMDIY_BLOCK_TRAITS_HPP
#define VTKMDIY_BLOCK_TRAITS_HPP
#include "traits.hpp"
namespace diy
{
namespace detail
{
template<class F>
struct block_traits
{
typedef typename std::remove_pointer<typename function_traits<F>::template arg<0>::type>::type type;
};
// matches block member functions
template<class Block, class R, class... Args>
struct block_traits<R(Block::*)(Args...)>
{
typedef Block type;
};
template<class Block, class R, class... Args>
struct block_traits<R(Block::*)(Args...) const>
{
typedef Block type;
};
}
}
#endif

@ -1,131 +0,0 @@
namespace diy
{
namespace detail
{
struct CollectiveOp
{
virtual void init() =0;
virtual void update(const CollectiveOp& other) =0;
virtual void global(const mpi::communicator& comm) =0;
virtual void copy_from(const CollectiveOp& other) =0;
virtual void result_out(void* dest) const =0;
virtual ~CollectiveOp() {}
};
template<class T, class Op>
struct AllReduceOp: public CollectiveOp
{
AllReduceOp(const T& x, Op op):
in_(x), op_(op) {}
void init() { out_ = in_; }
void update(const CollectiveOp& other) { out_ = op_(out_, static_cast<const AllReduceOp&>(other).in_); }
void global(const mpi::communicator& comm) { T res; mpi::all_reduce(comm, out_, res, op_); out_ = res; }
void copy_from(const CollectiveOp& other) { out_ = static_cast<const AllReduceOp&>(other).out_; }
void result_out(void* dest) const { *reinterpret_cast<T*>(dest) = out_; }
private:
T in_, out_;
Op op_;
};
template<class T>
struct Scratch: public CollectiveOp
{
Scratch(const T& x):
x_(x) {}
void init() {}
void update(const CollectiveOp&) {}
void global(const mpi::communicator&) {}
void copy_from(const CollectiveOp&) {}
void result_out(void* dest) const { *reinterpret_cast<T*>(dest) = x_; }
private:
T x_;
};
}
struct Master::Collective
{
Collective():
cop_(0) {}
Collective(detail::CollectiveOp* cop):
cop_(cop) {}
Collective(Collective&& other):
cop_(0) { swap(const_cast<Collective&>(other)); }
~Collective() { delete cop_; }
Collective& operator=(const Collective& other) = delete;
Collective(Collective& other) = delete;
void init() { cop_->init(); }
void swap(Collective& other) { std::swap(cop_, other.cop_); }
void update(const Collective& other) { cop_->update(*other.cop_); }
void global(const mpi::communicator& c) { cop_->global(c); }
void copy_from(Collective& other) const { cop_->copy_from(*other.cop_); }
void result_out(void* x) const { cop_->result_out(x); }
detail::CollectiveOp* cop_;
};
struct Master::CollectivesList: public std::list<Collective>
{};
struct Master::CollectivesMap: public std::map<int, CollectivesList>
{};
}
diy::Master::CollectivesMap&
diy::Master::
collectives()
{
return *collectives_;
}
diy::Master::CollectivesList&
diy::Master::
collectives(int gid__)
{
return (*collectives_)[gid__];
}
void
diy::Master::
process_collectives()
{
auto scoped = prof.scoped("collectives");
DIY_UNUSED(scoped);
if (collectives().empty())
return;
using CollectivesIterator = CollectivesList::iterator;
std::vector<CollectivesIterator> iters;
std::vector<int> gids;
for (auto& x : collectives())
{
gids.push_back(x.first);
iters.push_back(x.second.begin());
}
while (iters[0] != collectives().begin()->second.end())
{
iters[0]->init();
for (unsigned j = 1; j < iters.size(); ++j)
{
// NB: this assumes that the operations are commutative
iters[0]->update(*iters[j]);
}
iters[0]->global(comm_); // do the mpi collective
for (unsigned j = 1; j < iters.size(); ++j)
{
iters[j]->copy_from(*iters[0]);
++iters[j];
}
++iters[0];
}
}

@ -1,22 +0,0 @@
namespace diy
{
struct Master::BaseCommand
{
virtual ~BaseCommand() {} // to delete derived classes
virtual void execute(void* b, const ProxyWithLink& cp) const =0;
virtual bool skip(int i, const Master& master) const =0;
};
template<class Block>
struct Master::Command: public BaseCommand
{
Command(Callback<Block> f_, const Skip& s_):
f(f_), s(s_) {}
void execute(void* b, const ProxyWithLink& cp) const override { f(static_cast<Block*>(b), cp); }
bool skip(int i, const Master& m) const override { return s(i,m); }
Callback<Block> f;
Skip s;
};
}

@ -1,202 +0,0 @@
namespace diy
{
struct Master::tags { enum { queue, piece }; };
struct Master::MessageInfo
{
int from, to;
int round;
};
struct Master::InFlightSend
{
std::shared_ptr<MemoryBuffer> message;
mpi::request request;
MessageInfo info; // for debug purposes
};
struct Master::InFlightRecv
{
MemoryBuffer message;
MessageInfo info { -1, -1, -1 };
bool done = false;
inline void recv(mpi::communicator& comm, const mpi::status& status);
inline void place(IncomingRound* in, bool unload, ExternalStorage* storage, IExchangeInfo* iexchange);
void reset() { *this = InFlightRecv(); }
};
struct Master::InFlightRecvsMap: public std::map<int, InFlightRecv>
{};
struct Master::InFlightSendsList: public std::list<InFlightSend>
{};
struct Master::GidSendOrder
{
size_t size() const { return list.size(); }
bool empty() const { return list.empty(); }
int pop() { int x = list.front(); list.pop_front(); return x; }
std::list<int> list;
size_t limit = 0;
};
struct Master::IExchangeInfo
{
IExchangeInfo():
n(0) {}
IExchangeInfo(size_t n_, mpi::communicator comm_):
n(n_),
comm(comm_),
global_work_(new mpi::window<int>(comm, 1)) { global_work_->lock_all(MPI_MODE_NOCHECK); }
~IExchangeInfo() { global_work_->unlock_all(); }
inline void not_done(int gid);
inline int global_work(); // get global work status (for debugging)
inline bool all_done(); // get global all done status
inline void reset_work(); // reset global work counter
inline int add_work(int work); // add work to global work counter
int inc_work() { return add_work(1); } // increment global work counter
int dec_work() { return add_work(-1); } // decremnent global work counter
size_t n;
mpi::communicator comm;
std::unordered_map<int, bool> done; // gid -> done
std::unique_ptr<mpi::window<int>> global_work_; // global work to do
std::shared_ptr<spd::logger> log = get_logger();
};
// VectorWindow is used to send and receive subsets of a contiguous array in-place
namespace detail
{
template <typename T>
struct VectorWindow
{
T *begin;
size_t count;
};
} // namespace detail
namespace mpi
{
namespace detail
{
template<typename T> struct is_mpi_datatype< diy::detail::VectorWindow<T> > { typedef true_type type; };
template <typename T>
struct mpi_datatype< diy::detail::VectorWindow<T> >
{
using VecWin = diy::detail::VectorWindow<T>;
static MPI_Datatype datatype() { return get_mpi_datatype<T>(); }
static const void* address(const VecWin& x) { return x.begin; }
static void* address(VecWin& x) { return x.begin; }
static int count(const VecWin& x) { return static_cast<int>(x.count); }
};
}
} // namespace mpi::detail
} // namespace diy
void
diy::Master::IExchangeInfo::
not_done(int gid)
{
if (done[gid])
{
done[gid] = false;
int work = inc_work();
log->debug("[{}] Incrementing work when switching done (on receipt): work = {}\n", gid, work);
} else
log->debug("[{}] Not done, no need to increment work\n", gid);
}
diy::Master::InFlightRecv&
diy::Master::
inflight_recv(int proc)
{
return (*inflight_recvs_)[proc];
}
diy::Master::InFlightSendsList&
diy::Master::inflight_sends()
{
return *inflight_sends_;
}
// receive message described by status
void
diy::Master::InFlightRecv::
recv(mpi::communicator& comm, const mpi::status& status)
{
if (info.from == -1) // uninitialized
{
MemoryBuffer bb;
comm.recv(status.source(), status.tag(), bb.buffer);
if (status.tag() == tags::piece) // first piece is the header
{
size_t msg_size;
diy::load(bb, msg_size);
diy::load(bb, info);
message.buffer.reserve(msg_size);
}
else // tags::queue
{
diy::load_back(bb, info);
message.swap(bb);
}
}
else
{
size_t start_idx = message.buffer.size();
size_t count = status.count<char>();
message.buffer.resize(start_idx + count);
detail::VectorWindow<char> window;
window.begin = &message.buffer[start_idx];
window.count = count;
comm.recv(status.source(), status.tag(), window);
}
if (status.tag() == tags::queue)
done = true;
}
// once the InFlightRecv is done, place it either out of core or in the appropriate incoming queue
void
diy::Master::InFlightRecv::
place(IncomingRound* in, bool unload, ExternalStorage* storage, IExchangeInfo* iexchange)
{
size_t size = message.size();
int from = info.from;
int to = info.to;
int external = -1;
if (unload)
{
get_logger()->debug("Directly unloading queue {} <- {}", to, from);
external = storage->put(message); // unload directly
}
else if (!iexchange)
{
in->map[to].queues[from].swap(message);
in->map[to].queues[from].reset(); // buffer position = 0
}
else // iexchange
{
auto log = get_logger();
iexchange->not_done(to);
in->map[to].queues[from].append_binary(&message.buffer[0], message.size()); // append insted of overwrite
int work = iexchange->dec_work();
log->debug("[{}] Decrementing work after receiving: work = {}\n", to, work);
}
in->map[to].records[from] = QueueRecord(size, external);
++(in->received);
}

@ -1,157 +0,0 @@
struct diy::Master::ProcessBlock
{
ProcessBlock(Master& master_,
const std::deque<int>& blocks__,
int local_limit_,
critical_resource<int>& idx_):
master(master_),
blocks(blocks__),
local_limit(local_limit_),
idx(idx_)
{}
ProcessBlock(const ProcessBlock&) = delete;
ProcessBlock(ProcessBlock&&) = default;
void operator()()
{
master.log->debug("Processing with thread: {}", this_thread::get_id());
std::vector<int> local;
do
{
int cur = (*idx.access())++;
if ((size_t)cur >= blocks.size())
return;
int i = blocks[cur];
if (master.block(i))
{
if (local.size() == (size_t)local_limit)
master.unload(local);
local.push_back(i);
}
master.log->debug("Processing block: {}", master.gid(i));
bool skip = all_skip(i);
IncomingQueuesMap &current_incoming = master.incoming_[master.exchange_round_].map;
if (master.block(i) == 0) // block unloaded
{
if (skip)
master.load_queues(i); // even though we are skipping the block, the queues might be necessary
else
{
if (local.size() == (size_t)local_limit) // reached the local limit
master.unload(local);
master.load(i);
local.push_back(i);
}
}
for (auto& cmd : master.commands_)
{
cmd->execute(skip ? 0 : master.block(i), master.proxy(i));
// no longer need them, so get rid of them
current_incoming[master.gid(i)].queues.clear();
current_incoming[master.gid(i)].records.clear();
}
if (skip && master.block(i) == 0)
master.unload_queues(i); // even though we are skipping the block, the queues might be necessary
} while(true);
}
bool all_skip(int i) const
{
bool skip = true;
for (auto& cmd : master.commands_)
{
if (!cmd->skip(i, master))
{
skip = false;
break;
}
}
return skip;
}
Master& master;
const std::deque<int>& blocks;
int local_limit;
critical_resource<int>& idx;
};
void
diy::Master::
execute()
{
log->debug("Entered execute()");
auto scoped = prof.scoped("execute");
DIY_UNUSED(scoped);
//show_incoming_records();
// touch the outgoing and incoming queues as well as collectives to make sure they exist
for (unsigned i = 0; i < size(); ++i)
{
outgoing(gid(i));
incoming(gid(i)); // implicitly touches queue records
collectives(gid(i));
}
if (commands_.empty())
return;
// Order the blocks, so the loaded ones come first
std::deque<int> blocks;
for (unsigned i = 0; i < size(); ++i)
if (block(i) == 0)
blocks.push_back(i);
else
blocks.push_front(i);
// don't use more threads than we can have blocks in memory
int num_threads;
int blocks_per_thread;
if (limit_ == -1)
{
num_threads = threads_;
blocks_per_thread = size();
}
else
{
num_threads = std::min(threads_, limit_);
blocks_per_thread = limit_/num_threads;
}
// idx is shared
critical_resource<int> idx(0);
if (num_threads > 1)
{
// launch the threads
std::list<thread> threads;
for (unsigned i = 0; i < (unsigned)num_threads; ++i)
threads.emplace_back(ProcessBlock(*this, blocks, blocks_per_thread, idx));
for (auto& t : threads)
t.join();
} else
{
ProcessBlock(*this, blocks, blocks_per_thread, idx)();
}
// clear incoming queues
incoming_[exchange_round_].map.clear();
if (limit() != -1 && in_memory() > limit())
throw std::runtime_error(fmt::format("Fatal: {} blocks in memory, with limit {}", in_memory(), limit()));
// clear commands
commands_.clear();
}

@ -1,168 +0,0 @@
#ifndef VTKMDIY_DETAIL_ALL_TO_ALL_HPP
#define VTKMDIY_DETAIL_ALL_TO_ALL_HPP
#include "../block_traits.hpp"
namespace diy
{
namespace detail
{
template<class Op>
struct AllToAllReduce
{
using Block = typename block_traits<Op>::type;
AllToAllReduce(const Op& op_, const Assigner& assigner):
op(op_)
{
for (int gid = 0; gid < assigner.nblocks(); ++gid)
{
BlockID nbr = { gid, assigner.rank(gid) };
all_neighbors_link.add_neighbor(nbr);
}
}
void operator()(Block* b, const ReduceProxy& srp, const RegularSwapPartners& partners) const
{
int k_in = srp.in_link().size();
int k_out = srp.out_link().size();
if (k_in == 0 && k_out == 0) // special case of a single block
{
ReduceProxy all_srp_out(srp, srp.block(), 0, srp.assigner(), empty_link, all_neighbors_link);
ReduceProxy all_srp_in (srp, srp.block(), 1, srp.assigner(), all_neighbors_link, empty_link);
op(b, all_srp_out);
MemoryBuffer& in_queue = all_srp_in.incoming(all_srp_in.in_link().target(0).gid);
in_queue.swap(all_srp_out.outgoing(all_srp_out.out_link().target(0)));
in_queue.reset();
op(b, all_srp_in);
return;
}
if (k_in == 0) // initial round
{
ReduceProxy all_srp(srp, srp.block(), 0, srp.assigner(), empty_link, all_neighbors_link);
op(b, all_srp);
Master::OutgoingQueues all_queues;
all_queues.swap(*all_srp.outgoing()); // clears out the queues and stores them locally
// enqueue outgoing
int group = all_srp.out_link().size() / k_out;
for (int i = 0; i < k_out; ++i)
{
std::pair<int,int> range(i*group, (i+1)*group);
srp.enqueue(srp.out_link().target(i), range);
for (int j = i*group; j < (i+1)*group; ++j)
{
int from = srp.gid();
int to = all_srp.out_link().target(j).gid;
srp.enqueue(srp.out_link().target(i), std::make_pair(from, to));
srp.enqueue(srp.out_link().target(i), all_queues[all_srp.out_link().target(j)]);
}
}
} else if (k_out == 0) // final round
{
// dequeue incoming + reorder into the correct order
ReduceProxy all_srp(srp, srp.block(), 1, srp.assigner(), all_neighbors_link, empty_link);
Master::IncomingQueues all_incoming;
all_incoming.swap(*srp.incoming());
std::pair<int, int> range; // all the ranges should be the same
for (int i = 0; i < k_in; ++i)
{
int gid_in = srp.in_link().target(i).gid;
MemoryBuffer& in = all_incoming[gid_in];
load(in, range);
while(in)
{
std::pair<int, int> from_to;
load(in, from_to);
load(in, all_srp.incoming(from_to.first));
all_srp.incoming(from_to.first).reset();
}
}
op(b, all_srp);
} else // intermediate round: reshuffle queues
{
// add up buffer sizes
std::vector<size_t> sizes_out(k_out, sizeof(std::pair<int,int>));
std::pair<int, int> range; // all the ranges should be the same
for (int i = 0; i < k_in; ++i)
{
MemoryBuffer& in = srp.incoming(srp.in_link().target(i).gid);
load(in, range);
int group = (range.second - range.first)/k_out;
std::pair<int, int> from_to;
size_t s;
while(in)
{
diy::load(in, from_to);
diy::load(in, s);
int j = (from_to.second - range.first) / group;
sizes_out[j] += s + sizeof(size_t) + sizeof(std::pair<int,int>);
in.skip(s);
}
in.reset();
}
// reserve outgoing buffers of correct size
int group = (range.second - range.first)/k_out;
for (int i = 0; i < k_out; ++i)
{
MemoryBuffer& out = srp.outgoing(srp.out_link().target(i));
out.reserve(sizes_out[i]);
std::pair<int, int> out_range;
out_range.first = range.first + group*i;
out_range.second = range.first + group*(i+1);
save(out, out_range);
}
// re-direct the queues
for (int i = 0; i < k_in; ++i)
{
MemoryBuffer& in = srp.incoming(srp.in_link().target(i).gid);
load(in, range);
std::pair<int, int> from_to;
while(in)
{
load(in, from_to);
int j = (from_to.second - range.first) / group;
MemoryBuffer& out = srp.outgoing(srp.out_link().target(j));
save(out, from_to);
MemoryBuffer::copy(in, out);
}
}
}
}
const Op& op;
Link all_neighbors_link, empty_link;
};
struct SkipIntermediate
{
SkipIntermediate(size_t rounds_):
rounds(rounds_) {}
bool operator()(int round, int, const Master&) const { if (round == 0 || round == (int) rounds) return false; return true; }
size_t rounds;
};
}
}
#endif

@ -1,318 +0,0 @@
//--------------------------------------
// utils/traits: Additional type traits
//--------------------------------------
//
// Copyright kennytm (auraHT Ltd.) 2011.
// Distributed under the Boost Software License, Version 1.0.
// (See accompanying file doc/LICENSE_1_0.txt or copy at
// http://www.boost.org/LICENSE_1_0.txt)
/**
``<utils/traits.hpp>`` --- Additional type traits
=================================================
This module provides additional type traits and related functions, missing from
the standard library.
*/
#ifndef VTKMDIY_UTILS_TRAITS_HPP
#define VTKMDIY_UTILS_TRAITS_HPP
#include <cstdlib>
#include <tuple>
#include <functional>
#include <type_traits>
namespace diy
{
namespace detail {
/**
.. macro:: DECLARE_HAS_TYPE_MEMBER(member_name)
This macro declares a template ``has_member_name`` which will check whether
a type member ``member_name`` exists in a particular type.
Example::
DECLARE_HAS_TYPE_MEMBER(result_type)
...
printf("%d\n", has_result_type< std::plus<int> >::value);
// ^ prints '1' (true)
printf("%d\n", has_result_type< double(*)() >::value);
// ^ prints '0' (false)
*/
#define DECLARE_HAS_TYPE_MEMBER(member_name) \
template <typename, typename = void> \
struct has_##member_name \
{ enum { value = false }; }; \
template <typename T> \
struct has_##member_name<T, typename std::enable_if<sizeof(typename T::member_name)||true>::type> \
{ enum { value = true }; };
/**
.. type:: struct utils::function_traits<F>
Obtain compile-time information about a function object *F*.
This template currently supports the following types:
* Normal function types (``R(T...)``), function pointers (``R(*)(T...)``)
and function references (``R(&)(T...)`` and ``R(&&)(T...)``).
* Member functions (``R(C::*)(T...)``)
* ``std::function<F>``
* Type of lambda functions, and any other types that has a unique
``operator()``.
* Type of ``std::mem_fn`` (only for GCC's libstdc++ and LLVM's libc++).
Following the C++ spec, the first argument will be a raw pointer.
*/
template <typename T>
struct function_traits
: public function_traits<decltype(&T::operator())>
{};
namespace xx_impl
{
template <typename C, typename R, typename... A>
struct memfn_type
{
typedef typename std::conditional<
std::is_const<C>::value,
typename std::conditional<
std::is_volatile<C>::value,
R (C::*)(A...) const volatile,
R (C::*)(A...) const
>::type,
typename std::conditional<
std::is_volatile<C>::value,
R (C::*)(A...) volatile,
R (C::*)(A...)
>::type
>::type type;
};
}
template <typename ReturnType, typename... Args>
struct function_traits<ReturnType(Args...)>
{
/**
.. type:: type result_type
The type returned by calling an instance of the function object type *F*.
*/
typedef ReturnType result_type;
/**
.. type:: type function_type
The function type (``R(T...)``).
*/
typedef ReturnType function_type(Args...);
/**
.. type:: type member_function_type<OwnerType>
The member function type for an *OwnerType* (``R(OwnerType::*)(T...)``).
*/
template <typename OwnerType>
using member_function_type = typename xx_impl::memfn_type<
typename std::remove_pointer<typename std::remove_reference<OwnerType>::type>::type,
ReturnType, Args...
>::type;
/**
.. data:: static const size_t arity
Number of arguments the function object will take.
*/
enum { arity = sizeof...(Args) };
/**
.. type:: type arg<n>::type
The type of the *n*-th argument.
*/
template <size_t i>
struct arg
{
typedef typename std::tuple_element<i, std::tuple<Args...>>::type type;
};
};
template <typename ReturnType, typename... Args>
struct function_traits<ReturnType(*)(Args...)>
: public function_traits<ReturnType(Args...)>
{};
template <typename ClassType, typename ReturnType, typename... Args>
struct function_traits<ReturnType(ClassType::*)(Args...)>
: public function_traits<ReturnType(Args...)>
{
typedef ClassType& owner_type;
};
template <typename ClassType, typename ReturnType, typename... Args>
struct function_traits<ReturnType(ClassType::*)(Args...) const>
: public function_traits<ReturnType(Args...)>
{
typedef const ClassType& owner_type;
};
template <typename ClassType, typename ReturnType, typename... Args>
struct function_traits<ReturnType(ClassType::*)(Args...) volatile>
: public function_traits<ReturnType(Args...)>
{
typedef volatile ClassType& owner_type;
};
template <typename ClassType, typename ReturnType, typename... Args>
struct function_traits<ReturnType(ClassType::*)(Args...) const volatile>
: public function_traits<ReturnType(Args...)>
{
typedef const volatile ClassType& owner_type;
};
template <typename FunctionType>
struct function_traits<std::function<FunctionType>>
: public function_traits<FunctionType>
{};
#if defined(_GLIBCXX_FUNCTIONAL)
#define MEM_FN_SYMBOL_XX0SL7G4Z0J std::_Mem_fn
#elif defined(_LIBCPP_FUNCTIONAL)
#define MEM_FN_SYMBOL_XX0SL7G4Z0J std::__mem_fn
#endif
#ifdef MEM_FN_SYMBOL_XX0SL7G4Z0J
template <typename R, typename C>
struct function_traits<MEM_FN_SYMBOL_XX0SL7G4Z0J<R C::*>>
: public function_traits<R(C*)>
{};
template <typename R, typename C, typename... A>
struct function_traits<MEM_FN_SYMBOL_XX0SL7G4Z0J<R(C::*)(A...)>>
: public function_traits<R(C*, A...)>
{};
template <typename R, typename C, typename... A>
struct function_traits<MEM_FN_SYMBOL_XX0SL7G4Z0J<R(C::*)(A...) const>>
: public function_traits<R(const C*, A...)>
{};
template <typename R, typename C, typename... A>
struct function_traits<MEM_FN_SYMBOL_XX0SL7G4Z0J<R(C::*)(A...) volatile>>
: public function_traits<R(volatile C*, A...)>
{};
template <typename R, typename C, typename... A>
struct function_traits<MEM_FN_SYMBOL_XX0SL7G4Z0J<R(C::*)(A...) const volatile>>
: public function_traits<R(const volatile C*, A...)>
{};
#undef MEM_FN_SYMBOL_XX0SL7G4Z0J
#endif
template <typename T>
struct function_traits<T&> : public function_traits<T> {};
template <typename T>
struct function_traits<const T&> : public function_traits<T> {};
template <typename T>
struct function_traits<volatile T&> : public function_traits<T> {};
template <typename T>
struct function_traits<const volatile T&> : public function_traits<T> {};
template <typename T>
struct function_traits<T&&> : public function_traits<T> {};
template <typename T>
struct function_traits<const T&&> : public function_traits<T> {};
template <typename T>
struct function_traits<volatile T&&> : public function_traits<T> {};
template <typename T>
struct function_traits<const volatile T&&> : public function_traits<T> {};
#define FORWARD_RES_8QR485JMSBT \
typename std::conditional< \
std::is_lvalue_reference<R>::value, \
T&, \
typename std::remove_reference<T>::type&& \
>::type
/**
.. function:: auto utils::forward_like<Like, T>(T&& t) noexcept
Forward the reference *t* like the type of *Like*. That means, if *Like* is
an lvalue (reference), this function will return an lvalue reference of *t*.
Otherwise, if *Like* is an rvalue, this function will return an rvalue
reference of *t*.
This is mainly used to propagate the expression category (lvalue/rvalue) of
a member of *Like*, generalizing ``std::forward``.
*/
template <typename R, typename T>
FORWARD_RES_8QR485JMSBT forward_like(T&& input) noexcept
{
return static_cast<FORWARD_RES_8QR485JMSBT>(input);
}
#undef FORWARD_RES_8QR485JMSBT
/**
.. type:: struct utils::copy_cv<From, To>
Copy the CV qualifier between the two types. For example,
``utils::copy_cv<const int, double>::type`` will become ``const double``.
*/
template <typename From, typename To>
struct copy_cv
{
private:
typedef typename std::remove_cv<To>::type raw_To;
typedef typename std::conditional<std::is_const<From>::value,
const raw_To, raw_To>::type const_raw_To;
public:
/**
.. type:: type type
Result of cv-copying.
*/
typedef typename std::conditional<std::is_volatile<From>::value,
volatile const_raw_To, const_raw_To>::type type;
};
/**
.. type:: struct utils::pointee<T>
Returns the type by derefering an instance of *T*. This is a generalization
of ``std::remove_pointer``, that it also works with iterators.
*/
template <typename T>
struct pointee
{
/**
.. type:: type type
Result of dereferencing.
*/
typedef typename std::remove_reference<decltype(*std::declval<T>())>::type type;
};
/**
.. function:: std::add_rvalue_reference<T>::type utils::rt_val<T>() noexcept
Returns a value of type *T*. It is guaranteed to do nothing and will not
throw a compile-time error, but using the returned result will cause
undefined behavior.
*/
template <typename T>
typename std::add_rvalue_reference<T>::type rt_val() noexcept
{
return std::move(*static_cast<T*>(nullptr));
}
}
}
#endif

@ -1,535 +0,0 @@
/*
Formatting library for C++
Copyright (c) 2012 - 2016, Victor Zverovich
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#include "format.h"
#include <string.h>
#include <cctype>
#include <cerrno>
#include <climits>
#include <cmath>
#include <cstdarg>
#include <cstddef> // for std::ptrdiff_t
#if defined(_WIN32) && defined(__MINGW32__)
# include <cstring>
#endif
#if FMT_USE_WINDOWS_H
# if !defined(FMT_HEADER_ONLY) && !defined(WIN32_LEAN_AND_MEAN)
# define WIN32_LEAN_AND_MEAN
# endif
# if defined(NOMINMAX) || defined(FMT_WIN_MINMAX)
# include <windows.h>
# else
# define NOMINMAX
# include <windows.h>
# undef NOMINMAX
# endif
#endif
#if FMT_EXCEPTIONS
# define FMT_TRY try
# define FMT_CATCH(x) catch (x)
#else
# define FMT_TRY if (true)
# define FMT_CATCH(x) if (false)
#endif
#ifdef _MSC_VER
# pragma warning(push)
# pragma warning(disable: 4127) // conditional expression is constant
# pragma warning(disable: 4702) // unreachable code
// Disable deprecation warning for strerror. The latter is not called but
// MSVC fails to detect it.
# pragma warning(disable: 4996)
#endif
// Dummy implementations of strerror_r and strerror_s called if corresponding
// system functions are not available.
static inline fmt::internal::Null<> strerror_r(int, char *, ...) {
return fmt::internal::Null<>();
}
static inline fmt::internal::Null<> strerror_s(char *, std::size_t, ...) {
return fmt::internal::Null<>();
}
namespace fmt {
FMT_FUNC internal::RuntimeError::~RuntimeError() FMT_DTOR_NOEXCEPT {}
FMT_FUNC FormatError::~FormatError() FMT_DTOR_NOEXCEPT {}
FMT_FUNC SystemError::~SystemError() FMT_DTOR_NOEXCEPT {}
namespace {
#ifndef _MSC_VER
# define FMT_SNPRINTF snprintf
#else // _MSC_VER
inline int fmt_snprintf(char *buffer, size_t size, const char *format, ...) {
va_list args;
va_start(args, format);
int result = vsnprintf_s(buffer, size, _TRUNCATE, format, args);
va_end(args);
return result;
}
# define FMT_SNPRINTF fmt_snprintf
#endif // _MSC_VER
#if defined(_WIN32) && defined(__MINGW32__) && !defined(__NO_ISOCEXT)
# define FMT_SWPRINTF snwprintf
#else
# define FMT_SWPRINTF swprintf
#endif // defined(_WIN32) && defined(__MINGW32__) && !defined(__NO_ISOCEXT)
const char RESET_COLOR[] = "\x1b[0m";
typedef void (*FormatFunc)(Writer &, int, StringRef);
// Portable thread-safe version of strerror.
// Sets buffer to point to a string describing the error code.
// This can be either a pointer to a string stored in buffer,
// or a pointer to some static immutable string.
// Returns one of the following values:
// 0 - success
// ERANGE - buffer is not large enough to store the error message
// other - failure
// Buffer should be at least of size 1.
int safe_strerror(
int error_code, char *&buffer, std::size_t buffer_size) FMT_NOEXCEPT {
FMT_ASSERT(buffer != 0 && buffer_size != 0, "invalid buffer");
class StrError {
private:
int error_code_;
char *&buffer_;
std::size_t buffer_size_;
// A noop assignment operator to avoid bogus warnings.
void operator=(const StrError &) {}
// Handle the result of XSI-compliant version of strerror_r.
int handle(int result) {
// glibc versions before 2.13 return result in errno.
return result == -1 ? errno : result;
}
// Handle the result of GNU-specific version of strerror_r.
int handle(char *message) {
// If the buffer is full then the message is probably truncated.
if (message == buffer_ && strlen(buffer_) == buffer_size_ - 1)
return ERANGE;
buffer_ = message;
return 0;
}
// Handle the case when strerror_r is not available.
int handle(internal::Null<>) {
return fallback(strerror_s(buffer_, buffer_size_, error_code_));
}
// Fallback to strerror_s when strerror_r is not available.
int fallback(int result) {
// If the buffer is full then the message is probably truncated.
return result == 0 && strlen(buffer_) == buffer_size_ - 1 ?
ERANGE : result;
}
// Fallback to strerror if strerror_r and strerror_s are not available.
int fallback(internal::Null<>) {
errno = 0;
buffer_ = strerror(error_code_);
return errno;
}
public:
StrError(int err_code, char *&buf, std::size_t buf_size)
: error_code_(err_code), buffer_(buf), buffer_size_(buf_size) {}
int run() {
// Suppress a warning about unused strerror_r.
strerror_r(0, FMT_NULL, "");
return handle(strerror_r(error_code_, buffer_, buffer_size_));
}
};
return StrError(error_code, buffer, buffer_size).run();
}
void format_error_code(Writer &out, int error_code,
StringRef message) FMT_NOEXCEPT {
// Report error code making sure that the output fits into
// INLINE_BUFFER_SIZE to avoid dynamic memory allocation and potential
// bad_alloc.
out.clear();
static const char SEP[] = ": ";
static const char ERROR_STR[] = "error ";
// Subtract 2 to account for terminating null characters in SEP and ERROR_STR.
std::size_t error_code_size = sizeof(SEP) + sizeof(ERROR_STR) - 2;
typedef internal::IntTraits<int>::MainType MainType;
MainType abs_value = static_cast<MainType>(error_code);
if (internal::is_negative(error_code)) {
abs_value = 0 - abs_value;
++error_code_size;
}
error_code_size += internal::count_digits(abs_value);
if (message.size() <= internal::INLINE_BUFFER_SIZE - error_code_size)
out << message << SEP;
out << ERROR_STR << error_code;
assert(out.size() <= internal::INLINE_BUFFER_SIZE);
}
void report_error(FormatFunc func, int error_code,
StringRef message) FMT_NOEXCEPT {
MemoryWriter full_message;
func(full_message, error_code, message);
// Use Writer::data instead of Writer::c_str to avoid potential memory
// allocation.
std::fwrite(full_message.data(), full_message.size(), 1, stderr);
std::fputc('\n', stderr);
}
} // namespace
FMT_FUNC void SystemError::init(
int err_code, CStringRef format_str, ArgList args) {
error_code_ = err_code;
MemoryWriter w;
format_system_error(w, err_code, format(format_str, args));
std::runtime_error &base = *this;
base = std::runtime_error(w.str());
}
template <typename T>
int internal::CharTraits<char>::format_float(
char *buffer, std::size_t size, const char *format,
unsigned width, int precision, T value) {
if (width == 0) {
return precision < 0 ?
FMT_SNPRINTF(buffer, size, format, value) :
FMT_SNPRINTF(buffer, size, format, precision, value);
}
return precision < 0 ?
FMT_SNPRINTF(buffer, size, format, width, value) :
FMT_SNPRINTF(buffer, size, format, width, precision, value);
}
template <typename T>
int internal::CharTraits<wchar_t>::format_float(
wchar_t *buffer, std::size_t size, const wchar_t *format,
unsigned width, int precision, T value) {
if (width == 0) {
return precision < 0 ?
FMT_SWPRINTF(buffer, size, format, value) :
FMT_SWPRINTF(buffer, size, format, precision, value);
}
return precision < 0 ?
FMT_SWPRINTF(buffer, size, format, width, value) :
FMT_SWPRINTF(buffer, size, format, width, precision, value);
}
template <typename T>
const char internal::BasicData<T>::DIGITS[] =
"0001020304050607080910111213141516171819"
"2021222324252627282930313233343536373839"
"4041424344454647484950515253545556575859"
"6061626364656667686970717273747576777879"
"8081828384858687888990919293949596979899";
#define FMT_POWERS_OF_10(factor) \
factor * 10, \
factor * 100, \
factor * 1000, \
factor * 10000, \
factor * 100000, \
factor * 1000000, \
factor * 10000000, \
factor * 100000000, \
factor * 1000000000
template <typename T>
const uint32_t internal::BasicData<T>::POWERS_OF_10_32[] = {
0, FMT_POWERS_OF_10(1)
};
template <typename T>
const uint64_t internal::BasicData<T>::POWERS_OF_10_64[] = {
0,
FMT_POWERS_OF_10(1),
FMT_POWERS_OF_10(ULongLong(1000000000)),
// Multiply several constants instead of using a single long long constant
// to avoid warnings about C++98 not supporting long long.
ULongLong(1000000000) * ULongLong(1000000000) * 10
};
FMT_FUNC void internal::report_unknown_type(char code, const char *type) {
(void)type;
if (std::isprint(static_cast<unsigned char>(code))) {
FMT_THROW(FormatError(
format("unknown format code '{}' for {}", code, type)));
}
FMT_THROW(FormatError(
format("unknown format code '\\x{:02x}' for {}",
static_cast<unsigned>(code), type)));
}
#if FMT_USE_WINDOWS_H
FMT_FUNC internal::UTF8ToUTF16::UTF8ToUTF16(StringRef s) {
static const char ERROR_MSG[] = "cannot convert string from UTF-8 to UTF-16";
if (s.size() > INT_MAX)
FMT_THROW(WindowsError(ERROR_INVALID_PARAMETER, ERROR_MSG));
int s_size = static_cast<int>(s.size());
int length = MultiByteToWideChar(
CP_UTF8, MB_ERR_INVALID_CHARS, s.data(), s_size, FMT_NULL, 0);
if (length == 0)
FMT_THROW(WindowsError(GetLastError(), ERROR_MSG));
buffer_.resize(length + 1);
length = MultiByteToWideChar(
CP_UTF8, MB_ERR_INVALID_CHARS, s.data(), s_size, &buffer_[0], length);
if (length == 0)
FMT_THROW(WindowsError(GetLastError(), ERROR_MSG));
buffer_[length] = 0;
}
FMT_FUNC internal::UTF16ToUTF8::UTF16ToUTF8(WStringRef s) {
if (int error_code = convert(s)) {
FMT_THROW(WindowsError(error_code,
"cannot convert string from UTF-16 to UTF-8"));
}
}
FMT_FUNC int internal::UTF16ToUTF8::convert(WStringRef s) {
if (s.size() > INT_MAX)
return ERROR_INVALID_PARAMETER;
int s_size = static_cast<int>(s.size());
int length = WideCharToMultiByte(
CP_UTF8, 0, s.data(), s_size, FMT_NULL, 0, FMT_NULL, FMT_NULL);
if (length == 0)
return GetLastError();
buffer_.resize(length + 1);
length = WideCharToMultiByte(
CP_UTF8, 0, s.data(), s_size, &buffer_[0], length, FMT_NULL, FMT_NULL);
if (length == 0)
return GetLastError();
buffer_[length] = 0;
return 0;
}
FMT_FUNC void WindowsError::init(
int err_code, CStringRef format_str, ArgList args) {
error_code_ = err_code;
MemoryWriter w;
internal::format_windows_error(w, err_code, format(format_str, args));
std::runtime_error &base = *this;
base = std::runtime_error(w.str());
}
FMT_FUNC void internal::format_windows_error(
Writer &out, int error_code, StringRef message) FMT_NOEXCEPT {
FMT_TRY {
MemoryBuffer<wchar_t, INLINE_BUFFER_SIZE> buffer;
buffer.resize(INLINE_BUFFER_SIZE);
for (;;) {
wchar_t *system_message = &buffer[0];
int result = FormatMessageW(
FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS,
FMT_NULL, error_code, MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
system_message, static_cast<uint32_t>(buffer.size()), FMT_NULL);
if (result != 0) {
UTF16ToUTF8 utf8_message;
if (utf8_message.convert(system_message) == ERROR_SUCCESS) {
out << message << ": " << utf8_message;
return;
}
break;
}
if (GetLastError() != ERROR_INSUFFICIENT_BUFFER)
break; // Can't get error message, report error code instead.
buffer.resize(buffer.size() * 2);
}
} FMT_CATCH(...) {}
fmt::format_error_code(out, error_code, message); // 'fmt::' is for bcc32.
}
#endif // FMT_USE_WINDOWS_H
FMT_FUNC void format_system_error(
Writer &out, int error_code, StringRef message) FMT_NOEXCEPT {
FMT_TRY {
internal::MemoryBuffer<char, internal::INLINE_BUFFER_SIZE> buffer;
buffer.resize(internal::INLINE_BUFFER_SIZE);
for (;;) {
char *system_message = &buffer[0];
int result = safe_strerror(error_code, system_message, buffer.size());
if (result == 0) {
out << message << ": " << system_message;
return;
}
if (result != ERANGE)
break; // Can't get error message, report error code instead.
buffer.resize(buffer.size() * 2);
}
} FMT_CATCH(...) {}
fmt::format_error_code(out, error_code, message); // 'fmt::' is for bcc32.
}
template <typename Char>
void internal::ArgMap<Char>::init(const ArgList &args) {
if (!map_.empty())
return;
typedef internal::NamedArg<Char> NamedArg;
const NamedArg *named_arg = FMT_NULL;
bool use_values =
args.type(ArgList::MAX_PACKED_ARGS - 1) == internal::Arg::NONE;
if (use_values) {
for (unsigned i = 0;/*nothing*/; ++i) {
internal::Arg::Type arg_type = args.type(i);
switch (arg_type) {
case internal::Arg::NONE:
return;
case internal::Arg::NAMED_ARG:
named_arg = static_cast<const NamedArg*>(args.values_[i].pointer);
map_.push_back(Pair(named_arg->name, *named_arg));
break;
default:
/*nothing*/;
}
}
return;
}
for (unsigned i = 0; i != ArgList::MAX_PACKED_ARGS; ++i) {
internal::Arg::Type arg_type = args.type(i);
if (arg_type == internal::Arg::NAMED_ARG) {
named_arg = static_cast<const NamedArg*>(args.args_[i].pointer);
map_.push_back(Pair(named_arg->name, *named_arg));
}
}
for (unsigned i = ArgList::MAX_PACKED_ARGS;/*nothing*/; ++i) {
switch (args.args_[i].type) {
case internal::Arg::NONE:
return;
case internal::Arg::NAMED_ARG:
named_arg = static_cast<const NamedArg*>(args.args_[i].pointer);
map_.push_back(Pair(named_arg->name, *named_arg));
break;
default:
/*nothing*/;
}
}
}
template <typename Char>
void internal::FixedBuffer<Char>::grow(std::size_t) {
FMT_THROW(std::runtime_error("buffer overflow"));
}
FMT_FUNC internal::Arg internal::FormatterBase::do_get_arg(
unsigned arg_index, const char *&error) {
internal::Arg arg = args_[arg_index];
switch (arg.type) {
case internal::Arg::NONE:
error = "argument index out of range";
break;
case internal::Arg::NAMED_ARG:
arg = *static_cast<const internal::Arg*>(arg.pointer);
break;
default:
/*nothing*/;
}
return arg;
}
FMT_FUNC void report_system_error(
int error_code, fmt::StringRef message) FMT_NOEXCEPT {
// 'fmt::' is for bcc32.
report_error(format_system_error, error_code, message);
}
#if FMT_USE_WINDOWS_H
FMT_FUNC void report_windows_error(
int error_code, fmt::StringRef message) FMT_NOEXCEPT {
// 'fmt::' is for bcc32.
report_error(internal::format_windows_error, error_code, message);
}
#endif
FMT_FUNC void print(std::FILE *f, CStringRef format_str, ArgList args) {
MemoryWriter w;
w.write(format_str, args);
std::fwrite(w.data(), 1, w.size(), f);
}
FMT_FUNC void print(CStringRef format_str, ArgList args) {
print(stdout, format_str, args);
}
FMT_FUNC void print_colored(Color c, CStringRef format, ArgList args) {
char escape[] = "\x1b[30m";
escape[3] = static_cast<char>('0' + c);
std::fputs(escape, stdout);
print(format, args);
std::fputs(RESET_COLOR, stdout);
}
#ifndef FMT_HEADER_ONLY
template struct internal::BasicData<void>;
// Explicit instantiations for char.
template void internal::FixedBuffer<char>::grow(std::size_t);
template void internal::ArgMap<char>::init(const ArgList &args);
template FMT_API int internal::CharTraits<char>::format_float(
char *buffer, std::size_t size, const char *format,
unsigned width, int precision, double value);
template FMT_API int internal::CharTraits<char>::format_float(
char *buffer, std::size_t size, const char *format,
unsigned width, int precision, long double value);
// Explicit instantiations for wchar_t.
template void internal::FixedBuffer<wchar_t>::grow(std::size_t);
template void internal::ArgMap<wchar_t>::init(const ArgList &args);
template FMT_API int internal::CharTraits<wchar_t>::format_float(
wchar_t *buffer, std::size_t size, const wchar_t *format,
unsigned width, int precision, double value);
template FMT_API int internal::CharTraits<wchar_t>::format_float(
wchar_t *buffer, std::size_t size, const wchar_t *format,
unsigned width, int precision, long double value);
#endif // FMT_HEADER_ONLY
} // namespace fmt
#ifdef _MSC_VER
# pragma warning(pop)
#endif

File diff suppressed because it is too large Load Diff

@ -1,35 +0,0 @@
/*
Formatting library for C++ - std::ostream support
Copyright (c) 2012 - 2016, Victor Zverovich
All rights reserved.
For the license information refer to format.h.
*/
#include "ostream.h"
namespace fmt {
namespace internal {
FMT_FUNC void write(std::ostream &os, Writer &w) {
const char *data = w.data();
typedef internal::MakeUnsigned<std::streamsize>::Type UnsignedStreamSize;
UnsignedStreamSize size = w.size();
UnsignedStreamSize max_size =
internal::to_unsigned((std::numeric_limits<std::streamsize>::max)());
do {
UnsignedStreamSize n = size <= max_size ? size : max_size;
os.write(data, static_cast<std::streamsize>(n));
data += n;
size -= n;
} while (size != 0);
}
}
FMT_FUNC void print(std::ostream &os, CStringRef format_str, ArgList args) {
MemoryWriter w;
w.write(format_str, args);
internal::write(os, w);
}
} // namespace fmt

@ -1,105 +0,0 @@
/*
Formatting library for C++ - std::ostream support
Copyright (c) 2012 - 2016, Victor Zverovich
All rights reserved.
For the license information refer to format.h.
*/
#ifndef FMT_OSTREAM_H_
#define FMT_OSTREAM_H_
#include "format.h"
#include <ostream>
namespace fmt {
namespace internal {
template <class Char>
class FormatBuf : public std::basic_streambuf<Char> {
private:
typedef typename std::basic_streambuf<Char>::int_type int_type;
typedef typename std::basic_streambuf<Char>::traits_type traits_type;
Buffer<Char> &buffer_;
public:
FormatBuf(Buffer<Char> &buffer) : buffer_(buffer) {}
protected:
// The put-area is actually always empty. This makes the implementation
// simpler and has the advantage that the streambuf and the buffer are always
// in sync and sputc never writes into uninitialized memory. The obvious
// disadvantage is that each call to sputc always results in a (virtual) call
// to overflow. There is no disadvantage here for sputn since this always
// results in a call to xsputn.
int_type overflow(int_type ch = traits_type::eof()) FMT_OVERRIDE {
if (!traits_type::eq_int_type(ch, traits_type::eof()))
buffer_.push_back(static_cast<Char>(ch));
return ch;
}
std::streamsize xsputn(const Char *s, std::streamsize count) FMT_OVERRIDE {
buffer_.append(s, s + count);
return count;
}
};
Yes &convert(std::ostream &);
struct DummyStream : std::ostream {
DummyStream(); // Suppress a bogus warning in MSVC.
// Hide all operator<< overloads from std::ostream.
void operator<<(Null<>);
};
No &operator<<(std::ostream &, int);
template<typename T>
struct ConvertToIntImpl<T, true> {
// Convert to int only if T doesn't have an overloaded operator<<.
enum {
value = sizeof(convert(get<DummyStream>() << get<T>())) == sizeof(No)
};
};
// Write the content of w to os.
FMT_API void write(std::ostream &os, Writer &w);
} // namespace internal
// Formats a value.
template <typename Char, typename ArgFormatter_, typename T>
void format_arg(BasicFormatter<Char, ArgFormatter_> &f,
const Char *&format_str, const T &value) {
internal::MemoryBuffer<Char, internal::INLINE_BUFFER_SIZE> buffer;
internal::FormatBuf<Char> format_buf(buffer);
std::basic_ostream<Char> output(&format_buf);
output << value;
BasicStringRef<Char> str(&buffer[0], buffer.size());
typedef internal::MakeArg< BasicFormatter<Char> > MakeArg;
format_str = f.format(format_str, MakeArg(str));
}
/**
\rst
Prints formatted data to the stream *os*.
**Example**::
print(cerr, "Don't {}!", "panic");
\endrst
*/
FMT_API void print(std::ostream &os, CStringRef format_str, ArgList args);
FMT_VARIADIC(void, print, std::ostream &, CStringRef)
} // namespace fmt
#ifdef FMT_HEADER_ONLY
# include "ostream.cc"
#endif
#endif // FMT_OSTREAM_H_

@ -1,198 +0,0 @@
#ifndef VTKMDIY_GRID_HPP
#define VTKMDIY_GRID_HPP
#include "point.hpp"
namespace diy
{
template<class C, unsigned D>
struct Grid;
template<class C, unsigned D>
struct GridRef
{
public:
typedef C Value;
typedef Point<int, D> Vertex;
typedef size_t Index;
public:
template<class Int>
GridRef(C* data, const Point<Int,D>& shape, bool c_order = true):
data_(data), shape_(shape), c_order_(c_order) { set_stride(); }
GridRef(Grid<C,D>& g):
data_(g.data()), shape_(g.shape()),
c_order_(g.c_order()) { set_stride(); }
template<class Int>
C operator()(const Point<Int, D>& v) const { return (*this)(index(v)); }
template<class Int>
C& operator()(const Point<Int, D>& v) { return (*this)(index(v)); }
C operator()(Index i) const { return data_[i]; }
C& operator()(Index i) { return data_[i]; }
const Vertex&
shape() const { return shape_; }
const C*
data() const { return data_; }
C* data() { return data_; }
// Set every element to the given value
GridRef& operator=(C value) { Index s = size(); for (Index i = 0; i < s; ++i) data_[i] = value; return *this; }
GridRef& operator/=(C value) { Index s = size(); for (Index i = 0; i < s; ++i) data_[i] /= value; return *this; }
inline
bool operator==(const GridRef& other) const;
bool operator!=(const GridRef& other) const { return !(*this == other); }
inline
Vertex vertex(Index idx) const;
Index index(const Vertex& v) const { Index idx = 0; for (unsigned i = 0; i < D; ++i) { idx += ((Index) v[i]) * ((Index) stride_[i]); } return idx; }
Index size() const { return size(shape()); }
void swap(GridRef& other) { std::swap(data_, other.data_); std::swap(shape_, other.shape_); std::swap(stride_, other.stride_); std::swap(c_order_, other.c_order_); }
bool c_order() const { return c_order_; }
static constexpr
unsigned dimension() { return D; }
protected:
static Index
size(const Vertex& v) { Index res = 1; for (unsigned i = 0; i < D; ++i) res *= v[i]; return res; }
void set_stride()
{
Index cur = 1;
if (c_order_)
for (unsigned i = D; i > 0; --i) { stride_[i-1] = cur; cur *= shape_[i-1]; }
else
for (unsigned i = 0; i < D; ++i) { stride_[i] = cur; cur *= shape_[i]; }
}
void set_shape(const Vertex& v) { shape_ = v; set_stride(); }
void set_data(C* data) { data_ = data; }
void set_c_order(bool order) { c_order_ = order; }
private:
C* data_;
Vertex shape_;
Vertex stride_;
bool c_order_;
};
template<class C, unsigned D>
struct Grid: public GridRef<C,D>
{
public:
typedef GridRef<C,D> Parent;
typedef typename Parent::Value Value;
typedef typename Parent::Index Index;
typedef typename Parent::Vertex Vertex;
typedef Parent Reference;
template<class U>
struct rebind { typedef Grid<U,D> type; };
public:
Grid():
Parent(new C[0], Vertex::zero()) {}
template<class Int>
Grid(const Point<Int, D>& shape, bool c_order = true):
Parent(new C[size(shape)], shape, c_order)
{}
Grid(Grid&& g): Grid() { Parent::swap(g); }
Grid(const Parent& g):
Parent(new C[size(g.shape())], g.shape(),
g.c_order()) { copy_data(g.data()); }
template<class OtherGrid>
Grid(const OtherGrid& g):
Parent(new C[size(g.shape())],
g.shape(),
g.c_order()) { copy_data(g.data()); }
~Grid() { delete[] Parent::data(); }
template<class OC>
Grid& operator=(const GridRef<OC, D>& other)
{
delete[] Parent::data();
Parent::set_c_order(other.c_order()); // NB: order needs to be set before the shape, to set the stride correctly
Parent::set_shape(other.shape());
Index s = size(shape());
Parent::set_data(new C[s]);
copy_data(other.data());
return *this;
}
Grid& operator=(Grid&& g) { Parent::swap(g); return *this; }
using Parent::data;
using Parent::shape;
using Parent::operator();
using Parent::operator=;
using Parent::size;
private:
template<class OC>
void copy_data(const OC* data)
{
Index s = size(shape());
for (Index i = 0; i < s; ++i)
Parent::data()[i] = data[i];
}
};
template<class C, unsigned D>
bool GridRef<C, D>::operator==(const GridRef<C, D>& other) const
{
if (c_order() != other.c_order())
return false;
if (shape() != other.shape())
return false;
for(unsigned i = 0; i < size(); ++i)
if (data()[i] != other.data()[i])
return false;
return true;
}
template<class C, unsigned D>
typename GridRef<C, D>::Vertex
GridRef<C, D>::
vertex(typename GridRef<C, D>::Index idx) const
{
Vertex v;
if (c_order())
for (unsigned i = 0; i < D; ++i)
{
v[i] = idx / stride_[i];
idx %= stride_[i];
}
else
for (int i = D-1; i >= 0; --i)
{
v[i] = idx / stride_[i];
idx %= stride_[i];
}
return v;
}
}
#endif

@ -1,392 +0,0 @@
#ifndef VTKMDIY_IO_BLOCK_HPP
#define VTKMDIY_IO_BLOCK_HPP
#include <string>
#include <algorithm>
#include <stdexcept>
#include "../mpi.hpp"
#include "../assigner.hpp"
#include "../master.hpp"
#include "../storage.hpp"
#include "../log.hpp"
#include "utils.hpp"
// Read and write collections of blocks using MPI-IO
namespace diy
{
namespace io
{
namespace detail
{
typedef mpi::io::offset offset_t;
struct GidOffsetCount
{
GidOffsetCount(): // need to initialize a vector of given size
gid(-1), offset(0), count(0) {}
GidOffsetCount(int gid_, offset_t offset_, offset_t count_):
gid(gid_), offset(offset_), count(count_) {}
bool operator<(const GidOffsetCount& other) const { return gid < other.gid; }
int gid;
offset_t offset;
offset_t count;
};
}
}
// Serialize GidOffsetCount explicitly, to avoid alignment and uninitialized data issues
// (to get identical output files given the same block input)
template<>
struct Serialization<io::detail::GidOffsetCount>
{
typedef io::detail::GidOffsetCount GidOffsetCount;
static void save(BinaryBuffer& bb, const GidOffsetCount& x)
{
diy::save(bb, x.gid);
diy::save(bb, x.offset);
diy::save(bb, x.count);
}
static void load(BinaryBuffer& bb, GidOffsetCount& x)
{
diy::load(bb, x.gid);
diy::load(bb, x.offset);
diy::load(bb, x.count);
}
};
namespace io
{
/**
* \ingroup IO
* \brief Write blocks to storage collectively in one shared file
*/
inline
void
write_blocks(const std::string& outfilename, //!< output file name
const mpi::communicator& comm, //!< communicator
Master& master, //!< master object
const MemoryBuffer& extra = MemoryBuffer(),//!< user-defined metadata for file header; meaningful only on rank == 0
Master::SaveBlock save = 0) //!< block save function in case different than or undefined in the master
{
if (!save) save = master.saver(); // save is likely to be different from master.save()
typedef detail::offset_t offset_t;
typedef detail::GidOffsetCount GidOffsetCount;
unsigned size = master.size(),
max_size, min_size;
mpi::all_reduce(comm, size, max_size, mpi::maximum<unsigned>());
mpi::all_reduce(comm, size, min_size, mpi::minimum<unsigned>());
// truncate the file
if (comm.rank() == 0)
diy::io::utils::truncate(outfilename.c_str(), 0);
mpi::io::file f(comm, outfilename, mpi::io::file::wronly | mpi::io::file::create);
offset_t start = 0, shift;
std::vector<GidOffsetCount> offset_counts;
for (unsigned i = 0; i < max_size; ++i)
{
offset_t count = 0,
offset;
if (i < size)
{
// get the block from master and serialize it
const void* block = master.get(i);
MemoryBuffer bb;
LinkFactory::save(bb, master.link(i));
save(block, bb);
count = bb.buffer.size();
mpi::scan(comm, count, offset, std::plus<offset_t>());
offset += start - count;
mpi::all_reduce(comm, count, shift, std::plus<offset_t>());
start += shift;
if (i < min_size) // up to min_size, we can do collective IO
f.write_at_all(offset, bb.buffer);
else
f.write_at(offset, bb.buffer);
offset_counts.push_back(GidOffsetCount(master.gid(i), offset, count));
} else
{
// matching global operations
mpi::scan(comm, count, offset, std::plus<offset_t>());
mpi::all_reduce(comm, count, shift, std::plus<offset_t>());
// -1 indicates that there is no block written here from this rank
offset_counts.push_back(GidOffsetCount(-1, offset, count));
}
}
if (comm.rank() == 0)
{
// round-about way of gather vector of vectors of GidOffsetCount to avoid registering a new mpi datatype
std::vector< std::vector<char> > gathered_offset_count_buffers;
MemoryBuffer oc_buffer; diy::save(oc_buffer, offset_counts);
mpi::gather(comm, oc_buffer.buffer, gathered_offset_count_buffers, 0);
std::vector<GidOffsetCount> all_offset_counts;
for (unsigned i = 0; i < gathered_offset_count_buffers.size(); ++i)
{
MemoryBuffer per_rank_oc_buffer; per_rank_oc_buffer.buffer.swap(gathered_offset_count_buffers[i]);
std::vector<GidOffsetCount> per_rank_offset_counts;
diy::load(per_rank_oc_buffer, per_rank_offset_counts);
for (unsigned j = 0; j < per_rank_offset_counts.size(); ++j)
if (per_rank_offset_counts[j].gid != -1)
all_offset_counts.push_back(per_rank_offset_counts[j]);
}
std::sort(all_offset_counts.begin(), all_offset_counts.end()); // sorts by gid
MemoryBuffer bb;
diy::save(bb, all_offset_counts);
diy::save(bb, extra);
size_t footer_size = bb.size();
diy::save(bb, footer_size);
// find footer_offset as the max of (offset + count)
offset_t footer_offset = 0;
for (unsigned i = 0; i < all_offset_counts.size(); ++i)
{
offset_t end = all_offset_counts[i].offset + all_offset_counts[i].count;
if (end > footer_offset)
footer_offset = end;
}
f.write_at(footer_offset, bb.buffer);
} else
{
MemoryBuffer oc_buffer; diy::save(oc_buffer, offset_counts);
mpi::gather(comm, oc_buffer.buffer, 0);
}
}
/**
* \ingroup IO
* \brief Read blocks from storage collectively from one shared file
*/
inline
void
read_blocks(const std::string& infilename, //!< input file name
const mpi::communicator& comm, //!< communicator
StaticAssigner& assigner, //!< assigner object
Master& master, //!< master object
MemoryBuffer& extra, //!< user-defined metadata in file header
Master::LoadBlock load = 0) //!< load block function in case different than or unefined in the master
{
if (!load) load = master.loader(); // load is likely to be different from master.load()
typedef detail::offset_t offset_t;
typedef detail::GidOffsetCount GidOffsetCount;
mpi::io::file f(comm, infilename, mpi::io::file::rdonly);
offset_t footer_offset = f.size() - sizeof(size_t);
size_t footer_size;
// Read the size
f.read_at_all(footer_offset, (char*) &footer_size, sizeof(footer_size));
// Read all_offset_counts
footer_offset -= footer_size;
MemoryBuffer footer;
footer.buffer.resize(footer_size);
f.read_at_all(footer_offset, footer.buffer);
std::vector<GidOffsetCount> all_offset_counts;
diy::load(footer, all_offset_counts);
diy::load(footer, extra);
extra.reset();
// Get local gids from assigner
size_t size = all_offset_counts.size();
assigner.set_nblocks(size);
std::vector<int> gids;
assigner.local_gids(comm.rank(), gids);
for (unsigned i = 0; i < gids.size(); ++i)
{
if (gids[i] != all_offset_counts[gids[i]].gid)
get_logger()->warn("gids don't match in diy::io::read_blocks(), {} vs {}",
gids[i], all_offset_counts[gids[i]].gid);
offset_t offset = all_offset_counts[gids[i]].offset,
count = all_offset_counts[gids[i]].count;
MemoryBuffer bb;
bb.buffer.resize(count);
f.read_at(offset, bb.buffer);
Link* l = LinkFactory::load(bb);
l->fix(assigner);
void* b = master.create();
load(b, bb);
master.add(gids[i], b, l);
}
}
// Functions without the extra buffer, for compatibility with the old code
inline
void
write_blocks(const std::string& outfilename,
const mpi::communicator& comm,
Master& master,
Master::SaveBlock save)
{
MemoryBuffer extra;
write_blocks(outfilename, comm, master, extra, save);
}
inline
void
read_blocks(const std::string& infilename,
const mpi::communicator& comm,
StaticAssigner& assigner,
Master& master,
Master::LoadBlock load = 0)
{
MemoryBuffer extra; // dummy
read_blocks(infilename, comm, assigner, master, extra, load);
}
namespace split
{
/**
* \ingroup IO
* \brief Write blocks to storage independently in one file per process
*/
inline
void
write_blocks(const std::string& outfilename, //!< output file name
const mpi::communicator& comm, //!< communicator
Master& master, //!< master object
const MemoryBuffer& extra = MemoryBuffer(),//!< user-defined metadata for file header; meaningful only on rank == 0
Master::SaveBlock save = 0) //!< block save function in case different than or undefined in master
{
if (!save) save = master.saver(); // save is likely to be different from master.save()
bool proceed = false;
size_t size = 0;
if (comm.rank() == 0)
{
if (diy::io::utils::is_directory(outfilename))
proceed = true;
else if (diy::io::utils::make_directory(outfilename))
proceed = true;
mpi::broadcast(comm, proceed, 0);
mpi::reduce(comm, (size_t) master.size(), size, 0, std::plus<size_t>());
} else
{
mpi::broadcast(comm, proceed, 0);
mpi::reduce(comm, (size_t) master.size(), 0, std::plus<size_t>());
}
if (!proceed)
throw std::runtime_error("Cannot access or create directory: " + outfilename);
for (int i = 0; i < (int)master.size(); ++i)
{
const void* block = master.get(i);
std::string filename = fmt::format("{}/{}", outfilename, master.gid(i));
::diy::detail::FileBuffer bb(fopen(filename.c_str(), "w"));
LinkFactory::save(bb, master.link(i));
save(block, bb);
fclose(bb.file);
}
if (comm.rank() == 0)
{
// save the extra buffer
std::string filename = outfilename + "/extra";
::diy::detail::FileBuffer bb(fopen(filename.c_str(), "w"));
::diy::save(bb, size);
::diy::save(bb, extra);
fclose(bb.file);
}
}
/**
* \ingroup IO
* \brief Read blocks from storage independently from one file per process
*/
inline
void
read_blocks(const std::string& infilename, //!< input file name
const mpi::communicator& comm, //!< communicator
StaticAssigner& assigner, //!< assigner object
Master& master, //!< master object
MemoryBuffer& extra, //!< user-defined metadata in file header
Master::LoadBlock load = 0) //!< block load function in case different than or undefined in master
{
if (!load) load = master.loader(); // load is likely to be different from master.load()
// load the extra buffer and size
size_t size;
{
std::string filename = infilename + "/extra";
::diy::detail::FileBuffer bb(fopen(filename.c_str(), "r"));
::diy::load(bb, size);
::diy::load(bb, extra);
extra.reset();
fclose(bb.file);
}
// Get local gids from assigner
assigner.set_nblocks(size);
std::vector<int> gids;
assigner.local_gids(comm.rank(), gids);
// Read our blocks;
for (unsigned i = 0; i < gids.size(); ++i)
{
std::string filename = fmt::format("{}/{}", infilename, gids[i]);
::diy::detail::FileBuffer bb(fopen(filename.c_str(), "r"));
Link* l = LinkFactory::load(bb);
l->fix(assigner);
void* b = master.create();
load(b, bb);
master.add(gids[i], b, l);
fclose(bb.file);
}
}
// Functions without the extra buffer, for compatibility with the old code
inline
void
write_blocks(const std::string& outfilename,
const mpi::communicator& comm,
Master& master,
Master::SaveBlock save)
{
MemoryBuffer extra;
write_blocks(outfilename, comm, master, extra, save);
}
inline
void
read_blocks(const std::string& infilename,
const mpi::communicator& comm,
StaticAssigner& assigner,
Master& master,
Master::LoadBlock load = 0)
{
MemoryBuffer extra; // dummy
read_blocks(infilename, comm, assigner, master, extra, load);
}
} // split
} // io
} // diy
#endif

@ -1,181 +0,0 @@
#ifndef VTKMDIY_IO_BOV_HPP
#define VTKMDIY_IO_BOV_HPP
#include <vector>
#include <algorithm>
#include <numeric>
#include "../types.hpp"
#include "../mpi.hpp"
namespace diy
{
namespace io
{
// Reads and writes subsets of a block of values into specified block bounds
class BOV
{
public:
typedef std::vector<int> Shape;
public:
BOV(mpi::io::file& f):
f_(f), offset_(0) {}
template<class S>
BOV(mpi::io::file& f,
const S& shape = S(),
mpi::io::offset offset = 0):
f_(f), offset_(offset) { set_shape(shape); }
void set_offset(mpi::io::offset offset) { offset_ = offset; }
template<class S>
void set_shape(const S& shape)
{
shape_.clear();
stride_.clear();
for (unsigned i = 0; i < shape.size(); ++i)
{
shape_.push_back(shape[i]);
stride_.push_back(1);
}
for (int i = shape_.size() - 2; i >= 0; --i)
stride_[i] = stride_[i+1] * shape_[i+1];
}
const Shape& shape() const { return shape_; }
template<class T>
void read(const DiscreteBounds& bounds, T* buffer, bool collective = false, int chunk = 1) const;
template<class T>
void write(const DiscreteBounds& bounds, const T* buffer, bool collective = false, int chunk = 1);
template<class T>
void write(const DiscreteBounds& bounds, const T* buffer, const DiscreteBounds& core, bool collective = false, int chunk = 1);
protected:
mpi::io::file& file() { return f_; }
private:
mpi::io::file& f_;
Shape shape_;
std::vector<size_t> stride_;
size_t offset_;
};
}
}
template<class T>
void
diy::io::BOV::
read(const DiscreteBounds& bounds, T* buffer, bool collective, int chunk) const
{
#ifndef VTKM_DIY_NO_MPI
int dim = shape_.size();
int total = 1;
std::vector<int> subsizes;
for (int i = 0; i < dim; ++i)
{
subsizes.push_back(bounds.max[i] - bounds.min[i] + 1);
total *= subsizes.back();
}
MPI_Datatype T_type;
if (chunk == 1)
T_type = mpi::detail::get_mpi_datatype<T>();
else
{
// create an MPI struct of size chunk to read the data in those chunks
// (this allows to work around MPI-IO weirdness where crucial quantities
// are ints, which are too narrow of a type)
int array_of_blocklengths[] = { chunk };
MPI_Aint array_of_displacements[] = { 0 };
MPI_Datatype array_of_types[] = { mpi::detail::get_mpi_datatype<T>() };
MPI_Type_create_struct(1, array_of_blocklengths, array_of_displacements, array_of_types, &T_type);
MPI_Type_commit(&T_type);
}
MPI_Datatype fileblk;
MPI_Type_create_subarray(dim, (int*) &shape_[0], &subsizes[0], (int*) &bounds.min[0], MPI_ORDER_C, T_type, &fileblk);
MPI_Type_commit(&fileblk);
MPI_File_set_view(f_.handle(), offset_, T_type, fileblk, (char*)"native", MPI_INFO_NULL);
mpi::status s;
if (!collective)
MPI_File_read(f_.handle(), buffer, total, T_type, &s.s);
else
MPI_File_read_all(f_.handle(), buffer, total, T_type, &s.s);
if (chunk != 1)
MPI_Type_free(&T_type);
MPI_Type_free(&fileblk);
#else
(void) bounds; (void) buffer; (void) collective; (void)chunk;
DIY_UNSUPPORTED_MPI_CALL(diy::io::BOV::read);
#endif
}
template<class T>
void
diy::io::BOV::
write(const DiscreteBounds& bounds, const T* buffer, bool collective, int chunk)
{
write(bounds, buffer, bounds, collective, chunk);
}
template<class T>
void
diy::io::BOV::
write(const DiscreteBounds& bounds, const T* buffer, const DiscreteBounds& core, bool collective, int chunk)
{
#ifndef VTKM_DIY_NO_MPI
int dim = shape_.size();
std::vector<int> subsizes;
std::vector<int> buffer_shape, buffer_start;
for (int i = 0; i < dim; ++i)
{
buffer_shape.push_back(bounds.max[i] - bounds.min[i] + 1);
buffer_start.push_back(core.min[i] - bounds.min[i]);
subsizes.push_back(core.max[i] - core.min[i] + 1);
}
MPI_Datatype T_type;
if (chunk == 1)
T_type = mpi::detail::get_mpi_datatype<T>();
else
{
// assume T is a binary block and create an MPI struct of appropriate size
int array_of_blocklengths[] = { chunk };
MPI_Aint array_of_displacements[] = { 0 };
MPI_Datatype array_of_types[] = { mpi::detail::get_mpi_datatype<T>() };
MPI_Type_create_struct(1, array_of_blocklengths, array_of_displacements, array_of_types, &T_type);
MPI_Type_commit(&T_type);
}
MPI_Datatype fileblk, subbuffer;
MPI_Type_create_subarray(dim, (int*) &shape_[0], &subsizes[0], (int*) &core.min[0], MPI_ORDER_C, T_type, &fileblk);
MPI_Type_create_subarray(dim, (int*) &buffer_shape[0], &subsizes[0], (int*) &buffer_start[0], MPI_ORDER_C, T_type, &subbuffer);
MPI_Type_commit(&fileblk);
MPI_Type_commit(&subbuffer);
MPI_File_set_view(f_.handle(), offset_, T_type, fileblk, (char*)"native", MPI_INFO_NULL);
mpi::status s;
if (!collective)
MPI_File_write(f_.handle(), (void*)buffer, 1, subbuffer, &s.s);
else
MPI_File_write_all(f_.handle(), (void*)buffer, 1, subbuffer, &s.s);
if (chunk != 1)
MPI_Type_free(&T_type);
MPI_Type_free(&fileblk);
MPI_Type_free(&subbuffer);
#else
(void) bounds; (void) buffer;(void) core; (void) collective; (void) chunk;
DIY_UNSUPPORTED_MPI_CALL(diy::io::bov::write);
#endif
}
#endif

@ -1,213 +0,0 @@
#ifndef VTKMDIY_IO_NMPY_HPP
#define VTKMDIY_IO_NMPY_HPP
#include <sstream>
#include <complex>
#include <stdexcept>
#include "../serialization.hpp"
#include "bov.hpp"
namespace diy
{
namespace io
{
class NumPy: public BOV
{
public:
NumPy(mpi::io::file& f):
BOV(f) {}
unsigned word_size() const { return word_size_; }
unsigned read_header()
{
BOV::Shape shape;
bool fortran;
size_t offset = parse_npy_header(shape, fortran);
if (fortran)
throw std::runtime_error("diy::io::NumPy cannot read data in fortran order");
BOV::set_offset(offset);
BOV::set_shape(shape);
return word_size_;
}
template<class T>
void write_header(int dim, const DiscreteBounds& bounds);
template<class T, class S>
void write_header(const S& shape);
private:
inline size_t parse_npy_header(BOV::Shape& shape, bool& fortran_order);
void save(diy::BinaryBuffer& bb, const std::string& s) { bb.save_binary(s.c_str(), s.size()); }
template<class T>
inline void convert_and_save(diy::BinaryBuffer& bb, const T& x)
{
std::ostringstream oss;
oss << x;
save(bb, oss.str());
}
private:
unsigned word_size_;
};
namespace detail
{
inline char big_endian();
template<class T>
char map_numpy_type();
}
}
}
// Modified from: https://github.com/rogersce/cnpy
// Copyright (C) 2011 Carl Rogers
// Released under MIT License
// license available at http://www.opensource.org/licenses/mit-license.php
size_t
diy::io::NumPy::
parse_npy_header(BOV::Shape& shape, bool& fortran_order)
{
char buffer[256];
file().read_at_all(0, buffer, 256);
std::string header(buffer, buffer + 256);
size_t nl = header.find('\n');
if (nl == std::string::npos)
throw std::runtime_error("parse_npy_header: failed to read the header");
header = header.substr(11, nl - 11 + 1);
size_t header_size = nl + 1;
int loc1, loc2;
//fortran order
loc1 = header.find("fortran_order")+16;
fortran_order = (header.substr(loc1,4) == "True" ? true : false);
//shape
unsigned ndims;
loc1 = header.find("(");
loc2 = header.find(")");
std::string str_shape = header.substr(loc1+1,loc2-loc1-1);
if(str_shape[str_shape.size()-1] == ',') ndims = 1;
else ndims = std::count(str_shape.begin(),str_shape.end(),',')+1;
shape.resize(ndims);
for(unsigned int i = 0;i < ndims;i++) {
loc1 = str_shape.find(",");
shape[i] = atoi(str_shape.substr(0,loc1).c_str());
str_shape = str_shape.substr(loc1+1);
}
//endian, word size, data type
//byte order code | stands for not applicable.
//not sure when this applies except for byte array
loc1 = header.find("descr")+9;
//bool littleEndian = (header[loc1] == '<' || header[loc1] == '|' ? true : false);
//assert(littleEndian);
//char type = header[loc1+1];
//assert(type == map_type(T));
std::string str_ws = header.substr(loc1+2);
loc2 = str_ws.find("'");
word_size_ = atoi(str_ws.substr(0,loc2).c_str());
return header_size;
}
template<class T>
void
diy::io::NumPy::
write_header(int dim, const DiscreteBounds& bounds)
{
std::vector<int> shape;
for (int i = 0; i < dim; ++i)
shape.push_back(bounds.max[i] - bounds.min[i] + 1);
write_header< T, std::vector<int> >(shape);
}
template<class T, class S>
void
diy::io::NumPy::
write_header(const S& shape)
{
BOV::set_shape(shape);
diy::MemoryBuffer dict;
save(dict, "{'descr': '");
diy::save(dict, detail::big_endian());
diy::save(dict, detail::map_numpy_type<T>());
convert_and_save(dict, sizeof(T));
save(dict, "', 'fortran_order': False, 'shape': (");
convert_and_save(dict, shape[0]);
for (int i = 1; i < (int) shape.size(); i++)
{
save(dict, ", ");
convert_and_save(dict, shape[i]);
}
if(shape.size() == 1) save(dict, ",");
save(dict, "), }");
//pad with spaces so that preamble+dict is modulo 16 bytes. preamble is 10 bytes. dict needs to end with \n
int remainder = 16 - (10 + dict.position) % 16;
for (int i = 0; i < remainder - 1; ++i)
diy::save(dict, ' ');
diy::save(dict, '\n');
diy::MemoryBuffer header;
diy::save(header, (char) 0x93);
save(header, "NUMPY");
diy::save(header, (char) 0x01); // major version of numpy format
diy::save(header, (char) 0x00); // minor version of numpy format
diy::save(header, (unsigned short) dict.position);
header.save_binary(&dict.buffer[0], dict.buffer.size());
BOV::set_offset(header.position);
if (file().comm().rank() == 0)
file().write_at(0, &header.buffer[0], header.buffer.size());
}
char
diy::io::detail::big_endian()
{
unsigned char x[] = {1,0};
void* x_void = x;
short y = *static_cast<short*>(x_void);
return y == 1 ? '<' : '>';
}
namespace diy
{
namespace io
{
namespace detail
{
template<> inline char map_numpy_type<float>() { return 'f'; }
template<> inline char map_numpy_type<double>() { return 'f'; }
template<> inline char map_numpy_type<long double>() { return 'f'; }
template<> inline char map_numpy_type<int>() { return 'i'; }
template<> inline char map_numpy_type<char>() { return 'i'; }
template<> inline char map_numpy_type<short>() { return 'i'; }
template<> inline char map_numpy_type<long>() { return 'i'; }
template<> inline char map_numpy_type<long long>() { return 'i'; }
template<> inline char map_numpy_type<unsigned int>() { return 'u'; }
template<> inline char map_numpy_type<unsigned char>() { return 'u'; }
template<> inline char map_numpy_type<unsigned short>() { return 'u'; }
template<> inline char map_numpy_type<unsigned long>() { return 'u'; }
template<> inline char map_numpy_type<unsigned long long>() { return 'u'; }
template<> inline char map_numpy_type<bool>() { return 'b'; }
template<> inline char map_numpy_type< std::complex<float> >() { return 'c'; }
template<> inline char map_numpy_type< std::complex<double> >() { return 'c'; }
template<> inline char map_numpy_type< std::complex<long double> >() { return 'c'; }
}
}
}
#endif

@ -1,49 +0,0 @@
#ifndef VTKMDIY_IO_SHARED_HPP
#define VTKMDIY_IO_SHARED_HPP
#include <sstream>
#include <fstream>
#include "../mpi.hpp"
namespace diy
{
namespace io
{
class SharedOutFile: public std::ostringstream
{
public:
SharedOutFile(std::string filename, diy::mpi::communicator world, int root = 0):
filename_(filename),
world_(world),
root_(root) {}
~SharedOutFile() { close(); }
void close()
{
auto str = this->str();
std::vector<char> contents(str.begin(), str.end());
if (world_.rank() == root_)
{
std::vector<std::vector<char>> all_contents;
diy::mpi::gather(world_, contents, all_contents, root_);
// write the file serially
std::ofstream out(filename_);
for (auto& contents : all_contents)
out.write(contents.data(), contents.size());
} else
diy::mpi::gather(world_, contents, root_);
}
private:
std::string filename_;
diy::mpi::communicator world_;
int root_;
};
}
}
#endif

@ -1,162 +0,0 @@
#ifndef VTKMDIY_IO_UTILS_HPP
#define VTKMDIY_IO_UTILS_HPP
#if defined(_WIN32)
#include <direct.h>
#include <io.h>
#include <share.h>
#else
#include <unistd.h> // mkstemp() on Mac
#include <dirent.h>
#endif
#include <cstdio> // remove()
#include <cstdlib> // mkstemp() on Linux
#include <sys/stat.h>
#include "../constants.h" // for DIY_UNUSED
namespace diy
{
namespace io
{
namespace utils
{
namespace detail
{
// internal method to split a filename into path and fullname.
inline void splitpath(const std::string& fullname, std::string& path, std::string& name)
{
auto pos = fullname.rfind('/');
if (pos != std::string::npos)
{
path = fullname.substr(0, pos);
name = fullname.substr(pos+1);
}
else
{
path = ".";
name = fullname;
}
}
} // namespace detail
/**
* returns true if the filename exists and refers to a directory.
*/
inline bool is_directory(const std::string& filename)
{
#if defined(_WIN32)
DWORD attr = GetFileAttributes(filename.c_str());
return (attr != INVALID_FILE_ATTRIBUTES) && ((attr & FILE_ATTRIBUTE_DIRECTORY) != 0);
#else
struct stat s;
return (stat(filename.c_str(), &s) == 0 && S_ISDIR(s.st_mode));
#endif
}
/**
* creates a new directory. returns true on success.
*/
inline bool make_directory(const std::string& filename)
{
#if defined(_WIN32)
return _mkdir(filename.c_str()) == 0;
#else
return mkdir(filename.c_str(), 0755) == 0;
#endif
}
/**
* truncates a file to the given length, if the file exists and can be opened
* for writing.
*/
inline void truncate(const std::string& filename, size_t length)
{
#if defined(_WIN32)
int fd = -1;
_sopen_s(&fd, filename.c_str(), _O_WRONLY | _O_BINARY, _SH_DENYNO, _S_IWRITE);
if (fd != -1)
{
_chsize_s(fd, static_cast<__int64>(length));
_close(fd);
}
#else
auto r = ::truncate(filename.c_str(), static_cast<off_t>(length));
(void) r;
#endif
}
inline int mkstemp(std::string& filename)
{
#if defined(_WIN32)
std::string path, name;
detail::splitpath(filename, path, name);
char temppath[MAX_PATH];
// note: GetTempFileName only uses the first 3 chars of the prefix (name)
// specified.
if (GetTempFileName(path.c_str(), name.c_str(), 0, temppath) == 0)
{
// failed! return invalid handle.
return -1;
}
int handle = -1;
// _sopen_s sets handle to -1 on error.
_sopen_s(&handle, temppath, _O_WRONLY | _O_CREAT | _O_BINARY, _SH_DENYNO, _S_IWRITE);
if (handle != -1)
filename = temppath;
return handle;
#else // defined(_WIN32)
std::unique_ptr<char[]> s_template(new char[filename.size() + 1]);
std::copy(filename.begin(), filename.end(), s_template.get());
s_template[filename.size()] = 0;
int handle = -1;
#if defined(__MACH__)
// TODO: figure out how to open with O_SYNC
handle = ::mkstemp(s_template.get());
#else
handle = mkostemp(s_template.get(), O_WRONLY | O_SYNC);
#endif
if (handle != -1)
filename = s_template.get();
return handle;
#endif // defined(_WIN32)
}
inline void close(int fd)
{
#if defined(_WIN32)
_close(fd);
#else
fsync(fd);
::close(fd);
#endif
}
inline void sync(int fd)
{
#if defined(_WIN32)
DIY_UNUSED(fd);
#else
fsync(fd);
#endif
}
inline bool remove(const std::string& filename)
{
#if defined(_WIN32)
return DeleteFile(filename.c_str()) != 0;
#else
return ::remove(filename.c_str()) == 0;
#endif
}
}
}
}
#endif

@ -1,226 +0,0 @@
#ifndef VTKMDIY_COVER_HPP
#define VTKMDIY_COVER_HPP
#include <vector>
#include <map>
#include <algorithm>
#include "types.hpp"
#include "serialization.hpp"
#include "assigner.hpp"
namespace diy
{
// Local view of a distributed representation of a cover, a completely unstructured link
class Link
{
public:
using Neighbors = std::vector<BlockID>;
virtual ~Link() {} // need to be able to delete derived classes
int size() const { return static_cast<int>(neighbors_.size()); }
inline
int size_unique() const;
BlockID target(int i) const { return neighbors_[static_cast<size_t>(i)]; }
BlockID& target(int i) { return neighbors_[static_cast<size_t>(i)]; }
inline
int find(int gid) const;
void add_neighbor(const BlockID& block) { neighbors_.push_back(block); }
void fix(const Assigner& assigner) { for (unsigned i = 0; i < neighbors_.size(); ++i) { neighbors_[i].proc = assigner.rank(neighbors_[i].gid); } }
void swap(Link& other) { neighbors_.swap(other.neighbors_); }
const Neighbors&
neighbors() const { return neighbors_; }
Neighbors&
neighbors() { return neighbors_; }
virtual void save(BinaryBuffer& bb) const { diy::save(bb, neighbors_); }
virtual void load(BinaryBuffer& bb) { diy::load(bb, neighbors_); }
virtual size_t id() const { return 0; }
private:
Neighbors neighbors_;
};
template<class Bounds_>
class RegularLink;
typedef RegularLink<DiscreteBounds> RegularGridLink;
typedef RegularLink<ContinuousBounds> RegularContinuousLink;
// Selector between regular discrete and contious links given bounds type
template<class Bounds_>
struct RegularLinkSelector;
template<>
struct RegularLinkSelector<DiscreteBounds>
{
typedef RegularGridLink type;
static const size_t id = 1;
};
template<>
struct RegularLinkSelector<ContinuousBounds>
{
typedef RegularContinuousLink type;
static const size_t id = 2;
};
// for a regular decomposition, it makes sense to address the neighbors by direction
// and store local and neighbor bounds
template<class Bounds_>
class RegularLink: public Link
{
public:
typedef Bounds_ Bounds;
typedef std::map<Direction, int> DirMap;
typedef std::vector<Direction> DirVec;
public:
RegularLink(int dim, const Bounds& core__, const Bounds& bounds__):
dim_(dim), core_(core__), bounds_(bounds__) {}
// dimension
int dimension() const { return dim_; }
// direction
int direction(Direction dir) const; // convert direction to a neighbor (-1 if no neighbor)
Direction direction(int i) const { return dir_vec_[i]; }
void add_direction(Direction dir) { int c = dir_map_.size(); dir_map_[dir] = c; dir_vec_.push_back(dir); }
// wrap
void add_wrap(Direction dir) { wrap_.push_back(dir); }
Direction wrap(int i) const { return wrap_[i]; }
Direction& wrap(int i) { return wrap_[i]; }
// bounds
const Bounds& core() const { return core_; }
Bounds& core() { return core_; }
const Bounds& bounds() const { return bounds_; }
Bounds& bounds() { return bounds_; }
const Bounds& bounds(int i) const { return nbr_bounds_[i]; }
void add_bounds(const Bounds& bounds__) { nbr_bounds_.push_back(bounds__); }
void swap(RegularLink& other) { Link::swap(other); dir_map_.swap(other.dir_map_); dir_vec_.swap(other.dir_vec_); nbr_bounds_.swap(other.nbr_bounds_); std::swap(dim_, other.dim_); wrap_.swap(other.wrap_); std::swap(core_, other.core_); std::swap(bounds_, other.bounds_); }
void save(BinaryBuffer& bb) const
{
Link::save(bb);
diy::save(bb, dim_);
diy::save(bb, dir_map_);
diy::save(bb, dir_vec_);
diy::save(bb, core_);
diy::save(bb, bounds_);
diy::save(bb, nbr_bounds_);
diy::save(bb, wrap_);
}
void load(BinaryBuffer& bb)
{
Link::load(bb);
diy::load(bb, dim_);
diy::load(bb, dir_map_);
diy::load(bb, dir_vec_);
diy::load(bb, core_);
diy::load(bb, bounds_);
diy::load(bb, nbr_bounds_);
diy::load(bb, wrap_);
}
virtual size_t id() const { return RegularLinkSelector<Bounds>::id; }
private:
int dim_;
DirMap dir_map_;
DirVec dir_vec_;
Bounds core_;
Bounds bounds_;
std::vector<Bounds> nbr_bounds_;
std::vector<Direction> wrap_;
};
// Other cover candidates: KDTreeLink, AMRGridLink
struct LinkFactory
{
public:
static Link* create(size_t id)
{
// not pretty, but will do for now
if (id == 0)
return new Link;
else if (id == 1)
return new RegularGridLink(0, DiscreteBounds(), DiscreteBounds());
else if (id == 2)
return new RegularContinuousLink(0, ContinuousBounds(), ContinuousBounds());
else
return 0;
}
inline static void save(BinaryBuffer& bb, const Link* l);
inline static Link* load(BinaryBuffer& bb);
};
}
void
diy::LinkFactory::
save(BinaryBuffer& bb, const Link* l)
{
diy::save(bb, l->id());
l->save(bb);
}
diy::Link*
diy::LinkFactory::
load(BinaryBuffer& bb)
{
size_t id;
diy::load(bb, id);
Link* l = create(id);
l->load(bb);
return l;
}
int
diy::Link::
find(int gid) const
{
for (int i = 0; i < size(); ++i)
{
if (target(i).gid == gid)
return i;
}
return -1;
}
int
diy::Link::
size_unique() const
{
std::vector<BlockID> tmp(neighbors_.begin(), neighbors_.end());
std::sort(tmp.begin(), tmp.end());
return static_cast<int>(std::unique(tmp.begin(), tmp.end()) - tmp.begin());
}
template<class Bounds>
int
diy::RegularLink<Bounds>::
direction(Direction dir) const
{
DirMap::const_iterator it = dir_map_.find(dir);
if (it == dir_map_.end())
return -1;
else
return it->second;
}
#endif

@ -1,103 +0,0 @@
#ifndef VTKMDIY_LOG_HPP
#define VTKMDIY_LOG_HPP
#ifndef VTKMDIY_USE_SPDLOG
#include <memory>
#include "fmt/format.h"
#include "fmt/ostream.h"
namespace diy
{
namespace spd
{
struct logger
{
// logger.info(cppformat_string, arg1, arg2, arg3, ...) call style
template <typename... Args> void trace(const char*, const Args&...) {}
template <typename... Args> void debug(const char*, const Args&...) {}
template <typename... Args> void info(const char*, const Args&...) {}
template <typename... Args> void warn(const char*, const Args&...) {}
template <typename... Args> void error(const char*, const Args&...) {}
template <typename... Args> void critical(const char*, const Args&...) {}
};
}
inline
std::shared_ptr<spd::logger>
get_logger()
{
return std::make_shared<spd::logger>();
}
inline
std::shared_ptr<spd::logger>
create_logger(std::string)
{
return std::make_shared<spd::logger>();
}
template<class... Args>
std::shared_ptr<spd::logger>
set_logger(Args...)
{
return std::make_shared<spd::logger>();
}
} // diy
#else // DIY_USE_SPDLOG
#include <string>
#include <spdlog/spdlog.h>
#include <spdlog/sinks/null_sink.h>
#include <spdlog/fmt/bundled/format.h>
#include <spdlog/fmt/bundled/ostream.h>
namespace diy
{
namespace spd = ::spdlog;
inline
std::shared_ptr<spd::logger>
get_logger()
{
auto log = spd::get("diy");
if (!log)
{
auto null_sink = std::make_shared<spd::sinks::null_sink_mt> ();
log = std::make_shared<spd::logger>("null_logger", null_sink);
}
return log;
}
inline
std::shared_ptr<spd::logger>
create_logger(std::string log_level)
{
auto log = spd::stderr_logger_mt("diy");
int lvl;
for (lvl = spd::level::trace; lvl < spd::level::off; ++lvl)
if (spd::level::level_names[lvl] == log_level)
break;
log->set_level(static_cast<spd::level::level_enum>(lvl));
return log;
}
template<class... Args>
std::shared_ptr<spd::logger>
set_logger(Args... args)
{
auto log = std::make_shared<spdlog::logger>("diy", args...);
return log;
}
} // diy
#endif
#endif // DIY_LOG_HPP

File diff suppressed because it is too large Load Diff

@ -1,71 +0,0 @@
#ifndef VTKMDIY_MPI_HPP
#define VTKMDIY_MPI_HPP
#ifndef VTKM_DIY_NO_MPI
#include <mpi.h>
#else
#include "mpi/no-mpi.hpp"
#endif
#include "mpi/constants.hpp"
#include "mpi/datatypes.hpp"
#include "mpi/optional.hpp"
#include "mpi/status.hpp"
#include "mpi/request.hpp"
#include "mpi/point-to-point.hpp"
#include "mpi/communicator.hpp"
#include "mpi/collectives.hpp"
#include "mpi/io.hpp"
#include "mpi/window.hpp"
namespace diy
{
namespace mpi
{
//! \ingroup MPI
struct environment
{
inline environment(int threading = MPI_THREAD_FUNNELED);
inline environment(int argc, char* argv[], int threading = MPI_THREAD_FUNNELED);
inline ~environment();
int threading() const { return provided_threading; }
int provided_threading;
};
}
}
diy::mpi::environment::
environment(int threading)
{
#ifndef VTKM_DIY_NO_MPI
int argc = 0; char** argv;
MPI_Init_thread(&argc, &argv, threading, &provided_threading);
#else
provided_threading = threading;
#endif
}
diy::mpi::environment::
environment(int argc, char* argv[], int threading)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Init_thread(&argc, &argv, threading, &provided_threading);
#else
(void) argc; (void) argv;
provided_threading = threading;
#endif
}
diy::mpi::environment::
~environment()
{
#ifndef VTKM_DIY_NO_MPI
MPI_Finalize();
#endif
}
#endif

@ -1,380 +0,0 @@
#include <vector>
#include "../constants.h" // for DIY_UNUSED.
#include "operations.hpp"
namespace diy
{
namespace mpi
{
//!\addtogroup MPI
//!@{
template<class T, class Op>
struct Collectives
{
static void broadcast(const communicator& comm, T& x, int root)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Bcast(address(x), count(x), datatype(x), root, comm);
#else
DIY_UNUSED(comm);
DIY_UNUSED(x);
DIY_UNUSED(root);
#endif
}
static void broadcast(const communicator& comm, std::vector<T>& x, int root)
{
#ifndef VTKM_DIY_NO_MPI
size_t sz = x.size();
Collectives<size_t, void*>::broadcast(comm, sz, root);
if (comm.rank() != root)
x.resize(sz);
MPI_Bcast(address(x), count(x), datatype(x), root, comm);
#else
DIY_UNUSED(comm);
DIY_UNUSED(x);
DIY_UNUSED(root);
#endif
}
static request ibroadcast(const communicator& comm, T& x, int root)
{
#ifndef VTKM_DIY_NO_MPI
request r;
MPI_Ibcast(address(x), count(x), datatype(x), root, comm, &r.r);
return r;
#else
DIY_UNUSED(comm);
DIY_UNUSED(x);
DIY_UNUSED(root);
DIY_UNSUPPORTED_MPI_CALL(MPI_Ibcast);
#endif
}
static void gather(const communicator& comm, const T& in, std::vector<T>& out, int root)
{
out.resize(comm.size());
#ifndef VTKM_DIY_NO_MPI
MPI_Gather(address(in), count(in), datatype(in), address(out), count(in), datatype(out), root, comm);
#else
DIY_UNUSED(comm);
DIY_UNUSED(root);
out[0] = in;
#endif
}
static void gather(const communicator& comm, const std::vector<T>& in, std::vector< std::vector<T> >& out, int root)
{
#ifndef VTKM_DIY_NO_MPI
std::vector<int> counts(comm.size());
Collectives<int,void*>::gather(comm, count(in), counts, root);
std::vector<int> offsets(comm.size(), 0);
for (unsigned i = 1; i < offsets.size(); ++i)
offsets[i] = offsets[i-1] + counts[i-1];
int elem_size = count(in[0]); // size of 1 vector element in units of mpi datatype
std::vector<T> buffer((offsets.back() + counts.back()) / elem_size);
MPI_Gatherv(address(in), count(in), datatype(in),
address(buffer),
&counts[0],
&offsets[0],
datatype(buffer),
root, comm);
out.resize(comm.size());
size_t cur = 0;
for (unsigned i = 0; i < (unsigned)comm.size(); ++i)
{
out[i].reserve(counts[i] / elem_size);
for (unsigned j = 0; j < (unsigned)(counts[i] / elem_size); ++j)
out[i].push_back(buffer[cur++]);
}
#else
DIY_UNUSED(comm);
DIY_UNUSED(root);
out.resize(1);
out[0] = in;
#endif
}
static void gather(const communicator& comm, const T& in, int root)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Gather(address(in), count(in), datatype(in), address(in), count(in), datatype(in), root, comm);
#else
DIY_UNUSED(comm);
DIY_UNUSED(in);
DIY_UNUSED(root);
DIY_UNSUPPORTED_MPI_CALL("MPI_Gather");
#endif
}
static void gather(const communicator& comm, const std::vector<T>& in, int root)
{
#ifndef VTKM_DIY_NO_MPI
Collectives<int,void*>::gather(comm, count(in), root);
MPI_Gatherv(address(in), count(in), datatype(in),
0, 0, 0,
datatype(in),
root, comm);
#else
DIY_UNUSED(comm);
DIY_UNUSED(in);
DIY_UNUSED(root);
DIY_UNSUPPORTED_MPI_CALL("MPI_Gatherv");
#endif
}
static void all_gather(const communicator& comm, const T& in, std::vector<T>& out)
{
out.resize(comm.size());
#ifndef VTKM_DIY_NO_MPI
MPI_Allgather(address(in), count(in), datatype(in),
address(out), count(in), datatype(in),
comm);
#else
DIY_UNUSED(comm);
out[0] = in;
#endif
}
static void all_gather(const communicator& comm, const std::vector<T>& in, std::vector< std::vector<T> >& out)
{
#ifndef VTKM_DIY_NO_MPI
std::vector<int> counts(comm.size());
Collectives<int,void*>::all_gather(comm, count(in), counts);
std::vector<int> offsets(comm.size(), 0);
for (unsigned i = 1; i < offsets.size(); ++i)
offsets[i] = offsets[i-1] + counts[i-1];
int elem_size = count(in[0]); // size of 1 vector element in units of mpi datatype
std::vector<T> buffer((offsets.back() + counts.back()) / elem_size);
MPI_Allgatherv(address(in), count(in), datatype(in),
address(buffer),
&counts[0],
&offsets[0],
datatype(buffer),
comm);
out.resize(comm.size());
size_t cur = 0;
for (int i = 0; i < comm.size(); ++i)
{
out[i].reserve(counts[i] / elem_size);
for (int j = 0; j < (int)(counts[i] / elem_size); ++j)
out[i].push_back(buffer[cur++]);
}
#else
DIY_UNUSED(comm);
out.resize(1);
out[0] = in;
#endif
}
static void reduce(const communicator& comm, const T& in, T& out, int root, const Op&)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Reduce(address(in), address(out), count(in), datatype(in),
detail::mpi_op<Op>::get(),
root, comm);
#else
DIY_UNUSED(comm);
DIY_UNUSED(root);
out = in;
#endif
}
static void reduce(const communicator& comm, const T& in, int root, const Op&)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Reduce(address(in), address(in), count(in), datatype(in),
detail::mpi_op<Op>::get(),
root, comm);
#else
DIY_UNUSED(comm);
DIY_UNUSED(in);
DIY_UNUSED(root);
DIY_UNSUPPORTED_MPI_CALL("MPI_Reduce");
#endif
}
static void all_reduce(const communicator& comm, const T& in, T& out, const Op&)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Allreduce(address(in), address(out), count(in), datatype(in),
detail::mpi_op<Op>::get(),
comm);
#else
DIY_UNUSED(comm);
out = in;
#endif
}
static void all_reduce(const communicator& comm, const std::vector<T>& in, std::vector<T>& out, const Op&)
{
#ifndef VTKM_DIY_NO_MPI
out.resize(in.size());
MPI_Allreduce(address(in), address(out), count(in),
datatype(in),
detail::mpi_op<Op>::get(),
comm);
#else
DIY_UNUSED(comm);
out = in;
#endif
}
static void scan(const communicator& comm, const T& in, T& out, const Op&)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Scan(address(in), address(out), count(in), datatype(in),
detail::mpi_op<Op>::get(),
comm);
#else
DIY_UNUSED(comm);
out = in;
#endif
}
static void all_to_all(const communicator& comm, const std::vector<T>& in, std::vector<T>& out, int n = 1)
{
#ifndef VTKM_DIY_NO_MPI
// n specifies how many elements go to/from every process from every process;
// the sizes of in and out are expected to be n * comm.size()
int elem_size = count(in[0]); // size of 1 vector element in units of mpi datatype
// NB: this will fail if T is a vector
MPI_Alltoall(address(in),
elem_size * n,
datatype(in),
address(out),
elem_size * n,
datatype(out),
comm);
#else
DIY_UNUSED(comm);
DIY_UNUSED(n);
out = in;
#endif
}
};
//! Broadcast to all processes in `comm`.
template<class T>
void broadcast(const communicator& comm, T& x, int root)
{
Collectives<T,void*>::broadcast(comm, x, root);
}
//! Broadcast for vectors
template<class T>
void broadcast(const communicator& comm, std::vector<T>& x, int root)
{
Collectives<T,void*>::broadcast(comm, x, root);
}
//! iBroadcast to all processes in `comm`.
template<class T>
request ibroadcast(const communicator& comm, T& x, int root)
{
return Collectives<T,void*>::ibroadcast(comm, x, root);
}
//! Gather from all processes in `comm`.
//! On `root` process, `out` is resized to `comm.size()` and filled with
//! elements from the respective ranks.
template<class T>
void gather(const communicator& comm, const T& in, std::vector<T>& out, int root)
{
Collectives<T,void*>::gather(comm, in, out, root);
}
//! Same as above, but for vectors.
template<class T>
void gather(const communicator& comm, const std::vector<T>& in, std::vector< std::vector<T> >& out, int root)
{
Collectives<T,void*>::gather(comm, in, out, root);
}
//! Simplified version (without `out`) for use on non-root processes.
template<class T>
void gather(const communicator& comm, const T& in, int root)
{
Collectives<T,void*>::gather(comm, in, root);
}
//! Simplified version (without `out`) for use on non-root processes.
template<class T>
void gather(const communicator& comm, const std::vector<T>& in, int root)
{
Collectives<T,void*>::gather(comm, in, root);
}
//! all_gather from all processes in `comm`.
//! `out` is resized to `comm.size()` and filled with
//! elements from the respective ranks.
template<class T>
void all_gather(const communicator& comm, const T& in, std::vector<T>& out)
{
Collectives<T,void*>::all_gather(comm, in, out);
}
//! Same as above, but for vectors.
template<class T>
void all_gather(const communicator& comm, const std::vector<T>& in, std::vector< std::vector<T> >& out)
{
Collectives<T,void*>::all_gather(comm, in, out);
}
//! reduce
template<class T, class Op>
void reduce(const communicator& comm, const T& in, T& out, int root, const Op& op)
{
Collectives<T, Op>::reduce(comm, in, out, root, op);
}
//! Simplified version (without `out`) for use on non-root processes.
template<class T, class Op>
void reduce(const communicator& comm, const T& in, int root, const Op& op)
{
Collectives<T, Op>::reduce(comm, in, root, op);
}
//! all_reduce
template<class T, class Op>
void all_reduce(const communicator& comm, const T& in, T& out, const Op& op)
{
Collectives<T, Op>::all_reduce(comm, in, out, op);
}
//! Same as above, but for vectors.
template<class T, class Op>
void all_reduce(const communicator& comm, const std::vector<T>& in, std::vector<T>& out, const Op& op)
{
Collectives<T, Op>::all_reduce(comm, in, out, op);
}
//! scan
template<class T, class Op>
void scan(const communicator& comm, const T& in, T& out, const Op& op)
{
Collectives<T, Op>::scan(comm, in, out, op);
}
//! all_to_all
template<class T>
void all_to_all(const communicator& comm, const std::vector<T>& in, std::vector<T>& out, int n = 1)
{
Collectives<T, void*>::all_to_all(comm, in, out, n);
}
//!@}
}
}

@ -1,227 +0,0 @@
namespace diy
{
namespace mpi
{
//! \ingroup MPI
//! Simple wrapper around `MPI_Comm`.
class communicator
{
public:
inline
communicator(MPI_Comm comm = MPI_COMM_WORLD, bool owner = false);
~communicator() { destroy(); }
communicator(const communicator& other):
comm_(other.comm_),
rank_(other.rank_),
size_(other.size_),
owner_(false) {}
communicator(communicator&& other):
comm_(other.comm_),
rank_(other.rank_),
size_(other.size_),
owner_(other.owner_) { other.owner_ = false; }
communicator&
operator=(const communicator& other) { destroy(); comm_ = other.comm_; rank_ = other.rank_; size_ = other.size_; owner_ = false; return *this; }
communicator&
operator=(communicator&& other) { destroy(); comm_ = other.comm_; rank_ = other.rank_; size_ = other.size_; owner_ = other.owner_; other.owner_ = false; return *this; }
int rank() const { return rank_; }
int size() const { return size_; }
//! Send `x` to processor `dest` using `tag` (blocking).
template<class T>
void send(int dest, int tag, const T& x) const { detail::send<T>()(comm_, dest, tag, x); }
//! Receive `x` from `dest` using `tag` (blocking).
//! If `T` is an `std::vector<...>`, `recv` will resize it to fit exactly the sent number of values.
template <class T>
status recv(int source, int tag, T &x) const
{
#if defined(VTKM_DIY_NO_MPI) && defined(__CUDACC_VER_MAJOR__) && __CUDACC_VER_MAJOR__ < 8 // CUDA 7.5 workaround
(void) source; (void)tag; (void)x;
DIY_UNSUPPORTED_MPI_CALL(MPI_Recv);
#else
return detail::recv<T>{}(comm_, source, tag, x);
#endif
}
//! Non-blocking version of `send()`.
template <class T>
request isend(int dest, int tag, const T &x) const
{
#if defined(VTKM_DIY_NO_MPI) && defined(__CUDACC_VER_MAJOR__) && __CUDACC_VER_MAJOR__ < 8 // CUDA 7.5 workaround
(void) dest; (void)tag; (void)x;
DIY_UNSUPPORTED_MPI_CALL(MPI_Send);
#else
return detail::isend<T>{}(comm_, dest, tag, x);
#endif
}
//! Non-blocking version of `ssend()`.
template<class T>
request issend(int dest, int tag, const T& x) const { return detail::issend<T>()(comm_, dest, tag, x); }
//! Non-blocking version of `recv()`.
//! If `T` is an `std::vector<...>`, its size must be big enough to accommodate the sent values.
template <class T>
request irecv(int source, int tag, T &x) const
{
#if defined(VTKM_DIY_NO_MPI) && defined(__CUDACC_VER_MAJOR__) && __CUDACC_VER_MAJOR__ < 8 // CUDA 7.5 workaround
(void)source; (void)tag; (void)x;
DIY_UNSUPPORTED_MPI_CALL(MPI_Irecv);
#else
return detail::irecv<T>()(comm_, source, tag, x);
#endif
}
//! probe
inline
status probe(int source, int tag) const;
//! iprobe
inline
optional<status>
iprobe(int source, int tag) const;
//! barrier
inline
void barrier() const;
//! Nonblocking version of barrier
inline
request ibarrier() const;
operator MPI_Comm() const { return comm_; }
//! split
//! When keys are the same, the ties are broken by the rank in the original comm.
inline
communicator
split(int color, int key = 0) const;
//! duplicate
inline
void duplicate(const communicator& other);
private:
inline
void destroy();
private:
MPI_Comm comm_;
int rank_;
int size_;
bool owner_;
};
}
}
diy::mpi::communicator::
communicator(MPI_Comm comm, bool owner):
comm_(comm), rank_(0), size_(1), owner_(owner)
{
#ifndef VTKM_DIY_NO_MPI
if (comm != MPI_COMM_NULL)
{
MPI_Comm_rank(comm_, &rank_);
MPI_Comm_size(comm_, &size_);
}
#endif
}
void
diy::mpi::communicator::
destroy()
{
#ifndef VTKM_DIY_NO_MPI
if (owner_)
MPI_Comm_free(&comm_);
#endif
}
diy::mpi::status
diy::mpi::communicator::
probe(int source, int tag) const
{
(void) source;
(void) tag;
#ifndef VTKM_DIY_NO_MPI
status s;
MPI_Probe(source, tag, comm_, &s.s);
return s;
#else
DIY_UNSUPPORTED_MPI_CALL(MPI_Probe);
#endif
}
diy::mpi::optional<diy::mpi::status>
diy::mpi::communicator::
iprobe(int source, int tag) const
{
(void) source;
(void) tag;
#ifndef VTKM_DIY_NO_MPI
status s;
int flag;
MPI_Iprobe(source, tag, comm_, &flag, &s.s);
if (flag)
return s;
#endif
return optional<status>();
}
void
diy::mpi::communicator::
barrier() const
{
#ifndef VTKM_DIY_NO_MPI
MPI_Barrier(comm_);
#endif
}
diy::mpi::communicator
diy::mpi::communicator::
split(int color, int key) const
{
#ifndef VTKM_DIY_NO_MPI
MPI_Comm newcomm;
MPI_Comm_split(comm_, color, key, &newcomm);
return communicator(newcomm, true);
#else
return communicator();
#endif
}
diy::mpi::request
diy::mpi::communicator::
ibarrier() const
{
#ifndef VTKM_DIY_NO_MPI
request r_;
MPI_Ibarrier(comm_, &r_.r);
return r_;
#else
// this is not the ideal fix; in principle we should just return a status
// that tests true, but this requires redesigning some parts of our no-mpi
// handling
DIY_UNSUPPORTED_MPI_CALL(MPI_Ibarrier);
#endif
}
void
diy::mpi::communicator::
duplicate(const communicator& other)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Comm newcomm;
MPI_Comm_dup(other.comm_, &newcomm);
(*this) = communicator(newcomm,true);
#endif
}

@ -1,13 +0,0 @@
#ifndef VTKMDIY_MPI_CONSTANTS_HPP
#define VTKMDIY_MPI_CONSTANTS_HPP
namespace diy
{
namespace mpi
{
const int any_source = MPI_ANY_SOURCE;
const int any_tag = MPI_ANY_TAG;
}
}
#endif

@ -1,93 +0,0 @@
#ifndef VTKMDIY_MPI_DATATYPES_HPP
#define VTKMDIY_MPI_DATATYPES_HPP
#include <vector>
namespace diy
{
namespace mpi
{
namespace detail
{
template<class T> MPI_Datatype get_mpi_datatype();
struct true_type {};
struct false_type {};
/* is_mpi_datatype */
template<class T>
struct is_mpi_datatype { typedef false_type type; };
#define VTKMDIY_MPI_DATATYPE_MAP(cpp_type, mpi_type) \
template<> inline MPI_Datatype get_mpi_datatype<cpp_type>() { return mpi_type; } \
template<> struct is_mpi_datatype<cpp_type> { typedef true_type type; }; \
template<> struct is_mpi_datatype< std::vector<cpp_type> > { typedef true_type type; };
VTKMDIY_MPI_DATATYPE_MAP(char, MPI_BYTE);
VTKMDIY_MPI_DATATYPE_MAP(unsigned char, MPI_BYTE);
VTKMDIY_MPI_DATATYPE_MAP(bool, MPI_BYTE);
VTKMDIY_MPI_DATATYPE_MAP(int, MPI_INT);
VTKMDIY_MPI_DATATYPE_MAP(unsigned, MPI_UNSIGNED);
VTKMDIY_MPI_DATATYPE_MAP(long, MPI_LONG);
VTKMDIY_MPI_DATATYPE_MAP(unsigned long, MPI_UNSIGNED_LONG);
VTKMDIY_MPI_DATATYPE_MAP(long long, MPI_LONG_LONG_INT);
VTKMDIY_MPI_DATATYPE_MAP(unsigned long long, MPI_UNSIGNED_LONG_LONG);
VTKMDIY_MPI_DATATYPE_MAP(float, MPI_FLOAT);
VTKMDIY_MPI_DATATYPE_MAP(double, MPI_DOUBLE);
/* mpi_datatype: helper routines, specialized for std::vector<...> */
template<class T>
struct mpi_datatype
{
static MPI_Datatype datatype() { return get_mpi_datatype<T>(); }
static const void* address(const T& x) { return &x; }
static void* address(T& x) { return &x; }
static int count(const T&) { return 1; }
};
template<class U>
struct mpi_datatype< std::vector<U> >
{
typedef std::vector<U> VecU;
static MPI_Datatype datatype() { return mpi_datatype<U>::datatype(); }
static const void* address(const VecU& x) { return x.data(); }
static void* address(VecU& x) { return x.data(); }
static int count(const VecU& x) { return x.empty() ? 0 : (static_cast<int>(x.size()) * mpi_datatype<U>::count(x[0])); }
};
} // detail
template<class U>
static MPI_Datatype datatype(const U&)
{
using Datatype = detail::mpi_datatype<U>;
return Datatype::datatype();
}
template<class U>
static void* address(const U& x)
{
using Datatype = detail::mpi_datatype<U>;
return const_cast<void*>(Datatype::address(x));
}
template<class U>
static void* address(U& x)
{
using Datatype = detail::mpi_datatype<U>;
return Datatype::address(x);
}
template<class U>
static int count(const U& x)
{
using Datatype = detail::mpi_datatype<U>;
return Datatype::count(x);
}
} // mpi
} // diy
#endif

@ -1,215 +0,0 @@
#ifndef VTKMDIY_MPI_IO_HPP
#define VTKMDIY_MPI_IO_HPP
#include "../constants.h"
#include <vector>
#include <string>
namespace diy
{
namespace mpi
{
namespace io
{
typedef MPI_Offset offset;
//! Wraps MPI file IO. \ingroup MPI
class file
{
public:
enum
{
rdonly = MPI_MODE_RDONLY,
rdwr = MPI_MODE_RDWR,
wronly = MPI_MODE_WRONLY,
create = MPI_MODE_CREATE,
exclusive = MPI_MODE_EXCL,
delete_on_close = MPI_MODE_DELETE_ON_CLOSE,
unique_open = MPI_MODE_UNIQUE_OPEN,
sequential = MPI_MODE_SEQUENTIAL,
append = MPI_MODE_APPEND
};
public:
inline file(const communicator& comm, const std::string& filename, int mode);
~file() { close(); }
inline void close();
inline offset size() const;
inline void resize(offset size);
inline void read_at(offset o, char* buffer, size_t size);
inline void read_at_all(offset o, char* buffer, size_t size);
inline void write_at(offset o, const char* buffer, size_t size);
inline void write_at_all(offset o, const char* buffer, size_t size);
template<class T>
inline void read_at(offset o, std::vector<T>& data);
template<class T>
inline void read_at_all(offset o, std::vector<T>& data);
template<class T>
inline void write_at(offset o, const std::vector<T>& data);
template<class T>
inline void write_at_all(offset o, const std::vector<T>& data);
const communicator&
comm() const { return comm_; }
MPI_File& handle() { return fh; }
private:
const communicator& comm_;
MPI_File fh;
};
}
}
}
diy::mpi::io::file::
file(const communicator& comm__, const std::string& filename, int mode)
: comm_(comm__)
{
#ifndef VTKM_DIY_NO_MPI
int ret = MPI_File_open(comm__, const_cast<char*>(filename.c_str()), mode, MPI_INFO_NULL, &fh);
if (ret)
throw std::runtime_error("DIY cannot open file: " + filename);
#else
DIY_UNUSED(comm__);
DIY_UNUSED(filename);
DIY_UNUSED(mode);
DIY_UNSUPPORTED_MPI_CALL(MPI_File_open);
#endif
}
void
diy::mpi::io::file::
close()
{
#ifndef VTKM_DIY_NO_MPI
if (fh != MPI_FILE_NULL)
MPI_File_close(&fh);
#endif
}
diy::mpi::io::offset
diy::mpi::io::file::
size() const
{
#ifndef VTKM_DIY_NO_MPI
offset sz;
MPI_File_get_size(fh, &sz);
return sz;
#else
DIY_UNSUPPORTED_MPI_CALL(MPI_File_get_size);
#endif
}
void
diy::mpi::io::file::
resize(diy::mpi::io::offset size_)
{
#ifndef VTKM_DIY_NO_MPI
MPI_File_set_size(fh, size_);
#else
DIY_UNUSED(size_);
DIY_UNSUPPORTED_MPI_CALL(MPI_File_set_size);
#endif
}
void
diy::mpi::io::file::
read_at(offset o, char* buffer, size_t size_)
{
#ifndef VTKM_DIY_NO_MPI
status s;
MPI_File_read_at(fh, o, buffer, static_cast<int>(size_), detail::get_mpi_datatype<char>(), &s.s);
#else
DIY_UNUSED(o);
DIY_UNUSED(buffer);
DIY_UNUSED(size_);
DIY_UNSUPPORTED_MPI_CALL(MPI_File_read_at);
#endif
}
template<class T>
void
diy::mpi::io::file::
read_at(offset o, std::vector<T>& data)
{
read_at(o, &data[0], data.size()*sizeof(T));
}
void
diy::mpi::io::file::
read_at_all(offset o, char* buffer, size_t size_)
{
#ifndef VTKM_DIY_NO_MPI
status s;
MPI_File_read_at_all(fh, o, buffer, static_cast<int>(size_), detail::get_mpi_datatype<char>(), &s.s);
#else
DIY_UNUSED(o);
DIY_UNUSED(buffer);
DIY_UNUSED(size_);
DIY_UNSUPPORTED_MPI_CALL(MPI_File_read_at_all);
#endif
}
template<class T>
void
diy::mpi::io::file::
read_at_all(offset o, std::vector<T>& data)
{
read_at_all(o, (char*) &data[0], data.size()*sizeof(T));
}
void
diy::mpi::io::file::
write_at(offset o, const char* buffer, size_t size_)
{
#ifndef VTKM_DIY_NO_MPI
status s;
MPI_File_write_at(fh, o, (void *)buffer, static_cast<int>(size_), detail::get_mpi_datatype<char>(), &s.s);
#else
DIY_UNUSED(o);
DIY_UNUSED(buffer);
DIY_UNUSED(size_);
DIY_UNSUPPORTED_MPI_CALL(MPI_File_write_at);
#endif
}
template<class T>
void
diy::mpi::io::file::
write_at(offset o, const std::vector<T>& data)
{
write_at(o, (const char*) &data[0], data.size()*sizeof(T));
}
void
diy::mpi::io::file::
write_at_all(offset o, const char* buffer, size_t size_)
{
#ifndef VTKM_DIY_NO_MPI
status s;
MPI_File_write_at_all(fh, o, (void *)buffer, static_cast<int>(size_), detail::get_mpi_datatype<char>(), &s.s);
#else
DIY_UNUSED(o);
DIY_UNUSED(buffer);
DIY_UNUSED(size_);
DIY_UNSUPPORTED_MPI_CALL(MPI_File_write_at_all);
#endif
}
template<class T>
void
diy::mpi::io::file::
write_at_all(offset o, const std::vector<T>& data)
{
write_at_all(o, &data[0], data.size()*sizeof(T));
}
#endif

@ -1,92 +0,0 @@
#ifndef VTKMDIY_MPI_NO_MPI_HPP
#define VTKMDIY_MPI_NO_MPI_HPP
#include <stdexcept> // std::runtime_error
static const int MPI_SUCCESS = 0;
static const int MPI_ANY_SOURCE = -1;
static const int MPI_ANY_TAG = -1;
/* define communicator type and constants */
using MPI_Comm = int;
static const MPI_Comm MPI_COMM_NULL = 0;
static const MPI_Comm MPI_COMM_WORLD = 1;
/* MPI threading modes */
static const int MPI_THREAD_SINGLE = 0;
static const int MPI_THREAD_FUNNELED = 1;
static const int MPI_THREAD_SERIALIZED = 2;
static const int MPI_THREAD_MULTIPLE = 3;
/* define datatypes */
using MPI_Datatype = size_t;
#define VTKM_DIY_NO_MPI_DATATYPE(cpp_type, mpi_type) \
static const MPI_Datatype mpi_type = sizeof(cpp_type);
VTKM_DIY_NO_MPI_DATATYPE(char, MPI_BYTE);
VTKM_DIY_NO_MPI_DATATYPE(int, MPI_INT);
VTKM_DIY_NO_MPI_DATATYPE(unsigned, MPI_UNSIGNED);
VTKM_DIY_NO_MPI_DATATYPE(long, MPI_LONG);
VTKM_DIY_NO_MPI_DATATYPE(unsigned long, MPI_UNSIGNED_LONG);
VTKM_DIY_NO_MPI_DATATYPE(long long, MPI_LONG_LONG_INT);
VTKM_DIY_NO_MPI_DATATYPE(unsigned long long, MPI_UNSIGNED_LONG_LONG);
VTKM_DIY_NO_MPI_DATATYPE(float, MPI_FLOAT);
VTKM_DIY_NO_MPI_DATATYPE(double, MPI_DOUBLE);
#endif
/* status type */
struct MPI_Status
{
/* These fields are publicly defined in the MPI specification.
User applications may freely read from these fields. */
int MPI_SOURCE;
int MPI_TAG;
int MPI_ERROR;
};
/* define MPI_Request */
using MPI_Request = int;
#ifndef DIY_UNSUPPORTED_MPI_CALL
#define DIY_UNSUPPORTED_MPI_CALL(name) \
throw std::runtime_error("`" #name "` not supported when VTKM_DIY_NO_MPI is defined.");
#endif
/* define operations */
using MPI_Op = int;
static const MPI_Op MPI_MAX = 0;
static const MPI_Op MPI_MIN = 0;
static const MPI_Op MPI_SUM = 0;
static const MPI_Op MPI_PROD = 0;
static const MPI_Op MPI_LAND = 0;
static const MPI_Op MPI_LOR = 0;
/* mpi i/o stuff */
using MPI_Offset = size_t;
using MPI_File = int;
static const MPI_File MPI_FILE_NULL = 0;
static const int MPI_MODE_CREATE = 1;
static const int MPI_MODE_RDONLY = 2;
static const int MPI_MODE_WRONLY = 4;
static const int MPI_MODE_RDWR = 8;
static const int MPI_MODE_DELETE_ON_CLOSE = 16;
static const int MPI_MODE_UNIQUE_OPEN = 32;
static const int MPI_MODE_EXCL = 64;
static const int MPI_MODE_APPEND = 128;
static const int MPI_MODE_SEQUENTIAL = 256;
/* define window type */
using MPI_Win = int;
/* window fence assertions */
static const int MPI_MODE_NOSTORE = 1;
static const int MPI_MODE_NOPUT = 2;
static const int MPI_MODE_NOPRECEDE = 4;
static const int MPI_MODE_NOSUCCEED = 8;
static const int MPI_MODE_NOCHECK = 16;
/* window lock types */
static const int MPI_LOCK_SHARED = 1;
static const int MPI_LOCK_EXCLUSIVE = 2;

@ -1,27 +0,0 @@
#include <algorithm> // for std::min/max
#include <functional>
namespace diy
{
namespace mpi
{
//! \addtogroup MPI
//!@{
template<class U>
struct maximum { const U& operator()(const U& x, const U& y) const { return std::max(x,y); } };
template<class U>
struct minimum { const U& operator()(const U& x, const U& y) const { return std::min(x,y); } };
//!@}
namespace detail
{
template<class T> struct mpi_op { static MPI_Op get(); };
template<class U> struct mpi_op< maximum<U> > { static MPI_Op get() { return MPI_MAX; } };
template<class U> struct mpi_op< minimum<U> > { static MPI_Op get() { return MPI_MIN; } };
template<class U> struct mpi_op< std::plus<U> > { static MPI_Op get() { return MPI_SUM; } };
template<class U> struct mpi_op< std::multiplies<U> > { static MPI_Op get() { return MPI_PROD; } };
template<class U> struct mpi_op< std::logical_and<U> > { static MPI_Op get() { return MPI_LAND; } };
template<class U> struct mpi_op< std::logical_or<U> > { static MPI_Op get() { return MPI_LOR; } };
}
}
}

@ -1,55 +0,0 @@
namespace diy
{
namespace mpi
{
template<class T>
struct optional
{
optional():
init_(false) {}
optional(const T& v):
init_(true) { new(buf_) T(v); }
optional(const optional& o):
init_(o.init_) { if (init_) new(buf_) T(*o); }
~optional() { if (init_) clear(); }
inline
optional& operator=(const optional& o);
operator bool() const { return init_; }
T& operator*() { return *static_cast<T*>(address()); }
const T& operator*() const { return *static_cast<const T*>(address()); }
T* operator->() { return &(operator*()); }
const T* operator->() const { return &(operator*()); }
private:
void clear() { static_cast<T*>(address())->~T(); }
void* address() { return buf_; }
const void* address() const { return buf_; }
private:
bool init_;
char buf_[sizeof(T)];
};
}
}
template<class T>
diy::mpi::optional<T>&
diy::mpi::optional<T>::
operator=(const optional& o)
{
if (init_)
clear();
init_ = o.init_;
if (init_)
new (buf_) T(*o);
return *this;
}

@ -1,147 +0,0 @@
#include <vector>
namespace diy
{
namespace mpi
{
namespace detail
{
// send
template< class T, class is_mpi_datatype_ = typename is_mpi_datatype<T>::type >
struct send;
template<class T>
struct send<T, true_type>
{
void operator()(MPI_Comm comm, int dest, int tag, const T& x) const
{
#ifndef VTKM_DIY_NO_MPI
typedef mpi_datatype<T> Datatype;
MPI_Send((void*) Datatype::address(x),
Datatype::count(x),
Datatype::datatype(),
dest, tag, comm);
#else
(void) comm; (void) dest; (void) tag; (void) x;
DIY_UNSUPPORTED_MPI_CALL(MPI_Send);
#endif
}
};
// recv
template< class T, class is_mpi_datatype_ = typename is_mpi_datatype<T>::type >
struct recv;
template<class T>
struct recv<T, true_type>
{
status operator()(MPI_Comm comm, int source, int tag, T& x) const
{
#ifndef VTKM_DIY_NO_MPI
typedef mpi_datatype<T> Datatype;
status s;
MPI_Recv((void*) Datatype::address(x),
Datatype::count(x),
Datatype::datatype(),
source, tag, comm, &s.s);
return s;
#else
(void) comm; (void) source; (void) tag; (void) x;
DIY_UNSUPPORTED_MPI_CALL(MPI_Recv);
#endif
}
};
template<class U>
struct recv<std::vector<U>, true_type>
{
status operator()(MPI_Comm comm, int source, int tag, std::vector<U>& x) const
{
#ifndef VTKM_DIY_NO_MPI
status s;
MPI_Probe(source, tag, comm, &s.s);
x.resize(s.count<U>());
MPI_Recv(&x[0], static_cast<int>(x.size()), get_mpi_datatype<U>(), source, tag, comm, &s.s);
return s;
#else
(void) comm; (void) source; (void) tag; (void) x;
DIY_UNSUPPORTED_MPI_CALL(MPI_Recv);
#endif
}
};
// isend
template< class T, class is_mpi_datatype_ = typename is_mpi_datatype<T>::type >
struct isend;
template<class T>
struct isend<T, true_type>
{
request operator()(MPI_Comm comm, int dest, int tag, const T& x) const
{
#ifndef VTKM_DIY_NO_MPI
request r;
typedef mpi_datatype<T> Datatype;
MPI_Isend((void*) Datatype::address(x),
Datatype::count(x),
Datatype::datatype(),
dest, tag, comm, &r.r);
return r;
#else
(void) comm; (void) dest; (void) tag; (void) x;
DIY_UNSUPPORTED_MPI_CALL(MPI_Isend);
#endif
}
};
// issend
template< class T, class is_mpi_datatype_ = typename is_mpi_datatype<T>::type >
struct issend;
template<class T>
struct issend<T, true_type>
{
request operator()(MPI_Comm comm, int dest, int tag, const T& x) const
{
#ifndef VTKM_DIY_NO_MPI
request r;
typedef mpi_datatype<T> Datatype;
MPI_Issend((void*) Datatype::address(x),
Datatype::count(x),
Datatype::datatype(),
dest, tag, comm, &r.r);
return r;
#else
(void) comm; (void) dest; (void) tag; (void) x;
DIY_UNSUPPORTED_MPI_CALL(MPI_Issend);
#endif
}
};
// irecv
template< class T, class is_mpi_datatype_ = typename is_mpi_datatype<T>::type >
struct irecv;
template<class T>
struct irecv<T, true_type>
{
request operator()(MPI_Comm comm, int source, int tag, T& x) const
{
#ifndef VTKM_DIY_NO_MPI
request r;
typedef mpi_datatype<T> Datatype;
MPI_Irecv(Datatype::address(x),
Datatype::count(x),
Datatype::datatype(),
source, tag, comm, &r.r);
return r;
#else
(void) comm; (void) source; (void) tag; (void) x;
DIY_UNSUPPORTED_MPI_CALL(MPI_Irecv);
#endif
}
};
}
}
}

@ -1,50 +0,0 @@
namespace diy
{
namespace mpi
{
struct request
{
inline
status wait();
inline
optional<status> test();
inline
void cancel();
MPI_Request r;
};
}
}
diy::mpi::status
diy::mpi::request::wait()
{
#ifndef VTKM_DIY_NO_MPI
status s;
MPI_Wait(&r, &s.s);
return s;
#else
DIY_UNSUPPORTED_MPI_CALL(diy::mpi::request::wait);
#endif
}
diy::mpi::optional<diy::mpi::status>
diy::mpi::request::test()
{
#ifndef VTKM_DIY_NO_MPI
status s;
int flag;
MPI_Test(&r, &flag, &s.s);
if (flag)
return s;
#endif
return optional<status>();
}
void
diy::mpi::request::cancel()
{
#ifndef VTKM_DIY_NO_MPI
MPI_Cancel(&r);
#endif
}

@ -1,49 +0,0 @@
namespace diy
{
namespace mpi
{
struct status
{
int source() const { return s.MPI_SOURCE; }
int tag() const { return s.MPI_TAG; }
int error() const { return s.MPI_ERROR; }
inline
bool cancelled() const;
template<class T>
int count() const;
operator MPI_Status&() { return s; }
operator const MPI_Status&() const { return s; }
MPI_Status s;
};
}
}
bool
diy::mpi::status::cancelled() const
{
#ifndef VTKM_DIY_NO_MPI
int flag;
MPI_Test_cancelled(const_cast<MPI_Status*>(&s), &flag);
return flag;
#else
DIY_UNSUPPORTED_MPI_CALL(diy::mpi::status::cancelled);
#endif
}
template<class T>
int
diy::mpi::status::count() const
{
#ifndef VTKM_DIY_NO_MPI
int c;
MPI_Get_count(const_cast<MPI_Status*>(&s), detail::get_mpi_datatype<T>(), &c);
return c;
#else
DIY_UNSUPPORTED_MPI_CALL(diy::mpi::status::count);
#endif
}

@ -1,282 +0,0 @@
#include <type_traits>
namespace diy
{
namespace mpi
{
//! \ingroup MPI
//! Simple wrapper around MPI window functions.
template<class T>
class window
{
static_assert(std::is_same<typename detail::is_mpi_datatype<T>::type, detail::true_type>::value, "Only MPI datatypes are allowed in windows");
public:
inline window(const communicator& comm, unsigned size);
inline ~window();
// moving is Ok
window(window&&) = default;
window& operator=(window&&) = default;
// cannot copy because of the buffer_
window(const window&) = delete;
window& operator=(const window&) = delete;
inline void put(const T& x, int rank, unsigned offset);
inline void put(const std::vector<T>& x, int rank, unsigned offset);
inline void get(T& x, int rank, unsigned offset);
inline void get(std::vector<T>& x, int rank, unsigned offset);
inline void fence(int assert);
inline void lock(int lock_type, int rank, int assert = 0);
inline void unlock(int rank);
inline void lock_all(int assert = 0);
inline void unlock_all();
inline void fetch_and_op(const T* origin, T* result, int rank, unsigned offset, MPI_Op op);
inline void fetch(T& result, int rank, unsigned offset);
inline void replace(const T& value, int rank, unsigned offset);
inline void sync();
inline void flush(int rank);
inline void flush_all();
inline void flush_local(int rank);
inline void flush_local_all();
private:
std::vector<T> buffer_;
int rank_;
#ifndef VTKM_DIY_NO_MPI
MPI_Win window_;
#endif
};
} // mpi
} // diy
template<class T>
diy::mpi::window<T>::
window(const communicator& comm, unsigned size):
buffer_(size), rank_(comm.rank())
{
#ifndef VTKM_DIY_NO_MPI
MPI_Win_create(buffer_.data(), buffer_.size()*sizeof(T), sizeof(T), MPI_INFO_NULL, comm, &window_);
#endif
}
template<class T>
diy::mpi::window<T>::
~window()
{
#ifndef VTKM_DIY_NO_MPI
MPI_Win_free(&window_);
#endif
}
template<class T>
void
diy::mpi::window<T>::
put(const T& x, int rank, unsigned offset)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Put(address(x), count(x), datatype(x),
rank,
offset,
count(x), datatype(x),
window_);
#else
buffer_[offset] = x;
#endif
}
template<class T>
void
diy::mpi::window<T>::
put(const std::vector<T>& x, int rank, unsigned offset)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Put(address(x), count(x), datatype(x),
rank,
offset,
count(x), datatype(x),
window_);
#else
for (size_t i = 0; i < x.size(); ++i)
buffer_[offset + i] = x[i];
#endif
}
template<class T>
void
diy::mpi::window<T>::
get(T& x, int rank, unsigned offset)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Get(address(x), count(x), datatype(x),
rank,
offset,
count(x), datatype(x),
window_);
#else
x = buffer_[offset];
#endif
}
template<class T>
void
diy::mpi::window<T>::
get(std::vector<T>& x, int rank, unsigned offset)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Get(address(x), count(x), datatype(x),
rank,
offset,
count(x), datatype(x),
window_);
#else
for (size_t i = 0; i < x.size(); ++i)
x[i] = buffer_[offset + i];
#endif
}
template<class T>
void
diy::mpi::window<T>::
fence(int assert)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Win_fence(assert, window_);
#endif
}
template<class T>
void
diy::mpi::window<T>::
lock(int lock_type, int rank, int assert)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Win_lock(lock_type, rank, assert, window_);
#endif
}
template<class T>
void
diy::mpi::window<T>::
unlock(int rank)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Win_unlock(rank, window_);
#endif
}
template<class T>
void
diy::mpi::window<T>::
lock_all(int assert)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Win_lock_all(assert, window_);
#endif
}
template<class T>
void
diy::mpi::window<T>::
unlock_all()
{
#ifndef VTKM_DIY_NO_MPI
MPI_Win_unlock_all(window_);
#endif
}
template<class T>
void
diy::mpi::window<T>::
fetch_and_op(const T* origin, T* result, int rank, unsigned offset, MPI_Op op)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Fetch_and_op(origin, result, datatype(*origin), rank, offset, op, window_);
#else
DIY_UNSUPPORTED_MPI_CALL(MPI_Fetch_and_op);
#endif
}
template<class T>
void
diy::mpi::window<T>::
fetch(T& result, int rank, unsigned offset)
{
#ifndef VTKM_DIY_NO_MPI
T unused;
fetch_and_op(&unused, &result, rank, offset, MPI_NO_OP);
#else
result = buffer_[offset];
#endif
}
template<class T>
void
diy::mpi::window<T>::
replace(const T& value, int rank, unsigned offset)
{
#ifndef VTKM_DIY_NO_MPI
T unused;
fetch_and_op(&value, &unused, rank, offset, MPI_REPLACE);
#else
buffer_[offset] = value;
#endif
}
template<class T>
void
diy::mpi::window<T>::
sync()
{
#ifndef VTKM_DIY_NO_MPI
MPI_Win_sync(window_);
#endif
}
template<class T>
void
diy::mpi::window<T>::
flush(int rank)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Win_flush(rank, window_);
#endif
}
template<class T>
void
diy::mpi::window<T>::
flush_all()
{
#ifndef VTKM_DIY_NO_MPI
MPI_Win_flush_all(window_);
#endif
}
template<class T>
void
diy::mpi::window<T>::
flush_local(int rank)
{
#ifndef VTKM_DIY_NO_MPI
MPI_Win_flush_local(rank, window_);
#endif
}
template<class T>
void
diy::mpi::window<T>::
flush_local_all()
{
#ifndef VTKM_DIY_NO_MPI
MPI_Win_flush_local_all(window_);
#endif
}

@ -1,42 +0,0 @@
#ifndef VTKMDIY_NO_THREAD_HPP
#define VTKMDIY_NO_THREAD_HPP
#include <utility>
#include <functional>
#include <type_traits>
// replicates only the parts of the threading interface that we use
// executes everything in a single thread
namespace diy
{
struct thread
{
thread() {}
thread(thread&&) = default;
thread(const thread&) = delete;
template<class Function, class... Args>
explicit thread(Function&& f, Args&&... args) { f(args...); } // not ideal, since it doesn't support member functions
void join() {}
static unsigned hardware_concurrency() { return 1; }
};
struct mutex {};
struct fast_mutex {};
struct recursive_mutex {};
template<class T>
struct lock_guard
{
lock_guard(T&) {}
};
namespace this_thread
{
inline unsigned long int get_id() { return 0; }
}
}
#endif

@ -1,72 +0,0 @@
#ifndef VTKMDIY_PARTNERS_ALL_REDUCE_HPP
#define VTKMDIY_PARTNERS_ALL_REDUCE_HPP
#include "merge.hpp"
namespace diy
{
class Master;
//! Allreduce (reduction with results broadcasted to all blocks) is
//! implemented as two merge reductions, with incoming and outgoing items swapped in second one.
//! Ie, follows merge reduction up and down the merge tree
/**
* \ingroup Communication
* \brief Partners for all-reduce
*
*/
struct RegularAllReducePartners: public RegularMergePartners
{
typedef RegularMergePartners Parent; //!< base class merge reduction
//! contiguous parameter indicates whether to match partners contiguously or in a round-robin fashion;
//! contiguous is useful when data needs to be united;
//! round-robin is useful for vector-"halving"
template<class Decomposer>
RegularAllReducePartners(const Decomposer& decomposer, //!< domain decomposition
int k, //!< target k value
bool contiguous = true //!< distance doubling (true) or halving (false)
):
Parent(decomposer, k, contiguous) {}
RegularAllReducePartners(const DivisionVector& divs,//!< explicit division vector
const KVSVector& kvs, //!< explicit k vector
bool contiguous = true //!< distance doubling (true) or halving (false)
):
Parent(divs, kvs, contiguous) {}
//! returns total number of rounds
size_t rounds() const { return 2*Parent::rounds(); }
//! returns size of a group of partners in a given round
int size(int round) const { return Parent::size(parent_round(round)); }
//! returns dimension (direction of partners in a regular grid) in a given round
int dim(int round) const { return Parent::dim(parent_round(round)); }
//! returns whether a given block in a given round has dropped out of the merge yet or not
inline bool active(int round, int gid, const Master& m) const { return Parent::active(parent_round(round), gid, m); }
//! returns what the current round would be in the first or second parent merge reduction
int parent_round(int round) const { return round < (int) Parent::rounds() ? round : static_cast<int>(rounds()) - round; }
// incoming is only valid for an active gid; it will only be called with an active gid
inline void incoming(int round, int gid, std::vector<int>& partners, const Master& m) const
{
if (round <= (int) Parent::rounds())
Parent::incoming(round, gid, partners, m);
else
Parent::outgoing(parent_round(round), gid, partners, m);
}
inline void outgoing(int round, int gid, std::vector<int>& partners, const Master& m) const
{
if (round < (int) Parent::rounds())
Parent::outgoing(round, gid, partners, m);
else
Parent::incoming(parent_round(round), gid, partners, m);
}
};
} // diy
#endif

@ -1,62 +0,0 @@
#ifndef VTKMDIY_PARTNERS_BROADCAST_HPP
#define VTKMDIY_PARTNERS_BROADCAST_HPP
#include "merge.hpp"
namespace diy
{
class Master;
/**
* \ingroup Communication
* \brief Partners for broadcast
*
*/
struct RegularBroadcastPartners: public RegularMergePartners
{
typedef RegularMergePartners Parent; //!< base class merge reduction
//! contiguous parameter indicates whether to match partners contiguously or in a round-robin fashion;
//! contiguous is useful when data needs to be united;
//! round-robin is useful for vector-"halving"
template<class Decomposer>
RegularBroadcastPartners(const Decomposer& decomposer, //!< domain decomposition
int k, //!< target k value
bool contiguous = true //!< distance doubling (true) or halving (false)
):
Parent(decomposer, k, contiguous) {}
RegularBroadcastPartners(const DivisionVector& divs,//!< explicit division vector
const KVSVector& kvs, //!< explicit k vector
bool contiguous = true //!< distance doubling (true) or halving (false)
):
Parent(divs, kvs, contiguous) {}
//! returns total number of rounds
size_t rounds() const { return Parent::rounds(); }
//! returns size of a group of partners in a given round
int size(int round) const { return Parent::size(parent_round(round)); }
//! returns dimension (direction of partners in a regular grid) in a given round
int dim(int round) const { return Parent::dim(parent_round(round)); }
//! returns whether a given block in a given round has dropped out of the merge yet or not
inline bool active(int round, int gid, const Master& m) const { return Parent::active(parent_round(round), gid, m); }
//! returns what the current round would be in the first or second parent merge reduction
int parent_round(int round) const { return rounds() - round; }
// incoming is only valid for an active gid; it will only be called with an active gid
inline void incoming(int round, int gid, std::vector<int>& partners, const Master& m) const
{
Parent::outgoing(parent_round(round), gid, partners, m);
}
inline void outgoing(int round, int gid, std::vector<int>& partners, const Master& m) const
{
Parent::incoming(parent_round(round), gid, partners, m);
}
};
} // diy
#endif

@ -1,204 +0,0 @@
#ifndef VTKMDIY_PARTNERS_COMMON_HPP
#define VTKMDIY_PARTNERS_COMMON_HPP
#include "../decomposition.hpp"
#include "../types.hpp"
namespace diy
{
struct RegularPartners
{
// The record of group size per round in a dimension
struct DimK
{
DimK(int dim_, int k_):
dim(dim_), size(k_) {}
int dim;
int size; // group size
};
typedef std::vector<int> CoordVector;
typedef std::vector<int> DivisionVector;
typedef std::vector<DimK> KVSVector;
// The part of RegularDecomposer that we need works the same with either Bounds (so we fix them arbitrarily)
typedef DiscreteBounds Bounds;
typedef RegularDecomposer<Bounds> Decomposer;
template<class Decomposer_>
RegularPartners(const Decomposer_& decomposer, int k, bool contiguous = true):
divisions_(decomposer.divisions),
contiguous_(contiguous) { factor(k, divisions_, kvs_); fill_steps(); }
RegularPartners(const DivisionVector& divs,
const KVSVector& kvs,
bool contiguous = true):
divisions_(divs), kvs_(kvs),
contiguous_(contiguous) { fill_steps(); }
size_t rounds() const { return kvs_.size(); }
int size(int round) const { return kvs_[round].size; }
int dim(int round) const { return kvs_[round].dim; }
int step(int round) const { return steps_[round]; }
const DivisionVector& divisions() const { return divisions_; }
const KVSVector& kvs() const { return kvs_; }
bool contiguous() const { return contiguous_; }
static
inline void factor(int k, const DivisionVector& divisions, KVSVector& kvs);
inline void fill(int round, int gid, std::vector<int>& partners) const;
inline int group_position(int round, int c, int step) const;
private:
inline void fill_steps();
static
inline void factor(int k, int tot_b, std::vector<int>& kvs);
DivisionVector divisions_;
KVSVector kvs_;
bool contiguous_;
std::vector<int> steps_;
};
}
void
diy::RegularPartners::
fill_steps()
{
if (contiguous_)
{
std::vector<int> cur_steps(divisions().size(), 1);
for (size_t r = 0; r < rounds(); ++r)
{
steps_.push_back(cur_steps[kvs_[r].dim]);
cur_steps[kvs_[r].dim] *= kvs_[r].size;
}
} else
{
std::vector<int> cur_steps(divisions().begin(), divisions().end());
for (size_t r = 0; r < rounds(); ++r)
{
cur_steps[kvs_[r].dim] /= kvs_[r].size;
steps_.push_back(cur_steps[kvs_[r].dim]);
}
}
}
void
diy::RegularPartners::
fill(int round, int gid, std::vector<int>& partners) const
{
const DimK& kv = kvs_[round];
partners.reserve(kv.size);
int step = this->step(round); // gids jump by this much in the current round
CoordVector coords;
Decomposer::gid_to_coords(gid, coords, divisions_);
int c = coords[kv.dim];
int pos = group_position(round, c, step);
int partner = c - pos * step;
coords[kv.dim] = partner;
int partner_gid = Decomposer::coords_to_gid(coords, divisions_);
partners.push_back(partner_gid);
for (int k = 1; k < kv.size; ++k)
{
partner += step;
coords[kv.dim] = partner;
partner_gid = Decomposer::coords_to_gid(coords, divisions_);
partners.push_back(partner_gid);
}
}
// Tom's GetGrpPos
int
diy::RegularPartners::
group_position(int round, int c, int step) const
{
// the second term in the following expression does not simplify to
// (gid - start_b) / kv[r]
// because the division gid / (step * kv[r]) is integer and truncates
// this is exactly what we want
int g = c % step + c / (step * kvs_[round].size) * step;
int p = c / step % kvs_[round].size;
static_cast<void>(g); // shut up the compiler
// g: group number (output)
// p: position number within the group (output)
return p;
}
void
diy::RegularPartners::
factor(int k, const DivisionVector& divisions, KVSVector& kvs)
{
// factor in each dimension
std::vector< std::vector<int> > tmp_kvs(divisions.size());
for (unsigned i = 0; i < divisions.size(); ++i)
factor(k, divisions[i], tmp_kvs[i]);
// interleave the dimensions
std::vector<int> round_per_dim(divisions.size(), 0);
while(true)
{
// TODO: not the most efficient way to do this
bool changed = false;
for (unsigned i = 0; i < divisions.size(); ++i)
{
if (round_per_dim[i] == (int) tmp_kvs[i].size())
continue;
kvs.push_back(DimK(i, tmp_kvs[i][round_per_dim[i]++]));
changed = true;
}
if (!changed)
break;
}
}
// Tom's FactorK
void
diy::RegularPartners::
factor(int k, int tot_b, std::vector<int>& kv)
{
int rem = tot_b; // unfactored remaining portion of tot_b
int j;
while (rem > 1)
{
// remainder is divisible by k
if (rem % k == 0)
{
kv.push_back(k);
rem /= k;
}
// if not, start at k and linearly look for smaller factors down to 2
else
{
for (j = k - 1; j > 1; j--)
{
if (rem % j == 0)
{
kv.push_back(j);
rem /= j;
break;
}
}
if (j == 1)
{
kv.push_back(rem);
rem = 1;
}
} // else
} // while
}
#endif

@ -1,60 +0,0 @@
#ifndef VTKMDIY_PARTNERS_MERGE_HPP
#define VTKMDIY_PARTNERS_MERGE_HPP
#include "common.hpp"
namespace diy
{
class Master;
/**
* \ingroup Communication
* \brief Partners for merge-reduce
*
*/
struct RegularMergePartners: public RegularPartners
{
typedef RegularPartners Parent;
// contiguous parameter indicates whether to match partners contiguously or in a round-robin fashion;
// contiguous is useful when data needs to be united;
// round-robin is useful for vector-"halving"
template<class Decomposer>
RegularMergePartners(const Decomposer& decomposer, //!< domain decomposition
int k, //!< target k value
bool contiguous = true //!< distance doubling (true) or halving (false)
):
Parent(decomposer, k, contiguous) {}
RegularMergePartners(const DivisionVector& divs, //!< explicit division vector
const KVSVector& kvs, //!< explicit k vector
bool contiguous = true //!< distance doubling (true) or halving (false)
):
Parent(divs, kvs, contiguous) {}
inline bool active(int round, int gid, const Master&) const;
// incoming is only valid for an active gid; it will only be called with an active gid
inline void incoming(int round, int gid, std::vector<int>& partners, const Master&) const { Parent::fill(round - 1, gid, partners); }
// this is a lazy implementation of outgoing, but it reuses the existing code
inline void outgoing(int round, int gid, std::vector<int>& partners, const Master&) const { std::vector<int> tmp; Parent::fill(round, gid, tmp); partners.push_back(tmp[0]); }
};
} // diy
bool
diy::RegularMergePartners::
active(int round, int gid, const Master&) const
{
CoordVector coords;
Decomposer::gid_to_coords(gid, coords, divisions());
for (int r = 0; r < round; ++r)
if (Parent::group_position(r, coords[kvs()[r].dim], step(r)) != 0)
return false;
return true;
}
#endif

@ -1,43 +0,0 @@
#ifndef VTKMDIY_PARTNERS_SWAP_HPP
#define VTKMDIY_PARTNERS_SWAP_HPP
#include "common.hpp"
namespace diy
{
class Master;
/**
* \ingroup Communication
* \brief Partners for swap-reduce
*
*/
struct RegularSwapPartners: public RegularPartners
{
typedef RegularPartners Parent;
// contiguous parameter indicates whether to match partners contiguously or in a round-robin fashion;
// contiguous is useful when data needs to be united;
// round-robin is useful for vector-"halving"
template<class Decomposer>
RegularSwapPartners(const Decomposer& decomposer, //!< domain decomposition
int k, //!< target k value
bool contiguous = true //!< distance halving (true) or doubling (false)
):
Parent(decomposer, k, contiguous) {}
RegularSwapPartners(const DivisionVector& divs, //!< explicit division vector
const KVSVector& kvs, //!< explicit k vector
bool contiguous = true //!< distance halving (true) or doubling (false)
):
Parent(divs, kvs, contiguous) {}
bool active(int, int, const Master&) const { return true; } // in swap-reduce every block is always active
void incoming(int round, int gid, std::vector<int>& partners, const Master&) const { Parent::fill(round - 1, gid, partners); }
void outgoing(int round, int gid, std::vector<int>& partners, const Master&) const { Parent::fill(round, gid, partners); }
};
} // diy
#endif

@ -1,137 +0,0 @@
#ifndef VTKMDIY_PICK_HPP
#define VTKMDIY_PICK_HPP
#include "link.hpp"
namespace diy
{
template<class Bounds, class Point, class OutIter>
void near(const RegularLink<Bounds>& link, const Point& p, float r, OutIter out,
const Bounds& domain);
template<class Bounds, class Point, class OutIter>
void in(const RegularLink<Bounds>& link, const Point& p, OutIter out, const Bounds& domain);
template<class Point, class Bounds>
float distance(int dim, const Bounds& bounds, const Point& p);
template<class Bounds>
inline
float distance(int dim, const Bounds& bounds1, const Bounds& bounds2);
template<class Bounds>
void wrap_bounds(Bounds& bounds, Direction wrap_dir, const Bounds& domain, int dim);
}
//! Finds the neighbors within radius r of a target point.
template<class Bounds, class Point, class OutIter>
void
diy::
near(const RegularLink<Bounds>& link, //!< neighbors
const Point& p, //!< target point (must be in current block)
float r, //!< target radius (>= 0.0)
OutIter out, //!< insert iterator for output set of neighbors
const Bounds& domain) //!< global domain bounds
{
Bounds neigh_bounds; // neighbor block bounds
// for all neighbors of this block
for (int n = 0; n < link.size(); n++)
{
// wrap neighbor bounds, if necessary, otherwise bounds will be unchanged
neigh_bounds = link.bounds(n);
wrap_bounds(neigh_bounds, link.wrap(n), domain, link.dimension());
if (distance(link.dimension(), neigh_bounds, p) <= r)
*out++ = n;
} // for all neighbors
}
//! Find the distance between point `p` and box `bounds`.
template<class Point, class Bounds>
float
diy::
distance(int dim, const Bounds& bounds, const Point& p)
{
float res = 0;
for (int i = 0; i < dim; ++i)
{
// avoids all the annoying case logic by finding
// diff = max(bounds.min[i] - p[i], 0, p[i] - bounds.max[i])
float diff = 0, d;
d = bounds.min[i] - p[i];
if (d > diff) diff = d;
d = p[i] - bounds.max[i];
if (d > diff) diff = d;
res += diff*diff;
}
return sqrt(res);
}
template<class Bounds>
float
diy::
distance(int dim, const Bounds& bounds1, const Bounds& bounds2)
{
float res = 0;
for (int i = 0; i < dim; ++i)
{
float diff = 0, d;
float d1 = bounds1.max[i] - bounds2.min[i];
float d2 = bounds2.max[i] - bounds1.min[i];
if (d1 > 0 && d2 > 0)
diff = 0;
else if (d1 <= 0)
diff = -d1;
else if (d2 <= 0)
diff = -d2;
res += diff*diff;
}
return sqrt(res);
}
//! Finds the neighbor(s) containing the target point.
template<class Bounds, class Point, class OutIter>
void
diy::
in(const RegularLink<Bounds>& link, //!< neighbors
const Point& p, //!< target point
OutIter out, //!< insert iterator for output set of neighbors
const Bounds& domain) //!< global domain bounds
{
Bounds neigh_bounds; // neighbor block bounds
// for all neighbors of this block
for (int n = 0; n < link.size(); n++)
{
// wrap neighbor bounds, if necessary, otherwise bounds will be unchanged
neigh_bounds = link.bounds(n);
wrap_bounds(neigh_bounds, link.wrap(n), domain, link.dimension());
if (distance(link.dimension(), neigh_bounds, p) == 0)
*out++ = n;
} // for all neighbors
}
// wraps block bounds
// wrap dir is the wrapping direction from original block to wrapped neighbor block
// overall domain bounds and dimensionality are also needed
template<class Bounds>
void
diy::
wrap_bounds(Bounds& bounds, Direction wrap_dir, const Bounds& domain, int dim)
{
for (int i = 0; i < dim; ++i)
{
bounds.min[i] += wrap_dir[i] * (domain.max[i] - domain.min[i]);
bounds.max[i] += wrap_dir[i] * (domain.max[i] - domain.min[i]);
}
}
#endif

@ -1,120 +0,0 @@
#ifndef VTKMDIY_POINT_HPP
#define VTKMDIY_POINT_HPP
#include <iostream>
#include <vector>
#include <string>
#include <sstream>
#include <array>
namespace diy
{
template<class Coordinate_, unsigned D>
class Point: public std::array<Coordinate_, D>
{
public:
typedef Coordinate_ Coordinate;
typedef std::array<Coordinate, D> ArrayParent;
typedef Point<Coordinate, D-1> LPoint;
typedef Point<Coordinate, D+1> UPoint;
template<class U>
struct rebind { typedef Point<U,D> type; };
public:
Point() { for (unsigned i = 0; i < D; ++i) (*this)[i] = 0; }
Point(const ArrayParent& a):
ArrayParent(a) {}
template<class T> Point(const Point<T, D>& p) { for (size_t i = 0; i < D; ++i) (*this)[i] = p[i]; }
template<class T> Point(const T* a) { for (unsigned i = 0; i < D; ++i) (*this)[i] = a[i]; }
template<class T> Point(const std::vector<T>& a) { for (unsigned i = 0; i < D; ++i) (*this)[i] = a[i]; }
Point(std::initializer_list<Coordinate> lst) { unsigned i = 0; for (Coordinate x : lst) (*this)[i++] = x; }
Point(Point&&) =default;
Point(const Point&) =default;
Point& operator=(const Point&) =default;
static constexpr
unsigned dimension() { return D; }
static Point zero() { return Point(); }
static Point one() { Point p; for (unsigned i = 0; i < D; ++i) p[i] = 1; return p; }
LPoint drop(int dim) const { LPoint p; unsigned c = 0; for (unsigned i = 0; i < D; ++i) { if (i == dim) continue; p[c++] = (*this)[i]; } return p; }
UPoint lift(int dim, Coordinate x) const { UPoint p; for (unsigned i = 0; i < D+1; ++i) { if (i < dim) p[i] = (*this)[i]; else if (i == dim) p[i] = x; else if (i > dim) p[i] = (*this)[i-1]; } return p; }
using ArrayParent::operator[];
Point& operator+=(const Point& y) { for (unsigned i = 0; i < D; ++i) (*this)[i] += y[i]; return *this; }
Point& operator-=(const Point& y) { for (unsigned i = 0; i < D; ++i) (*this)[i] -= y[i]; return *this; }
Point& operator*=(Coordinate a) { for (unsigned i = 0; i < D; ++i) (*this)[i] *= a; return *this; }
Point& operator/=(Coordinate a) { for (unsigned i = 0; i < D; ++i) (*this)[i] /= a; return *this; }
Coordinate norm() const { return (*this)*(*this); }
std::ostream& operator<<(std::ostream& out) const { out << (*this)[0]; for (unsigned i = 1; i < D; ++i) out << " " << (*this)[i]; return out; }
std::istream& operator>>(std::istream& in);
friend
Point operator+(Point x, const Point& y) { x += y; return x; }
friend
Point operator-(Point x, const Point& y) { x -= y; return x; }
friend
Point operator/(Point x, Coordinate y) { x /= y; return x; }
friend
Point operator*(Point x, Coordinate y) { x *= y; return x; }
friend
Point operator*(Coordinate y, Point x) { x *= y; return x; }
friend
Coordinate operator*(const Point& x, const Point& y) { Coordinate n = 0; for (size_t i = 0; i < D; ++i) n += x[i] * y[i]; return n; }
template<class T>
friend
Coordinate operator*(const Point<T,D>& x, const Point& y) { Coordinate n = 0; for (size_t i = 0; i < D; ++i) n += x[i] * y[i]; return n; }
};
template<class C, unsigned D>
std::istream&
Point<C,D>::
operator>>(std::istream& in)
{
std::string point_str;
in >> point_str; // read until ' '
std::stringstream ps(point_str);
char x;
for (unsigned i = 0; i < dimension(); ++i)
{
ps >> (*this)[i];
ps >> x;
}
return in;
}
template<class Coordinate, unsigned D>
Coordinate norm2(const Point<Coordinate,D>& p)
{ Coordinate res = 0; for (unsigned i = 0; i < D; ++i) res += p[i]*p[i]; return res; }
template<class C, unsigned D>
std::ostream&
operator<<(std::ostream& out, const Point<C,D>& p)
{ return p.operator<<(out); }
template<class C, unsigned D>
std::istream&
operator>>(std::istream& in, Point<C,D>& p)
{ return p.operator>>(in); }
}
#endif // DIY_POINT_HPP

@ -1,293 +0,0 @@
#ifndef VTKMDIY_PROXY_HPP
#define VTKMDIY_PROXY_HPP
namespace diy
{
//! Communication proxy, used for enqueueing and dequeueing items for future exchange.
struct Master::Proxy
{
template <class T>
struct EnqueueIterator;
Proxy(Master* master__, int gid__):
gid_(gid__),
master_(master__),
incoming_(&master__->incoming(gid__)),
outgoing_(&master__->outgoing(gid__)),
collectives_(&master__->collectives(gid__)) {}
int gid() const { return gid_; }
//! Enqueue data whose size can be determined automatically, e.g., an STL vector.
template<class T>
void enqueue(const BlockID& to, //!< target block (gid,proc)
const T& x, //!< data (eg. STL vector)
void (*save)(BinaryBuffer&, const T&) = &::diy::save<T> //!< optional serialization function
) const
{ OutgoingQueues& out = *outgoing_; save(out[to], x); }
//! Enqueue data whose size is given explicitly by the user, e.g., an array.
template<class T>
void enqueue(const BlockID& to, //!< target block (gid,proc)
const T* x, //!< pointer to the data (eg. address of start of vector)
size_t n, //!< size in data elements (eg. ints)
void (*save)(BinaryBuffer&, const T&) = &::diy::save<T> //!< optional serialization function
) const;
//! Dequeue data whose size can be determined automatically (e.g., STL vector) and that was
//! previously enqueued so that diy knows its size when it is received.
//! In this case, diy will allocate the receive buffer; the user does not need to do so.
template<class T>
void dequeue(int from, //!< target block gid
T& x, //!< data (eg. STL vector)
void (*load)(BinaryBuffer&, T&) = &::diy::load<T> //!< optional serialization function
) const
{ IncomingQueues& in = *incoming_; load(in[from], x); }
//! Dequeue an array of data whose size is given explicitly by the user.
//! In this case, the user needs to allocate the receive buffer prior to calling dequeue.
template<class T>
void dequeue(int from, //!< target block gid
T* x, //!< pointer to the data (eg. address of start of vector)
size_t n, //!< size in data elements (eg. ints)
void (*load)(BinaryBuffer&, T&) = &::diy::load<T> //!< optional serialization function
) const;
//! Dequeue data whose size can be determined automatically (e.g., STL vector) and that was
//! previously enqueued so that diy knows its size when it is received.
//! In this case, diy will allocate the receive buffer; the user does not need to do so.
template<class T>
void dequeue(const BlockID& from, //!< target block (gid,proc)
T& x, //!< data (eg. STL vector)
void (*load)(BinaryBuffer&, T&) = &::diy::load<T> //!< optional serialization function
) const { dequeue(from.gid, x, load); }
//! Dequeue an array of data whose size is given explicitly by the user.
//! In this case, the user needs to allocate the receive buffer prior to calling dequeue.
template<class T>
void dequeue(const BlockID& from, //!< target block (gid,proc)
T* x, //!< pointer to the data (eg. address of start of vector)
size_t n, //!< size in data elements (eg. ints)
void (*load)(BinaryBuffer&, T&) = &::diy::load<T> //!< optional serialization function
) const { dequeue(from.gid, x, n, load); }
template<class T>
EnqueueIterator<T> enqueuer(const T& x,
void (*save)(BinaryBuffer&, const T&) = &::diy::save<T>) const
{ return EnqueueIterator<T>(this, x, save); }
IncomingQueues* incoming() const { return incoming_; }
MemoryBuffer& incoming(int from) const { return (*incoming_)[from]; }
inline void incoming(std::vector<int>& v) const; // fill v with every gid from which we have a message
OutgoingQueues* outgoing() const { return outgoing_; }
MemoryBuffer& outgoing(const BlockID& to) const { return (*outgoing_)[to]; }
/**
* \ingroup Communication
* \brief Post an all-reduce collective using an existing communication proxy.
* Available operators are:
* maximum<T>, minimum<T>, std::plus<T>, std::multiplies<T>, std::logical_and<T>, and
* std::logical_or<T>.
*/
template<class T, class Op>
inline void all_reduce(const T& in, //!< local value being reduced
Op op //!< operator
) const;
/**
* \ingroup Communication
* \brief Return the result of a proxy collective without popping it off the collectives list (same result would be returned multiple times). The list can be cleared with collectives()->clear().
*/
template<class T>
inline T read() const;
/**
* \ingroup Communication
* \brief Return the result of a proxy collective; result is popped off the collectives list.
*/
template<class T>
inline T get() const;
template<class T>
inline void scratch(const T& in) const;
/**
* \ingroup Communication
* \brief Return the list of proxy collectives (values and operations)
*/
CollectivesList* collectives() const { return collectives_; }
Master* master() const { return master_; }
private:
int gid_;
Master* master_;
IncomingQueues* incoming_;
OutgoingQueues* outgoing_;
CollectivesList* collectives_;
};
template<class T>
struct Master::Proxy::EnqueueIterator:
public std::iterator<std::output_iterator_tag, void, void, void, void>
{
typedef void (*SaveT)(BinaryBuffer&, const T&);
EnqueueIterator(const Proxy* proxy, const T& x,
SaveT save = &::diy::save<T>):
proxy_(proxy), x_(x), save_(save) {}
EnqueueIterator& operator=(const BlockID& to) { proxy_->enqueue(to, x_, save_); return *this; }
EnqueueIterator& operator*() { return *this; }
EnqueueIterator& operator++() { return *this; }
EnqueueIterator& operator++(int) { return *this; }
private:
const Proxy* proxy_;
const T& x_;
SaveT save_;
};
struct Master::ProxyWithLink: public Master::Proxy
{
ProxyWithLink(const Proxy& proxy,
void* block__,
Link* link__,
IExchangeInfo* iexchange__ = 0):
Proxy(proxy),
block_(block__),
link_(link__),
iexchange_(iexchange__) {}
Link* link() const { return link_; }
void* block() const { return block_; }
private:
void* block_;
Link* link_;
IExchangeInfo* iexchange_; // not used for iexchange presently, but later could trigger some special behavior
public:
template<class T>
void enqueue(const BlockID& to,
const T& x,
void (*save)(BinaryBuffer&, const T&) = &::diy::save<T>) const
{
diy::Master::Proxy::enqueue(to, x, save);
if (iexchange_)
master()->icommunicate(iexchange_);
}
template<class T>
void enqueue(const BlockID& to,
const T* x,
size_t n,
void (*save)(BinaryBuffer&, const T&) = &::diy::save<T>) const
{
diy::Master::Proxy::enqueue(to, x, n, save);
if (iexchange_)
master()->icommunicate(iexchange_);
}
template<class T>
void dequeue(int from,
T& x,
void (*load)(BinaryBuffer&, T&) = &::diy::load<T>) const
{
// TODO: uncomment if necessary, try first without icommunicating on dequeue
// if (iexchange_)
// master()->icommunicate(iexchange_);
diy::Master::Proxy::dequeue(from, x, load);
}
template<class T>
void dequeue(int from,
T* x,
size_t n,
void (*load)(BinaryBuffer&, T&) = &::diy::load<T>) const
{
// TODO: uncomment if necessary, try first without icommunicating on dequeue
// if (iexchange_)
// master()->icommunicate(iexchange_);
diy::Master::Proxy::dequeue(from, x, n, load);
}
};
} // diy namespace
void
diy::Master::Proxy::
incoming(std::vector<int>& v) const
{
for (IncomingQueues::const_iterator it = incoming_->begin(); it != incoming_->end(); ++it)
v.push_back(it->first);
}
template<class T, class Op>
void
diy::Master::Proxy::
all_reduce(const T& in, Op op) const
{
collectives_->push_back(Collective(new detail::AllReduceOp<T,Op>(in, op)));
}
template<class T>
T
diy::Master::Proxy::
read() const
{
T res;
collectives_->front().result_out(&res);
return res;
}
template<class T>
T
diy::Master::Proxy::
get() const
{
T res = read<T>();
collectives_->pop_front();
return res;
}
template<class T>
void
diy::Master::Proxy::
scratch(const T& in) const
{
collectives_->push_back(Collective(new detail::Scratch<T>(in)));
}
template<class T>
void
diy::Master::Proxy::
enqueue(const BlockID& to, const T* x, size_t n,
void (*save)(BinaryBuffer&, const T&)) const
{
OutgoingQueues& out = *outgoing_;
BinaryBuffer& bb = out[to];
if (save == (void (*)(BinaryBuffer&, const T&)) &::diy::save<T>)
diy::save(bb, x, n); // optimized for unspecialized types
else
for (size_t i = 0; i < n; ++i)
save(bb, x[i]);
}
template<class T>
void
diy::Master::Proxy::
dequeue(int from, T* x, size_t n,
void (*load)(BinaryBuffer&, T&)) const
{
IncomingQueues& in = *incoming_;
BinaryBuffer& bb = in[from];
if (load == (void (*)(BinaryBuffer&, T&)) &::diy::load<T>)
diy::load(bb, x, n); // optimized for unspecialized types
else
for (size_t i = 0; i < n; ++i)
load(bb, x[i]);
}
#endif

@ -1,33 +0,0 @@
#ifndef VTKMDIY_REDUCE_OPERATIONS_HPP
#define VTKMDIY_REDUCE_OPERATIONS_HPP
#include "reduce.hpp"
#include "partners/swap.hpp"
#include "detail/reduce/all-to-all.hpp"
namespace diy
{
/**
* \ingroup Communication
* \brief all to all reduction
*
*/
template<class Op>
void
all_to_all(Master& master, //!< block owner
const Assigner& assigner, //!< global block locator (maps gid to proc)
const Op& op, //!< user-defined operation called to enqueue and dequeue items
int k = 2 //!< reduction fanout
)
{
auto scoped = master.prof.scoped("all_to_all");
(void)scoped;
RegularDecomposer<DiscreteBounds> decomposer(1, interval(0,assigner.nblocks()-1), assigner.nblocks());
RegularSwapPartners partners(decomposer, k, false);
reduce(master, assigner, partners, detail::AllToAllReduce<Op>(op, assigner), detail::SkipIntermediate(partners.rounds()));
}
}
#endif

@ -1,216 +0,0 @@
#ifndef VTKMDIY_REDUCE_HPP
#define VTKMDIY_REDUCE_HPP
#include <vector>
#include "master.hpp"
#include "assigner.hpp"
#include "detail/block_traits.hpp"
#include "log.hpp"
namespace diy
{
//! Enables communication within a group during a reduction.
//! DIY creates the ReduceProxy for you in diy::reduce()
//! and provides a reference to ReduceProxy each time the user's reduction function is called
struct ReduceProxy: public Master::Proxy
{
typedef std::vector<int> GIDVector;
ReduceProxy(const Master::Proxy& proxy, //!< parent proxy
void* block, //!< diy block
unsigned round, //!< current round
const Assigner& assigner, //!< assigner
const GIDVector& incoming_gids, //!< incoming gids in this group
const GIDVector& outgoing_gids): //!< outgoing gids in this group
Master::Proxy(proxy),
block_(block),
round_(round),
assigner_(assigner)
{
// setup in_link
for (unsigned i = 0; i < incoming_gids.size(); ++i)
{
BlockID nbr;
nbr.gid = incoming_gids[i];
nbr.proc = assigner.rank(nbr.gid);
in_link_.add_neighbor(nbr);
}
// setup out_link
for (unsigned i = 0; i < outgoing_gids.size(); ++i)
{
BlockID nbr;
nbr.gid = outgoing_gids[i];
nbr.proc = assigner.rank(nbr.gid);
out_link_.add_neighbor(nbr);
}
}
ReduceProxy(const Master::Proxy& proxy, //!< parent proxy
void* block, //!< diy block
unsigned round, //!< current round
const Assigner& assigner,
const Link& in_link,
const Link& out_link):
Master::Proxy(proxy),
block_(block),
round_(round),
assigner_(assigner),
in_link_(in_link),
out_link_(out_link)
{}
//! returns pointer to block
void* block() const { return block_; }
//! returns current round number
unsigned round() const { return round_; }
//! returns incoming link
const Link& in_link() const { return in_link_; }
//! returns outgoing link
const Link& out_link() const { return out_link_; }
//! returns total number of blocks
int nblocks() const { return assigner_.nblocks(); }
//! returns the assigner
const Assigner& assigner() const { return assigner_; }
//! advanced: change current round number
void set_round(unsigned r) { round_ = r; }
private:
void* block_;
unsigned round_;
const Assigner& assigner_;
Link in_link_;
Link out_link_;
};
namespace detail
{
template<class Block, class Partners>
struct ReductionFunctor;
template<class Partners, class Skip>
struct SkipInactiveOr;
struct ReduceNeverSkip
{
bool operator()(int, int, const Master&) const { return false; }
};
}
/**
* \ingroup Communication
* \brief Implementation of the reduce communication pattern (includes
* swap-reduce, merge-reduce, and any other global communication).
*
*/
template<class Reduce, class Partners, class Skip>
void reduce(Master& master, //!< master object
const Assigner& assigner, //!< assigner object
const Partners& partners, //!< partners object
const Reduce& reduce, //!< reduction callback function
const Skip& skip) //!< object determining whether a block should be skipped
{
auto log = get_logger();
int original_expected = master.expected();
using Block = typename detail::block_traits<Reduce>::type;
unsigned round;
for (round = 0; round < partners.rounds(); ++round)
{
log->debug("Round {}", round);
master.foreach(detail::ReductionFunctor<Block,Partners>(round, reduce, partners, assigner),
detail::SkipInactiveOr<Partners,Skip>(round, partners, skip));
master.execute();
int expected = 0;
for (unsigned i = 0; i < master.size(); ++i)
{
if (partners.active(round + 1, master.gid(i), master))
{
std::vector<int> incoming_gids;
partners.incoming(round + 1, master.gid(i), incoming_gids, master);
expected += static_cast<int>(incoming_gids.size());
master.incoming(master.gid(i)).clear();
}
}
master.set_expected(expected);
master.flush();
}
// final round
log->debug("Round {}", round);
master.foreach(detail::ReductionFunctor<Block,Partners>(round, reduce, partners, assigner),
detail::SkipInactiveOr<Partners,Skip>(round, partners, skip));
master.set_expected(original_expected);
}
/**
* \ingroup Communication
* \brief Implementation of the reduce communication pattern (includes
* swap-reduce, merge-reduce, and any other global communication).
*
*/
template<class Reduce, class Partners>
void reduce(Master& master, //!< master object
const Assigner& assigner, //!< assigner object
const Partners& partners, //!< partners object
const Reduce& reducer) //!< reduction callback function
{
reduce(master, assigner, partners, reducer, detail::ReduceNeverSkip());
}
namespace detail
{
template<class Block, class Partners>
struct ReductionFunctor
{
using Callback = std::function<void(Block*, const ReduceProxy&, const Partners&)>;
ReductionFunctor(unsigned round_, const Callback& reduce_, const Partners& partners_, const Assigner& assigner_):
round(round_), reduce(reduce_), partners(partners_), assigner(assigner_) {}
void operator()(Block* b, const Master::ProxyWithLink& cp) const
{
if (!partners.active(round, cp.gid(), *cp.master())) return;
std::vector<int> incoming_gids, outgoing_gids;
if (round > 0)
partners.incoming(round, cp.gid(), incoming_gids, *cp.master()); // receive from the previous round
if (round < partners.rounds())
partners.outgoing(round, cp.gid(), outgoing_gids, *cp.master()); // send to the next round
ReduceProxy rp(cp, b, round, assigner, incoming_gids, outgoing_gids);
reduce(b, rp, partners);
// touch the outgoing queues to make sure they exist
Master::OutgoingQueues& outgoing = *cp.outgoing();
if (outgoing.size() < (size_t) rp.out_link().size())
for (int j = 0; j < rp.out_link().size(); ++j)
outgoing[rp.out_link().target(j)]; // touch the outgoing queue, creating it if necessary
}
unsigned round;
Callback reduce;
Partners partners;
const Assigner& assigner;
};
template<class Partners, class Skip>
struct SkipInactiveOr
{
SkipInactiveOr(int round_, const Partners& partners_, const Skip& skip_):
round(round_), partners(partners_), skip(skip_) {}
bool operator()(int i, const Master& master) const { return !partners.active(round, master.gid(i), master) || skip(round, i, master); }
int round;
const Partners& partners;
Skip skip;
};
}
} // diy
#endif // DIY_REDUCE_HPP

@ -1,72 +0,0 @@
#ifndef VTKMDIY_RESOLVE_HPP
#define VTKMDIY_RESOLVE_HPP
#include "master.hpp"
#include "assigner.hpp"
namespace diy
{
// record master gids in assigner and then lookup the procs for all gids in the links
inline void fix_links(diy::Master& master, diy::DynamicAssigner& assigner);
// auxiliary functions; could stick them into detail namespace, but they might be useful on their own
inline void record_local_gids(const diy::Master& master, diy::DynamicAssigner& assigner);
inline void update_links(diy::Master& master, const diy::DynamicAssigner& assigner);
}
void
diy::
record_local_gids(const diy::Master& master, diy::DynamicAssigner& assigner)
{
// figure out local ranks
std::vector<std::tuple<int,int>> local_gids;
for (int i = 0; i < master.size(); ++i)
local_gids.emplace_back(std::make_tuple(master.communicator().rank(), master.gid(i)));
assigner.set_ranks(local_gids);
}
void
diy::
update_links(diy::Master& master, const diy::DynamicAssigner& assigner)
{
// figure out all the gids we need
std::vector<int> nbr_gids;
for (int i = 0; i < master.size(); ++i)
{
auto* link = master.link(i);
for (auto blockid : link->neighbors())
nbr_gids.emplace_back(blockid.gid);
}
// keep only unique gids to avoid unnecessary lookups
std::sort(nbr_gids.begin(), nbr_gids.end());
nbr_gids.resize(std::unique(nbr_gids.begin(), nbr_gids.end()) - nbr_gids.begin());
// get the neighbor ranks
std::vector<int> nbr_procs = assigner.ranks(nbr_gids);
// convert to a map
std::unordered_map<int, int> gid_to_proc;
for (size_t i = 0; i < nbr_gids.size(); ++i)
gid_to_proc[nbr_gids[i]] = nbr_procs[i];
// fix the procs in links
for (int i = 0; i < master.size(); ++i)
{
auto* link = master.link(i);
for (auto& blockid : link->neighbors())
blockid.proc = gid_to_proc[blockid.gid];
}
}
void
diy::
fix_links(diy::Master& master, diy::DynamicAssigner& assigner)
{
record_local_gids(master, assigner);
master.communicator().barrier(); // make sure everyone has set ranks
update_links(master, assigner);
}
#endif

@ -1,509 +0,0 @@
#ifndef VTKMDIY_SERIALIZATION_HPP
#define VTKMDIY_SERIALIZATION_HPP
#include <vector>
#include <valarray>
#include <map>
#include <set>
#include <string>
#include <fstream>
#include <tuple>
#include <unordered_map>
#include <unordered_set>
#include <type_traits> // this is used for a safety check for default serialization
namespace diy
{
//! A serialization buffer. \ingroup Serialization
struct BinaryBuffer
{
virtual ~BinaryBuffer() =default;
virtual void save_binary(const char* x, size_t count) =0; //!< copy `count` bytes from `x` into the buffer
virtual inline void append_binary(const char* x, size_t count) =0; //!< append `count` bytes from `x` to end of buffer
virtual void load_binary(char* x, size_t count) =0; //!< copy `count` bytes into `x` from the buffer
virtual void load_binary_back(char* x, size_t count) =0; //!< copy `count` bytes into `x` from the back of the buffer
};
struct MemoryBuffer: public BinaryBuffer
{
MemoryBuffer(size_t position_ = 0):
position(position_) {}
virtual inline void save_binary(const char* x, size_t count) override; //!< copy `count` bytes from `x` into the buffer
virtual inline void append_binary(const char* x, size_t count) override; //!< append `count` bytes from `x` to end of buffer
virtual inline void load_binary(char* x, size_t count) override; //!< copy `count` bytes into `x` from the buffer
virtual inline void load_binary_back(char* x, size_t count) override; //!< copy `count` bytes into `x` from the back of the buffer
void clear() { buffer.clear(); reset(); }
void wipe() { std::vector<char>().swap(buffer); reset(); }
void reset() { position = 0; }
void skip(size_t s) { position += s; }
void swap(MemoryBuffer& o) { std::swap(position, o.position); buffer.swap(o.buffer); }
bool empty() const { return buffer.empty(); }
size_t size() const { return buffer.size(); }
void reserve(size_t s) { buffer.reserve(s); }
operator bool() const { return position < buffer.size(); }
//! copy a memory buffer from one buffer to another, bypassing making a temporary copy first
inline static void copy(MemoryBuffer& from, MemoryBuffer& to);
//! multiplier used for the geometric growth of the container
static float growth_multiplier() { return 1.5; }
// simple file IO
void write(const std::string& fn) const { std::ofstream out(fn.c_str()); out.write(&buffer[0], size()); }
void read(const std::string& fn)
{
std::ifstream in(fn.c_str(), std::ios::binary | std::ios::ate);
buffer.resize(static_cast<size_t>(in.tellg()));
in.seekg(0);
in.read(&buffer[0], static_cast<std::streamsize>(size()));
position = 0;
}
size_t position;
std::vector<char> buffer;
};
namespace detail
{
struct Default {};
}
//!\addtogroup Serialization
//!@{
/**
* \brief Main interface to serialization, meant to be specialized for the
* types that require special handling. `diy::save()` and `diy::load()` call
* the static member functions of this class.
*
* The default (unspecialized) version copies
* `sizeof(T)` bytes from `&x` to or from `bb` via
* its `diy::BinaryBuffer::save_binary()` and `diy::BinaryBuffer::load_binary()`
* functions. This works out perfectly for plain old data (e.g., simple structs).
* To save a more complicated type, one has to specialize
* `diy::Serialization<T>` for that type. Specializations are already provided for
* `std::vector<T>`, `std::map<K,V>`, and `std::pair<T,U>`.
* As a result one can quickly add a specialization of one's own
*
*/
template<class T>
struct Serialization: public detail::Default
{
#if (defined(__clang__) && !defined(__ppc64__)) || (defined(__GNUC__) && __GNUC__ >= 5)
//exempt power-pc clang variants due to: https://gitlab.kitware.com/vtk/vtk-m/issues/201
static_assert(std::is_trivially_copyable<T>::value, "Default serialization works only for trivially copyable types");
#endif
static void save(BinaryBuffer& bb, const T& x) { bb.save_binary((const char*) &x, sizeof(T)); }
static void load(BinaryBuffer& bb, T& x) { bb.load_binary((char*) &x, sizeof(T)); }
static size_t size(const T& x) { return sizeof(T); }
};
//! Saves `x` to `bb` by calling `diy::Serialization<T>::save(bb,x)`.
template<class T>
void save(BinaryBuffer& bb, const T& x) { Serialization<T>::save(bb, x); }
//! Loads `x` from `bb` by calling `diy::Serialization<T>::load(bb,x)`.
template<class T>
void load(BinaryBuffer& bb, T& x) { Serialization<T>::load(bb, x); }
//! Optimization for arrays. If `diy::Serialization` is not specialized for `T`,
//! the array will be copied all at once. Otherwise, it's copied element by element.
template<class T>
void save(BinaryBuffer& bb, const T* x, size_t n);
//! Optimization for arrays. If `diy::Serialization` is not specialized for `T`,
//! the array will be filled all at once. Otherwise, it's filled element by element.
template<class T>
void load(BinaryBuffer& bb, T* x, size_t n);
//! Supports only binary data copying (meant for simple footers).
template<class T>
void load_back(BinaryBuffer& bb, T& x) { bb.load_binary_back((char*) &x, sizeof(T)); }
//@}
namespace detail
{
template<typename T>
struct is_default
{
typedef char yes;
typedef int no;
static yes test(Default*);
static no test(...);
enum { value = (sizeof(test((T*) 0)) == sizeof(yes)) };
};
}
template<class T>
void save(BinaryBuffer& bb, const T* x, size_t n)
{
if (!detail::is_default< Serialization<T> >::value)
for (size_t i = 0; i < n; ++i)
diy::save(bb, x[i]);
else // if Serialization is not specialized for U, just save the binary data
bb.save_binary((const char*) &x[0], sizeof(T)*n);
}
template<class T>
void load(BinaryBuffer& bb, T* x, size_t n)
{
if (!detail::is_default< Serialization<T> >::value)
for (size_t i = 0; i < n; ++i)
diy::load(bb, x[i]);
else // if Serialization is not specialized for U, just load the binary data
bb.load_binary((char*) &x[0], sizeof(T)*n);
}
// save/load for MemoryBuffer
template<>
struct Serialization< MemoryBuffer >
{
static void save(BinaryBuffer& bb, const MemoryBuffer& x)
{
diy::save(bb, x.position);
diy::save(bb, &x.buffer[0], x.position);
}
static void load(BinaryBuffer& bb, MemoryBuffer& x)
{
diy::load(bb, x.position);
x.buffer.resize(x.position);
diy::load(bb, &x.buffer[0], x.position);
}
static size_t size(const MemoryBuffer& x)
{
return sizeof(x.position) + x.position;
}
};
// save/load for std::vector<U>
template<class U>
struct Serialization< std::vector<U> >
{
typedef std::vector<U> Vector;
static void save(BinaryBuffer& bb, const Vector& v)
{
size_t s = v.size();
diy::save(bb, s);
if (s > 0)
diy::save(bb, &v[0], v.size());
}
static void load(BinaryBuffer& bb, Vector& v)
{
size_t s;
diy::load(bb, s);
v.resize(s);
if (s > 0)
diy::load(bb, &v[0], s);
}
};
template<class U>
struct Serialization< std::valarray<U> >
{
typedef std::valarray<U> ValArray;
static void save(BinaryBuffer& bb, const ValArray& v)
{
size_t s = v.size();
diy::save(bb, s);
if (s > 0)
diy::save(bb, &v[0], v.size());
}
static void load(BinaryBuffer& bb, ValArray& v)
{
size_t s;
diy::load(bb, s);
v.resize(s);
if (s > 0)
diy::load(bb, &v[0], s);
}
};
// save/load for std::string
template<>
struct Serialization< std::string >
{
typedef std::string String;
static void save(BinaryBuffer& bb, const String& s)
{
size_t sz = s.size();
diy::save(bb, sz);
diy::save(bb, s.c_str(), sz);
}
static void load(BinaryBuffer& bb, String& s)
{
size_t sz;
diy::load(bb, sz);
s.resize(sz);
for (size_t i = 0; i < sz; ++i)
{
char c;
diy::load(bb, c);
s[i] = c;
}
}
};
// save/load for std::pair<X,Y>
template<class X, class Y>
struct Serialization< std::pair<X,Y> >
{
typedef std::pair<X,Y> Pair;
static void save(BinaryBuffer& bb, const Pair& p)
{
diy::save(bb, p.first);
diy::save(bb, p.second);
}
static void load(BinaryBuffer& bb, Pair& p)
{
diy::load(bb, p.first);
diy::load(bb, p.second);
}
};
// save/load for std::map<K,V>
template<class K, class V>
struct Serialization< std::map<K,V> >
{
typedef std::map<K,V> Map;
static void save(BinaryBuffer& bb, const Map& m)
{
size_t s = m.size();
diy::save(bb, s);
for (typename std::map<K,V>::const_iterator it = m.begin(); it != m.end(); ++it)
diy::save(bb, *it);
}
static void load(BinaryBuffer& bb, Map& m)
{
size_t s;
diy::load(bb, s);
for (size_t i = 0; i < s; ++i)
{
K k;
diy::load(bb, k);
diy::load(bb, m[k]);
}
}
};
// save/load for std::set<T>
template<class T>
struct Serialization< std::set<T> >
{
typedef std::set<T> Set;
static void save(BinaryBuffer& bb, const Set& m)
{
size_t s = m.size();
diy::save(bb, s);
for (typename std::set<T>::const_iterator it = m.begin(); it != m.end(); ++it)
diy::save(bb, *it);
}
static void load(BinaryBuffer& bb, Set& m)
{
size_t s;
diy::load(bb, s);
for (size_t i = 0; i < s; ++i)
{
T p;
diy::load(bb, p);
m.insert(p);
}
}
};
// save/load for std::unordered_map<K,V,H,E,A>
template<class K, class V, class H, class E, class A>
struct Serialization< std::unordered_map<K,V,H,E,A> >
{
typedef std::unordered_map<K,V,H,E,A> Map;
static void save(BinaryBuffer& bb, const Map& m)
{
size_t s = m.size();
diy::save(bb, s);
for (auto& x : m)
diy::save(bb, x);
}
static void load(BinaryBuffer& bb, Map& m)
{
size_t s;
diy::load(bb, s);
for (size_t i = 0; i < s; ++i)
{
std::pair<K,V> p;
diy::load(bb, p);
m.emplace(std::move(p));
}
}
};
// save/load for std::unordered_set<T,H,E,A>
template<class T, class H, class E, class A>
struct Serialization< std::unordered_set<T,H,E,A> >
{
typedef std::unordered_set<T,H,E,A> Set;
static void save(BinaryBuffer& bb, const Set& m)
{
size_t s = m.size();
diy::save(bb, s);
for (auto& x : m)
diy::save(bb, x);
}
static void load(BinaryBuffer& bb, Set& m)
{
size_t s;
diy::load(bb, s);
for (size_t i = 0; i < s; ++i)
{
T p;
diy::load(bb, p);
m.emplace(std::move(p));
}
}
};
// save/load for std::tuple<...>
// TODO: this ought to be default (copying) serialization
// if all arguments are default
template<class... Args>
struct Serialization< std::tuple<Args...> >
{
typedef std::tuple<Args...> Tuple;
static void save(BinaryBuffer& bb, const Tuple& t) { save<0>(bb, t); }
template<std::size_t I = 0>
static
typename std::enable_if<I == sizeof...(Args), void>::type
save(BinaryBuffer&, const Tuple&) {}
template<std::size_t I = 0>
static
typename std::enable_if<I < sizeof...(Args), void>::type
save(BinaryBuffer& bb, const Tuple& t) { diy::save(bb, std::get<I>(t)); save<I+1>(bb, t); }
static void load(BinaryBuffer& bb, Tuple& t) { load<0>(bb, t); }
template<std::size_t I = 0>
static
typename std::enable_if<I == sizeof...(Args), void>::type
load(BinaryBuffer&, Tuple&) {}
template<std::size_t I = 0>
static
typename std::enable_if<I < sizeof...(Args), void>::type
load(BinaryBuffer& bb, Tuple& t) { diy::load(bb, std::get<I>(t)); load<I+1>(bb, t); }
};
}
void
diy::MemoryBuffer::
save_binary(const char* x, size_t count)
{
if (position + count > buffer.capacity())
{
double newsize = static_cast<double>(position + count) * growth_multiplier(); // if we have to grow, grow geometrically
buffer.reserve(static_cast<size_t>(newsize));
}
if (position + count > buffer.size())
buffer.resize(position + count);
std::copy_n(x, count, &buffer[position]);
position += count;
}
void
diy::MemoryBuffer::
append_binary(const char* x, size_t count)
{
if (buffer.size() + count > buffer.capacity()) // growth/copying will be triggered
{
size_t cur_size = buffer.size() - position;
size_t new_size = cur_size + count;
if (new_size * growth_multiplier() <= buffer.capacity()) // we have enough space in this buffer, copy in place
{
// copy the data to the beginning of the buffer and reduce its size
for (size_t i = 0; i < cur_size; ++i)
buffer[i] = buffer[position++];
buffer.resize(cur_size);
position = 0;
} else
{
std::vector<char> tmp;
tmp.reserve(new_size * static_cast<size_t>(growth_multiplier()));
tmp.resize(cur_size);
for (size_t i = 0; i < tmp.size(); ++i)
tmp[i] = buffer[position++];
buffer.swap(tmp);
position = 0;
}
}
size_t temp_pos = position;
position = size();
save_binary(x, count);
position = temp_pos;
}
void
diy::MemoryBuffer::
load_binary(char* x, size_t count)
{
std::copy_n(&buffer[position], count, x);
position += count;
}
void
diy::MemoryBuffer::
load_binary_back(char* x, size_t count)
{
std::copy_n(&buffer[buffer.size() - count], count, x);
buffer.resize(buffer.size() - count);
}
void
diy::MemoryBuffer::
copy(MemoryBuffer& from, MemoryBuffer& to)
{
size_t sz;
diy::load(from, sz);
from.position -= sizeof(size_t);
size_t total = sizeof(size_t) + sz;
to.buffer.resize(to.position + total);
std::copy_n(&from.buffer[from.position], total, &to.buffer[to.position]);
to.position += total;
from.position += total;
}
#endif

@ -1,119 +0,0 @@
#ifndef VTKMDIY_STATS_HPP
#define VTKMDIY_STATS_HPP
#include <chrono>
#include <string>
#include <vector>
#include "log.hpp" // need this for format
namespace diy
{
namespace stats
{
#if defined(DIY_PROFILE)
struct Profiler
{
using Clock = std::chrono::high_resolution_clock;
using Time = Clock::time_point;
struct Event
{
Event(const std::string& name_, bool begin_):
name(name_),
begin(begin_),
stamp(Clock::now())
{}
std::string name;
bool begin;
Time stamp;
};
using EventsVector = std::vector<Event>;
struct Scoped
{
Scoped(Profiler& prof_, std::string name_):
prof(prof_), name(name_), active(true) { prof << name; }
~Scoped() { if (active) prof >> name; }
Scoped(Scoped&& other):
prof(other.prof),
name(other.name),
active(other.active) { other.active = false; }
Scoped&
operator=(Scoped&& other) = delete;
Scoped(const Scoped&) = delete;
Scoped&
operator=(const Scoped&) = delete;
Profiler& prof;
std::string name;
bool active;
};
Profiler() { reset_time(); }
void reset_time() { start = Clock::now(); }
void operator<<(std::string name) { enter(name); }
void operator>>(std::string name) { exit(name); }
void enter(std::string name) { events.push_back(Event(name, true)); }
void exit(std::string name) { events.push_back(Event(name, false)); }
void output(std::ostream& out, std::string prefix = "")
{
if (!prefix.empty())
prefix += " ";
for (size_t i = 0; i < events.size(); ++i)
{
const Event& e = events[i];
auto time = std::chrono::duration_cast<std::chrono::microseconds>(e.stamp - start).count();
fmt::print(out, "{}{:02d}:{:02d}:{:02d}.{:06d} {}{}\n",
prefix,
time/1000000/60/60,
time/1000000/60 % 60,
time/1000000 % 60,
time % 1000000,
(e.begin ? '<' : '>'),
e.name);
}
}
Scoped scoped(std::string name) { return Scoped(*this, name); }
void clear() { events.clear(); }
private:
Time start;
EventsVector events;
};
#else
struct Profiler
{
struct Scoped {};
void reset_time() {}
void operator<<(std::string) {}
void operator>>(std::string) {}
void enter(const std::string&) {}
void exit(const std::string&) {}
void output(std::ostream&, std::string = "") {}
void clear() {}
Scoped scoped(std::string) { return Scoped(); }
};
#endif
}
}
#endif

@ -1,246 +0,0 @@
#ifndef VTKMDIY_STORAGE_HPP
#define VTKMDIY_STORAGE_HPP
#include <string>
#include <map>
#include <fstream>
#include <fcntl.h>
#include "serialization.hpp"
#include "thread.hpp"
#include "log.hpp"
#include "io/utils.hpp"
namespace diy
{
namespace detail
{
typedef void (*Save)(const void*, BinaryBuffer& buf);
typedef void (*Load)(void*, BinaryBuffer& buf);
struct FileBuffer: public BinaryBuffer
{
FileBuffer(FILE* file_): file(file_), head(0), tail(0) {}
// TODO: add error checking
virtual inline void save_binary(const char* x, size_t count) override { fwrite(x, 1, count, file); head += count; }
virtual inline void append_binary(const char* x, size_t count) override
{
size_t temp_pos = ftell(file);
fseek(file, static_cast<long>(tail), SEEK_END);
fwrite(x, 1, count, file);
tail += count;
fseek(file, temp_pos, SEEK_SET);
}
virtual inline void load_binary(char* x, size_t count) override { auto n = fread(x, 1, count, file); (void) n;}
virtual inline void load_binary_back(char* x, size_t count) override { fseek(file, static_cast<long>(tail), SEEK_END); auto n = fread(x, 1, count, file); tail += count; fseek(file, static_cast<long>(head), SEEK_SET); (void) n;}
size_t size() const { return head; }
FILE* file;
size_t head, tail; // tail is used to support reading from the back;
// the mechanism is a little awkward and unused, but should work if needed
};
}
class ExternalStorage
{
public:
virtual int put(MemoryBuffer& bb) =0;
virtual int put(const void* x, detail::Save save) =0;
virtual void get(int i, MemoryBuffer& bb, size_t extra = 0) =0;
virtual void get(int i, void* x, detail::Load load) =0;
virtual void destroy(int i) =0;
};
class FileStorage: public ExternalStorage
{
private:
struct FileRecord
{
size_t size;
std::string name;
};
public:
FileStorage(const std::string& filename_template = "/tmp/DIY.XXXXXX"):
filename_templates_(1, filename_template),
count_(0), current_size_(0), max_size_(0) {}
FileStorage(const std::vector<std::string>& filename_templates):
filename_templates_(filename_templates),
count_(0), current_size_(0), max_size_(0) {}
virtual int put(MemoryBuffer& bb) override
{
auto log = get_logger();
std::string filename;
int fh = open_random(filename);
log->debug("FileStorage::put(): {}; buffer size: {}", filename, bb.size());
size_t sz = bb.buffer.size();
#if defined(_WIN32)
using r_type = int;
r_type written = _write(fh, &bb.buffer[0], static_cast<unsigned int>(sz));
#else
using r_type = ssize_t;
r_type written = write(fh, &bb.buffer[0], sz);
#endif
if (written < static_cast<r_type>(sz) || written == r_type(-1))
log->warn("Could not write the full buffer to {}: written = {}; size = {}", filename, written, sz);
io::utils::close(fh);
bb.wipe();
#if 0 // double-check the written file size: only for extreme debugging
FILE* fp = fopen(filename.c_str(), "r");
fseek(fp, 0L, SEEK_END);
int fsz = ftell(fp);
if (fsz != sz)
log->warn("file size doesn't match the buffer size, {} vs {}", fsz, sz);
fclose(fp);
#endif
return make_file_record(filename, sz);
}
virtual int put(const void* x, detail::Save save) override
{
std::string filename;
int fh = open_random(filename);
#if defined(_WIN32)
detail::FileBuffer fb(_fdopen(fh, "wb"));
#else
detail::FileBuffer fb(fdopen(fh, "w"));
#endif
save(x, fb);
size_t sz = fb.size();
fclose(fb.file);
io::utils::sync(fh);
return make_file_record(filename, sz);
}
virtual void get(int i, MemoryBuffer& bb, size_t extra) override
{
FileRecord fr = extract_file_record(i);
get_logger()->debug("FileStorage::get(): {}", fr.name);
bb.buffer.reserve(fr.size + extra);
bb.buffer.resize(fr.size);
#if defined(_WIN32)
int fh = -1;
_sopen_s(&fh, fr.name.c_str(), _O_RDONLY | _O_BINARY, _SH_DENYNO, _S_IREAD);
_read(fh, &bb.buffer[0], static_cast<unsigned int>(fr.size));
#else
int fh = open(fr.name.c_str(), O_RDONLY | O_SYNC, 0600);
auto n = read(fh, &bb.buffer[0], fr.size); (void) n;
#endif
io::utils::close(fh);
remove_file(fr);
}
virtual void get(int i, void* x, detail::Load load) override
{
FileRecord fr = extract_file_record(i);
//int fh = open(fr.name.c_str(), O_RDONLY | O_SYNC, 0600);
#if defined(_WIN32)
int fh = -1;
_sopen_s(&fh, fr.name.c_str(), _O_RDONLY | _O_BINARY, _SH_DENYNO, _S_IREAD);
detail::FileBuffer fb(_fdopen(fh, "rb"));
#else
int fh = open(fr.name.c_str(), O_RDONLY, 0600);
detail::FileBuffer fb(fdopen(fh, "r"));
#endif
load(x, fb);
fclose(fb.file);
remove_file(fr);
}
virtual void destroy(int i) override
{
FileRecord fr;
{
CriticalMapAccessor accessor = filenames_.access();
fr = (*accessor)[i];
accessor->erase(i);
}
io::utils::remove(fr.name);
(*current_size_.access()) -= fr.size;
}
int count() const { return (*count_.const_access()); }
size_t current_size() const { return (*current_size_.const_access()); }
size_t max_size() const { return (*max_size_.const_access()); }
~FileStorage()
{
for (FileRecordMap::const_iterator it = filenames_.const_access()->begin();
it != filenames_.const_access()->end();
++it)
{
io::utils::remove(it->second.name);
}
}
private:
int open_random(std::string& filename) const
{
if (filename_templates_.size() == 1)
filename = filename_templates_[0].c_str();
else
{
// pick a template at random (very basic load balancing mechanism)
filename = filename_templates_[static_cast<size_t>(std::rand()) % filename_templates_.size()].c_str();
}
int fh = diy::io::utils::mkstemp(filename);
return fh;
}
int make_file_record(const std::string& filename, size_t sz)
{
int res = (*count_.access())++;
FileRecord fr = { sz, filename };
(*filenames_.access())[res] = fr;
// keep track of sizes
critical_resource<size_t>::accessor cur = current_size_.access();
*cur += sz;
critical_resource<size_t>::accessor max = max_size_.access();
if (*cur > *max)
*max = *cur;
return res;
}
FileRecord extract_file_record(int i)
{
CriticalMapAccessor accessor = filenames_.access();
FileRecord fr = (*accessor)[i];
accessor->erase(i);
return fr;
}
void remove_file(const FileRecord& fr)
{
io::utils::remove(fr.name);
(*current_size_.access()) -= fr.size;
}
private:
typedef std::map<int, FileRecord> FileRecordMap;
typedef critical_resource<FileRecordMap> CriticalMap;
typedef CriticalMap::accessor CriticalMapAccessor;
private:
std::vector<std::string> filename_templates_;
CriticalMap filenames_;
critical_resource<int> count_;
critical_resource<size_t> current_size_, max_size_;
};
}
#endif

@ -1,31 +0,0 @@
#ifndef VTKMDIY_THREAD_H
#define VTKMDIY_THREAD_H
#ifdef VTKM_DIY_NO_THREADS
#include "no-thread.hpp"
#else
#include "thread/fast_mutex.h"
#include <thread>
#include <mutex>
namespace diy
{
using std::thread;
using std::mutex;
using std::recursive_mutex;
namespace this_thread = std::this_thread;
// TODO: replace with our own implementation using std::atomic_flag
using fast_mutex = tthread::fast_mutex;
template<class Mutex>
using lock_guard = std::unique_lock<Mutex>;
}
#endif
#include "critical-resource.hpp"
#endif

@ -1,248 +0,0 @@
/* -*- mode: c++; tab-width: 2; indent-tabs-mode: nil; -*-
Copyright (c) 2010-2012 Marcus Geelnard
This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source
distribution.
*/
#ifndef _FAST_MUTEX_H_
#define _FAST_MUTEX_H_
/// @file
// Which platform are we on?
#if !defined(_TTHREAD_PLATFORM_DEFINED_)
#if defined(_WIN32) || defined(__WIN32__) || defined(__WINDOWS__)
#define _TTHREAD_WIN32_
#else
#define _TTHREAD_POSIX_
#endif
#define _TTHREAD_PLATFORM_DEFINED_
#endif
// Check if we can support the assembly language level implementation (otherwise
// revert to the system API)
#if (defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))) || \
(defined(_MSC_VER) && (defined(_M_IX86) || defined(_M_X64))) || \
(defined(__GNUC__) && (defined(__ppc__)))
#define _FAST_MUTEX_ASM_
#else
#define _FAST_MUTEX_SYS_
#endif
#if defined(_TTHREAD_WIN32_)
#ifndef WIN32_LEAN_AND_MEAN
#define WIN32_LEAN_AND_MEAN
#define __UNDEF_LEAN_AND_MEAN
#endif
#include <windows.h>
#ifdef __UNDEF_LEAN_AND_MEAN
#undef WIN32_LEAN_AND_MEAN
#undef __UNDEF_LEAN_AND_MEAN
#endif
#else
#ifdef _FAST_MUTEX_ASM_
#include <sched.h>
#else
#include <pthread.h>
#endif
#endif
namespace tthread {
/// Fast mutex class.
/// This is a mutual exclusion object for synchronizing access to shared
/// memory areas for several threads. It is similar to the tthread::mutex class,
/// but instead of using system level functions, it is implemented as an atomic
/// spin lock with very low CPU overhead.
///
/// The \c fast_mutex class is NOT compatible with the \c condition_variable
/// class (however, it IS compatible with the \c lock_guard class). It should
/// also be noted that the \c fast_mutex class typically does not provide
/// as accurate thread scheduling as a the standard \c mutex class does.
///
/// Because of the limitations of the class, it should only be used in
/// situations where the mutex needs to be locked/unlocked very frequently.
///
/// @note The "fast" version of this class relies on inline assembler language,
/// which is currently only supported for 32/64-bit Intel x86/AMD64 and
/// PowerPC architectures on a limited number of compilers (GNU g++ and MS
/// Visual C++).
/// For other architectures/compilers, system functions are used instead.
class fast_mutex {
public:
/// Constructor.
#if defined(_FAST_MUTEX_ASM_)
fast_mutex() : mLock(0) {}
#else
fast_mutex()
{
#if defined(_TTHREAD_WIN32_)
InitializeCriticalSection(&mHandle);
#elif defined(_TTHREAD_POSIX_)
pthread_mutex_init(&mHandle, NULL);
#endif
}
#endif
#if !defined(_FAST_MUTEX_ASM_)
/// Destructor.
~fast_mutex()
{
#if defined(_TTHREAD_WIN32_)
DeleteCriticalSection(&mHandle);
#elif defined(_TTHREAD_POSIX_)
pthread_mutex_destroy(&mHandle);
#endif
}
#endif
/// Lock the mutex.
/// The method will block the calling thread until a lock on the mutex can
/// be obtained. The mutex remains locked until \c unlock() is called.
/// @see lock_guard
inline void lock()
{
#if defined(_FAST_MUTEX_ASM_)
bool gotLock;
do {
gotLock = try_lock();
if(!gotLock)
{
#if defined(_TTHREAD_WIN32_)
Sleep(0);
#elif defined(_TTHREAD_POSIX_)
sched_yield();
#endif
}
} while(!gotLock);
#else
#if defined(_TTHREAD_WIN32_)
EnterCriticalSection(&mHandle);
#elif defined(_TTHREAD_POSIX_)
pthread_mutex_lock(&mHandle);
#endif
#endif
}
/// Try to lock the mutex.
/// The method will try to lock the mutex. If it fails, the function will
/// return immediately (non-blocking).
/// @return \c true if the lock was acquired, or \c false if the lock could
/// not be acquired.
inline bool try_lock()
{
#if defined(_FAST_MUTEX_ASM_)
int oldLock;
#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
asm volatile (
"movl $1,%%eax\n\t"
"xchg %%eax,%0\n\t"
"movl %%eax,%1\n\t"
: "=m" (mLock), "=m" (oldLock)
:
: "%eax", "memory"
);
#elif defined(_MSC_VER) && (defined(_M_IX86) || defined(_M_X64))
int *ptrLock = &mLock;
__asm {
mov eax,1
mov ecx,ptrLock
xchg eax,[ecx]
mov oldLock,eax
}
#elif defined(__GNUC__) && (defined(__ppc__))
int newLock = 1;
asm volatile (
"\n1:\n\t"
"lwarx %0,0,%1\n\t"
"cmpwi 0,%0,0\n\t"
"bne- 2f\n\t"
"stwcx. %2,0,%1\n\t"
"bne- 1b\n\t"
"isync\n"
"2:\n\t"
: "=&r" (oldLock)
: "r" (&mLock), "r" (newLock)
: "cr0", "memory"
);
#endif
return (oldLock == 0);
#else
#if defined(_TTHREAD_WIN32_)
return TryEnterCriticalSection(&mHandle) ? true : false;
#elif defined(_TTHREAD_POSIX_)
return (pthread_mutex_trylock(&mHandle) == 0) ? true : false;
#endif
#endif
}
/// Unlock the mutex.
/// If any threads are waiting for the lock on this mutex, one of them will
/// be unblocked.
inline void unlock()
{
#if defined(_FAST_MUTEX_ASM_)
#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
asm volatile (
"movl $0,%%eax\n\t"
"xchg %%eax,%0\n\t"
: "=m" (mLock)
:
: "%eax", "memory"
);
#elif defined(_MSC_VER) && (defined(_M_IX86) || defined(_M_X64))
int *ptrLock = &mLock;
__asm {
mov eax,0
mov ecx,ptrLock
xchg eax,[ecx]
}
#elif defined(__GNUC__) && (defined(__ppc__))
asm volatile (
"sync\n\t" // Replace with lwsync where possible?
: : : "memory"
);
mLock = 0;
#endif
#else
#if defined(_TTHREAD_WIN32_)
LeaveCriticalSection(&mHandle);
#elif defined(_TTHREAD_POSIX_)
pthread_mutex_unlock(&mHandle);
#endif
#endif
}
private:
#if defined(_FAST_MUTEX_ASM_)
int mLock;
#else
#if defined(_TTHREAD_WIN32_)
CRITICAL_SECTION mHandle;
#elif defined(_TTHREAD_POSIX_)
pthread_mutex_t mHandle;
#endif
#endif
};
}
#endif // _FAST_MUTEX_H_

@ -1,43 +0,0 @@
#ifndef VTKMDIY_TIME_HPP
#define VTKMDIY_TIME_HPP
#ifndef _WIN32
#include <sys/time.h>
#ifdef __MACH__
#include <mach/clock.h>
#include <mach/mach.h>
#endif // __MACH__
#endif // ifndef _WIN32
namespace diy
{
typedef unsigned long time_type;
inline time_type get_time()
{
#ifdef __MACH__ // OS X does not have clock_gettime, use clock_get_time
clock_serv_t cclock;
mach_timespec_t ts;
host_get_clock_service(mach_host_self(), CALENDAR_CLOCK, &cclock);
clock_get_time(cclock, &ts);
mach_port_deallocate(mach_task_self(), cclock);
return ts.tv_sec*1000 + static_cast<unsigned int>(ts.tv_nsec/1000000);
#elif defined(_WIN32)
// SOURCE: http://stackoverflow.com/questions/5404277/porting-clock-gettime-to-windows
__int64 wintime;
GetSystemTimeAsFileTime((FILETIME*)&wintime);
wintime -=116444736000000000LL; //1jan1601 to 1jan1970
long tv_sec = static_cast<long>(wintime / 10000000LL); //seconds
long tv_nsec = static_cast<long>(wintime % 10000000LL *100); //nano-seconds
return static_cast<time_type>(tv_sec*1000 + tv_nsec/1000000);
#else
timespec ts;
clock_gettime(CLOCK_REALTIME, &ts);
return ts.tv_sec*1000 + ts.tv_nsec/1000000;
#endif
}
}
#endif

@ -1,94 +0,0 @@
#ifndef VTKMDIY_TYPES_HPP
#define VTKMDIY_TYPES_HPP
#include <iostream>
#include "constants.h"
#include "point.hpp"
namespace diy
{
struct BlockID
{
int gid, proc;
BlockID() = default;
BlockID(int _gid, int _proc) : gid(_gid), proc(_proc) {}
};
template<class Coordinate_>
struct Bounds
{
using Coordinate = Coordinate_;
using Point = diy::Point<Coordinate, DIY_MAX_DIM>;
Point min, max;
Bounds() = default;
Bounds(const Point& _min, const Point& _max) : min(_min), max(_max) {}
};
using DiscreteBounds = Bounds<int>;
using ContinuousBounds = Bounds<float>;
//! Helper to create a 1-dimensional discrete domain with the specified extents
inline
diy::DiscreteBounds
interval(int from, int to) { DiscreteBounds domain; domain.min[0] = from; domain.max[0] = to; return domain; }
struct Direction: public Point<int,DIY_MAX_DIM>
{
Direction() { for (size_t i = 0; i < DIY_MAX_DIM; ++i) (*this)[i] = 0; }
Direction(std::initializer_list<int> lst):
Direction() { size_t i = 0; for(int x : lst) (*this)[i++] = x; }
Direction(int dir)
{
for (size_t i = 0; i < DIY_MAX_DIM; ++i) (*this)[i] = 0;
if (dir & DIY_X0) (*this)[0] -= 1;
if (dir & DIY_X1) (*this)[0] += 1;
if (dir & DIY_Y0) (*this)[1] -= 1;
if (dir & DIY_Y1) (*this)[1] += 1;
if (dir & DIY_Z0) (*this)[2] -= 1;
if (dir & DIY_Z1) (*this)[2] += 1;
if (dir & DIY_T0) (*this)[3] -= 1;
if (dir & DIY_T1) (*this)[3] += 1;
}
bool
operator==(const diy::Direction& y) const
{
for (size_t i = 0; i < DIY_MAX_DIM; ++i)
if ((*this)[i] != y[i]) return false;
return true;
}
// lexicographic comparison
bool
operator<(const diy::Direction& y) const
{
for (size_t i = 0; i < DIY_MAX_DIM; ++i)
{
if ((*this)[i] < y[i]) return true;
if ((*this)[i] > y[i]) return false;
}
return false;
}
};
// Selector of bounds value type
template<class Bounds_>
struct BoundsValue
{
using type = typename Bounds_::Coordinate;
};
inline
bool
operator<(const diy::BlockID& x, const diy::BlockID& y)
{ return x.gid < y.gid; }
inline
bool
operator==(const diy::BlockID& x, const diy::BlockID& y)
{ return x.gid == y.gid; }
}
#endif

@ -1,8 +0,0 @@
#ifndef VTKMDIY_VERSION_HPP
#define VTKMDIY_VERSION_HPP
#define VTKMDIY_VERSION_MAJOR 3
#define VTKMDIY_VERSION_MINOR 5
#define VTKMDIY_VERSION_PATCH dev1
#endif

@ -1,54 +0,0 @@
#ifndef VTKMDIY_VERTICES_HPP
#define VTKMDIY_VERTICES_HPP
#include <iterator>
namespace diy
{
namespace detail
{
template<class Vertex, size_t I>
struct IsLast
{
static constexpr bool value = (Vertex::dimension() - 1 == I);
};
template<class Vertex, class Callback, size_t I, bool P>
struct ForEach
{
void operator()(Vertex& pos, const Vertex& from, const Vertex& to, const Callback& callback) const
{
for (pos[I] = from[I]; pos[I] <= to[I]; ++pos[I])
ForEach<Vertex, Callback, I+1, IsLast<Vertex,I+1>::value>()(pos, from, to, callback);
}
};
template<class Vertex, class Callback, size_t I>
struct ForEach<Vertex,Callback,I,true>
{
void operator()(Vertex& pos, const Vertex& from, const Vertex& to, const Callback& callback) const
{
for (pos[I] = from[I]; pos[I] <= to[I]; ++pos[I])
callback(pos);
}
};
}
template<class Vertex, class Callback>
void for_each(const Vertex& from, const Vertex& to, const Callback& callback)
{
Vertex pos;
diy::detail::ForEach<Vertex, Callback, 0, detail::IsLast<Vertex,0>::value>()(pos, from, to, callback);
}
template<class Vertex, class Callback>
void for_each(const Vertex& shape, const Callback& callback)
{
// specify grid namespace to disambiguate with std::for_each(...)
diy::for_each(Vertex::zero(), shape - Vertex::one(), callback);
}
}
#endif