Utilities
constant
-
template<typename T = double>
struct constant constant<T>
represents a constant value of typeT
.The object has the property that for any binary operation involving a
constant<T>
and a value of typeU
, the constant is automatically cast to also be of typeU
.For example:
float a = 5; constant<double> b = 3; auto c = a + b; // The result will be of type `float`
tiling
-
template<typename TileDim, typename BlockDim, typename Distributions = distributions<>, typename IndexType = int>
struct tiling Represents a tiling where the elements given by
TileDim
are distributed over the threads given byBlockDim
according to the distributions given byDistributions
.The template parameters should be the following:
TileDim
: Should be an instance oftile_size<...>
. For example,tile_size<16, 16>
represents a 2-dimensional 16x16 tile.BlockDim
: Should be an instance ofblock_dim<...>
. For example,block_dim<16, 4>
represents a thread block having X dimension 16 and Y-dimension 4 for a total of 64 threads per block.Distributions
: Should be an instance ofdistributions<...>
. For example,distributions<dist::cyclic, dist::blocked>
will distribute elements in cyclic fashion along the X-axis and blocked fashion along the Y-axis.IndexType
: The type used for index values (int
by default)
Public Functions
-
inline bool is_present(size_t item) const
Checks if a specific item is present for the current thread based on the distribution strategy. Not always is the number of items stored per thread equal to the number of items owned by each thread (for example, if the tile size is not divisible by the block size). In this case,
is_present
will returnfalse
for certain items.
-
inline vector<index_type, extent<rank>> at(size_t item) const
Returns the global coordinates of a specific item for the current thread.
-
inline index_type at(size_t item, size_t axis) const
Returns the global coordinates of a specific item along a specified axis for the current thread.
-
inline vector<index_type, extent<rank>> operator[](size_t item) const
Returns the global coordinates of a specific item for the current thread (alias of
at
).
-
inline vector<vector<index_type, extent<rank>>, extent<num_locals>> local_points() const
Returns a vector of global coordinates of all items present for the current thread.
-
inline vector<index_type, extent<num_locals>> local_points(size_t axis) const
Returns a vector of coordinate values along a specified axis for all items present for the current thread.
-
inline vector<bool, extent<num_locals>> local_mask() const
Returns a vector of boolean values representing the result of
is_present
of the items for the current thread.
-
inline index_type thread_index(size_t axis) const
Returns the thread index (position) along a specified axis for the current thread.
-
inline index_type tile_offset(size_t axis) const
Returns the offset of the tile along a specified axis.
-
inline vector<index_type, extent<rank>> thread_index() const
Returns a vector of thread indices for all axes.
-
inline vector<index_type, extent<rank>> tile_offset() const
Returns the offset of the tile for all axes.
Public Static Functions
-
static inline constexpr size_t size()
Returns the number of items per thread in the tiling.
Note that this method is
constexpr
and can be called at compile-time.
-
static inline constexpr bool all_present()
Checks if the tiling is exhaustive, meaning all items are always present for all threads. If this returns
true
, thenis_present
will always true for any given index.Note that this method is
constexpr
and can thus be called at compile-time.
-
static inline constexpr index_type block_size(size_t axis)
Returns the size of the block (number of threads) along a specified axis.
Note that this method is
constexpr
and can thus be called at compile-time.
-
static inline constexpr index_type tile_size(size_t axis)
Returns the size of the tile along a specified axis.
Note that this method is
constexpr
and can thus be called at compile-time.
Friends
-
inline friend tiling operator+(const tiling &self, const vector<index_type, extent<rank>> &offset)
Adds
offset
to all points of this tiling and returns a new tiling.
KERNEL_FLOAT_TILING_FOR
-
KERNEL_FLOAT_TILING_FOR(...)
Iterate over the points in a
tiling<...>
using a for loop.There are two ways to use this macro. Using the 1 variable form:
auto t = tiling<tile_size<16, 16>, block_size<4, 4>>; KERNEL_FLOAT_TILING_FOR(t, auto point) { printf("%d,%d\n", point[0], point[1]); }
Or using the 2 variables form:
auto t = tiling<tile_size<16, 16>, block_size<4, 4>>; KERNEL_FLOAT_TILING_FOR(t, auto index, auto point) { printf("%d] %d,%d\n", index, point[0], point[1]); }