Utilities

constant

template<typename T = double>
struct constant

constant<T> represents a constant value of type T.

The object has the property that for any binary operation involving a constant<T> and a value of type U, the constant is automatically cast to also be of type U.

For example:

float a = 5;
constant<double> b = 3;

auto c = a + b; // The result will be of type `float`

Public Functions

inline constexpr constant(T value = {})

Create a new constant from the given value.

template<typename R>
inline explicit constexpr constant(const constant<R> &that)

Create a new constant from another constant of type R.

inline constexpr T get() const

Return the value of the constant

tiling

template<typename TileDim, typename BlockDim, typename Distributions = distributions<>, typename IndexType = int>
struct tiling

Represents a tiling where the elements given by TileDim are distributed over the threads given by BlockDim according to the distributions given by Distributions.

The template parameters should be the following:

  • TileDim: Should be an instance of tile_size<...>. For example, tile_size<16, 16> represents a 2-dimensional 16x16 tile.

  • BlockDim: Should be an instance of block_dim<...>. For example, block_dim<16, 4> represents a thread block having X dimension 16 and Y-dimension 4 for a total of 64 threads per block.

  • Distributions: Should be an instance of distributions<...>. For example, distributions<dist::cyclic, dist::blocked> will distribute elements in cyclic fashion along the X-axis and blocked fashion along the Y-axis.

  • IndexType: The type used for index values (int by default)

Public Functions

inline bool is_present(size_t item) const

Checks if a specific item is present for the current thread based on the distribution strategy. Not always is the number of items stored per thread equal to the number of items owned by each thread (for example, if the tile size is not divisible by the block size). In this case, is_present will return false for certain items.

inline vector<index_type, extent<rank>> at(size_t item) const

Returns the global coordinates of a specific item for the current thread.

inline index_type at(size_t item, size_t axis) const

Returns the global coordinates of a specific item along a specified axis for the current thread.

inline vector<index_type, extent<rank>> operator[](size_t item) const

Returns the global coordinates of a specific item for the current thread (alias of at).

inline vector<vector<index_type, extent<rank>>, extent<num_locals>> local_points() const

Returns a vector of global coordinates of all items present for the current thread.

inline vector<index_type, extent<num_locals>> local_points(size_t axis) const

Returns a vector of coordinate values along a specified axis for all items present for the current thread.

inline vector<bool, extent<num_locals>> local_mask() const

Returns a vector of boolean values representing the result of is_present of the items for the current thread.

inline index_type thread_index(size_t axis) const

Returns the thread index (position) along a specified axis for the current thread.

inline index_type tile_offset(size_t axis) const

Returns the offset of the tile along a specified axis.

inline vector<index_type, extent<rank>> thread_index() const

Returns a vector of thread indices for all axes.

inline vector<index_type, extent<rank>> tile_offset() const

Returns the offset of the tile for all axes.

inline tiling_iterator<tiling> begin() const

Returns an iterator pointing to the beginning of the tiling.

inline tiling_iterator<tiling> end() const

Returns an iterator pointing to the end of the tiling.

template<typename F>
inline void for_each(F fun) const

Applies a provided function to each item present in the tiling for the current thread. The function should take an index and a vector of global coordinates as arguments.

Public Static Functions

static inline constexpr size_t size()

Returns the number of items per thread in the tiling.

Note that this method is constexpr and can be called at compile-time.

static inline constexpr bool all_present()

Checks if the tiling is exhaustive, meaning all items are always present for all threads. If this returns true, then is_present will always true for any given index.

Note that this method is constexpr and can thus be called at compile-time.

static inline constexpr index_type block_size(size_t axis)

Returns the size of the block (number of threads) along a specified axis.

Note that this method is constexpr and can thus be called at compile-time.

static inline constexpr index_type tile_size(size_t axis)

Returns the size of the tile along a specified axis.

Note that this method is constexpr and can thus be called at compile-time.

static inline vector<index_type, extent<rank>> block_size()

Returns a vector of block sizes for all axes.

static inline vector<index_type, extent<rank>> tile_size()

Returns a vector of tile sizes for all axes.

Friends

inline friend tiling operator+(const tiling &self, const vector<index_type, extent<rank>> &offset)

Adds offset to all points of this tiling and returns a new tiling.

inline friend tiling operator+(const vector<index_type, extent<rank>> &offset, const tiling &self)

Adds offset to all points of this tiling and returns a new tiling.

inline friend tiling &operator+=(tiling &self, const vector<index_type, extent<rank>> &offset)

Adds offset to all points of this tiling.

KERNEL_FLOAT_TILING_FOR

KERNEL_FLOAT_TILING_FOR(...)

Iterate over the points in a tiling<...> using a for loop.

There are two ways to use this macro. Using the 1 variable form:

auto t = tiling<tile_size<16, 16>, block_size<4, 4>>;

KERNEL_FLOAT_TILING_FOR(t, auto point) {
 printf("%d,%d\n", point[0], point[1]);
}

Or using the 2 variables form:

auto t = tiling<tile_size<16, 16>, block_size<4, 4>>;

KERNEL_FLOAT_TILING_FOR(t, auto index, auto point) {
 printf("%d] %d,%d\n", index, point[0], point[1]);
}