Memory read/write

read

template<typename T, typename I, typename M = bool, typename E = broadcast_vector_extent_type<I, M>>
inline vector<T, E> kernel_float::read(const T *ptr, const I &indices, const M &mask = true)

Load the elements from the buffer ptr at the locations specified by indices.

The mask should be a vector of booleans where true indicates that the value should be loaded and false indicates that the value should be skipped. This can be used to prevent reading out of bounds.

// Load 2 elements at data[0] and data[8], skip data[2] and data[4]
vec<T, 4> values = read(data, make_vec(0, 2, 4, 8), make_vec(true, false, false, true));

write

template<typename T, typename V, typename I, typename M = bool, typename E = broadcast_vector_extent_type<V, I, M>>
inline void kernel_float::write(T *ptr, const I &indices, const V &values, const M &mask = true)

Store the elements from the vector values in the buffer ptr at the locations specified by indices.

The mask should be a vector of booleans where true indicates that the value should be store and false indicates that the value should be skipped. This can be used to prevent writing out of bounds.

// Store 2 elements at data[0] and data[8], skip data[2] and data[4]
auto values = make_vec(42, 13, 87, 12);
auto mask = make_vec(true, false, false, true);
write(data, make_vec(0, 2, 4, 8), values, mask);

read

template<size_t N, typename T>
inline vector<T, extent<N>> kernel_float::read(const T *ptr)

Load N elements at the location ptr[0], ptr[1], ptr[2], ....

// Load 4 elements at locations data[0], data[1], data[2], data[3]
vec<T, 4> values = read<4>(data);

// Load 4 elements at locations data[10], data[11], data[12], data[13]
vec<T, 4> values = read<4>(data + 10);

write

template<typename V, typename T>
inline void kernel_float::write(T *ptr, const V &values)

Store N elements at the location ptr[0], ptr[1], ptr[2], ....

// Store 4 elements at locations data[0], data[1], data[2], data[3]
vec<float, 4> values = {1.0f, 2.0f, 3.0f, 4.0f};
write(data, values);

// Store 4 elements at locations data[10], data[11], data[12], data[13]
write(data + 10, values);

read_aligned

template<size_t Align, size_t N = Align, typename T>
inline vector<T, extent<N>> kernel_float::read_aligned(const T *ptr)

Load N elements at the locations ptr[0], ptr[1], ptr[2], ....

It is assumed that ptr is maximum aligned such that all N elements can be loaded at once using a vector operation. If the pointer is not aligned, undefined behavior will occur.

// Load 4 elements at locations data[0], data[1], data[2], data[3]
vec<T, 4> values = read_aligned<4>(data);

// Load 4 elements at locations data[12], data[13], data[14], data[15]
vec<T, 4> values2 = read_aligned<4>(data + 12);

write_aligned

template<size_t Align, typename V, typename T>
inline void kernel_float::write_aligned(T *ptr, const V &values)

Store N elements at the locations ptr[0], ptr[1], ptr[2], ....

It is assumed that ptr is maximum aligned such that all N elements can be loaded at once using a vector operation. If the pointer is not aligned, undefined behavior will occur.

// Store 4 elements at locations data[0], data[1], data[2], data[3]
vec<float, 4> values = {1.0f, 2.0f, 3.0f, 4.0f};
write_aligned(data, values);

// Load 4 elements at locations data[10], data[11], data[12], data[13]
write_aligned(data + 10, values);

make_vec_ptr

template<typename T, size_t N = 1, typename U>
inline vector_ptr<T, N, access_policy<U, N * sizeof(U)>> kernel_float::make_vec_ptr(U *ptr)

Creates a vector_ptr from a raw pointer U*.

This name resolves to one of four overloads depending on how it is called:

  1. No template arguments: make_vec_ptr(ptr). Returns vec_ptr<U, 1, U, KERNEL_FLOAT_MAX_ALIGNMENT>. The pointer is assumed to be aligned to KERNEL_FLOAT_MAX_ALIGNMENT.

  2. Integer template argument: make_vec_ptr<N>(ptr). Returns vec_ptr<U, N, U, N>. The pointer is assumed to be aligned for N consecutive elements.

  3. Type template argument: make_vec_ptr<T>(ptr). Returns vector_ptr<T, 1, U, 1>. Elements are stored as U but viewed as T, with element-aligned access.

  4. Type and size template arguments: make_vec_ptr<T, N>(ptr). Returns vector_ptr<T, N, U, N>. Elements are stored as U but viewed as T, assuming alignment for N elements.

Template Parameters:
  • T – The type of the elements as viewed by the user.

  • N – The vector size in number of elements.

  • U – The type of the elements pointed to by the raw pointer.

vector_ptr

template<typename T, size_t N, typename Policy = access_policy<T, sizeof(T) * N>, access_mode = Policy::mode>
struct vector_ptr : private kernel_float::access_policy<T, sizeof(T) * N>

A wrapper for a pointer that enables vectorized access and supports type conversions..

The vector_ptr<T, N, U> type is designed to function as if its a vec<T, N>* pointer, allowing of reading and writing vec<T, N> elements. However, the actual type of underlying storage is a pointer of type U*, where automatic conversion is performed between T and U when reading/writing items.

For example, a vector_ptr<double, N, half> is useful where the data is stored in low precision (here 16 bit) but it should be accessed as if it was in a higher precision format (here 64 bit).

The access policy is stored as a (privately inherited) subobject. For the stateless default policy this base is empty, so sizeof(vector_ptr) equals sizeof(pointer) (EBCO). The stored policy is forwarded to the vector_ref returned by operator*.

Template Parameters:
  • T – The type of the elements as viewed by the user.

  • N – The alignment of T in number of elements.

  • Policy – The access policy, which also determines the underlying storage type and alignment.

Public Functions

inline vector_ptr()

Default constructor sets the pointer to NULL.

template<typename V = storage_type, enable_if_t<alignment != alignof(V), int> = 0>
inline explicit vector_ptr(pointer_type p, policy_type policy = {})

Constructor from a given pointer. It is up to the user to assert that the pointer is aligned to Alignment.

template<typename V = storage_type, enable_if_t<alignment == alignof(V), int> = 0>
inline vector_ptr(pointer_type p, policy_type policy = {})

Constructor from a given pointer. This assumes that the alignment of the pointer equals Alignment.

template<typename T2, size_t N2, typename P2, enable_if_t<detail::is_policy_convertible<policy_type, P2>::value, int> = 0>
inline vector_ptr(vector_ptr<T2, N2, P2> p)

Constructs a vector_ptr from another vector_ptr with potentially different alignment and type. This constructor only allows conversion if the alignment of the source is greater than or equal to the alignment of the target. The target policy is default-constructed.

inline vector_ptr<value_type, N, offset_policy_type> offset(size_t index) const

Returns a vector_ptr where the pointer has been offset by index * N elements.

inline const vector_ref<value_type, N, policy_type> operator*() const

Shorthand for at(0). The stored policy is forwarded to the resulting reference.

template<size_t K = N>
inline vector_ref<value_type, K, offset_policy_type> at(size_t index) const

Accesses a reference to a vector at a specific index with optional alignment considerations.

Template Parameters:

K – The number of elements in the vector to access, defaults to N.

Parameters:

index – The index at which to access the vector.

inline vector_ref<value_type, N, offset_policy_type> operator[](size_t index) const

Shorthand for at(index).

template<size_t K = N>
inline vector<value_type, extent<K>> read(size_t index = 0) const

Accesses a vector at a specific index.

Template Parameters:

K – The number of elements to read, defaults to N.

Parameters:

index – The index from which to read the data.

template<size_t K = N, typename V>
inline void write(size_t index, const V &values) const

Writes data to a specific index.

Template Parameters:
  • K – The number of elements to write, defaults to N.

  • V – The type of the values being written.

Parameters:
  • index – The index at which to write the data.

  • values – The vector of values to write.

inline pointer_type get() const

Gets the raw data pointer managed by this vector_ptr.

inline const policy_type &policy() const

Returns a reference to the access policy stored within this vector_ptr.