Type Promotion
For operations that involve two (or more) input arguments, kernel_float
will first convert the inputs into a common type before applying the operation.
For example, when adding vec<int, N>
to vec<float, N>
, both arguments must first be converted into a vec<float, N>
.
This procedure is called “type promotion” and is implemented as follows.
Initially, every argument is transformed into a vector using the into_vec
function
Next, all arguments must have length N
or length 1
, where vectors of length 1
are repeated to become length N
.
Finally, the vector element types are promoted into a shared type.
The rules for element type promotion in kernel_float
are slightly different than in regular C++.
In a nutshell, for two element types, the promotion rules can be summarized as follows:
If one of the types is
bool
, the result is the other type.If one type is a floating-point and the other is an integer (signed or unsigned), the outcome is the floating-point type.
If both are floating-point types, the largest of the two is chosen. An exception is combining
half
andbfloat16
, which results infloat
.If both types are integer types of the same signedness, the largest of the two is chosen.
Combining a signed integer and unsigned integer type is not allowed.
Overview
The type promotion rules are shown in the table below. The labels are as follows:
b
: booleaniN
: signed integer ofN
bits (e.g.,int
,long
)uN
: unsigned integer ofN
bits (e.g.,unsigned int
,size_t
)fN
: floating-point type ofN
bits (e.g.,float
,double
)bf16
: bfloat16 floating-point format.
b |
i8 |
i16 |
i32 |
i64 |
u8 |
u16 |
u32 |
u64 |
f8 |
f16 |
bf16 |
f32 |
f64 |
|
b |
b |
i8 |
i16 |
i32 |
i64 |
u8 |
u16 |
u32 |
u64 |
f8 |
f16 |
bf16 |
f32 |
f64 |
i8 |
i8 |
i8 |
i16 |
i32 |
i64 |
x |
x |
x |
x |
f8 |
f16 |
bf16 |
f32 |
f64 |
i16 |
i16 |
i16 |
i16 |
i32 |
i64 |
x |
x |
x |
x |
f8 |
f16 |
bf16 |
f32 |
f64 |
i32 |
i32 |
i32 |
i32 |
i32 |
i64 |
x |
x |
x |
x |
f8 |
f16 |
bf16 |
f32 |
f64 |
i64 |
i64 |
i64 |
i64 |
i64 |
i64 |
x |
x |
x |
x |
f8 |
f16 |
bf16 |
f32 |
f64 |
u8 |
u8 |
x |
x |
x |
x |
u8 |
u16 |
u32 |
u64 |
f8 |
f16 |
bf16 |
f32 |
f64 |
u16 |
u16 |
x |
x |
x |
x |
u16 |
u16 |
u32 |
u64 |
f8 |
f16 |
bf16 |
f32 |
f64 |
u32 |
u32 |
x |
x |
x |
x |
u32 |
u32 |
u32 |
u64 |
f8 |
f16 |
bf16 |
f32 |
f64 |
u64 |
u64 |
x |
x |
x |
x |
u64 |
u64 |
u64 |
u64 |
f8 |
f16 |
bf16 |
f32 |
f64 |
f8 |
f8 |
f8 |
f8 |
f8 |
f8 |
f8 |
f8 |
f8 |
f8 |
f8 |
f16 |
bf16 |
f32 |
f64 |
f16 |
f16 |
f16 |
f16 |
f16 |
f16 |
f16 |
f16 |
f16 |
f16 |
f16 |
f16 |
f32 |
f32 |
f64 |
bf16 |
bf16 |
bf16 |
bf16 |
bf16 |
bf16 |
bf16 |
bf16 |
bf16 |
bf16 |
bf16 |
f32 |
bf16 |
f32 |
f64 |
f32 |
f32 |
f32 |
f32 |
f32 |
f32 |
f32 |
f32 |
f32 |
f32 |
f32 |
f32 |
f32 |
f32 |
f64 |
f64 |
f64 |
f64 |
f64 |
f64 |
f64 |
f64 |
f64 |
f64 |
f64 |
f64 |
f64 |
f64 |
f64 |
f64 |