At present, the SIMD support makes use of a subset of SSE up to SSE4.1. The subset used depends on the current CPU type.
SSE1 only supports single-precision SIMD (
float-4 ).
SSE2 introduces double-precision SIMD (
double-2 ) and integer SIMD (all types). Integer SIMD is missing a few features; in particular, the
vmin and
vmax operations only work on
uchar-16 and
short-8.
SSE3 introduces horizontal adds (summing all components of a single vector register), which are useful for computing dot products. Where available, SSE3 operations are used to speed up
sum,
vdot,
norm-sq,
norm, and
distance.
SSSE3 introduces
vabs for
char-16,
short-8 and
int-4.
SSE4.1 introduces
vmin and
vmax for all remaining integer types, a faster instruction for
vdot, and a few other things.
On PowerPC, or older x86 chips without SSE, software fallbacks are used for all high-level vector operations. SIMD code can run with no loss in functionality, just decreased performance.
The primitives in the
math.vectors.simd.intrinsics vocabulary do not have software fallbacks, but they should not be called directly in any case.