# Vector processing via SSE/AVX

FTN95 /64 creates machine code that makes some use of the SSE and AVX instruction sets (see https://en.wikipedia.org/wiki/Streaming_SIMD_Extensions).

Note that code compiled using /OPTIMISE will not load on a machine that does not support AVX instructions.

Users can also provide direct SSE/AVX support via CODE/EDOC statements in their code.

Four "BLAS" type library routines (DOT_PRODUCT8@,DOT_PRODUCT4@,AXPY8@ and AXPY4@) are also provided and these make direct use of the SSE/AVX instruction sets. In addition, the library function USE_AVX@ can be called in order to instruct these routines to use AVX rather than SSE when the CPU and operating system make this possible.

REAL*8 FUNCTION DOT_PRODUCT8@(x,y,n)
REAL*8 x(n),y(n)
INTEGER*8 n

REAL*4 FUNCTION DOT_PRODUCT4@(x,y,n)
REAL*4 x(n),y(n)
INTEGER*8 n

SUBROUTINE AXPY8@(y,x,n,a)
REAL*8 x(n),y(n),a
INTEGER*8 n
(Y = Y + A*X)

SUBROUTINE AXPY4@(y,x,n,a)
REAL*4 x(n),y(n),a
INTEGER*8 n
(Y = Y + A*X)

INTEGER FUNCTION USE_AVX@(level)
INTEGER level
(Set level = 0 for SSE. Set level = 1 for AVX. The function returns the level that will be used by the current CPU/OS. The default level is 1 which means that AVX will be used when available otherwise SSE. If USE_AVX@(1) is called before an ALLOCATE statement then the resultant addresses will be 32 byte aligned. The USE_AVX@ level must be the same at a corresponding DEALLOCATE.)

For example:

```  INTEGER(4),PARAMETER::n=100
REAL(2) DOT_PRODUCT8@,prod,x(n),y(n)
INTEGER USE_AVX@,level
! x = ...; y = ...
level   prod = DOT_PRODUCT8@(x,y,n) ```