Using NEON and VFPv3 on Cortex-A8

The compiler supports two different options to control NEON and VFPv3.

--float_support=VFPv3 --neon

The --float_support=VFPv3 option instructs the compiler to generate code that utilizes the VFPv3 coprocessor for both double and single precision floating point operations. The option is also used to enable the assembler to accept VFPv3 instructions in assembly source. To enable VFPv3 the EABI mode must also be enabled through the --abi=eabi option. This is necessary because the calling convention for floating point paramemters changes when VFPv3 is enabled and that convention is only supported in EABI mode.

The --neon option instructs the compiler to automatically vectorize loops to use the NEON instructions. To get benefit from this option you should be using --opt_level=2 or higher and be generating code for performance by using the --opt_for_speed=[3-5] option.

Combining options[edit]

The TI ARM compiler supports four modes related to Cortex-A8, NEON, and VFPv3. By default neither NEON or VFPv3 is enabled. In addition to the default the following 3 modes are supported:

VFP enabled without NEON

The compiler will generate VFPv3 instructions for single and double precision floating point operations

NEON enabled without VFP

In this mode the compiler will generate NEON instructions for SIMD integer operations. It will not generate NEON instructions to vectorize floating point operations. The motivation for not allowing floating point NEON instructions if VFP is not enabled is because it is possible to have an integer only variant of NEON implemented. In order for the NEON unit to support floating point operations the VFPv3 coprocessor must be present.

NEON enabled and VFP enabled

In this mode the compiler will generate a mix of NEON and VFP instructions. The NEON instructions can be either integer or floating point.

VFPv3 vs. NEON performance[edit]

A common question with regard to TI ARM compiler's support for NEON is how to get more floating point operations on the NEON unit instead of the VFPv3. The reason this is desirable is because the VFPv3 coprocessor is not a pipelined architecture on the Cortex-A8, but the NEON is. The compiler will always use VFP instructions for scalar floating point operations, even if the --neon option is used. The hardware is capable of issuing VFP instructions on the NEON coprocessor if the following conditions are met:

The instruction must be a single precision data processing instruction
The processor must be in flush-to-zero mode. In this mode the processor will treat all denormalized numbers as zero.
The processor must be in default NaN mode. In this mode the operation will return the default NaN regardless of the input, whereas in full-compliance mode the returned NaN follows the rules in the ARM Architecture Reference Manual.
The FPEXC.EX bit must be set to 0. This tells the processor that there is no additional state that must be handled by a context switch.

{{

switchcategory:MultiCore=

For technical support on MultiCore devices, please post your questions in the C6000 MultiCore Forum
For questions related to the BIOS MultiCore SDK (MCSDK), please use the BIOS Forum

Please post only comments related to the article Using NEON and VFPv3 on Cortex-A8 here.

Keystone=

For technical support on MultiCore devices, please post your questions in the C6000 MultiCore Forum
For questions related to the BIOS MultiCore SDK (MCSDK), please use the BIOS Forum

Please post only comments related to the article Using NEON and VFPv3 on Cortex-A8 here.

C2000=For technical support on the C2000 please post your questions on The C2000 Forum. Please post only comments about the article Using NEON and VFPv3 on Cortex-A8 here.

DaVinci=For technical support on DaVincoplease post your questions on The DaVinci Forum. Please post only comments about the article Using NEON and VFPv3 on Cortex-A8 here.

MSP430=For technical support on MSP430 please post your questions on The MSP430 Forum. Please post only comments about the article Using NEON and VFPv3 on Cortex-A8 here.

OMAP35x=For technical support on OMAP please post your questions on The OMAP Forum. Please post only comments about the article Using NEON and VFPv3 on Cortex-A8 here.

OMAPL1=For technical support on OMAP please post your questions on The OMAP Forum. Please post only comments about the article Using NEON and VFPv3 on Cortex-A8 here.

MAVRK=For technical support on MAVRK please post your questions on The MAVRK Toolbox Forum. Please post only comments about the article Using NEON and VFPv3 on Cortex-A8 here.

For technical support please post your questions at http://e2e.ti.com. Please post only comments about the article Using NEON and VFPv3 on Cortex-A8 here.

}}