Newer ARM processors have their own flavor of SIMD instructions called NEON. In my little Android application Arashi, NEON is used a lot to speed up the simulation of particles.

Here is a table explaining some of the NEON functions that are used:

[table caption="C++ NEON functions" width="500" colwidth="20|100|50" colalign="left|left"]

NEON,Explanation,Pseudocode

vdupq_n_f32(a),New NEON value,a

vsubq_f32(a\, b),Subtract,a - b

vaddq_f32(a\, b),Add,a + b

vmulq_f32(a\, b),Multiply,a * b

vmlaq_f32(a\, b\, c),Multiply and add,a + (b * c)

vmlsq_f32(a\, b\, c),Multiply and subtract,a - (b * c)

vrsqrteq_f32(a),Reciprocal square root,1 / sqrt(a)

vcgtq_f32(a\, b),Compare greater than,a > b ? 1 : 0

vcltq_f32(a\, b),Compare less than,a < b ? 1 : 0

vbslq_f32(mask\, a\, b),Select by mask,mask != 0 ? a : b

vminq_f32(a\, b),Get minimum,a < b ? a : b

vmaxq_f32(a\, b),Get maximum,a > b ? a : b

[/table]

Gal HaiSlight Typo:

vcgtq_f32(a, b) Compare less than a < b ? 1 : 0

Guess you meant vcltq_f32.

KevinPost authorYou’re right, thanks for letting me know! I’ve update the post