I’m trying to optimize some code for an instrument on Solar Orbiter. We have to do some square roots, so while reading ASM code to see some C code change effects and hunting for FPU instructions, I discovered that pattern for sqrt:
4001a9ac: 97 a0 05 2f fsqrts %f15, %f11 4001a9b0: 81 aa ca 2b fcmps %f11, %f11 4001a9b4: 01 00 00 00 nop 4001a9b8: 13 80 00 11 fbe 4001a9fc <compute_BP1+0x144> 4001a9bc: db 06 20 44 ld [ %i0 + 0x44 ], %f13 4001a9c0: df 27 bf c8 st %f15, [ %fp + -56 ] 4001a9c4: d1 27 bf d4 st %f8, [ %fp + -44 ] 4001a9c8: d3 27 bf d0 st %f9, [ %fp + -48 ] 4001a9cc: d5 27 bf d8 st %f10, [ %fp + -40 ] 4001a9d0: d9 27 bf dc st %f12, [ %fp + -36 ] 4001a9d4: db 27 bf e0 st %f13, [ %fp + -32 ] 4001a9d8: 40 00 02 49 call 4001b2fc <sqrtf>
What I understand is that fsqrts might fail and in this case we would repeat the computation with newlib’s soft implementation. Am I right? Under which circumstances could it fail? Can we avoid it?