Sqrt ASM code with rcc 1.2.21

I’m trying to optimize some code for an instrument on Solar Orbiter. We have to do some square roots, so while reading ASM code to see some C code change effects and hunting for FPU instructions, I discovered that pattern for sqrt:

4001a9ac:	97 a0 05 2f 	fsqrts  %f15, %f11
4001a9b0:	81 aa ca 2b 	fcmps  %f11, %f11
4001a9b4:	01 00 00 00 	nop 
4001a9b8:	13 80 00 11 	fbe  4001a9fc <compute_BP1+0x144>
4001a9bc:	db 06 20 44 	ld  [ %i0 + 0x44 ], %f13
4001a9c0:	df 27 bf c8 	st  %f15, [ %fp + -56 ]
4001a9c4:	d1 27 bf d4 	st  %f8, [ %fp + -44 ]
4001a9c8:	d3 27 bf d0 	st  %f9, [ %fp + -48 ]
4001a9cc:	d5 27 bf d8 	st  %f10, [ %fp + -40 ]
4001a9d0:	d9 27 bf dc 	st  %f12, [ %fp + -36 ]
4001a9d4:	db 27 bf e0 	st  %f13, [ %fp + -32 ]
4001a9d8:	40 00 02 49 	call  4001b2fc <sqrtf>

What I understand is that fsqrts might fail and in this case we would repeat the computation with newlib’s soft implementation. Am I right? Under which circumstances could it fail? Can we avoid it?

Best regards,

Hi! Are you using our binary rcc distribution or built the toolchain yourself? Any build flags (-mtune etc)? Is this generated from a single sqrtf() function call in C?

Just from looking at the disassembly, it looks like this code checks for a NaN result from fsqrts (as NaN is not equal to itself by definition) which should happen if the input to the square root is less than 0.

Hi, I use a self built rcc since we had to modify spw init for dual link mode:
teamcity-docker-SolarOrbiter-LFR-agent/Dockerfile at master · jeandet/teamcity-docker-SolarOrbiter-LFR-agent · GitHub
regarding CPU related build flags, nothing except -mfix-b2bst LFR_Flight_Software/meson.build at R3.3 · LaboratoryOfPlasmaPhysics/LFR_Flight_Software · GitHub

NaN results was my guess, but I wondered if it could happen with allowed values(>=0) because I would expect libc’s sqrt to return the same result (NaN) with anything < 0.

Maybe it is called to get ERRNO set?

Yes it was errno :slight_smile:

Good find, thanks for reporting back!