Reduce fdiv's into fmul's Provides small speedup on microarchitectures where the floating point divide is slower than the floating point multiply.