# Floating-point operators

### February 8, 2024

#### Floating-point operators

For floating-point and complex numbers, `+x` is the same as `x`, while `-x` is the negation of `x`. The result of a floating-point or complex division by zero is not specified beyond the IEEE-754 standard; whether a run-time panic occurs is implementation-specific.

I find this to be quite interesting. An implementation may choose to panic, or not, if you attempt to divide a floating-point or complex number by zero. And indeed, the standard implementation does not panic, as you can see by running the following code in the Playground or on your own machine:

``````func main() {
var x float64 = 1.23
fmt.Println(x / 0)
}
``````

Outputs:

``````+Inf
``````

But wait, that’s not all… there are other implementation-dependent details!

An implementation may combine multiple floating-point operations into a single fused operation, possibly across statements, and produce a result that differs from the value obtained by executing and rounding the instructions individually.

So as a simple example, if you have code that looks like:

``````x := float32(0.0000000001)
y := x
x = x * 12.3456789
x = x / 12.3456789
fmt.Println(y, x) // 1e-10 9.9999994e-11
``````

An implementation has the freedom to “fuse” the multiplication followed by division operations, effectively eliminating them, for a more accurate result.

… An explicit floating-point type conversion rounds to the precision of the target type, preventing fusion that would discard that rounding.

For instance, some architectures provide a “fused multiply and add” (FMA) instruction that computes `x*y + z` without rounding the intermediate result `x*y`. These examples show when a Go implementation can use that instruction:

``````// FMA allowed for computing r, because x*y is not explicitly rounded:
r  = x*y + z
r  = z;   r += x*y
t  = x*y; r = t + z
*p = x*y; r = *p + z
r  = x*y + float64(z)

// FMA disallowed for computing r, because it would omit rounding of x*y:
r  = float64(x*y) + z
r  = z; r += float64(x*y)
t  = float64(x*y); r = t + z
``````

So if you want your implementation/platform to (possibly) fuse floating-point operations, avoide explicit conversion to/from floating-point types, which make that impossible. In reality: If in doubt, don’t worry about this level of micro-optimization.

Quotes from The Go Programming Language Specification Version of August 2, 2023