What is Floating Point
Floating point is how computers represent decimal numbers in binary. While two's complement handles integers, floating point handles values like 3.14, -0.001, and 6.022 x 10^23 — numbers with fractional parts or extreme magnitudes.
How it works
A floating-point number has three parts, similar to scientific notation:
Scientific: -2.5 × 10^3
| | |
sign mantissa exponent
IEEE 754 (32-bit float):
[1 bit sign] [8 bits exponent] [23 bits mantissa]
- Sign — one bit. 0 is positive, 1 is negative.
- Exponent — determines the magnitude (how far the decimal point "floats").
- Mantissa (also called significand) — holds the significant digits.
The IEEE 754 standard defines two common sizes: 32-bit (float, f32) with about 7 decimal digits of precision, and 64-bit (double, f64) with about 15-16 digits.
The infamous precision issue: 0.1 + 0.2 != 0.3 in most languages. This happens because 0.1 has no exact binary representation — just like 1/3 has no exact decimal representation. The stored value is the closest approximation the format can express.
Other precision traps:
- Large numbers lose small details — a 32-bit float can represent 16,777,216 and 16,777,218, but not 16,777,217. The gaps between representable values grow as the magnitude increases.
- Catastrophic cancellation — subtracting two nearly equal numbers can destroy all significant digits.
- Special values — IEEE 754 defines
Infinity,-Infinity, andNaN(Not a Number). Dividing 1.0 by 0.0 gives Infinity. The square root of -1.0 gives NaN.
Why it matters
Nearly every program deals with floating-point numbers. Physics simulations, financial calculations, graphics, machine learning — they all use floats. Knowing that floats are approximations, not exact values, prevents real bugs. Never compare floats for exact equality. Never use floats for money (use integer cents instead). Know your precision budget: f32 for graphics, f64 for science.
See How Binary Works for the full picture of how numbers — integers and decimals — are encoded in binary.