Floating Point Hashing: Addenda

A collection of bonus content and elaborations on my recent RustWeek talk

Rose Peck

2025-05-13

I recently had the pleasure of giving a talk at RustWeek on Floating Point Hashing.¹ (The talk was recorded, ~~but, at time of writing, the recording has yet to be posted. I'll drop a link here once it's up.~~ You can find the recording here!) In it, I went into detail about all reasons why hasing a floating point value is difficult, unintuitive, highly error prone, and a bad idea all around. Then I did it anyway!

Wow! Did it work?

Thanks for asking, Crane! No! It did not :)

But I still had a lot of fun talking about it, and I have no issue making a slight fool of myself on stage to prove a point. That being said, there were a handful of things that I wanted to put in the talk, but ultimately left out (either for time, or to keep the talk flowing smoothly). But, I still wanted to share them somewhere, so please enjoy the following miscellaneous addenda.

Hiding in plain sight: println! and the 0.3 hootenanny

In the talk, I give the following example to show some unintuitive behavior with float math:

let a: f32 = 0.1 + 0.2;
println!("{}", a);

Which outputs:

0.3

This is very odd, but for a non-obvious reason. 0.1 + 0.2 is actually a classic example of floating point being wrong. In the talk, I explain that we see this seemingly correct behavior because we're using f32, not f64, and the classic 0.30000000000000004 error only shows up with f64. This is true, but it gloses over an odd detail: the f32 version is still wrong, but in a more subtle way.

The exact value of the f32 version of the expression is 0.300000011920928955078125, not 0.3. However, we don't see this in the console because Rust's display impl rounds floating point numbers based on their precision. In order to show the full precision, you have to write this:

let a: f32 = 0.1 + 0.2;
println!("{:.55}", a);

0.3000000119209289550781250000000000000000000000000000000

If you play with this a little more, you can see that 0.1 and 0.2 are also not stored precisely, although 0.5 is:²

println!("{:.55}", 0.1);
println!("{:.55}", 0.2);
println!("{:.55}", 0.5);

0.1000000000000000055511151231257827021181583404541015625
0.2000000000000000111022302462515654042363166809082031250
0.5000000000000000000000000000000000000000000000000000000

It's just another layer of strangeness that you have to peel back, and another way that you can easily think you're hashing one thing, but are actually hashing another. Not only can you not trust your math, sometimes you can't even trust your console.

Transcendental function calculation is kinda cursed, actually

This is one of the edge cases that I briefly mentioned, but didn't have time to talk about. The float inaccuracy caused by transcendental functions is bizzare, fascinating, and (from what I've found) not well covered elsewhere, so I wanted to share it here.

What's a transcendental function? Not that I don't know of course. Just, yknow, for anyone reading who might not know.

Good question! If you're anything like me, then when you first heard this term, you were so intimidated by it that you didn't even try googling it to find out what it means. Luckily for us, it's not too complicated to wrap your head around. For our purposes today, a transcendental function is any function which isn't a "normal polynomial" (or a ratio of polynomials).³ So, things like sin(x), ln(x), e^x, etc.

Have you ever thought about how computers actually calculate these? There are several ways, but the short answer is that they're usually⁴ approximated using a taylor series⁵.

Approximation you say? I have a bad feeling about this...

Indeed. A taylor series lets you approximate the value of any infinitely differentiable⁶ function to arbitrary precision, using only a finite number of polynomial terms. However, this polynomial approximation has three very important quirks:

The precision of the approximation depends on how many polynomial terms are used⁷
This is generally implemented in software, not in hardware
The precision to which these functions are approximated to is unspecified

This means that, if your program ever calls one of these functions, there is an unknown amount of approximation error that can be introduced, and the amount of error will actually be platform-specific, since different platforms will have different software implementations of the approximating series. Tying this back to hashing, this means that the exact value that you're trying to hash can end up varying non-trivially based on what OS⁸ the program was compiled for.

That's neat and all, but this is such a specific edge case. There's no way someone would hit this in the real world, right?

I'm so sorry.

This actually happened. A user submitted an issue to Bevy, an open-source game engine written in Rust, complaining that some recently added color space tests were failing on their machine, despite the tests passing in CI and passing on the machines of the original devs who did the color space work. As was foreshadowed, transcendental function approximation was the culprit. The color math used the several transcendental functions (powi, powf, sin, cos, and cbrt, just to name a few) in its calculations, and this was being evaluated slightly differently on different platforms. Then, the tests were checking color results using exact float equality, leading to test failures, but only sometimes.

Bevy's eventual solution to this is fairly clever. Inside of their math crate, they provide re-exports for all the transcendental float math operations that are present in the standard library. Based on whether users compile with the libm feature, they can choose between either the std implementations (which vary by platform) or the libm implementations (which are consistent across platforms). They even have a lint for it! Good stuff!

"If I could turn back time" -Cher

(This section will probably not make much sense if you have not seen the original talk.)

Here's a question: knowing what you know now, would you change the hash impl that you wrote?

Yes, but primarily in the sense that I wouldn't write one in the first place. I mentioned this in the talk, but I'll say it again: I don't think the current impl is bad. Given the knowledge we have of the problem domain (and the testing that I did, both on my own and later in production with early adopters), I'm pretty confident that it gets correct answers, despite all of floating point's quirks. The main mistake I made at the time was a lack of proper justification, not a lack of proper code.

That all said though, no, I would not do this again. This is partially because, even now, I'm still not confident that I can cover every possible edge case, but the larger reason is that there is a better way: integer quantization.⁹

The strategy here is pretty straighforward. Instead of storing the point coordinates using float values, we use integer coordinates on a regular grid:

Compute the AABB of the point cloud. (This is pretty easy to do with a single n-time scan.)
Divide the AABB into regular grid cells based on the integer precision used. So, for an i16, each grid cell would have a side length equal to 1/(2^16)th of the length of the AABB
Store the origin shift of the center of the AABB, so everything is scaled to the integer space
Round every point to the nearest integer coordinate¹⁰

For most domains, getting 2^16 divisions on your space is enough precision for what you need to do, and if it isn't, you can always jump up to i32 or i64. Then, you convert back to floating point once the math is done. This way, you can get nice, consistent, hashable values that also maintain enough (fixed) precision for what you need to do.

A lukewarm defense of floating point

After all this, it would be easy to assume that I hate floating point, but I really don't.

But when I say this, I say this all full of admiration for the jungle. It is not that I hate it, I love it. I love it very much. But I love it against my better judgment.

- Werner Herzog

A key and important thing to keep in mind is that floating point was designed for working with real-world engineering data. Data which is collected by humans, which has error, for which the outputs can only be meaninfully used to a certain precision, and so on. And, with this in mind, a lot of its design decisions actually make a lot of sense:

Yes, exact values and exact comparisons are basically impossible, but measuring exact values is impossible too. As long as we're within a reasonable error bound, we're happy.
Yes, values can drift as you work with them, but in almost any non-degenerate case, the drift is small enough to not matter.
Yes, the absolute precision drops as you move away from zero, but the relative precision is preserved. In most engineering contexts, you're not moving between wildly different orders of magnitude within the same computation.
Yes, NaN has a ton of weird inconsistencies, but real measurements don't bring in NaNs, and having error handling encoded into the format is an interesting and non-terrible¹¹ idea

And, after going through the excercise of the original talk, we can see this stuff in practice. Yes, hashing floats is a bad idea, and you definitely should not do it, but when you're working in a real-world engineering context, most of the quirks just kinda come out in the wash.

And that really is the beauty of it. Looking through floating point, picking it apart, seeing its mystic face and its ugly scars, you can really feel the tradeoffs being made. She's not perfect, but she's imperfect because we're imperfect, and I think there's something poetic in that.

Footnotes

You can find the slides here ↩
The proof of this fact is left as an excercise to the reader. Hint: remember that floating point always multiplies the mantissa by a power of 2. ↩
The actual definition is a bit more subtle, but this simplification (read: lie) will get us the right idea. ↩
I think. I don't work this low-level, so I'm not 100% sure on this. Feel free to write in and correct me! ↩
I solemly swear that this is not a math blog. I solemly swear that this is not a math blog. I solemly swear... ↩
I SOLEMLY SWEAR THAT THIS IS NOT A MATH BLOG. ↩
As the number of polynomial terms approaches infinity, the polynomial approximation approaches the original function. ↩
And, confusingly enough, it will not vary based on the architecture. x86_64-apple-darwin and aarch64-apple-darwin will get the same answer, but x86_64-apple-darwin and x86_64-unknown-linux-gnu will not. ↩
This was originally suggested to me by my coworker, Camillo Talero, when I first gave this talk internally at work. He very politely asked why I didn't use integer quantization to solve the problem, to which I, with mild embarrassment, replied that I hadn't thought of it at the time. ↩
If I were to actually do this, I would probably shift/scale the integer grid to align with the uniform voxels as well, so that we can make sure that those always line up exactly in the end. Details details. ↩
I don't personally like the inclusion of NaN, but I also don't consider myself knowledgable enough to claim that it was a bad design decision. ↩