Comparing Floating Point Numbers in C/C++
![]()
Comparing floating point numbers for equality can be problematic. It’s difficult because often we are comparing small or large numbers that are not represented exactly. There is also issues with rounding errors caused by not being able to represent and exact value.
Rather than doing a strict value comparison ( == ), we treat two values as equal if there values are very close to each other. So what does “very close” mean? Well to answer that we have to take a look at how the numbers are represented in memory.
Here is an example 32bit float:
And here is the layout of a double precision (64bit) float:
The number of bits in the fraction can be thought of as the number of significant bits (or, accuracy) of the number. We do not want to use all of the fraction bits otherwise we would be doing a strict comparison, but we will use most. For both sized float types I will use 4 less significant bits:
We will use this to calculate the epsilon (the small difference that is still considered equal). However, we have to be careful that the number of bits that we use to calculate the epsilon is based on the smallest precision value in the comparison. For example is we compare a 32bit float with a 64bit float we must use only the precision of the 32bit float.
What is the most effective way for float and double comparison?
What would be the most efficient way to compare two double or two float values?
Simply doing this is not correct:
But something like:
Seems to waste processing.
Does anyone know a smarter float comparer?
33 Answers 33
Be extremely careful using any of the other suggestions. It all depends on context.
I have spent a long time tracing bugs in a system that presumed a==b if |a-b|<epsilon . The underlying problems were:
The implicit presumption in an algorithm that if a==b and b==c then a==c .
Using the same epsilon for lines measured in inches and lines measured in mils (.001 inch). That is a==b but 1000a!=1000b . (This is why AlmostEqual2sComplement asks for the epsilon or max ULPS).
The use of the same epsilon for both the cosine of angles and the length of lines!
Using such a compare function to sort items in a collection. (In this case using the builtin C++ operator == for doubles produced correct results.)
Like I said: it all depends on context and the expected size of a and b .
By the way, std::numeric_limits<double>::epsilon() is the "machine epsilon". It is the difference between 1.0 and the next value representable by a double. I guess that it could be used in the compare function but only if the expected values are less than 1. (This is in response to @cdv’s answer. )
Also, if you basically have int arithmetic in doubles (here we use doubles to hold int values in certain cases) your arithmetic will be correct. For example 4.0/2.0 will be the same as 1.0+1.0 . This is as long as you do not do things that result in fractions ( 4.0/3.0 ) or do not go outside of the size of an int.
2^103 is an epsilon for some values; or is this referring to a minimum epsilon?
The comparison with an epsilon value is what most people do (even in game programming).
You should change your implementation a little though:
Edit: Christer has added a stack of great info on this topic on a recent blog post. Enjoy.
![]()
Comparing floating point numbers for depends on the context. Since even changing the order of operations can produce different results, it is important to know how «equal» you want the numbers to be.
Comparing floating point numbers by Bruce Dawson is a good place to start when looking at floating point comparison.
Of course, choosing epsilon depends on the context, and determines how equal you want the numbers to be.
Another method of comparing floating point numbers is to look at the ULP (units in last place) of the numbers. While not dealing specifically with comparisons, the paper What every computer scientist should know about floating point numbers is a good resource for understanding how floating point works and what the pitfalls are, including what ULP is.
I found that the Google C++ Testing Framework contains a nice cross-platform template-based implementation of AlmostEqual2sComplement which works on both doubles and floats. Given that it is released under the BSD license, using it in your own code should be no problem, as long as you retain the license. I extracted the below code from http://code.google.com/p/googletest/source/browse/trunk/include/gtest/internal/gtest-internal.h https://github.com/google/googletest/blob/master/googletest/include/gtest/internal/gtest-internal.h and added the license on top.
Be sure to #define GTEST_OS_WINDOWS to some value (or to change the code where it’s used to something that fits your codebase — it’s BSD licensed after all).
EDIT: This post is 4 years old. It’s probably still valid, and the code is nice, but some people found improvements. Best go get the latest version of AlmostEquals right from the Google Test source code, and not the one I pasted up here.
![]()
For a more in depth approach read Comparing floating point numbers. Here is the code snippet from that link:
Realizing this is an old thread but this article is one of the most straight forward ones I have found on comparing floating point numbers and if you want to explore more it has more detailed references as well and it the main site covers a complete range of issues dealing with floating point numbers The Floating-Point Guide :Comparison.
We can find a somewhat more practical article in Floating-point tolerances revisited and notes there is absolute tolerance test, which boils down to this in C++:
and relative tolerance test:
The article notes that the absolute test fails when x and y are large and fails in the relative case when they are small. Assuming he absolute and relative tolerance is the same a combined test would look like this:
I ended up spending quite some time going through material in this great thread. I doubt everyone wants to spend so much time so I would highlight the summary of what I learned and the solution I implemented.
Quick Summary
- Is 1e-8 approximately same as 1e-16? If you are looking at noisy sensor data then probably yes but if you are doing molecular simulation then may be not! Bottom line: You always need to think of tolerance value in context of specific function call and not just make it generic app-wide hard-coded constant.
- For general library functions, it’s still nice to have parameter with default tolerance. A typical choice is numeric_limits::epsilon() which is same as FLT_EPSILON in float.h. This is however problematic because epsilon for comparing values like 1.0 is not same as epsilon for values like 1E9. The FLT_EPSILON is defined for 1.0.
- The obvious implementation to check if number is within tolerance is fabs(a-b) <= epsilon however this doesn’t work because default epsilon is defined for 1.0. We need to scale epsilon up or down in terms of a and b.
- There are two solution to this problem: either you set epsilon proportional to max(a,b) or you can get next representable numbers around a and then see if b falls into that range. The former is called «relative» method and later is called ULP method.
- Both methods actually fails anyway when comparing with 0. In this case, application must supply correct tolerance.
Utility Functions Implementation (C++11)
![]()
The portable way to get epsilon in C++ is
Then the comparison function becomes
The code you wrote is bugged :
The correct code would be :
(. and yes this is different)
I wonder if fabs wouldn’t make you lose lazy evaluation in some case. I would say it depends on the compiler. You might want to try both. If they are equivalent in average, take the implementation with fabs.
If you have some info on which of the two float is more likely to be bigger than then other, you can play on the order of the comparison to take better advantage of the lazy evaluation.
Finally you might get better result by inlining this function. Not likely to improve much though.
Edit: OJ, thanks for correcting your code. I erased my comment accordingly
This is fine if:
- the order of magnitude of your inputs don’t change much
- very small numbers of opposite signs can be treated as equal
But otherwise it’ll lead you into trouble. Double precision numbers have a resolution of about 16 decimal places. If the two numbers you are comparing are larger in magnitude than EPSILON*1.0E16, then you might as well be saying:
I’ll examine a different approach that assumes you need to worry about the first issue and assume the second is fine your application. A solution would be something like:
This is expensive computationally, but it is sometimes what is called for. This is what we have to do at my company because we deal with an engineering library and inputs can vary by a few dozen orders of magnitude.
Anyway, the point is this (and applies to practically every programming problem): Evaluate what your needs are, then come up with a solution to address your needs — don’t assume the easy answer will address your needs. If after your evaluation you find that fabs(a-b) < EPSILON will suffice, perfect — use it! But be aware of its shortcomings and other possible solutions too.
As others have pointed out, using a fixed-exponent epsilon (such as 0.0000001) will be useless for values away from the epsilon value. For example, if your two values are 10000.000977 and 10000, then there are NO 32-bit floating-point values between these two numbers — 10000 and 10000.000977 are as close as you can possibly get without being bit-for-bit identical. Here, an epsilon of less than 0.0009 is meaningless; you might as well use the straight equality operator.
Likewise, as the two values approach epsilon in size, the relative error grows to 100%.
Thus, trying to mix a fixed point number such as 0.00001 with floating-point values (where the exponent is arbitrary) is a pointless exercise. This will only ever work if you can be assured that the operand values lie within a narrow domain (that is, close to some specific exponent), and if you properly select an epsilon value for that specific test. If you pull a number out of the air («Hey! 0.00001 is small, so that must be good!»), you’re doomed to numerical errors. I’ve spent plenty of time debugging bad numerical code where some poor schmuck tosses in random epsilon values to make yet another test case work.
If you do numerical programming of any kind and believe you need to reach for fixed-point epsilons, READ BRUCE’S ARTICLE ON COMPARING FLOATING-POINT NUMBERS.
C# Tip – How to check if two double values are equal
If you’ve worked with variable whose datatypes are “double”, you may have seen a problem when you check if two doubles are equal.
The problem is the way that a double (also called “float”) variable is stored. Doubles sometimes lose accuracy.
So, a variable you think holds “1” actually holds “0.9999999999987423” – or something like that.
Here are two extension methods you can use to compare two double variables, and see if they are “equal enough”.
These functions subtract the second value from the first, get the absolute value (converting negative differences to a positive number), and check if the difference between the two numbers is less than a value you consider to be acceptable for declaring the variables “equal”.
Here is how you would use it in your code:
The default value has always worked for the situations I’ve encountered, but you may see something different in your programs, and want to use your own level of accuracy.
What is the most effective way for float and double comparison in C/C++?
Here we will see how to compare two floating point data or two double data using C or C++. The floating point / double comparison is not similar to the integer comparison.
To compare two floating point or double values, we have to consider the precision in to the comparison. For example, if two numbers are 3.1428 and 3.1415, then they are same up to the precision 0.01, but after that, like 0.001 they are not same.
To compare using this criteria, we will find the absolute value after subtracting one floating point number from another, then check whether the result is lesser than the precision value or not. By this we can decide that they are equivalent or not.