Tuesday, 21 August 2012

What kind of data type you should use?


Cited from http://www.programmersheaven.com/user/pheaven/blog/191-Float-Double-and-Decimal-do-you-know-the-differences/

Float, Double and Decimal: do you know the differences?


There are a number of data types available in the .Net framework for storing numbers with fractional parts. They are each appropriate for different situations, and using the wrong one can lead to errors in calculations.

Thinking About Accuracy

In some applications, you require that calculations involving numbers with fractional parts are exact. Examples of this are in financial applications, where losing the odd cent here and there in a calculation is unacceptable. Customers expect their accounts to be completely accurate, not to mention the tax man.

In other applications, you don't care about exact results, but you are interested in having an answer that is correct to a certain number of significant digits. This is the case in experimental physics, for example. When you make a measurement of something, you can only measure it as accurately as your equipment will allow you. Therefore, you have an inherent error in that value already, which is going to cascade through any calculations you do with it. This means that while your computed answer may be 5.1826, you may know that the values this answer was calculated from were only accurate to three significant digits, and so the last two digits in this result (the 0.0026) don't matter. If there is a computational error in them, so be it - it's not going to hurt us.

Number Representations

There is more than one way to represent a number with fractional parts. Decimal represents the number as an integer, then also stores an integer power of ten (the exponent) that states where the decimal point is. The .Net decimal type allows you to shift it between 0 and 28 places to the left, so the most fractional places you can have is 28. Since both the value and the position of the point are represented as integer types, and because (with enough bits) we can represent any integer value exactly in binary, we can store the number exactly.

Single and Double (also known as float and double, depending on your language) work differently. Glossing over a few details (but if you want them, look up the IEEE 754 standard), they are stored with both a mantissa (which specifies the digits making up the number) and an exponent, which specifies where the binary point goes. That is, we're working in base two for everything. Further, the mantissa is normalized, so your number is always stored as something of the form 1.10010101 (with only a single bit to the left of the point) followed by binary columns that represent a half, a quarter, and eighth and so forth.

The important thing to realize here is that many decimal numbers are not exactly representable in binary. Consider decimal 0.3, for example. If we try to represent it as the sum of powers of two, we end up with a sequence like 1/4 + 1/32 + 1/64 + 1/512 + ... - in fact, we never actually get to an exact value. We'd need infinitely many bits. This is not unique to binary; in decimal, for example, we can not specify 1/3 exactly. However, it does mean that we have the potential to lose data.

Range

The decimal data type uses 96 bits to store the number itself. There is a special bit for storing the sign, meaning that you get a range of -79,228,162,514,264,337,593,543,950,335 to 79,228,162,514,264,337,593,543,950,335, if you have no fractional parts. This is 29 digits, meaning that you can move decimal point all the way to just after the initial digit. This is, with the maximum exponent you get a range of -7.9228162514264337593543950335 to 7.9228162514264337593543950335.

With floating point types (float and double), you store the mantissa in a normalized form, meaning you can shift the binary point in either direction to get really small or really large numbers. However, remember that unlike the decimal data type, you are not able to represent every value in the range. You get it accurate to a certain number of significant bits.

With a float, you can store numbers from -3.402823e38 to 3.402823e38, where e38 means "10 raised to the power of 38" - a bit of a wider range than with decimal. However, also interesting is the smallest positive or smallest negative number you can represent, which is around 1.4e-45 - really very tiny.

With a double, the range is -1.79769313486232e308 to 1.79769313486232e308, and the smallest is around 5e-324! While the range is greater, the main advantage of the double type isn't so much the extra range of values - due to a small increase in exponent size - but a big increase in precision by having a much larger mantissa. That is, you get a lot more significant digits.

Size In Memory

A float (Single) takes 32 bits of memory, a double takes 64 bits of memory and a decimal takes 128 bits of memory. Note that the reason a decimal is so much larger is because it can store a huge number of significant digits, giving a great deal of accuracy/precision. Note that a 32-bit float can only represent as many values as a 32-bit integer; you are trading in precision to get an increased range. It's worth noting that about half of the values you can represent with a floating point number are between -1.0 and 1.0.

Performance

In any modern PC, your CPU will have a dedicated Floating Point Unit, which is a chunk of hardware that performs operations on floating point numbers (floats and doubles). However, in embedded environments you may not have an FPU available. If you do have an FPU, operations on floats and doubles will be fast (and the non-floating point execution units can also be used to do other integer computation in parallel, thanks to in-hardware optimization). If you don't and it is being emulated in software, it will be much slower.

The exact performance differences in operations on floats and doubles is highly platform and application specific. We can certainly say that floats require double the memory, and thus twice as much CPU cache space, but how much that actually impacts performance is specific to how your application uses memory. Due to the fact that cache lines store multiple words anyway, there's not an obvious answer.

The decimal type does integer operations. Normally that would be faster than floating point ones, apart from there isn't dedicated hardware to deal with the 96-bit values, not to mention handling differences in the exponent, so it can wind up being much slower. If you didn't have an FPU, fixed point would likely beat floating point. It's the hardware support that makes the big difference here.

Conclusions

Always remember that floating point numbers can't represent every value in the range they cover, while decimal numbers can (though the range is somewhat smaller).

The inability of floating point numbers to represent every possible value in the range they cover can have important consequences in application design. For example, adding a bunch of numbers smallest first to largest first can end up with a different result to if you had done the largest first and finished up adding on the smallest one. The reasons behind that are complex and for another post, but it's good to be aware that there are a lot of subtleties to worry about when working with floating point. The other big thing to avoid in most circumstances is comparing floating point values exactly, rather than checking that the difference between them doesn't lie in an acceptably small range.

No comments:

Post a Comment