Real to Float conversion with no loss of data

Real to Float conversion with no loss of data - sql-server

I had a table with two columns for coordinates stored in. These columns were REAL datatype, and I noticed that from my application it was only showing 5 decimals for coordinates, and positions were not accurate enough.
I decided to change datatype to FLOAT, so I could use more decimals. It was for my pleasant surprise that when I changed the column data type, the decimals suddenly appeared without me having to store all the coordinates again.
Anyone can tell me why this happens? What happens with the decimal precision on REAL datatype?. Isn´t the data rounded and truncated when inserted? Why when I changed the datatype the precision came up with no loss of data?..

You want to use a Decimal data-type.
Floating point values are caluclated by a value and an exponenent. This allows you have store huge number representations in small amounts of memory. This also means that you don't always get exactly the number you're looking for, just very very close. This is why when you compare floating point values, you compare them within a certain tolerance.
It was for my pleasant surprise that when I changed the column data type, the decimals suddenly appeared without me having to store all the coordinates again.
Be careful, this doesn't mean that the value that was filled in is the accurate value of what you're looking for. If you truncated your original calculation, you need to get those numbers again without cutting off any precision. The values that it autofills when you convert from Real to Float aren't the rest of what you truncated, they are entirely new values which result from adding more precision to the calculation used to populate your Real value.
Here is a good thread that explains the difference in data-types in SQL:
Difference between numeric, float and decimal in SQL Server
Another helpful link:
Bad habits to kick : choosing the wrong data type

Related

Limit number of positions after decimal point in MariaDB

I am currently making a small MariaDB database and ran into the following problem:
I want to save a floatingpoint number with only 2 poistions after the decimal point but everything before the decimal point should be unaffected.
For example: 1.11; 56789.12; 9999.00; 999999999999.01 etc.
I have done some research and this is what I am using right now:
CREATE TABLE mytable (
mynumber DOUBLE(10, 2)
)
The problem with this solution is that I also have to limit the number of positions before the decimal point, what I don't want to do.
So is there a possibility to limit the number of positions after the decimal point without affecting the positions before the decimal point or is there a "default number" I can use for the positions before the decimal point?

Don't use (m,n) with FLOAT or DOUBLE. It does nothing useful; it does cause an extra round.
DECIMAL(10,2) is possible; that will store numbers precisely (to 2 decimal places).
See also ROUND() and FORMAT() for controlling the rounding for specific values.
You had a mistake -- 999999999999.01 won't fit in DOUBLE(10,2), nor DECIMAL(10,2). It can handle only 8 (=10-2) digits to the left of the decimal point.

You can create a trigger that intercepts INSERT and UPDATE statements and truncates their value to 2 decimal places. Note, however, that due to how floating point numbers work at machine level, the actual number may be different.
Double precision numbers are accurate up to 14 significant figures, not a certain number of decimal points. Realistically, you need to detemine what is the biggest value you might ever want to store. Once you have done that, the DECIMAL type may be more appropriate for what you are trying to do.
See here for more details:
https://dev.mysql.com/doc/refman/8.0/en/precision-math-decimal-characteristics.html

How to round Double's to exact bit-representation?

I'm doing geography calculations, and ultimately end up with a latitude and longitude to store in a Geography::Point object.
Both latitude and longitude can have 7 digits at most (which also gives precision up to 11 mm, which is plenty).
The problem is: if the value of a field cannot be stored correctly in a Double, MS SQL rounds towards the nearest number that can, but does so by adding a bunch of digits.
=> e.g. 5.9395772 is stored as 5.9395771999999996
The problem this creates, is that [Position].ToString() then exceeds the maximum amount of characters is allowed for that column (and no, I can't increase that limit).
Since we're dealing with Latitude, Longitude, Altitude and Accuracy, there's space for exactly 11 characters for Latitude and Longitude each:
String.Format(CultureInfo.InvariantCulture, "{0:##0.0######}", num)
I've tried simply Math.Round()ing to 6 digits, but then other numbers (e.g. 6.098163 to 6.0981629999999996) get the same problem.
How do I Math.Round towards the nearest 7-digit valid bit representation?
EDIT/ADD
Public Function ToString_LatLon(ByVal num As Double) As String
num = Math.Round(num, 7, MidpointRounding.AwayFromZero)
Return String.Format(CultureInfo.InvariantCulture, "{0:##0.0######}", num)
End Function 'IN = 5.9395772, OUT = 5.9395772
The above code receives a Double and correctly returns the String representation. I've checked it, this is correct also for troubling numbers.
It's stored in SQL Server through the framework we use. I think the problem occurs when storing the value
When I retrieve the value, I get an error in VB, saying the value is wider than the framework allows (max of 50 characters).
If I run a query in SSMS, I find e.g. POINT (X.0981629999999996 XX.664725 NULL 15602.707) (51 characters, anonimized).
EDIT 2
I've done some more research and some calculations. It seems that the stored value 5.9395772 is converted to binary and returned as 5.9395771999999996, which is stored as a double inside the database (in a binary Geography::Point object, not to worry.) Convert the binary 0 10000000001 0111110000100010000010000110100010000100010011011101 back to decimal, and you get 5.93957719999999955717839839053340256214141845703125, but abbreviated at 16 decimals - whereas I would like it abbreviated at 7 decimals.
Solutions:
Round the value down/up to the nearest value where everything from the 8th decimal onward is 0 (or enough zeroes before another nonzero digit is found)
Query for only so many decimals.
Query the actual (hexadecimal) value, and convert that (instead of the string representation)
Keep the string representation, but round the values before storing and after retrieving to the required amount of decimals.
Discussions:
Both in office and here (at #RobertBaron's answer): this is quite tricky, might have a huge decrease in precision, and is basically a lot of work.
Perhaps this is possible, I don't know.
This would be the cleanest solution, as my colleagues and I agree, however this is a lot of work in developing and testing.
Instead of caring about the value in memory to be equal to the value in the database, we don't care about the value in the database (too much).
In the end, after quite some whiteboard bit-calculations and a lengthy discussion, we've gone with option 4. After we retrieve the [Position].ToString() (for which we've increased the string limit) from the database, we convert that as we're already doing, and as additional step before using it anywhere we round the value to the required amount of decimals. When returning the value to the database, we once again round the value to the amount of decimals, and don't care what the database really does with it.
Essentially, this is option 2, but then on the program-side instead of database-side.

This is only a partial answer.
If by valid bit representation you mean exact bit representation, then this is possible. The decimal numbers that have exact bit representation are 1/2, 1/4, 3/4, 1/8, 3/8, 5/8, 7/8, 1/16, 3/16, ...
The challenge is to characterize among these powers of two, those whose base 10 representation has 7 digits or less, and then to round any base 10 number to the closest of these numbers.
I am posting this in the hope that it may get you one step further toward a solution.

If you cannot change the data type into a DECIMAL for whatever reasons, you have to cast it into a DECIMAL every time you need the value. It's that simple. And you can either do it on the SQL Server side or in VB.NET, but you need a DECIMAL. DOUBLEs are imprecise.
By the way, it is not the SQL Server that rounds towards the nearest number it recognizes by adding a bunch of digits - it's the processor that does it. That's also why you may get slightly different DOUBLE values after restoring your database on another server.
And never ever even think of using them as an ID: I know an application that uses FLOAT values containing the timestamp (<creation day since whatever>.<time as fractals of the day>) as part of the primary key (of nearly every table!). Every 10000th record or so cannot be addressed directly by its ID because the value differs somewhat on the client that sends the query and the server by some nanoseconds although the number looks exactly the same in SSMS on the client and the server.

Convert float to nvarchar keeping decimal variable part

I have a problem converting float to string but t-sql returns it with scientific format.
Example:
declare #amount float
set #amount=5223421341.23
print cast(#amount as nchar(20))
returns "5.22342e+009"
Well I tried the function STR but there is the point: I don't know how many decimals could have the float and I don't want to round it.
There is a way to return the float as nchar with the same precision as float is declared?
Thanks a lot.

There is a problem with the decimals places. The below code shows that the decimal value is distorted and I cant find a way around it. If there was a way to determine the precision of the float then the result could be rounded to a correct value.
With an unknown precision, I have yet to find a solution.
Declare #Amount float
Set #Amount=5223421341.23
Select LTrim(RTrim(Str(#Amount, 1000, 1000)))
Produces : 5223421341.2299995000000000
Based on what I am reading floats are only approximations by definition so this may be the best you can get.
https://msdn.microsoft.com/en-us/library/ms173773.aspx
http://www.webopedia.com/TERM/F/floating_point_number.html
Note that most floating-point numbers a computer can represent are
just approximations. One of the challenges in programming with
floating-point values is ensuring that the approximations lead to
reasonable results. If the programmeris not careful, small
discrepancies in the approximations can snowball to the point where
the final results become meaningless.

According to Microsoft, the best method in SQL is likely to use the STR function.
declare #amount float
set #amount=5223421341.23
print str(#amount,20,6)
print convert(VARCHAR(50),rtrim(ltrim(str(#amount,20,6))))
This seems like it would cover most scenarios, but you need to find out the max and min values in your data set and also the max precision.

According to Microsoft SQL Server documentation, (https://msdn.microsoft.com/en-us/library/ms173773.aspx) float is a synonym for float(53), and a float(53) has a precision of 15 digits.
But the 15 digits of precision say nothing about the exponent. The exponent is about 300. Which means that floats are capable of representing numbers containing about 300 zeros in them. That's why scientific notation is indispensable.
If scientific notation was not a problem for you, 21 characters would be enough to hold 15 digits with a period among them, plus the trailing e+999.
But since scientific notation is an issue for you, then you have to have a text field large enough to hold numbers of the form
0.0000000000000000000000000000000...00000000000000000000000000723237435435644
(that's almost 300 zeros followed by 15 digits)
and
377733453453453000000000000000000...00000000000000000000000000000000000000.0
(that's 15 digits followed by almost 300 zeros.)
So clearly, you are going to need some pretty huge text fields for what you are trying to do.
So, bottom line is, instead of looking for a solution to the problem as stated, I would recommend revising what you are trying to accomplish here.

"Round half up" on floating point values

We are stuck with a database that (unfortunately) uses floats instead of decimal values. This makes rounding a bit difficult. Consider the following example (SQL Server T-SQL):
SELECT ROUND(6.925e0, 2) --> returns 6.92
ROUND does round half up, but since floating point numbers cannot accurately represent decimal numbers, the "wrong" result (from the point of view of the end-user) is displayed. I understand why this happens.
I already came up with two possible solutions (both returning a float, which is, unfortunately, also a requirement):
Convert to a decimal data type before rounding: SELECT CONVERT(float, ROUND(CONVERT(decimal(29,14), 6.925e0), 2))
Multiply until the third digit is on the left-hand side of the decimal point (i.e. accurately represented), and then do the rounding: SELECT ROUND(6.925e0 * 1000, -1) / 1000
Which one should I choose? Is there some better solution? (Unfortunately, we cannot change the field types in the database due to some legacy applications accessing the same DB.)
Is there a well-established best practice solution for this (common?) problem?
(Obviously, the common technique "rounding twice" will not help here since 6.925 is already rounded to three decimal places -- as far as this is possible in a float.)

Your first solution seems safer, and also seems like a conceptually closer fit to the problem: convert as soon as possible from float to decimal, do all relevant calculations within the decimal type, and then do a last minute conversion back to float before writing to the DB.
Edit: You'll likely still need to do an extra round (e.g. to 3 decimal places, or whatever's appropriate for your application) immediately after retrieving the float value and converting to decimal, to make sure that you end up with the decimal value that was actually intended. 6.925e0 converted to decimal would again be likely (assuming that the decimal format has > 16 digits of precision) to give something that's very close to, but not exactly equal to, 6.925; an extra round would take care of this.
The second solution doesn't look reliable to me: what if the stored value for 6.925e0 happens to be, due to the usual binary floating-point issues, a tiny amount too small? Then after multiplication by 1000, the result may still be a touch under 6925, so that the rounding step rounds down instead of up. If you know your value always has at most 3 digits after the point, you could fix this by doing an extra round after multiplying by 1000, something like ROUND(ROUND(x * 1000, 0), -1).
(Disclaimer: while I have plenty of experience dealing with float and decimal issues in other contexts, I know next to nothing about SQL.)

Old question, but I am surprised that the normal practice is not mentioned here, so I just add it.
Normally, you would add a small amount that you know is much smaller than the accuracy of the numbers you are working with, e.g. like this:
SELECT ROUND(6.925e0 + 1e-7, 2)
Of course the added amount must be larger than the precision of the floating point type that is used.

Use an arbitrary-precision format such as DECIMAL. That way you can leave it to the language to get it right (or wrong as the case may be).

I managed to round the float column correctly using the following command:
SELECT CONVERT(float, ROUND(ROUND(CONVERT(decimal(38,14),float_column_name),3),2))

Is there a good reason for storing percentages that are less than 1 as numbers greater than 1?

I inherited a project that uses SQL Server 200x, wherein a column that stores a value that is always considered as a percentage in the problem domain is stored as its greater than 1 decimal equivalent. For example, 70% (0.7, literally) is stored as 70, 100% as 100, etc. Aside from the need to remember to * 0.01 on retrieved values and * 100 before persisting values, it doesn't seem to be a problem in and of itself. It does make my head explode though... so is there a good reason for it that I'm missing? Are there compelling reasons to fix it, given that there is a fair amount of code written to work with the pseudo-percentages?
There are a few cases where greater than 100% occurs, but I don't see why the value wouldn't just be stored as 1.05, for example, in those cases.
EDIT: Head feeling better, and slightly smarter. Thanks for all the insights.

There are actually four good reasons I can think of that you might want to store—and calculate with—whole-number percentage values rather than floating-point equivalents:
Depending on the data types chosen, the integer value may take up less space.
Depending on the data type, the floating-point value may lose precision (remember that not all languages have a data type equivalent to SQL Server's decimal type).
If the value will be input from or output to the user very frequently, it may be more convenient to keep it in a more user-friendly format (decision between convert when you display and convert when you calculate ... but see the next point).
If the principle values are also integers, then
principle * integerPercentage / 100
which uses all integer arithmetic is usually faster than its floating-point equivalent (likely significantly faster in the case of a floating-point type equivalent to T-SQL's decimal type).

If its a byte field then it takes up less room in the db than floating point numbers, but unless you have millions and millions of records, you'll hardly see a difference.

Since floating-point values can't be compared for equality, an integer may have been used to make the SQL simpler.
For example
(0.3==3*.1)
is usually False.
However
abs( 0.3 - 3*.1 )
Is a tiny number (5.55e-17). But it's pain to have to do everything with (column-SomeValue) BETWEEN -0.0001 AND 0.0001 or ABS(column-SomeValue) < 0.0001. You'd rather do column = SomeValue in your WHERE clause.

Floating point numbers are prone to rounding errors and, therefore, can act "funny" in comparisons. If you always want to deal with it as fixed decimal, you could either choose a decimal type, say decimal(5,2), or do the convert and store as int thing that your db does. I'd probably go the decimal route, even though the int would take up less space.

A good guess is because anything you do with integers (storing, calculating, stuffing into an edit for for a user, etc.) is marginally easier and more efficient than doing the same with floating point numbers. And the rounding issues aren't so obvious when you look at the data.

If these are numbers that end users are likely to see and interact with, percentages are easier to understand than decimals.
This is one of those situations where a notation aid can help; in the program, be consistent in using a prefix (Hungarian) or postfix to specify values that are percentages vs. those that are decimal. If you can extend a naming convention to the database fields themselves, so much the better.

And to add to the data storage issue, if you can use integer arithmetic for whatever processing you are doing, the performance is much better than when doing floating point arithmetic... So storing ther percetages as integer values may allow the processing logic to itilize integer arithmetic

If you're actually using them as a coefficient (or expect users of the database to do this sort of thing in reports), there's a case for storing them as a coefficient - particularly if there's a reason to do calculations involving more than one.
However, if you do this you should be consistent - either all percentages or all coefficients.