PostGIS accuracy of ST_DWithin results - postgis

I'm testing a function in a PostGIS spatial database: ST_DWithin. On edge cases I sometimes get true and sometimes false.
SELECT ST_DWithin(
ST_GeomFromText('POINT(-90.01 30)','4326'),
ST_GeomFromText('POINT(-90 30)','4326'),
'0.01'
)
st_dwithin -> false
and
SELECT ST_DWithin(
ST_GeomFromText('POINT(-90.1 30)','4326'),
ST_GeomFromText('POINT(-90 30)','4326'),
'0.1'
)
st_dwithin -> true
Shouldn't both be either true or false? Can anybody explain the results to me?

This is caused by the fact that underlying computations uses double precision and not the exact numeric type. Therefore, the computed distance between points is accurate up to 15 digits only.
Select ST_Distance(ST_GeomFromText('POINT(-90.01 30)','4326'),
ST_GeomFromText('POINT(-90 30)','4326')) d1,
ST_Distance(ST_GeomFromText('POINT(-90.1 30)','4326'),
ST_GeomFromText('POINT(-90 30)','4326')) d2;
==>
d1 | d2
--------------------+--------------------
0.0100000000000051 | 0.0999999999999943
(1 row)
We can see that both computed distances are inexact. Moreover, floating point equality comparison should always consider the precision. If not done, we get unexpected results (like d1 > .01 and d2 < .1)
You can read Postgresql doc on floats and google floating point comparison

Related

SQL Server Decimal Operation is Not Accurate

When I run this simple operation in SQL server:
Select 800.0 /30.0
I get the value 26.666666, where even if it rounds for 6 digits it should be 26.666667.
How can I get the calculation to be accurate? I tried to search about it online and I found a solution where I cast each operand to a high precision decimal before the operation, but this will not be convenient for me because I have many long complex calculations. think there must be a better solution.
When a using division, in SQL Server, any digits after the resulting scale are truncated, not rounded. For your expression you have a decimal(4,1) and a decimal(3,1), which results in a decimal(10,6):
Precision = p1 - s1 + s2 + max(6, s1 + p2 + 1)
Scale = max(6, s1 + p2 + 1)
As a result, 26.66666666666666~ is truncated to 26.666666.
You can get around this by can increasing the size of the precision and scale, and then CONVERT back to your required precision and scale. For example, increase the precision and scale of the decimal(3,1) to decimal(5,2) and convert back to a decimal(10,6):
SELECT CONVERT(decimal(10,6),800.0 / CONVERT(decimal(5,3),30.0));
This returns 26.666667.
This might helpful:
Use ROUND (Transact-SQL)
SELECT ROUND(800.0 /30.0, 5) AS RoundValue;
Result:
RoundValue
26.666670
I believe it's because SQL Server takes your numbers as decimal values (which are exact e.g., 6.6666 and 6.6667 means exactly those values, not 6 and two-thirds) rather than float values (which can work with approximate numbers).
If you explicity cast/convert it to a float at the start, you should get your calculations running smoothly.
Here's some examples to demonstrate the difference between int, decimal, and float calculations
Dividing 20 by 3
Dividing 20 by 3, then multiplying by 3 again (which mathematically should be 20).
SELECT (20/3) AS int_calc,
(20/3) * 3 AS int_calc_x3,
(CAST(20 AS decimal(10,3)) /3) AS dec_calc,
(CAST(20 AS decimal(10,3)) /3) * 3 AS dec_calc_x3,
(CAST(20 AS float) /3) AS float_calc,
(CAST(20 AS float) /3) * 3 AS float_calc_x3
with the following results
int_calc int_calc_x3 dec_calc dec_calc_x3 float_calc float_calc_x3
6 18 6.666666 19.999998 6.66666666666667 20
In your case, you can use
Select CAST(800.0 AS float) /30.0
which results in 26.6666666666667
Note if you then multiply back by 30, it gets the correct result e.g.,
Select (CAST(800.0 AS float) /30.0) * 30
results in 800. Solutions dealing with decimals will not have this.
Note also that once you have it as a float, then it should stay a float until converted back to a decimal or an int somehow (e.g., saved in a table as an int). So...
SELECT A.Num / 30
FROM (Select ((CAST(800.0 AS float) /30.0) * 30) AS Num) AS A
will still result in 26.6666666666667
This will hopefully help you in your long complex calculations.

Why does a FLOAT give me a more accurate result than a DECIMAL?

I am looking for a division result that is extremely accurate.
This SQL returns the following results:
SELECT (CAST(297282.26 AS DECIMAL(38, 30)) / CAST(495470.44 AS DECIMAL(38, 30))) AS ResultDecimal
SELECT (CAST(297282.26 AS FLOAT) / CAST(495470.44 AS FLOAT)) AS ResultFloat
Here is the accurate result from WolframAlpha:
http://www.wolframalpha.com/input/?i=297282.26%2F495470.44
I was under the impression that DECIMAL would be more accurate than FLOAT:
"Because of the approximate nature of the float and real data types, do not use these data types when exact numeric behavior is required, such as in financial applications, in operations involving rounding, or in equality checks. Instead, use the integer, decimal, money, or smallmoney data types."
https://technet.microsoft.com/en-us/library/ms187912(v=sql.105).aspx
Why does the FLOAT calculation give me a result more accurate than when using DECIMAL?
I found the best precision to be when you use:
SELECT (CAST(297282.26 AS DECIMAL(15, 9)) / CAST(495470.44 AS DECIMAL(24, 2))) AS ResultDecimal
This gives a result of
0.599999991926864496699338915153
I think the actual value (to 100 digits) is:
0.5999999919268644966993389151530412187657451370862810705720405842980259326873264124495499670979362562...
Please bear in mind SQL Server defines the maximum precision and scale for division as:
max precision = (p1 - s1 + s2) + MAX(6, s1 + p2 + 1) -- up to 38
max scale = MAX(6, s1 + p2 + 1)
Where p1 & p2 are the precision of the two numbers and s1 & s2 are the scale of the numbers.
In this case the maximum precision is (15-9+2) + MAX(6, 9+24+1) = 8 + 34 = 42.
However SQL Server only allows a maximum precision of 38.
The maximum scale = MAX(6, 9+24+1) = 34
Hopefully you already understand that just because the FLOAT version presents more numbers after the decimal point, doesn't necessarily mean that those are the true numbers. This is about precision, not accuracy.
It is the CAST function itself that causes this loss of precision, not the difference between the FLOAT and DECIMAL data types.
To demonstrate this, compare your previous results to the result of this:
SELECT 297282.26 / 495470.44 AS ResultNoCast
In my version of the query, the presence of a decimal point in the literal numbers tells SQL Server to treat the values as DECIMAL datatype, with appropriate length and precision as determined by the server. The result is more precise than when you CAST explicitly to DECIMAL.
A clue to the reason for this can be found hidden in the official documentation of the CAST function, under Truncating and Rounding Results:
When you convert data types that differ in decimal places, sometimes the result value is truncated and at other times it is rounded. The following table shows the behavior.
From | To | Behavior
numeric | numeric | Round
So the fact that each separate literal value is treated as a NUMERIC (same thing as DECIMAL) on the way in, and is being casted to NUMERIC, causes rounding.
Anticipating your next question a little, if you want a more precise result from the NUMERIC/DECIMAL datatype, you just need to tell SQL Server that each component of the calculation is more precise:
SELECT 297282.26000000 / 495470.44000000 AS ResultSuperPrecise
This appears (from experimentation) to be the most precise I can get: either adding or removing a 0 from either the numerator or denominator makes the result less precise. I'm at a loss to explain why that is, because the result is only 23 digits to the right of the decimal point.
It doesn't give you a more accurate result. I say that because the value is an approximate and not all values will be available to stored in a float. On the other side of that coin though is that float has the possibility of a lot more precision. The maximum precision of a decimal/numeric is 38. https://msdn.microsoft.com/en-us/library/ms187746.aspx
When you look at float though the maximum precision is 53. https://msdn.microsoft.com/en-us/library/ms173773.aspx
Okay, here is what I think is going on.
#philosophicles - I think you are right in that the CAST is causing the problem, but not because I am trying to "convert data types that differ in decimal places".
When I execute the following statement
SELECT CAST((297282.26 / 495470.44) AS DECIMAL(38, 30)) AS ResultDecimal
The accurate result for the calculation is
This has way more than 30 digits after the decimal point, and my data type has scale set to 30. So the CAST rounds the value, then just adds zeros to the end until there are 30 digits. We end up with this:
So the interesting thing is how does the CAST determine up to how many decimals to round or truncate the output? I am not sure, but as #philosophicles pointed out, the scale of the input effects the rounding applied on the output.
SELECT CAST(((297282.26/10000) / (495470.44/10000)) AS DECIMAL(38, 30)) AS ResultDecimal
Thoughts?
Also interesting:
However, in simple terms, precision is lost when the input scales are
high because the result scales need to be dropped to 38 with a
matching precision drop.
https://dba.stackexchange.com/questions/41743/automatic-decimal-rounding-issue
The precision and scale of the numeric data types besides decimal are fixed.
https://dba.stackexchange.com/questions/41743/automatic-decimal-rounding-issue

How do I control the datatype of a computed column?

Using SQL Server 2012...
I have two columns:
Price [decimal(28,12)]
OustandingShares [decimal(38,3)] -- The 38 is overkill but alas, not my call.
When I do an ALTER TABLE I get a resulting computed column as a [decimal(38,6)]. I need the datatype to be [decimal(28,12)].
ALTER TABLE [xyz].MyTable
ADD Mv AS OustandingShares * Price
How can I effectively get 12 decimals of scale on this computed column? I've tried doing convert on the OutstandingShares to 12 decimal places as well as wrapping a convert around the OutstandingShares * Price. The only thing I get is a computed field at [decimal(28,12)] with six trailing zeros.
Thoughts?
The Fix
This does what you want:
CONVERT(DECIMAL(28,12), (
CONVERT(DECIMAL(15, 3), [OustandingShares])
* CONVERT(DECIMAL(24, 12), [Price])
)
)
Test with this:
SELECT CONVERT(DECIMAL(28,12),
(CONVERT(DECIMAL(24,12), 5304.987781883689)
* CONVERT(DECIMAL(15,3), 3510.88)));
Result:
18625175.503659806036
The Reason
The computation is being truncated due to SQL Server's rules for how to handle Precision and Scale across various operations. These rules are detailed in the MSDN page for Precision, Scale, and Length. The details we are interested in for this case are:
Operation: e1 * e2
Result precision: p1 + p2 + 1
Result scale *: s1 + s2
Here the datatypes in play are:
DECIMAL(28, 12)
DECIMAL(38, 3)
This should result in:
Precision = (28 + 38 + 1) = 67
Scale = 15
But the max length of the DECIMAL type is 38. So what gives? We now need to notice that there was a footnote attached to the "Result scale" calculation, being:
* The result precision and scale have an absolute maximum of 38. When a result precision is greater than 38, the corresponding scale is reduced to prevent the integral part of a result from being truncated.
So it seems that in order to get the Precision back down to 38 it chopped off 9 decimal places.
And this is why my proposed fix works. I kept the "Scale" values the same as we don't want to truncate going in and expanding them serves no purpose as SQL Server will expand the Scale as appropriate. The key is in reducing the Precision so that the truncation would be non-existent or at least minimal.
With DECIMAL(15, 3) and DECIMAL(24, 12) we should get:
Precision = (15 + 24 + 1) = 40
Scale = 15
40 is over the limit so reduce by 2 to get down to 38, which means reduce the Scale by 2 leaving us with a true "Result Scale" of 13, which is 1 more than we need and will even be seeing.
Use cast() or convert(). Something like:
ALTER TABLE [xyz].MyTable ADD Mv AS cast(OustandingShares * Price as decimal(12, 6)
or whatever type you want it to be.
EDIT:
Oh, I think I'm getting the idea. The problem is the calculation itself. In that case, do the conversion before the multiplication, so you don't have to depend on SQL Server's (arcane) rules for conforming decimal types.
ALTER TABLE [xyz].MyTable
ADD Mv AS cast(OustandingShares as decimal(28, 12) * cast(Price as decimal(28, 12))
I believe what is happening in your case is that the maximum precision on the calculated result exceeds the allowed thresholds, so the scale is reduced accordingly. This is explained at the bottom of this page.

Types and rounds in SQL Server

I have the following query:
DECLARE #A as numeric(36,14) = 480
DECLARE #B as numeric(36,14) = 1
select #B/#A
select cast(#B as decimal)/cast(#A as decimal)
Why does the first calculation returns 0.002083 and the second one returns 0.00208333333333333?
Isn´t numeric(36,14) good enough to have a good precision (just as the second query)?
If I use only numeric, instead of numeric(36,14), I have a good precision again:
select cast(#B as numeric)/cast(#A as numeric)
You can calculate precision and scale by yourself using this documentation from SQL Server Books online.
I tried to calculate precision and scale for your case (operation=division, p=36, s=14) and I got a pretty strange results...
precision of the result: [p1 - s1 + s2 + max(6, s1 + p2 + 1)] -> 36-14+14+max(6,14+36+1)=36+51=87
scale of the result : [max(6, s1 + p2 + 1)] -> max(6,14+36+1)=51
In this situation precision is greater than 38 and in this case (as stated in the documentation)
*The result precision and scale have an absolute maximum of 38. When a result precision is greater than 38, the corresponding scale is
reduced to prevent the integral part of a result from being truncated.
scale must be reduced by (87-38=) 49, that is (51-49=) 2 ...
I think that minimum scale length is 6 (because of expression scale=[max(6, s1 + p2 + 1)]) and it can't be reduced lower than 6 - that we have as a result (0.002083).
Just contributing for the understanding of the problem (going deeper on #Andrey answer), the things could be tricky, depending on the order of calculations.
Consider the variables:^
DECLARE #A as NUMERIC(36,19) = 100
DECLARE #B as NUMERIC(36,19) = 480
DECLARE #C as NUMERIC(36,19) = 100
Calculating A/B*C
If you want to calculate A/B*C, using the formulaes, we have:
A/B is of type NUMERIC(38,6) --> as calculated by #Andrey
The result will be 0.208333 (with scale of 6)
Multiplying by 100, we will get 20.833300
Calculating A*C/B
The result of A*C is 10000 of type NUMERIC(38,6). Diving by C, the result will be 20.833333 of type NUMERIC(38,6)
Then, the result may vary depending on the order of calculation (the same problem was pointed in https://dba.stackexchange.com/questions/77664/how-does-sql-server-determine-precision-scale).

REAL column holding values outside documented range

According to MSDN, the range for REAL values is - 3.40E + 38 to -1.18E - 38, 0 and 1.18E - 38 to 3.40E + 38. However, I have quite a few values beyond that range in my table.
The following query returns lots of very small values and no very large ones:
SELECT MyColumn ,
*
FROM data.MyTable
WHERE MyColumn <> 0
AND ( MyColumn < CONVERT(REAL, 1.18E-38)
OR MyColumn > CONVERT(REAL, 3.40E+38)
)
AND ( MyColumn < CONVERT(REAL, -3.40E+38)
OR MyColumn > CONVERT(REAL, -1.18E-38)
)
It is easy to show how these values end up in the table. I cannot insert them directly:
CREATE TABLE a(r REAL NULL);
GO
INSERT INTO a(r) VALUES(4.330473E-39);
GO
SELECT r FROM a
GO
DROP TABLE a;
----
0.0
But I can divide two columns and get and outside of range value:
CREATE TABLE a
(
r1 REAL NULL ,
r2 REAL NULL ,
r3 REAL NULL
) ;
GO
INSERT INTO a
( r1, r2 )
VALUES ( 4.330473E-38, 1000 ) ;
GO
UPDATE a
SET r3 = r1 / r2 ;
SELECT r1 ,
r2 ,
r3
FROM a
r1 r2 r3
------------- ------------- -------------
4.330473E-38 1000 4.330433E-41
So I guess MSDN gives wrong ranges of valid data, correct?
Am I missing anything?
Several people suggested that this is a bug.
What part of this behavior exactly is a bug. is it:
Wrong constants documented in MSDN and used in DBCC, as well as wrong threshold for rounding down.
Update being able to save wrong values
Books Online documents only the normal range for single- and double-precision floating point numbers. The IEEE 754 rules also specify floating-point numbers closer to zero than the smallest non-zero normal value, known variously as denormalized, denormal, and subnormal numbers. From that last link:
Denormal numbers provide the guarantee that addition and subtraction of floating-point numbers never underflows; two nearby floating-point
numbers always have a representable non-zero difference. Without
gradual underflow, the subtraction a−b can underflow and produce zero
even though the values are not equal. This can, in turn, lead to
division by zero errors that cannot occur when gradual underflow is
used.
SQL Server is following the rules for single-precision floating point calculations in the examples posted. The bug may be that DBCC checks only for normal values, and throws an incorrect error message when it encounters a stored denormal value.
Example producing a denormal single-precision value:
DECLARE
#v1 real = 14e-39,
#v2 real = 1e+07;
-- 1.4013e-045
SELECT #v1 / #v2;
Example showing a stored float denormal passes DBCC checks:
CREATE TABLE dbo.b (v1 float PRIMARY KEY);
INSERT b VALUES (POWER(2e0, -1075));
SELECT v1 FROM b; -- 4.94065645841247E-324
DBCC CHECKTABLE(b) WITH DATA_PURITY; -- No errors or warnings
DROP TABLE dbo.b;
This is a bug in SQL Server. The last script you post is a nice repro. Add one line to it at the end:
DBCC CHECKDB WITH data_purity
This fails with:
Msg 2570, Level 16, State 3, Line 1 Page (1:313), slot 0 in object ID
357576312, index ID 0, partition ID 1801439851932155904, alloc unit ID
2017612634169999360 (type "In-row data"). Column "r3" value is out of
range for data type "real". Update column to a legal value.
This proves it is a bug. I suggest you file a bug with Microsoft Connect for SQL Server.

Resources