TSQL - get closest coordinate on linestring to a point - sql-server

Consider the overly simplistic example: POINT(0 0) and LINESTRING (1 -10, 1 10)
The closest point on the line to the POINT would be 1, 0.
How would one determine this in TSQL? My simple, not entirely accurate, approach was to make a linestring (POINT POINT) and extend out the X coord of one coords until the two linestrings intersected.
So:
linestring (0 0, 0.25 0) (no intersect)
linestring (0 0, 0.5 0) (no intersect)
linestring (0 0, 0.75 0) (no intersect)
linestring (0 0, 1 0) (intersection - so 1 0 is the point closest to POINT
This quasi worked, but doesn't seem to the most bestest/more performant way of accomplishing this.
For example, one inefficiency is that I move it one direction (positive increments), and if there was no match (after x attempts), then I would start over, but with negative increments.
To optimize, I tried moving in larger steps, then when intersected (probably went past the point), I backed off 1 increment and started from there with a smaller increment. I did this a couple of times - instead of going in tiny tiny increments so as not to overshoot by too much.
One acceptable assumption based on my processing that the POINT will be next to (left/right) of the LINESTRING.
Another acceptable assumption is that the LINESTRING will be fairly "perpendicular" to the POINT.

I think you can do this mathematically rather than with a brute-force iterative algorithm.
There is a post to get closest point to a line that describes the method.
I converted this method to SQL which returns the correct value (1,0). Your 'trivial' example is actually a bit of an edge case (vertical line with infinite slope) so it seems robust.
I also tested the source code with this example: https://www.desmos.com/calculator/iz07az84f5 and using the input for the line of (-1,2) (3,0) and a point at (2,2) got the correct answer (1.4, 0.8).
SQL code (also in SQL Fiddle at http://sqlfiddle.com/#!6/d87aa/15)
DECLARE #x int, #y int, #x1 int, #y1 int, #x2 int, #y2 int
DECLARE #atb2 float, #atp_dot_atb float
DECLARE #t float
--SELECT #x=0, #y=0
--SELECT #x1=1, #y1=10, #x2=1, #y2=-10
SELECT #x=2, #y=2
SELECT #x1=-1, #y1=2, #x2=3, #y2=0
SELECT #atb2 = SQUARE(#x2-#x1) + SQUARE(#y2-#y1) -- Basically finding the squared magnitude of a_to_b
SELECT #atp_dot_atb = (#x-#x1)*(#x2-#x1) + (#y-#y1)*(#y2-#y1) -- The dot product of a_to_p and a_to_b
SELECT #t = #atp_dot_atb / #atb2 -- The normalized "distance" from a to your closest point
SELECT #x1 + (#x2-#x1)*#t, #y1 + (#y2-#y1)*#t --Add the distance to A, moving towards B

Related

SQL Server Decimal Operation is Not Accurate

When I run this simple operation in SQL server:
Select 800.0 /30.0
I get the value 26.666666, where even if it rounds for 6 digits it should be 26.666667.
How can I get the calculation to be accurate? I tried to search about it online and I found a solution where I cast each operand to a high precision decimal before the operation, but this will not be convenient for me because I have many long complex calculations. think there must be a better solution.
When a using division, in SQL Server, any digits after the resulting scale are truncated, not rounded. For your expression you have a decimal(4,1) and a decimal(3,1), which results in a decimal(10,6):
Precision = p1 - s1 + s2 + max(6, s1 + p2 + 1)
Scale = max(6, s1 + p2 + 1)
As a result, 26.66666666666666~ is truncated to 26.666666.
You can get around this by can increasing the size of the precision and scale, and then CONVERT back to your required precision and scale. For example, increase the precision and scale of the decimal(3,1) to decimal(5,2) and convert back to a decimal(10,6):
SELECT CONVERT(decimal(10,6),800.0 / CONVERT(decimal(5,3),30.0));
This returns 26.666667.
This might helpful:
Use ROUND (Transact-SQL)
SELECT ROUND(800.0 /30.0, 5) AS RoundValue;
Result:
RoundValue
26.666670
I believe it's because SQL Server takes your numbers as decimal values (which are exact e.g., 6.6666 and 6.6667 means exactly those values, not 6 and two-thirds) rather than float values (which can work with approximate numbers).
If you explicity cast/convert it to a float at the start, you should get your calculations running smoothly.
Here's some examples to demonstrate the difference between int, decimal, and float calculations
Dividing 20 by 3
Dividing 20 by 3, then multiplying by 3 again (which mathematically should be 20).
SELECT (20/3) AS int_calc,
(20/3) * 3 AS int_calc_x3,
(CAST(20 AS decimal(10,3)) /3) AS dec_calc,
(CAST(20 AS decimal(10,3)) /3) * 3 AS dec_calc_x3,
(CAST(20 AS float) /3) AS float_calc,
(CAST(20 AS float) /3) * 3 AS float_calc_x3
with the following results
int_calc int_calc_x3 dec_calc dec_calc_x3 float_calc float_calc_x3
6 18 6.666666 19.999998 6.66666666666667 20
In your case, you can use
Select CAST(800.0 AS float) /30.0
which results in 26.6666666666667
Note if you then multiply back by 30, it gets the correct result e.g.,
Select (CAST(800.0 AS float) /30.0) * 30
results in 800. Solutions dealing with decimals will not have this.
Note also that once you have it as a float, then it should stay a float until converted back to a decimal or an int somehow (e.g., saved in a table as an int). So...
SELECT A.Num / 30
FROM (Select ((CAST(800.0 AS float) /30.0) * 30) AS Num) AS A
will still result in 26.6666666666667
This will hopefully help you in your long complex calculations.

T-SQL Rounding, First Truncates to LENGTH + 2

In using the T-SQL ROUND function I noticed what seems like weird behavior. It looks like the ROUND function only looks at the first digit to the right of the digit to be rounded. If I round -6.146 to one decimal I get -6.1. I would have thought it would start at the right and round each digit as it works its way to the left, like this: -6.146 -> -6.15 -> -6.2
I've observed the same behavior with Excel’s round function too.
The query below illustrates what I am describing. I may simply use the nested ROUND functions as shown below but I'm curious if there’s a better way and which approach is considered mathematically correct.
DECLARE #Num AS FLOAT
SET #Num = -6.1463
SELECT #Num [OriginalVal], ROUND(#Num, 1, 0) [SingleRound]
, ROUND(ROUND(ROUND(#Num, 3, 0), 2, 0), 1, 0) [NestedRound]
Results
OriginalVal | SingleRound | NestedRound
-6.1463 | -6.1 | -6.2
I think the basic rule of thumb is, in rounding, you look at the 1 digit immediately to the right of the place you are rounding to. You do not extend it all the way to the very end of the right of the decimal.
http://math.about.com/od/arithmetic/a/Rounding.htm

Types and rounds in SQL Server

I have the following query:
DECLARE #A as numeric(36,14) = 480
DECLARE #B as numeric(36,14) = 1
select #B/#A
select cast(#B as decimal)/cast(#A as decimal)
Why does the first calculation returns 0.002083 and the second one returns 0.00208333333333333?
Isn´t numeric(36,14) good enough to have a good precision (just as the second query)?
If I use only numeric, instead of numeric(36,14), I have a good precision again:
select cast(#B as numeric)/cast(#A as numeric)
You can calculate precision and scale by yourself using this documentation from SQL Server Books online.
I tried to calculate precision and scale for your case (operation=division, p=36, s=14) and I got a pretty strange results...
precision of the result: [p1 - s1 + s2 + max(6, s1 + p2 + 1)] -> 36-14+14+max(6,14+36+1)=36+51=87
scale of the result : [max(6, s1 + p2 + 1)] -> max(6,14+36+1)=51
In this situation precision is greater than 38 and in this case (as stated in the documentation)
*The result precision and scale have an absolute maximum of 38. When a result precision is greater than 38, the corresponding scale is
reduced to prevent the integral part of a result from being truncated.
scale must be reduced by (87-38=) 49, that is (51-49=) 2 ...
I think that minimum scale length is 6 (because of expression scale=[max(6, s1 + p2 + 1)]) and it can't be reduced lower than 6 - that we have as a result (0.002083).
Just contributing for the understanding of the problem (going deeper on #Andrey answer), the things could be tricky, depending on the order of calculations.
Consider the variables:^
DECLARE #A as NUMERIC(36,19) = 100
DECLARE #B as NUMERIC(36,19) = 480
DECLARE #C as NUMERIC(36,19) = 100
Calculating A/B*C
If you want to calculate A/B*C, using the formulaes, we have:
A/B is of type NUMERIC(38,6) --> as calculated by #Andrey
The result will be 0.208333 (with scale of 6)
Multiplying by 100, we will get 20.833300
Calculating A*C/B
The result of A*C is 10000 of type NUMERIC(38,6). Diving by C, the result will be 20.833333 of type NUMERIC(38,6)
Then, the result may vary depending on the order of calculation (the same problem was pointed in https://dba.stackexchange.com/questions/77664/how-does-sql-server-determine-precision-scale).

SQL Server Spatial Query: where condition behaving «oddly»

I've realized this «silly» spatial query to find all the points that lie 5Km far form a center.
Source table holds +150K rows.
Here the query:
DECLARE #position geography = geography::Parse('POINT(9.123 45.123)')
DECLARE #circle geography = #position.STBuffer(5000) -- A circle of 5Km of radius
SELECT
g.Coordinate.STDistance(#position), g.Coordinate.Filter(#circle)
FROM
[DB_NAME].[SCHEMA].[TABLE] AS g WITH (nolock)
WHERE
g.Coordinate.Filter(#circle) = 1
I oddly observe that the WHERE condition doesn't work: in fact I retrieve even +600 points where the condition returns 0.
Any suggestions?
For the sake of clarity table schema was
[DB_NAME].[SCHEMA].[TABLE](Coordinate geography NOT NULL)
Official documentation states: «Returns 1 if a geography instance potentially intersects another geography instance. This method may produce a false-positive return, and the exact result may be plan-dependent. Returns an accurate 0 value (true negative return) if there is no intersection of geography instances found.»
So I mean that 0 is always ok, while 1 could be approximated (IMHO this behaviour is absolutely reasonable)
By the way #Damien observation lead me to simply work around:
DECLARE #position geography = geography::Parse('POINT(9.123 45.123)')
DECLARE #circle geography = #position.STBuffer(5000) -- A circle of 5Km of radius
SELECT * FROM
(SELECT
g.Coordinate.Filter(#circle) filter, g.Coordinate Coord
FROM [DB_NAME].[SCHEMA].[TABLE] AS g WITH (nolock)
WHERE
g.Coordinate.Filter(#circle) = 1
) t
WHERE t.filter = 1
that recalls me the «Double Check Pattern» esoterism… but in that case It's clear the motivation.
One point that could be more investigated is about the return value conversion… Many years ago I stumbled upon on a similar issue where in a server farm an implicit conversion of a boolean tre to int led to -1 (0xFFFFFFFF) instead of 1 (0x00000001)… COM ages…

How to get the count of digits after the decimal point in a float column in ms sql?

I have to count the digits after the decimal point in a database hosted by a MS Sql Server (2005 or 2008 does not matter), in order to correct some errors made by users.
I have the same problem on an Oracle database, but there things are less complicated.
Bottom line is on Oracle the select is:
select length( substr(to_char(MY_FIELD), instr(to_char(MY_FILED),'.',1,1)+1, length(to_char(MY_FILED)))) as digits_length
from MY_TABLE
where the filed My_filed is float(38).
On Ms Sql server I try to use:
select LEN(SUBSTRING(CAST(MY_FIELD AS VARCHAR), CHARINDEX('.',CAST(MY_FILED AS VARCHAR),1)+1, LEN(CAST(MY_FIELD AS VARCHAR)))) as digits_length
from MY_TABLE
The problem is that on MS Sql Server, when i cast MY_FIELD as varchar the float number is truncated by only 2 decimals and the count of the digits is wrong.
Can someone give me any hints?
Best regards.
SELECT
LEN(CAST(REVERSE(SUBSTRING(STR(MY_FIELD, 13, 11), CHARINDEX('.', STR(MY_FIELD, 13, 11)) + 1, 20)) AS decimal))
from TABLE
I have received from my friend a very simple solution which is just great. So I will post the workaround in order to help others in the same position as me.
First, make function:
create FUNCTION dbo.countDigits(#A float) RETURNS tinyint AS
BEGIN
declare #R tinyint
IF #A IS NULL
RETURN NULL
set #R = 0
while #A - str(#A, 18 + #R, #r) <> 0
begin
SET #R = #R + 1
end
RETURN #R
END
GO
Second:
select MY_FIELD,
dbo.countDigits(MY_FIELD)
from MY_TABLE
Using the function will get you the exact number of digits after the decimal point.
The first thing is to switch to using CONVERT rather than CAST. The difference is, with CONVERT, you can specify a format code. CAST uses whatever the default format code is:
When expression is float or real, style can be one of the values shown in the following table. Other values are processed as 0.
None of the formats are particularly appealing, but I think the best for you to use would be 2. So it would be:
CONVERT(varchar(25),MY_FIELD,2)
This will, unfortunately, give you the value in scientific notation and always with 16 digits e.g. 1.234567890123456e+000. To get the number of "real" digits, you need to split this number apart, work out the number of digits in the decimal portion, and offset it by the number provided in the exponent.
And, of course, insert usual caveats/warnings about trying to talk about digits when dealing with a number which has a defined binary representation. The number of "digits" of a particular float may vary depending on how it was calculated.
I'm not sure about speed. etc or the elegance of this code. it was for some ad-hoc testing to find the first decimal value . but this code could be changed to loop through all the decimals and find the last time a value was greater than zero easily.
DECLARE #NoOfDecimals int = 0
Declare #ROUNDINGPRECISION numeric(32,16) = -.00001000
select #ROUNDINGPRECISION = ABS(#ROUNDINGPRECISION)
select #ROUNDINGPRECISION = #ROUNDINGPRECISION - floor(#ROUNDINGPRECISION)
while #ROUNDINGPRECISION < 1
Begin
select #NoOfDecimals = #NoOfDecimals +1
select #ROUNDINGPRECISION = #ROUNDINGPRECISION * 10
end;
select #NoOfDecimals

Resources