Averaging data for points in close proximity with SQL Server 2008 - sql-server

I have an application which receives GPS data from a mobile device as well as receiving co-ordinate data it also provides signal strength from the GSM network.
I am trying to plot the points on a map to display areas of good signal strength and areas of poor signal strength.
When I have a few points it all works well, the points are retrieved from the database and a square is built around the point with the top left corner 0.5km from the point. I then display the square shapes on the VE map using colour coding for signal strength.
The problem is that there may be thousands and thousands of readings and I need a way to average out those readings that are less than 0.5km from each other or I need to build the square (or circle perhaps) in SQL Server and average out the intersections.
I have no idea where to begin with this so any pointers to decent articles or some tips would be much appreciated.
Thanks.

One simple and somewhat inaccurate way to do this would be to decrease the granularity of your data. It might not even be inaccurate, depending on how accurate your x, y measurements are.
let's say we have the following data:
x y signal_strenth
10.2 5.1 10
10.1 5.3 12
10.3 5.5 8
If we floor the x and y values, we get:
x y signal_strenth
10 5 10
10 5 12
10 5 9
Then we can average those values by the floored x and y to show that we have average signal strength in the rectangle (10, 5) to (11, 6).
Here's the SQL:
select
floor(x) as rectangle_xmin,
floor(y) as rectangle_ymin,
floor(x) + 1 as rectangle_xmax,
floor(y) + 1 as rectangle_ymax,
avg(signal_strength) as signal_strength
from table
group by floor(x), floor(y);
Now, admittedly, you'd ideally want to group data points by distance from point to point, and this groups them by a maximum distance that varies from 1 and to square_root(2) =~1.44, flooring them into rectangular blocks. So it's less than ideal. But it may work well enough for you, especially if the flooring/grouping is less than the error in your measurement of position.
If floor() is not granular enough, you can use floor( x * someweight) / someweight to adjust it to the granularity you want. And of course you can use ceil() or round() to do the same thing.
The whole point is to collapse a bunch of nearby measurements to one "measurement", and then take the average of the collapsed values.

You might want to look into Delaunay Triangulation where you can plot X,Y,Z coordinates into a graph. It might be possible, not knowing exactly what you have for points, to use X,Y for the location and then plot the Z as signal strength and create a spike graph. I've only seen c++ examples CodePlex sample but it might be something you can write a SQL function for.

SELECT
geography::STPointFromText('POINT(' + CONVERT(varchar, AvgSignalReadings.rect_lngmin / 100) + ' ' + CONVERT(varchar, AvgSignalReadings.rect_latmin / 100) + ')', 4326) as Location,
AvgSignalReadings.lat / 100 as Latitude,
AvgSignalReadings.lng / 100 as Longitude,
AvgSignalReadings.SignalStrength
FROM
(
SELECT
FLOOR(l.Latitude * 100) as lat,
FLOOR(l.Longitude * 100) as lng,
AVG(l.SignalStrength) as SignalStrength,
COUNT(*) as NumberOfReadings
FROM SignalLog l
WHERE l.SignalStrength IS NOT NULL AND l.SignalStrength <> 0 AND l.Location IS NOT NULL
AND l.[Timestamp] > DATEADD(month, -1, GETDATE())
GROUP BY FLOOR(l.Latitude * 100), FLOOR(l.Longitude * 100))
AS AvgSignalReadings

Related

SQL Server strange ROUND() behaviour

I have a huge table of product rows... I only need a small portion of its data, more specifically the prices of the products (regular price - for which I have to choose between two fields in the sense that if one is present, I pick it, otherwise I pick the other; and sale price - which for many products is stored as a float with three decimals, because it was calculated as a percentage of the regular price). So I crafted the appropriate query to achieve what I want, and noticed a very strange behavior for the ROUND() function.
In some cases, when the third decimal digit is 5 (ie. .165) is truncated to .16 and in others it's rounded up to .17, and this happens for any other number with 5 at the third decimal place as well of course! How can that be possible? Here is the query:
SELECT CODE, FWHSPRICE, RTLPRICE, CASE WHEN ISNULL(FWHSPRICE, 0) = 0 THEN RTLPRICE ELSE FWHSPRICE END AS REGULAR, ROUND(FLDFLOAT3, 2) AS SALE
FROM MATERIAL
WHERE COMID = 12
AND FLTID1 = 1
And here is a screenshot of a comparison between the two recordsets, on the left without ROUND() in the query, and on the right with ROUND()
PS: If you want me to export data for replication, can you please explain to me how to create the appropriate INSERT statements for you? The whole table has so many fields - and rows, and I don't know how to set SSMS to do that. I'm coming from MySQL, so this "realm" of SQL Server is so new to me... Thank you in advance.
Yeah, you're mixing two things that have their own sets of quirky behavior (IMHO). I would honestly just not use float unless I needed the specific properties of float, but if you're stuck with this data type...
I would first convert from float to decimal with an extra decimal place (or maybe even 2), then use another convert to round instead of round itself. For example:
DECLARE #x TABLE(x float);
INSERT #x(x) VALUES(0.615),(0.165),(0.415),(0.414);
SELECT
x,
bad = ROUND(x, 2),
better = CONVERT(decimal(10,2), CONVERT(decimal(10,3), x))
FROM #x;
Results:
x
bad
better
0.615
0.61
0.62
0.165
0.17
0.17
0.415
0.41
0.42
0.414
0.41
0.41
Example db<>fiddle
If you have values like 0.4149, you can see how an extra decimal place will prevent that from rounding up (unless that's the behavior you want):
DECLARE #f float = 0.4149;
SELECT source = #f,
round_up = CONVERT(decimal(10,2), CONVERT(decimal(10,3), #f)),
round_down = CONVERT(decimal(10,2), CONVERT(decimal(10,4), #f));
Results:
source
round_up
round_down
0.4149
0.42
0.41

Fastest way to calculate distances between two coordinates?

We currently use the Geography type to calculate distance between a current location and the coordinates in our tsql table. Our code is based on this sqlauthority.com example.
Is there a faster way to retrieve the distance between two points? These calls will be done by a mobile phone app, so they should ideally be very fast.
After testing it with a distance I know, looping 100 times per batch and running the batch 15 times to make sure the 10 runs the client statistics stores in SSMS are cycled past initial query plan generation so it doesn't skew the results. Here are the averages of the remaining. The calculation method seems to be twice as fast as the geography option.
With a difference in distance returned of 0.0000000020044.
Calculation script used (returned miles: 41.9013152732833)
set nocount on;
declare
#lat1 float = 45.489614
,#lon1 float = -122.650021
,#lat2 float = 44.94404
,#lon2 float = -123.025739
select 3959.1825574 * acos(sin(#lat1/57.295779513082323) * sin(#lat2/57.295779513082323) + cos(#lat1/57.295779513082323) * cos(#lat2/57.295779513082323) * cos((#lon2-#lon1)/57.295779513082323)) distance_in_miles
GO 100
Geography script used (returned miles: 41.9013152752877)
set nocount on;
declare
#g geography = geography::Point(45.489614, -122.650021, 4326)
,#h geography = geography::Point(44.94404, -123.025739, 4326)
select #h.STDistance(#g) / 1609.344 distance_in_miles -- 1609.344 is meters in mile. STDistance = meters.
GO 100
Fair warning, doing it in a non-system function will still have unpredictable performance. I would recommend doing it inline for calculation.
Here's a raw calculation example.
Working example of inline syntax for miles. It is the easiest, most accurate and shortest syntax I could find.
adjusted for accuracy
if object_id('tempdb..#LatLongInfo','U') is not null
begin
drop table #LatLongInfo;
end;
create table #LatLongInfo (
lat1 float,
lon1 float,
lat2 float,
lon2 float
);
insert into #LatLongInfo
values (21, -76, 23, -72);
select
3959.1825574 * acos(sin(lat1/57.295779513082323) * sin(lat2/57.295779513082323) + cos(lat1/57.295779513082323) * cos(lat2/57.295779513082323) * cos((lon2-lon1)/57.295779513082323)) distance_in_miles
from #LatLongInfo;
Hope this helps. I used something like this to find the doctors within a given range for patients back when sql2000 was released, it's been a while. Google was a newborn, no maps, nothing but a search box and one button. You have me all nostalgic now...I remember reading this when I coded that the first time.

TSQL - get closest coordinate on linestring to a point

Consider the overly simplistic example: POINT(0 0) and LINESTRING (1 -10, 1 10)
The closest point on the line to the POINT would be 1, 0.
How would one determine this in TSQL? My simple, not entirely accurate, approach was to make a linestring (POINT POINT) and extend out the X coord of one coords until the two linestrings intersected.
So:
linestring (0 0, 0.25 0) (no intersect)
linestring (0 0, 0.5 0) (no intersect)
linestring (0 0, 0.75 0) (no intersect)
linestring (0 0, 1 0) (intersection - so 1 0 is the point closest to POINT
This quasi worked, but doesn't seem to the most bestest/more performant way of accomplishing this.
For example, one inefficiency is that I move it one direction (positive increments), and if there was no match (after x attempts), then I would start over, but with negative increments.
To optimize, I tried moving in larger steps, then when intersected (probably went past the point), I backed off 1 increment and started from there with a smaller increment. I did this a couple of times - instead of going in tiny tiny increments so as not to overshoot by too much.
One acceptable assumption based on my processing that the POINT will be next to (left/right) of the LINESTRING.
Another acceptable assumption is that the LINESTRING will be fairly "perpendicular" to the POINT.
I think you can do this mathematically rather than with a brute-force iterative algorithm.
There is a post to get closest point to a line that describes the method.
I converted this method to SQL which returns the correct value (1,0). Your 'trivial' example is actually a bit of an edge case (vertical line with infinite slope) so it seems robust.
I also tested the source code with this example: https://www.desmos.com/calculator/iz07az84f5 and using the input for the line of (-1,2) (3,0) and a point at (2,2) got the correct answer (1.4, 0.8).
SQL code (also in SQL Fiddle at http://sqlfiddle.com/#!6/d87aa/15)
DECLARE #x int, #y int, #x1 int, #y1 int, #x2 int, #y2 int
DECLARE #atb2 float, #atp_dot_atb float
DECLARE #t float
--SELECT #x=0, #y=0
--SELECT #x1=1, #y1=10, #x2=1, #y2=-10
SELECT #x=2, #y=2
SELECT #x1=-1, #y1=2, #x2=3, #y2=0
SELECT #atb2 = SQUARE(#x2-#x1) + SQUARE(#y2-#y1) -- Basically finding the squared magnitude of a_to_b
SELECT #atp_dot_atb = (#x-#x1)*(#x2-#x1) + (#y-#y1)*(#y2-#y1) -- The dot product of a_to_p and a_to_b
SELECT #t = #atp_dot_atb / #atb2 -- The normalized "distance" from a to your closest point
SELECT #x1 + (#x2-#x1)*#t, #y1 + (#y2-#y1)*#t --Add the distance to A, moving towards B

How do I control the datatype of a computed column?

Using SQL Server 2012...
I have two columns:
Price [decimal(28,12)]
OustandingShares [decimal(38,3)] -- The 38 is overkill but alas, not my call.
When I do an ALTER TABLE I get a resulting computed column as a [decimal(38,6)]. I need the datatype to be [decimal(28,12)].
ALTER TABLE [xyz].MyTable
ADD Mv AS OustandingShares * Price
How can I effectively get 12 decimals of scale on this computed column? I've tried doing convert on the OutstandingShares to 12 decimal places as well as wrapping a convert around the OutstandingShares * Price. The only thing I get is a computed field at [decimal(28,12)] with six trailing zeros.
Thoughts?
The Fix
This does what you want:
CONVERT(DECIMAL(28,12), (
CONVERT(DECIMAL(15, 3), [OustandingShares])
* CONVERT(DECIMAL(24, 12), [Price])
)
)
Test with this:
SELECT CONVERT(DECIMAL(28,12),
(CONVERT(DECIMAL(24,12), 5304.987781883689)
* CONVERT(DECIMAL(15,3), 3510.88)));
Result:
18625175.503659806036
The Reason
The computation is being truncated due to SQL Server's rules for how to handle Precision and Scale across various operations. These rules are detailed in the MSDN page for Precision, Scale, and Length. The details we are interested in for this case are:
Operation: e1 * e2
Result precision: p1 + p2 + 1
Result scale *: s1 + s2
Here the datatypes in play are:
DECIMAL(28, 12)
DECIMAL(38, 3)
This should result in:
Precision = (28 + 38 + 1) = 67
Scale = 15
But the max length of the DECIMAL type is 38. So what gives? We now need to notice that there was a footnote attached to the "Result scale" calculation, being:
* The result precision and scale have an absolute maximum of 38. When a result precision is greater than 38, the corresponding scale is reduced to prevent the integral part of a result from being truncated.
So it seems that in order to get the Precision back down to 38 it chopped off 9 decimal places.
And this is why my proposed fix works. I kept the "Scale" values the same as we don't want to truncate going in and expanding them serves no purpose as SQL Server will expand the Scale as appropriate. The key is in reducing the Precision so that the truncation would be non-existent or at least minimal.
With DECIMAL(15, 3) and DECIMAL(24, 12) we should get:
Precision = (15 + 24 + 1) = 40
Scale = 15
40 is over the limit so reduce by 2 to get down to 38, which means reduce the Scale by 2 leaving us with a true "Result Scale" of 13, which is 1 more than we need and will even be seeing.
Use cast() or convert(). Something like:
ALTER TABLE [xyz].MyTable ADD Mv AS cast(OustandingShares * Price as decimal(12, 6)
or whatever type you want it to be.
EDIT:
Oh, I think I'm getting the idea. The problem is the calculation itself. In that case, do the conversion before the multiplication, so you don't have to depend on SQL Server's (arcane) rules for conforming decimal types.
ALTER TABLE [xyz].MyTable
ADD Mv AS cast(OustandingShares as decimal(28, 12) * cast(Price as decimal(28, 12))
I believe what is happening in your case is that the maximum precision on the calculated result exceeds the allowed thresholds, so the scale is reduced accordingly. This is explained at the bottom of this page.

Types and rounds in SQL Server

I have the following query:
DECLARE #A as numeric(36,14) = 480
DECLARE #B as numeric(36,14) = 1
select #B/#A
select cast(#B as decimal)/cast(#A as decimal)
Why does the first calculation returns 0.002083 and the second one returns 0.00208333333333333?
IsnĀ“t numeric(36,14) good enough to have a good precision (just as the second query)?
If I use only numeric, instead of numeric(36,14), I have a good precision again:
select cast(#B as numeric)/cast(#A as numeric)
You can calculate precision and scale by yourself using this documentation from SQL Server Books online.
I tried to calculate precision and scale for your case (operation=division, p=36, s=14) and I got a pretty strange results...
precision of the result: [p1 - s1 + s2 + max(6, s1 + p2 + 1)] -> 36-14+14+max(6,14+36+1)=36+51=87
scale of the result : [max(6, s1 + p2 + 1)] -> max(6,14+36+1)=51
In this situation precision is greater than 38 and in this case (as stated in the documentation)
*The result precision and scale have an absolute maximum of 38. When a result precision is greater than 38, the corresponding scale is
reduced to prevent the integral part of a result from being truncated.
scale must be reduced by (87-38=) 49, that is (51-49=) 2 ...
I think that minimum scale length is 6 (because of expression scale=[max(6, s1 + p2 + 1)]) and it can't be reduced lower than 6 - that we have as a result (0.002083).
Just contributing for the understanding of the problem (going deeper on #Andrey answer), the things could be tricky, depending on the order of calculations.
Consider the variables:^
DECLARE #A as NUMERIC(36,19) = 100
DECLARE #B as NUMERIC(36,19) = 480
DECLARE #C as NUMERIC(36,19) = 100
Calculating A/B*C
If you want to calculate A/B*C, using the formulaes, we have:
A/B is of type NUMERIC(38,6) --> as calculated by #Andrey
The result will be 0.208333 (with scale of 6)
Multiplying by 100, we will get 20.833300
Calculating A*C/B
The result of A*C is 10000 of type NUMERIC(38,6). Diving by C, the result will be 20.833333 of type NUMERIC(38,6)
Then, the result may vary depending on the order of calculation (the same problem was pointed in https://dba.stackexchange.com/questions/77664/how-does-sql-server-determine-precision-scale).

Resources