Apply STDistance to all rows of table? - sql-server

I'm trying to return all the rows from [store] with distance of less than 10 miles. Table [Store] has a column of type Geography.
I understand how to find the distance between two specific points, something like this:
declare #origin geography
select #origin = geography::STPointFromText('POINT(' + CAST(-73.935242 AS
VARCHAR(20)) + ' ' + CAST(40.730610 AS VARCHAR(20)) + ')', 4326)
declare #destination geography
select #destination = geography::STPointFromText('POINT(' + CAST(-93.732666 AS VARCHAR(20)) + ' ' + CAST(30.274096 AS VARCHAR(20)) + ')', 4326)
select #origin.STDistance(#destination)/ 1609.344 as 'distance in miles'
I'm having trouble applying this logic to a SELECT statement. Instead of getting the distance between #origin and #destination, I would like to get the distance in miles between #origin and store.Geolocation for all rows.

The STDistance method, used from one instance of Geography and applied to another, returns the distance between the two points. It can be used with variables, e.g. #Origin.STDistance( #Destination ), columns or a combination thereof, e.g. to find all of the stores within 10 miles of a particular #Origin:
select *
from Store
where #Origin.STDistance( Store.Geolocation ) < 1609.344 * 10.0;
Note: As BenThul pointed out, spatial index handling is a bit fickle. An STDistance compared to a constant is SARGable: #Origin.STDistance( Store.Geolocation ) < 1609.344 * 10.0, but this mathematically equivalent expression is not: #Origin.STDistance( Store.Geolocation ) / 1609.344 < 10.0. This "feature" is documented here.

Related

STDistance – Calculates distance between two objects

I try to calculate the distance between two objects.
declare #p1 geography
declare #p2 geography
SELECT #p1 = WKT from tbl_1 where loc = "school"
SELECT #p2 = WKT from tbl_2 where loc = "school"
select round(#p1.STDistance(#p2)/1000,0) Distance_KM
But i get an error for the column loc
Invalid column name
This column exists and data type is geography.
Column WKT is populated using:
UPDATE [dbo].[lbl_1]
SET [WKT] = geography::STPointFromText('POINT(' + CAST([Longitude] AS VARCHAR(20)) + ' ' + CAST([Latitude] AS VARCHAR(20)) + ')', 4326)
GO
What's wrong ??
Your string literal is incorrect.
For SQL, you want single quotes, iow 'school' and not "school".
SQL treats it as a column and not a string literal.

Issue calculating distances in google maps only in Panama

I have encountered a problem that is driving me crazy.
A few years ago i developed a browser app that calculate the distance from one given point (latitude and longitude coords) to anothers given points.
Everything has worked fine until a few days ago when a client from Panama started working with us. The same SQL procedure that works for years is giving us wrong measurements.
This is the SQL formula:
(Acos(Sin((Ofd.Latitud * PI()) / 180) * Sin((#Longitud * PI()) / 180) + Cos((Ofd.Latitud * PI()) / 180) * Cos((#Longitud * PI()) / 180) * Cos((Ofd.Logitud * PI() / 180) - (#Latitud * PI()) / 180)) * 6371 * 1000) AS Distance
I tried to calculate the distance using the new method since SQL 2008
DECLARE #Latitude float = 8.9749377
DECLARE #Longitude float = -79.5060562
DECLARE #TLatitude float = 8.9868425
DECLARE #TLongitude float = -79.5012872
DECLARE #Source geography
DECLARE #Target geography
SET #Source = geography::STPointFromText('POINT(' + CAST(#Latitude as varchar(20)) + ' ' + CAST(#Longitude as varchar(20)) + ')',4326)
SET #Target = geography::STPointFromText('POINT(' + CAST(#TLatitude as varchar(20)) + ' ' + CAST(#TLongitude as varchar(20)) + ')',4326)
SELECT #source.STDistance(#Target)
The diference between the two methods is negligible, a few meters. The distance that returns the method is ~500m
So, the problem is that the real distance is almost 1500 meters, I've seen and measured the distance in google maps and 1.500 meters is the real distance. The funny side is that, this problem, only happens in Panama. With the clients in Spain we have no problem calculating the distance.
Have I found the Bermuda's triangle?
You have the Latitude and Longitude reversed. WKT POINT coordinates are ordered X,Y (Longitude, Latitude).
DECLARE #Latitude float = 8.9749377
DECLARE #Longitude float = -79.5060562
DECLARE #TLatitude float = 8.9868425
DECLARE #TLongitude float = -79.5012872
DECLARE #Source geography
DECLARE #Target geography
SET #Source = geography::STPointFromText('POINT(' + CAST(#Longitude as varchar(20)) + ' ' + CAST(#Latitude as varchar(20)) + ')',4326)
SET #Target = geography::STPointFromText('POINT(' + CAST(#TLongitude as varchar(20)) + ' ' + CAST(#TLatitude as varchar(20)) + ')',4326)
SELECT #source.STDistance(#Target)

SQL Server: Calculate the Radius of a Lat/Long?

Say I have the latitude and longitude of a city and I need to find out all the airport that are within 100 miles of this location. How would I accomplish this? My data resides in SQL Server. 1 table has all the city info with lat and long and the other has the airport info with lat and long.
First ... convert city's data point
DECLARE #point geography;
SELECT geography::STPointFromText('POINT(' + CAST(#lat AS VARCHAR(20)) + ' ' +
CAST(#lon AS VARCHAR(20)) + ')', 4326)
where #lat and #lon are the latitude and longitude of the city in question.
Then you can query the table ...
SELECT [column1],[column2],[etc]
FROM [table]
WHERE #point.STBuffer(160934.4).STIntersects(geography::STPointFromText(
'POINT(' + CAST([lat] AS VARCHAR(20)) + ' ' +
CAST([lon] AS VARCHAR(20)) + ')', 4326) );
where 160934.4 is the number of meters in 100 miles.
This will be slow, though. If you wanted to do even more spatial work, you could add a persisted computed column (because lat and lon points aren't really going to change) and then use a spatial index.
ALTER TABLE [table]
ADD geo_point AS geography::STPointFromText('POINT(' + CAST([lat] AS VARCHAR(20))
+ ' ' + CAST([lon] AS VARCHAR(20)) + ')', 4326) PERSISTED;
CREATE SPATIAL INDEX spix_table_geopt
ON table(geo_point)
WITH ( BOUNDING_BOX = ( 0, 0, 500, 200 ) ); --you'd have to know your data
I used/wrote this several years ago, and it was close enough for what I needed. Part of the formula takes into account the curvature of the earth if I remember correctly, but it has been a while. I used zip codes, but you could easily adapt for cities instead - same logic.
ALTER PROCEDURE [dbo].[sp_StoresByZipArea] (#zip nvarchar(5), #Radius float) AS
DECLARE #LatRange float
DECLARE #LongRange float
DECLARE #LowLatitude float
DECLARE #HighLatitude float
DECLARE #LowLongitude float
DECLARE #HighLongitude float
DECLARE #istartlat float
DECLARE #istartlong float
SELECT #iStartlat=Latitude, #iStartLong=Longitude from zipcodes where zipcode=#ZIP
SELECT #LatRange = #Radius / ((6076 / 5280) * 60)
SELECT #LongRange = #Radius / (((cos((#iStartLat * 3.141592653589 / 180)) * 6076.) / 5280.) * 60)
SELECT #LowLatitude = #istartlat - #LatRange
SELECT #HighLatitude = #istartlat + #LatRange
SELECT #LowLongitude = #istartlong - #LongRange
SELECT #HighLongitude = #istartlong + #LongRange
/** Now you can create a SQL statement which limits the recordset of cities in this manner: **/
SELECT * FROM ZipCodes
WHERE (Latitude <= #HighLatitude) AND (Latitude >= #LowLatitude) AND (Longitude >= #LowLongitude) AND (Longitude <= #HighLongitude)

Making this spatial query more efficient

I have 2 tables:
tZipCodeNoCity with ZipCode and PointGeography
and MBLPosition with Latitude and Longitude
In this query I'm finding closest ZipCode to my positions. It's "poor mans" geocoding :)
How do I write this query so I don't have to do this SELECT TOP 1 inline?
It's pretty slow with even 150 devices (like 20 seconds)
I had to include 150 mile radius into this subselect to get it faster
SELECT LastPositions.DeviceId, P.Description, P.Latitude, P.Longitude, P.Speed, P.DeviceTime,
(
SELECT TOP 1 ZC.ZipCode
FROM dbo.tZipCodeNoCity ZC
WHERE ZC.PointGeography.STDistance(geography::STPointFromText('POINT(' + CAST(P.Longitude AS VARCHAR(20)) + ' ' + CAST(P.Latitude AS VARCHAR(20)) + ')', 4326)) < 150 * 1609.344
ORDER BY ZC.PointGeography.STDistance(geography::STPointFromText('POINT(' + CAST(P.Longitude AS VARCHAR(20)) + ' ' + CAST(P.Latitude AS VARCHAR(20)) + ')', 4326))
)
FROM dbo.MBLPosition P
INNER JOIN
(
SELECT D.DeviceId, MAX(P.PositionKey) LastPositionKey
FROM dbo.MBLPosition P
INNER JOIN IDATTApplication.dbo.MBLDevice D ON P.DeviceKey = D.DeviceKey
GROUP BY D.DeviceId
) LastPositions ON P.PositionKey = LastPositions.LastPositionKey
In a project I worked on about 12 years ago, I ran a query along these lines to reduce the list of possibilities before doing the actual distance calculation:
WHERE zip.lat < my.lat + 0.5 && zip.lat > my.lat - 0.5
&& zip.long < my.long + 0.5 && zip.long > my.long - 0.5
From that subset, I calculate the actual distance between the two points and sort on it. You'll have to adjust the "0.5" portion as appropriate to get a big enough box to be sure you're going to get a hit.
And I would imagine that there's a better way than STPointFromText to create your point object. Could you use STPointFromWKB? Could you convert to the geography type once?
See this page for an example of creating your point via SET.
DECLARE #p geography;
SET #p = geography::STGeomFromText('POINT(' + CAST(P.Longitude AS VARCHAR(20)) + ' ' + CAST(P.Latitude AS VARCHAR(20)) + ')', 4326);
SELECT TOP 1 ZC.ZipCode
FROM dbo.tZipCodeNoCity ZC
WHERE ZC.PointGeography.STDistance(#p)) < 150 * 1609.344
ORDER BY ZC.PointGeography.STDistance(#p))

How do I easily find IDENTITY columns in danger of overflowing?

My database is getting old, and one of my biggest INT IDENTITY columns has a value around 1.3 billion. This will overflow around 2.1 billion. I plan on increasing it's size, but I don't want to do it too soon because of the number of records in the database. I may replace my database hardware before I increase the column size, which could offset any performance problems this could cause. I also want to keep an eye on all the other columns in my databases that are more than 50% full. It's a lot of tables, and checking each one manually is not practical.
This is how I am getting the value now (I know the value returned may be slightly out of date, but it's good enough for my purposes):
PRINT IDENT_CURRENT('MyDatabase.dbo.MyTable')
Can I use the INFORMATION_SCHEMA to get this information?
You can consult the sys.identity_columns system catalog view:
SELECT
name,
seed_value, increment_value, last_value
FROM sys.identity_columns
This gives you the name, seed, increment and last value for each column. The view also contains the data type, so you can easily figure out which identity columns might be running out of numbers soonish...
I created a stored procedure to solve this problem. It uses the INFORMATION_SCHEMA to find the IDENTITY columns, and then uses IDENT_CURRENT and the column's DATA_TYPE to calculate the percent full. Specify the database as the first parameter, and then optionally the minimum percent and data type.
EXEC master.dbo.CheckIdentityColumns 'MyDatabase' --all
EXEC master.dbo.CheckIdentityColumns 'MyDatabase', 50 --columns 50% full or greater
EXEC master.dbo.CheckIdentityColumns 'MyDatabase', 50, 'int' --only int columns
Example output:
Table Column Type Percent Full Remaining
------------------------- ------------------ ------- ------------ ---------------
MyDatabase.dbo.Table1 Table1ID int 9 1,937,868,393
MyDatabase.dbo.Table2 Table2ID int 5 2,019,944,894
MyDatabase.dbo.Table3 Table3ID int 9 1,943,793,775
I created a reminder to check all my databases once per month, and I log this information in a spreadsheet.
CheckIdentityColumns Procedure
USE master
GO
CREATE PROCEDURE dbo.CheckIdentityColumns
(
#Database AS NVARCHAR(128),
#PercentFull AS TINYINT = 0,
#Type AS VARCHAR(8) = NULL
)
AS
--this procedure assumes you are not using negative numbers in your identity columns
DECLARE #Sql NVARCHAR(3000)
SET #Sql =
'USE ' + #Database + '
SELECT
[Column].TABLE_CATALOG + ''.'' +
[Column].TABLE_SCHEMA + ''.'' +
[Table].TABLE_NAME AS [Table],
[Column].COLUMN_NAME AS [Column],
[Column].DATA_TYPE AS [Type],
CAST((
CASE LOWER([Column].DATA_TYPE)
WHEN ''tinyint''
THEN (IDENT_CURRENT([Table].TABLE_NAME) / 255)
WHEN ''smallint''
THEN (IDENT_CURRENT([Table].TABLE_NAME) / 32767)
WHEN ''int''
THEN (IDENT_CURRENT([Table].TABLE_NAME) / 2147483647)
WHEN ''bigint''
THEN (IDENT_CURRENT([Table].TABLE_NAME) / 9223372036854775807)
WHEN ''decimal''
THEN (IDENT_CURRENT([Table].TABLE_NAME) / (([Column].NUMERIC_PRECISION * 10) - 1))
END * 100) AS INT) AS [Percent Full],
REPLACE(CONVERT(VARCHAR(19), CAST(
CASE LOWER([Column].DATA_TYPE)
WHEN ''tinyint''
THEN (255 - IDENT_CURRENT([Table].TABLE_NAME))
WHEN ''smallint''
THEN (32767 - IDENT_CURRENT([Table].TABLE_NAME))
WHEN ''int''
THEN (2147483647 - IDENT_CURRENT([Table].TABLE_NAME))
WHEN ''bigint''
THEN (9223372036854775807 - IDENT_CURRENT([Table].TABLE_NAME))
WHEN ''decimal''
THEN ((([Column].NUMERIC_PRECISION * 10) - 1) - IDENT_CURRENT([Table].TABLE_NAME))
END
AS MONEY) , 1), ''.00'', '''') AS Remaining
FROM
INFORMATION_SCHEMA.COLUMNS AS [Column]
INNER JOIN
INFORMATION_SCHEMA.TABLES AS [Table]
ON [Table].TABLE_NAME = [Column].TABLE_NAME
WHERE
COLUMNPROPERTY(
OBJECT_ID([Column].TABLE_NAME),
[Column].COLUMN_NAME, ''IsIdentity'') = 1 --true
AND [Table].TABLE_TYPE = ''Base Table''
AND [Table].TABLE_NAME NOT LIKE ''dt%''
AND [Table].TABLE_NAME NOT LIKE ''MS%''
AND [Table].TABLE_NAME NOT LIKE ''syncobj_%''
AND CAST(
(
CASE LOWER([Column].DATA_TYPE)
WHEN ''tinyint''
THEN (IDENT_CURRENT([Table].TABLE_NAME) / 255)
WHEN ''smallint''
THEN (IDENT_CURRENT([Table].TABLE_NAME) / 32767)
WHEN ''int''
THEN (IDENT_CURRENT([Table].TABLE_NAME) / 2147483647)
WHEN ''bigint''
THEN (IDENT_CURRENT([Table].TABLE_NAME) / 9223372036854775807)
WHEN ''decimal''
THEN (IDENT_CURRENT([Table].TABLE_NAME) / (([Column].NUMERIC_PRECISION * 10) - 1))
END * 100
) AS INT) >= ' + CAST(#PercentFull AS VARCHAR(4))
IF (#Type IS NOT NULL)
SET #Sql = #Sql + 'AND LOWER([Column].DATA_TYPE) = ''' + LOWER(#Type) + ''''
SET #Sql = #Sql + '
ORDER BY
[Column].TABLE_CATALOG + ''.'' +
[Column].TABLE_SCHEMA + ''.'' +
[Table].TABLE_NAME,
[Column].COLUMN_NAME'
EXECUTE sp_executesql #Sql
GO
Keith Walton has a very comprehensive query that is very good. Here's a little simpler one that is based on the assumption that the identity columns are all integers:
SELECT sys.tables.name AS [Table Name],
last_value AS [Last Value],
MAX_LENGTH,
CAST(cast(last_value as int) / 2147483647.0 * 100.0 AS DECIMAL(5,2))
AS [Percentage of ID's Used],
2147483647 - cast(last_value as int) AS Remaining
FROM sys.identity_columns
INNER JOIN sys.tables
ON sys.identity_columns.object_id = sys.tables.object_id
ORDER BY last_value DESC
The results will look like this:
Table Name Last Value MAX_LENGTH Percentage of ID's Used Remaining
My_Table 49181800 4 2.29 2098301847
Checking Integer Identity Columns
While crafting a solution for this problem, we found this thread both informative and interesting (we also wrote a detailed post about this and described how our tool works).
In our solution we're querying the information_schema to acquire a list of
all columns. Then we wrote a program that would go through each of them and compute the maximum and minimum (we account for both overflow and underflow).
SELECT
b.COLUMN_NAME,
b.COLUMN_TYPE,
b.DATA_TYPE,
b.signed,
a.TABLE_NAME,
a.TABLE_SCHEMA
FROM (
-- get all tables
SELECT
TABLE_NAME, TABLE_SCHEMA
FROM information_schema.tables
WHERE
TABLE_TYPE IN ('BASE TABLE', 'VIEW') AND
TABLE_SCHEMA NOT IN ('mysql', 'performance_schema')
) a
JOIN (
-- get information about columns types
SELECT
TABLE_NAME,
COLUMN_NAME,
COLUMN_TYPE,
TABLE_SCHEMA,
DATA_TYPE,
(!(LOWER(COLUMN_TYPE) REGEXP '.*unsigned.*')) AS signed
FROM information_schema.columns
) b ON a.TABLE_NAME = b.TABLE_NAME AND a.TABLE_SCHEMA = b.TABLE_SCHEMA
ORDER BY a.TABLE_SCHEMA DESC;

Resources