Storing 'Point' column from ShapeFile - sql-server

I have a Shapefile (*.shp) which I am loading into the Database. I have a column called Point" which stores the Data in shapes. For example
POLYGON ((1543297.7815 5169880.9468, 1543236.7046 5169848.3834,
1543195.0218 5169930.2767, 1543104.4989 5170101.6818,
1543056.805 5170191.9835, 1542969.1187 5170358.1396,
1542820.9656 5170638.8525, 1542820.6605 5170639.7223,
1542816.1912 5170647.8707, 1543158.2618 5170829.6437,
1543318.4126 5170915.6562, 1543559.2078 5171043.8001,
1543840.2014 5171192.4698, 1544108.917 5171336.1306,
1544271.7972 5171422.313, 1544357.0262 5171263.5454,
1544447.9779 5171091.3804, 1544468.04 5171054.3179,
1544529.7931 5170936.192, 1544583.3416 5170837.5321,
1544658.3376 5170696.5608, 1544699.0638 5170622.0859,
1543985.6169 5170245.4526, 1543618.4129 5170050.7422,
1543297.7815 5169880.9468))
The data type of the Column "Point" is nvarchar(max).
The problem is when the size of Polygon exceeds , the column truncates and does not store all the values. I can't convert Points into Geometry as I want to convert Points into Lat/long from Polygon.

I'd suggest storing the whole polygon as a geometry type. If/when you need to "convert" it to geography, use the geography methods STNumPoints and STPointN to extract the individual points in sequence and convert them as appropriate.
Speaking of the conversion, what format are your data in now? I'm not seeing lat/long info there, but perhaps I'm missing something.
Edit: Here's a solution that I just coded.
use tempdb;
create table tally (i int not null);
with
a as (select 1 as [i] union select 0),
b as (select 1 as [i] from a as [a1] cross join a as [a2]),
c as (select 1 as [i] from b as [a1] cross join b as [a2]),
d as (select 1 as [i] from c as [a1] cross join c as [a2]),
e as (select 1 as [i] from d as [a1] cross join d as [a2])
insert into tally
select row_number() over (order by i) from e
create unique clustered index [CI_Tally] on tally (i)
create table ace (g geometry)
insert into ace (g)
values (geometry::STGeomFromText(<<your polygon string here>>, 0));
select i, g.STPointN(t.i), g.STPointN(t.i).STAsText()
from ace as [a]
cross join tally as [t]
where t.i <= g.STNumPoints()

Related

Rows to columns without PIVOT in SQL Server

I have a 3 tables from which contain this data:
Table 1:
Table 2:
Table 3:
Output:
I have tried using Pivot but it has to have an aggregate function in it.
SELECT
project_code, project_name, fk_prj_project_id,
[A], [B], [C], [D]
FROM
(SELECT
project_code, project_name, employee_name,
fk_prj_project_id, fk_prj_project_id AS nm,
activity_details
FROM
PRJ_MST_PROJECT AS a
LEFT JOIN
PRJ_TNS_DAILY_SUMMARY AS b ON a.pk_prj_project_id = b.fk_prj_project_id
LEFT JOIN
HRM_EMP_MST_EMPLOYEE AS c ON b.fk_hrm_emp_employee_id = c.pk_hrm_emp_employee_id
WHERE
a.project_status = 0
AND b.transaction_status = 1
AND CONVERT(date, b.transaction_date, 103) = CONVERT(date, '15/04/2021', 103)) x
PIVOT
(MAX(nm)
FOR nm IN ([A], [B], [C], [D])
) p
The problem is you set your PIVOT to look for values of nm in A, B, C, and D, but nm is an alias for fk_prj_project_id, which has possible values of 1, 2, 3, 4, and 5. So there are no A, B, C, or D values to be had. I don't even see a name for the column that holds A, B, C, and D, but whatever column that is needs to be what you put in the "FOR ___ IN" section of your pivot.
Test your query by commenting out the reference to the pivot columns in the SELECT and comment out the word PIVOT and everything after it and re-run your query. You should see some column with values A, B, C, D. If you don't, fix your query so you do. Once you do, that column is what you PIVOT on (put it between FOR and IN in the pivot block).
Oh, and if you provide data in a usable format people might run your query and give you directly usable results, it's a lot to ask to have people enter your data to get to help you so meet them half way. A link to sqlfiddle is ideal, but even just a bunch of DECLARE #T1 and INSERT INTO T1 VALUES statements is usually enough to get significantly better help.
EDIT:
Nice job with the Fiddle!
OK, so using your data, we can test out actual queries. For PIVOT to work, we need a column to look up (employee name), a column to aggregate (activity_details), and some columns that will be constant across the rows produced (the project's name and ID). You're working with text not numbers, so your aggregation can't be mathematical, leaving you with pretty much just MAX or MIN. To make sure you get the right (newest) one, I first built a table of comments and numbered them by how new they were, then I picked just the newest comment for each (project, user) pair. cteCommentNewest is the result of that.
Now with a clean (and verified) table to pivot, the actual pivot syntax is simple. Well, as simple as Pivot can be, it's inherently pretty confusing IMHO, but structuring it this way keeps the actual PIVOT as clean as possible.
Note that the query is in twice, I tested it as a static query before converting it to dynamic because it's much easier to troubleshoot a static query, then I left it in in case you want to experiment with it. You don't need it for the final solution to work.
Here's the final code, fully tested and producing the specified output:
DECLARE #cols3 AS NVARCHAR(MAX)
DECLARE #query3 AS NVARCHAR(MAX)=''
DECLARE #dt varchar(100)='14/04/2021'
select #cols3 = STUFF((SELECT ',' + QUOTENAME(employee_name)
from dbo.HRM_EMP_MST_EMPLOYEE
order by employee_name
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
--SELECT #cols3 --Test column list for dynamic query
--Test the core functions of pivot before making dynamic
;with cteCommentsAll as (
SELECT P.project_code , P.project_name, C.activity_details , E.employee_name
, ROW_NUMBER () over (PARTITION BY P.project_code , E.employee_name ORDER BY C.transaction_date DESC) as Newness
FROM dbo.PRJ_MST_PROJECT as P --Projects
LEFT OUTER JOIN dbo.PRJ_TNS_DAILY_SUMMARY as C --Comments on projects
ON P.pk_prj_project_id = C.fk_prj_project_id --Get all projects, then all comments for each project
LEFT OUTER JOIN dbo.HRM_EMP_MST_EMPLOYEE as E --Employees who commented
on E.pk_hrm_emp_employee_id = C.fk_hrm_emp_employee_id
), cteCommentsNewest as (
SELECT project_code , project_name, activity_details , employee_name
FROM cteCommentsAll WHERE Newness = 1 --Only one comment per user per project of CROSS problems
)
SELECT *
FROM cteCommentsNewest as N --TEST up to this point to see the raw table
PIVOT (MAX(activity_details) FOR employee_name IN (A, B, C) ) as P
--Put the working query, modified for dynamic columns, into a variable
set #query3 = N'
;with cteCommentsAll as (
SELECT P.project_code , P.project_name, C.activity_details , E.employee_name
, ROW_NUMBER () over (PARTITION BY P.project_code , E.employee_name ORDER BY C.transaction_date DESC) as Newness
FROM dbo.PRJ_MST_PROJECT as P --Projects
LEFT OUTER JOIN dbo.PRJ_TNS_DAILY_SUMMARY as C --Comments on projects
ON P.pk_prj_project_id = C.fk_prj_project_id --Get all projects, then all comments for each project
LEFT OUTER JOIN dbo.HRM_EMP_MST_EMPLOYEE as E --Employees who commented
on E.pk_hrm_emp_employee_id = C.fk_hrm_emp_employee_id
), cteCommentsNewest as (
SELECT project_code , project_name, activity_details , employee_name
FROM cteCommentsAll WHERE Newness = 1 --Only one comment per user per project of CROSS problems
)SELECT *
FROM cteCommentsNewest as N
PIVOT (MAX(activity_details) FOR employee_name IN (' + #cols3 + ') ) as P
'
exec sp_executesql #query3
which produces the following output
project_code
project_name
A
B
C
MOA20171
Project A
some remark By Employee A on 14
NULL
some remark By Employee C on 14
MOA20172
Project B
NULL
NULL
some remark By Employee C on 15
MOA20173
Project C
NULL
NULL
NULL

Error when joining a polygon table to a point table based on point being within a given polygon

The overall goal that I am attempting to achieve is to join two tables together based on a point (created from xy coordinates in feet) from one table falling within a polygon from another table. The expected result is given records and the name of the polygon it falls within. When the query is executed, the below error is returned, in summary:
A .NET Framework error occurred during execution of user-defined routine or aggregate "geometry":
System.FormatException: 24114: The label 395 in the input well-known text (WKT) is not valid.<
Due to not being familiar with geometry data in SQL, I am not fully certain whether I am even on the right path, so any advice would be appreciated.
The polygon table that I am attempting to join is a temporary table and the polygons are formatted as geometry. Running the below code in isolation executes as expected by creating a spatial output.
My apologies, it appears code formatting is not working on my phone and my work computer’s browser is not supported.
DECLARE #Hex1 TABLE (PolyGeom geometry, Hex varchar(6))
INSERT INTO #Hex1
Values
(geometry::STPolyFromText('Polygon((7598795.05553838 734372.656,7598217.70526919 735372.656,7597063.00473081 735372.656,7596485.65446162 734372.656,7597063.00473081 733372.656,7598217.70526919 733372.656,7598795.05553838 734372.656))',0),1),
(geometry::STPolyFromText('Polygon((7602259.15715352 734372.656,7601681.80688433 735372.656,7600527.10634595 735372.656,7599949.75607676 734372.656,7600527.10634595 733372.656,7601681.80688433 733372.656,7602259.15715352 734372.656))',0),2),
(geometry::STPolyFromText('Polygon((7605723.25876865 734372.656,7605145.90849947 735372.656,7603991.20796109 735372.656,7603413.8576919 734372.656,7603991.20796109 733372.656,7605145.90849947 733372.656,7605723.25876865 734372.656))',0),3)’’’
The table that the polygon table is joined to does not have points so have created a table that has a field with the calculated points. Running the below code in isolation works as expected by returning records with a point.
INSERT INTO #Points (Primary_Key, geom)
select a.rID, geometry::STGeomFromText('POINT('+convert(varchar(20),a.x_coordinate)+' '+convert(varchar(20),a.y_coordinate)+')',0) as geom
from data_a a'''
The tables are joined as shown below
WITH CTE1 AS --Due the number of polygons exceeding insert limits, multiple tables are created and unioned in a CTE
( Select*
From #Hex1
UNION ALL
Select*
From #Hex2
UNION ALL
Select*
From #Hex3
UNION ALL
Select*
From #Hex4)
select a.rID, C.Hex
from data_a a --Existing table with x y coordinates
left join #points p --Joins the point created in points table to the same case in go_data
on a.rID = p.Primary_key
left join CTE1 C --Joins the hexagon to the point if hex containts the point
on p.geom.STIntersects (c.hex) =1'''
Below is the full code, with the number of polygons in each table trimmed down
IF OBJECT_ID('tempdb..#Points') IS NOT NULL DROP TABLE #Points
create table #Points (Primary_key numeric identity not null, geom geometry)
SET IDENTITY_INSERT #Points ON
INSERT INTO #Points (Primary_Key, geom)
select g.rin, geometry::STGeomFromText('POINT('+convert(varchar(20),a.x_coordinate)+' '+convert(varchar(20),a.y_coordinate)+')',0) as geom
from data_a a
;
DECLARE #Hex1 TABLE
(PolyGeom geometry, Hex varchar(6))
INSERT INTO #Hex1
Values
(geometry::STPolyFromText('Polygon((7598795.05553838 734372.656,7598217.70526919 735372.656,7597063.00473081 735372.656,7596485.65446162 734372.656,7597063.00473081 733372.656,7598217.70526919 733372.656,7598795.05553838 734372.656))',0),1),
(geometry::STPolyFromText('Polygon((7602259.15715352 734372.656,7601681.80688433 735372.656,7600527.10634595 735372.656,7599949.75607676 734372.656,7600527.10634595 733372.656,7601681.80688433 733372.656,7602259.15715352 734372.656))',0),2),
(geometry::STPolyFromText('Polygon((7605723.25876865 734372.656,7605145.90849947 735372.656,7603991.20796109 735372.656,7603413.8576919 734372.656,7603991.20796109 733372.656,7605145.90849947 733372.656,7605723.25876865 734372.656))',0),3)
DECLARE #Hex2 TABLE
(PolyGeom geometry, Hex varchar(6))
INSERT INTO #Hex2
Values
(geometry::STPolyFromText('Polygon((7680201.44349411 721372.656,7679624.09322492 722372.656,7678469.39268654 722372.656,7677892.04241735 721372.656,7678469.39268654 720372.656,7679624.09322492 720372.656,7680201.44349411 721372.656))',0),1000),
(geometry::STPolyFromText('Polygon((7683665.54510925 721372.656,7683088.19484006 722372.656,7681933.49430168 722372.656,7681356.14403249 721372.656,7681933.49430168 720372.656,7683088.19484006 720372.656,7683665.54510925 721372.656))',0),1001),
(geometry::STPolyFromText('Polygon((7687129.64672438 721372.656,7686552.29645519 722372.656,7685397.59591681 722372.656,7684820.24564763 721372.656,7685397.59591681 720372.656,7686552.29645519 720372.656,7687129.64672438 721372.656))',0),1002)
DECLARE #Hex3 TABLE
(PolyGeom geometry, Hex varchar(6))
INSERT INTO #Hex3
Values
(geometry::STPolyFromText('Polygon((7765071.93306498 708372.656,7764494.58279579 709372.656,7763339.88225741 709372.656,7762762.53198822 708372.656,7763339.88225741 707372.656,7764494.58279579 707372.656,7765071.93306498 708372.656))',0),1999),
(geometry::STPolyFromText('Polygon((7768536.03468011 708372.656,7767958.68441092 709372.656,7766803.98387254 709372.656,7766226.63360335 708372.656,7766803.98387254 707372.656,7767958.68441092 707372.656,7768536.03468011 708372.656))',0),2000),
(geometry::STPolyFromText('Polygon((7772000.13629525 708372.656,7771422.78602606 709372.656,7770268.08548768 709372.656,7769690.73521849 708372.656,7770268.08548768 707372.656,7771422.78602606 707372.656,7772000.13629525 708372.656))',0),2001)
WITH CTE1 AS
( Select*
From #Hex1
UNION ALL
Select*
From #Hex2
UNION ALL
Select*
From #Hex3)
select a.rID, C.Hex
from data_a a
left join #points p --Joins the point created in points table to the same case in go_data
on g.rin = p.Primary_key
left join CTE1 C --Joins the hexagon to the point if hex containts the point
on p.geom.STIntersects (c.hex) =1
isn't it just a typo? you should intersect the point geometry with the geometry of the polygon and not the hex column.
left join CTE1 C --Joins the hexagon to the point if hex containts the point
on p.geom.STIntersects (c.PolyGeom) =1

PostGIS minimum distance between two large sets of points

I have two tables of points in PostGIS, say A and B, and I want to know, for every point in A, what is the distance to the closest point in B. I am able to solve this for small sets of points with the following query:
SELECT a.id, MIN(ST_Distance_Sphere(a.geom, b.geom))
FROM table_a a, table_b b
GROUP BY a.id;
However, I have a couple million points in each table and this query runs indefinitely. Is there some more efficient way to approach this. I am open to getting an approximate distance rather than an exact one.
Edit: A slight modification to the answer provided by JGH to return distances in meters rather than degrees if points are unprojected.
SELECT
a.id, nn.id AS id_nn,
a.geom, nn.geom_closest,
ST_Distance_Sphere(a.geom, nn.geom_closest) AS min_dist
FROM
table_a AS a
CROSS JOIN LATERAL
(SELECT
b.id,
b.geom AS geom_closest
FROM table_b b
ORDER BY a.geom <-> b.geom
LIMIT 1) AS nn;
Your query is slow because it computes the distance between every points without using any index. You could rewrite it to use the <-> operator that uses the index if used in the order by clause.
select a.id,closest_pt.id, closest_pt.dist
from tablea a
CROSS JOIN LATERAL
(SELECT
id ,
a.geom <-> b.geom as dist
FROM tableb b
ORDER BY a.geom <-> b.geom
LIMIT 1) AS closest_pt;

When we go for cross apply and when we go for inner join in SQL Server 2012

I have small question about SQL Server. When do we use cross apply, and when do we use inner join? Why use cross apply at all in SQL Server?
I have emp, dept tables; based on those two tables, I write an inner join and cross apply query like this:
----using cross apply
SELECT *
FROM Department D
CROSS APPLY
(SELECT *
FROM Employee E
WHERE E.DepartmentID = D.DepartmentID) A
----using inner join
SELECT *
FROM Department D
INNER JOIN Employee E ON D.DepartmentID = E.DepartmentID
Both queries return the same result.
Here why is cross apply needed in SQL Server? Is there performance difference? Can you please tell me?
When will we use cross apply and when inner join? Any performance difference between these queries? Please tell me which is the best way to write this query in SQL Server.
INNER JOIN and CROSS APPLY (same with LEFT JOIN and OUTER APPLY) are very closely related. In your example I'd assume, that the engine will find the same execution plan.
A JOIN is a link between two sets over a condition
an APPLY is a row-wise sub-call
But - as mentioned above - the optimizer is very smart and will - at least in such easy cases - understand, that it comes down to the same.
The JOIN will try to collect the sub-set and link it over the specified condition
The APPLY will try to call the related result with the current row's values over and over.
Differences are in calling table-valued-functions (should be inline-syntax!), with XML-method .nodes() and with more complex scenarios.
One example how one could use APPLY to simulate variables
...to use the result of a row-wise calculation like you'd use a variable:
DECLARE #dummy TABLE(ID INT IDENTITY, SomeString VARCHAR(100));
INSERT INTO #dummy VALUES('Want to split/this at the two/slashes.'),('And/this/also');
SELECT d.ID
,d.SomeString
,pos1
,pos2
,LEFT(d.SomeString,pos1-1)
,SUBSTRING(d.SomeString,pos1+1,pos2-pos1-1)
,SUBSTRING(d.SomeString,pos2+1,1000)
FROM #dummy AS d
CROSS APPLY(SELECT CHARINDEX('/',d.SomeString) AS pos1) AS x
CROSS APPLY(SELECT CHARINDEX('/',d.SomeString,x.pos1+1) AS pos2) AS y
This is the same as the following, but much easier to read (and type):
SELECT d.ID
,d.SomeString
,LEFT(d.SomeString,CHARINDEX('/',d.SomeString)-1)
,SUBSTRING(d.SomeString,CHARINDEX('/',d.SomeString)+1,CHARINDEX('/',d.SomeString,(CHARINDEX('/',d.SomeString)+1))-(CHARINDEX('/',d.SomeString)+1))
,SUBSTRING(d.SomeString,CHARINDEX('/',d.SomeString,(CHARINDEX('/',d.SomeString)+1))+1,1000)
FROM #dummy AS d
One example with XML-method .nodes()
DECLARE #dummy TABLE(SomeXML XML)
INSERT INTO #dummy VALUES
(N'<root>
<a>a1</a>
<a>a2</a>
<a>a3</a>
<b>Here is b!</b>
</root>');
SELECT All_a_nodes.value(N'.',N'nvarchar(max)')
FROM #dummy
CROSS APPLY SomeXML.nodes(N'/root/a') AS A(All_a_nodes);
The result
a1
a2
a3
And one example for an inlined function call
CREATE FUNCTION dbo.TestProduceRows(#i INT)
RETURNS TABLE
AS
RETURN
SELECT TOP(#i) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS Nr FROM master..spt_values
GO
CREATE TABLE dbo.TestData(ID INT IDENTITY, SomeString VARCHAR(100),Number INT);
INSERT INTO dbo.TestData VALUES
('Show me once',1)
,('Show me twice',2)
,('Me five times!',5);
SELECT *
FROM TestData
CROSS APPLY dbo.TestProduceRows(Number) AS x;
GO
DROP TABLE dbo.TestData;
DROP FUNCTION dbo.TestProduceRows;
The result
1 Show me once 1 1
2 Show me twice 2 1
2 Show me twice 2 2
3 Me five times! 5 1
3 Me five times! 5 2
3 Me five times! 5 3
3 Me five times! 5 4
3 Me five times! 5 5

Should I use a cursor for this?

I have a table with three fields. Group number, X-coord and Y-coord. There can be from 0 to about 10 rows within each group number.
What I want to do is calculate the maximum and minimum distance between points within each group. Obviously, this will only give you a value if there are 2 or more rows within that group.
Output should consist of fields: group number, minDistance, maxDistance.
Is a cursor a good solution for this?
(Coordinates are in WGS84 and I have a working formula for calculating distances)
My reasoning for using a cursor is that I cannot avoid doing a cross join for each group and then applying the formula for each result of the cross join.
I wouldn't use a cursor in your situation but preferably a scalar User Defined Function with the required group number in argument, and calculate the maximum distance for that group inside the UDF.
Please note the calculation algorithm inside the function is much simpler than what you may have.
create table dist (groupId int, X int, Y int)
insert into dist(groupid, x, y) values (1,14,20),(1,11,20),(1,10,22),(1,12,24),(1,11,28),(1,19,78)
insert into dist(groupid, x, y) values (2,10,20),(2,11,20),(2,10,22),(2,12,24),(2,11,28),(2,17,52)
create function dbo.getMinMaxDistanceForGroup (#groupId int)
returns table as return (
select MIN(SQRT(SQUARE(b.X - a.X) + SQUARE(b.Y - a.Y))) MinDistance,
MAX(SQRT(SQUARE(b.X - a.X) + SQUARE(b.Y - a.Y))) MaxDistance
from dist a cross join dist b
where a.groupId = #groupId and b.groupId = #groupId
)
select groupId, MinDistance, MaxDistance
from dist OUTER APPLY dbo.getMinMaxDistanceForGroup(groupId)
group by groupid, MinDistance, MaxDistance

Resources