Implementing geometry_columns view in MS SQL Server - sql-server

(We're using MSSQL Server 2014 as far as I know)
I have never seen a good solution for maintaining a geometry_columns table in MSSQL Server. https://gis.stackexchange.com/questions/71558 never got figured out, and even if it did, the PostGIS approach of using a view (rather than a table) is a much better solution.
With that said, I can't seem to figure out how to implement the basics of how this might work.
The basic schema of the geometry_columns view - from PostGIS is:
(the DDL is a bit more complicated, but can be provided if need be)
MS SQL Server will allow you to query your information_schema table to show tables with a 'geometry' data type:
select *
FROM information_schema.columns
where data_type = 'geometry'
I'm imagining the geometry_columns view could be defined with something similar to the following, but I can't figure out how to get the information about the geometry columns to populate in the query:
SELECT
TABLE_CATALOG as f_table_catalog
, TABLE_SCHEMA as f_table_schema
, table_name as f_table_name
, COLUMN_NAME as f_geometry_column
/*how to deal with these in view?
, geometry_column.STDimension() as coord_dimension
, geometry_column.STSrid as srid
, geometry_column.STGeometryType() as type
*/
FROM information_schema.columns where data_type = 'geometry'
I'm hung up as to how the three ST operators can dynamically report the dimension, srid, and geometry type in the view when trying to query from the information_schema table. Perhaps this is a SQL problem more than anything, but I can't wrap my head around it for some reason.
Here's what the PostGIS geometry columns table looks like:
Also please let me know if this question a) could be asked differently because it is a general SQL question and/or b) it belongs on another forum (GIS.SE didn't have an answer, as I believe this is more on the database side than spatial/GIS)

Based on a little reading, it seems that PostGIS - as befits a dedicated GIS system - is a little more clever than SQL Server, when it comes to geometry columns. It looks like in PostGIS you can say that a particular geometry column will only ever contain, say, a POINT, or a LINESTRING. This is how the geometry_columns view can then be more specific about the columns it is describing.
I don't believe it is possible to readily constrain a SQL Server geometry in this way (triggers or constraints might allow, but would be messy). PostGIS can have a general geometry column with no further restriction. Let's suppose you're happy for your SQL Server geometry_columns view to return the dimension, SRID, and type based on an arbitrary row of data.
We can get the column metadata out of the catalog views, but I think the only way to do the necessary querying to also get the geometry metadata is with dynamic SQL. This rules out views and functions. I can do you a stored procedure though:
CREATE PROCEDURE GetGeometryColumns
AS
BEGIN
DECLARE #sql nvarchar(max);
SET #sql = ( SELECT
STUFF((
SELECT ' UNION ALL ' + Query
FROM
( SELECT
'SELECT ''' + s.name + ''' SchemaName'
+ ', ''' + t.name + ''' TableName'
+ ', ''' + c.name + ''' ColumnName'
+ ', ( SELECT TOP (1) ' + c.name + '.STDimension() FROM ' + s.name + '.' + t.name + ') Dimension'
+ ', ( SELECT TOP (1) ' + c.name + '.STSrid FROM ' + s.name + '.' + t.name + ') SRID'
+ ', ( SELECT TOP (1) ' + c.name + '.STGeometryType() FROM ' + s.name + '.' + t.name + ') GeometryType'
AS Query
FROM
sys.schemas s
INNER JOIN sys.tables t ON s.schema_id = t.schema_id
INNER JOIN sys.columns c on t.object_id = c.object_id
WHERE
system_type_id = 240
) GeometryColumn
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 10, '')
);
EXEC ( #sql );
END
This builds a SQL statement which is a UNION of SELECTs, one for each geometry column defined in the database. Note that I'm using the sys. catalog views, which for SQL Server are better than using INFORMATION_SCHEMA.
Each of the individual SELECTs that this builds will return the name of the column, plus metadata from the value in the first row (artibtarily picked).
The sproc then executes the statements its built, and returns.
To use:
CREATE TABLE T1 (
Id int NOT NULL PRIMARY KEY
, Region geometry
)
;
CREATE TABLE T2 (
Id int NOT NULL PRIMARY KEY
, Source geometry
, Destination geometry
)
;
INSERT T1 VALUES ( 1, geometry::STGeomFromText('POLYGON((1 1, 3 3, 3 1, 1 1))', 4236)) ;
INSERT T2 VALUES ( 10
, geometry::STGeomFromText('POINT(1.3 2.4)', 4236)
, geometry::STGeomFromText('POINT(2.6 2.5)', 4236)) ;
then simply
EXEC GetGeometryColumns;
to get
SchemaName TableName ColumnName Dimension SRID GeometryType
---------- --------- ----------- ----------- ----------- ----------------------
dbo T1 Region 2 4236 Polygon
dbo T2 Source 0 4236 Point
dbo T2 Destination 0 4236 Point
If you want the results in a table, you can for example:
DECLARE #geometryColumn TABLE
(
SchemaName sysname
, TableName sysname
, ColumnName sysname
, Dimension int
, SRID int
, GeometryType nvarchar(100)
);
INSERT #geometryColumn EXEC GetGeometryColumns
SELECT * FROM #geometryColumn
I'd be interested to see if anyone can get the necessary logic into an actual VIEW...

Related

Select unique values from every column in every table

Is there a way to get a count of distinct values from every table and column in SQL Server.
I've tried using a cursor for this, but that seems insufficient.
I've got to agree with Sean and say that this is going to be horrifically slow, but if you really want to do it, then I'm not going to stop you.
Something like this could be used as a starting point if you specifically don't want to use a cursor. This took just under a minute to look at a small database I've got with 10 tables in it. The largest table has just a few million rows in it. No matter what, you're going to be doing some sort of iteration, whether that's a cursor or explicitly reading against the table for each column.
Also, if you want to do something like this, you'll likely need to accommodate for things... like you're not going to be able to use COUNT on xml columns. Like I said, it's a starting point.
DECLARE #cmd VARCHAR(MAX)
SELECT #cmd =
STUFF (
(
SELECT
' union SELECT ''['+ SCHEMA_NAME(st.schema_id) + '].[' + st.name +']'' as [Object], ''[' + sc.name + ']'' as [Column], COUNT(distinct [' + sc.name + ']) as [Count] FROM [' + SCHEMA_NAME(st.schema_id) + '].[' + st.name + ']'
FROM sys.tables st
JOIN sys.columns sc
ON sc.object_id = st.object_id
JOIN sys.dm_db_partition_stats ddps
ON ddps.object_id = sc.object_id
WHERE
ddps.row_count > 0
FOR XML PATH('')
),1,6,''
)
EXECUTE (#cmd)

SSRS Temporary table issue, data from stored procedure which include dynamic pivot. (SQL 2008/Visual Studio 2008)

I have a stored procedur which runs ok from SQL level but when I am tryning to
add dataset in SSRS there is an issue with temp table. I know that SSRS has a problem with temp tables (in my case global table) but how can I resolve this problem? I saw few solutions suggests to create #table but how can I do that with data from dynamic pivot and I am not sur of it reaaly works?
Here is example of my dynamic pivot:
DECLARE #PmtCols_pwyk AS NVARCHAR(MAX), #query_pwyk AS NVARCHAR(MAX)
select #PmtCols_pwyk = STUFF((SELECT ',' + QUOTENAME(przerwa)
from #przerwa_wyk
group by przerwa
order by przerwa
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'),1,1,'')
if OBJECT_ID ('tempdb..##pwyk_po_pivocie') is not null drop table ##pwyk_po_pivocie
set #query_pwyk = 'SELECT Id_Pracownika,' + #PmtCols_pwyk + ' into ##pwyk_po_pivocie from
(select przerwa, Id_Pracownika, przerwa_wyk from #przerwa_wyk ) as x
PIVOT
(max(przerwa_wyk) for przerwa in (' + #PmtCols_pwyk + ')) as p'
execute(#query_pwyk) ;

Return all rows where at least one value in any of the columns is null

I have just completed the process of loading new tables with data. I'm currently trying to validate the data. The way I have designed my database there really shouldn't be any values anywhere that are NULL so i'm trying to find all rows with any NULL value.
Is there a quick and easy way to do this instead of writing a lengthy WHERE clause with OR statements checking each column?
UPDATE: A little more detail... NULL values are valid initially as sometimes the data is missing. It just helps me find out what data I need to hunt down elsewhere. Some of my tables have over 50 columns so writing out the whole WHERE clause is not convenient.
Write a query against Information_Schema.Columns (documentation) that outputs the SQL for your very long where clause.
Here's something to get you started:
select 'OR ([' + TABLE_NAME + '].[' + TABLE_SCHEMA + '].[' + COLUMN_NAME + '] IS NULL)'
from mydatabase.Information_Schema.Columns
order by TABLE_NAME, ORDINAL_POSITION
The short version answer, use SET CONCAT_NULL_YIELDS_NULL ON and bung the whole thing together as a string and check that for NULL (once). That way any null will propagate through to make the whole row comparison null.
Here's the silly sample code to demo the principal, up to you if you want to wrap that in an auto-generating schema script (to only check Nullable columns and do all the appropriate conversions). Efficient it ain't, but almost any way you cut it you will need to do a table scan anyway.
CREATE TABLE dbo.Example
(
PK INT PRIMARY KEY CLUSTERED IDENTITY(1,1),
A nchar(10) NULL,
B int NULL,
C nvarchar(50) NULL
) ON [PRIMARY]
GO
INSERT dbo.Example(A, B, C)
VALUES('Your Name', 1, 'Not blank'),
('My Name', 3, NULL),
('His Name', NULL, 'Not blank'),
(NULL, 5, 'It''s blank');
SET CONCAT_NULL_YIELDS_NULL ON
SELECT E.PK
FROM dbo.Example E
WHERE (E.A + CONVERT(VARCHAR(32), E.B) + E.C) IS NULL
SET CONCAT_NULL_YIELDS_NULL OFF
As mentioned in a comment, if you really expect columns to not be null, then put NOT NULL constraints on them. That said...
Here's a slightly different approach, using INFORMATION_SCHEMA:
DECLARE #sql NVARCHAR(max) = '';
SELECT #sql = #sql + 'UNION ALL SELECT ''' + cnull.TABLE_NAME + ''' as TableName, '''
+ cnull.COLUMN_NAME + ''' as NullColumnName, '''
+ pk.COLUMN_NAME + ''' as PkColumnName,' +
+ 'CAST(' + pk.COLUMN_NAME + ' AS VARCHAR(500)) as PkValue '
+ ' FROM ' + cnull.TABLE_SCHEMA + '.' + cnull.TABLE_NAME
+ ' WHERE ' +cnull.COLUMN_NAME + ' IS NULL '
FROM INFORMATION_SCHEMA.COLUMNS cnull
INNER JOIN (SELECT Col.Column_Name, col.TABLE_NAME, col.TABLE_SCHEMA
from INFORMATION_SCHEMA.TABLE_CONSTRAINTS Tab
INNER JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE Col
ON Col.Constraint_Name = Tab.Constraint_Name AND Col.Table_Name = Tab.Table_Name
WHERE CONSTRAINT_TYPE = 'PRIMARY KEY') pk
ON pk.TABLE_NAME = cnull.TABLE_NAME AND cnull.TABLE_SCHEMA = pk.TABLE_SCHEMA
WHERE cnull.IS_NULLABLE = 'YES'
set #sql = SUBSTRING(#sql, 11, LEN(#sql)) -- remove the initial 'UNION ALL '
exec(#sql)
Rather a huge where clause, this will tell you the primary key on the table where any field in that table is null. Note that I'm CASTing all primary key values to avoid operand clashes if you have some that are int/varchar/uniqueidentifier etc. If you have a PK that doesn't fit into a VARCHAR(500) you probably have other problems....
This would probably need some tweaking if you have any tables with composite primary keys - as it is, I'm pretty sure it would just output separate rows for each member of the key instead of concatenating them, and wouldn't necessarily group them together the way you'd want.
One other thought would be to just SELECT * from ever table and save the output to a format (Excel, plain text csv) you can easily search for the string NULL.

Dynamically generate Sql for variable definition and assignment for a row in a table?

For example, I have a table
create table T (
A int,
B numeric(10,3),
C nvarchar(10),
D datetime,
E varbinary(8)
)
Update: This is just one of the example table. Any table can be used as input for generating the SQL string.
Is there an easy way to dynamically generate the following Sql for a row? (Any built-in function to make the Quotes, prefix easier?)
'declare
#A int = 1,
#B numeric(10,3) = 0.01,
#C nvarchar(10) = N''abcd'',
#D = ''10/1/2013'',
#E = 0x9123'
No, there isn't. The closest you might get, but which still will require manual changes, is by using SQL Server Management Studio. Expand the database and the table.
Right-click the table, then select Script table As, Insert To, and then selecting a new query window. This will generate output that will give you a starting point, but it's not generating variables. You'd have to either script that yourself, or edit the generated INSERT statement.
Example code:
INSERT INTO MyDB].[dbo].[Table1]
([A]
,[B]
,[C]
VALUES
(<A, int,>
,<B, float,>
,<C, nvarchar(10),>
)
GO
Not sure what specifically you are trying to achieve… There is no built in function for something like this but you can try to create one easily using query similar to this one…
select 'DECLARE #A int = ' + TableA.A +
', #B numeric(10,3) = ' + TableA.B +
', #C nvarchar(10) = N''' + TableA.C +
''', #D = ''' + TableA.D + ''''
from TableA
WHERE PrimaryKeyColumn = some_value
Just cleanup the query above and convert it into a function that returns nvarchar
If you want to dynamically generate table definitions too that’s possible too but you’ll have to use system views to create this for any given table.
Try something like this and work your way from here
select T.name, C.name, TY.name, C.column_id
from sys.tables T
inner join sys.columns C on T.object_id = C.object_id
inner join sys.types TY on TY.system_type_id = C.system_type_id
where T.name = 'TableName'
order by C.column_id asc

Database Tuning Advisor recommends to create an existing index

When I run SQL Server 2005 Database Tuning Advisor, it gives a recommendation to create an index, but it will recommends to index a column which already has an index on it. Why does it give a recommendation to create the same index again?
Here is my SQL:
SELECT t.name AS 'affected_table'
, 'Create NonClustered Index IX_' + t.name + '_'
+ CAST(ddmid.index_handle AS VARCHAR(10))
+ ' On ' + ddmid.STATEMENT
+ ' (' + IsNull(ddmid.equality_columns,'')
+ CASE
WHEN ddmid.equality_columns IS NOT NULL
AND ddmid.inequality_columns IS NOT NULL
THEN ','
ELSE ''
END
+ ISNULL(ddmid.inequality_columns, '')
+ ')'
+ ISNULL(' Include (' + ddmid.included_columns + ');', ';')
AS sql_statement
, ddmigs.user_seeks
, ddmigs.user_scans
, CAST((ddmigs.user_seeks + ddmigs.user_scans)
* ddmigs.avg_user_impact AS INT) AS 'est_impact'
, ddmigs.last_user_seek
FROM
sys.dm_db_missing_index_groups AS ddmig
INNER JOIN sys.dm_db_missing_index_group_stats AS ddmigs
ON ddmigs.group_handle = ddmig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details AS ddmid
ON ddmig.index_handle = ddmid.index_handle
INNER Join sys.tables AS t
ON ddmid.OBJECT_ID = t.OBJECT_ID
WHERE
ddmid.database_id = DB_ID()
AND CAST((ddmigs.user_seeks + ddmigs.user_scans)
* ddmigs.avg_user_impact AS INT) > 100
ORDER BY
CAST((ddmigs.user_seeks + ddmigs.user_scans)
* ddmigs.avg_user_impact AS INT) DESC;
Perhaps try "DESC" to order a different way?
This worked in another similar SO question... Why does SQL Server 2005 Dynamic Management View report a missing index when it is not?
You may need to run your queries and suggest the index that is already there.
SELECT * FROM table WITH INDEX(IX_INDEX_SHOULD_BE_USED) WHERE x = y
The index that is there might not be thought of as useful from SQL Server. Run the query that is suggesting the need for the index and see the execution path in SQL Server and then build other indexes that are needed.
Can u please list the full index missing warning message? generally, it's asking to create an index on the table BUT only to return certain fields, instead of an index on the table, which will return all fields by default.
Go ahead and script out the details of both your current index strucutre and then compare this to reccomendations made by the DTA.
I suspect that you will find there are structural differences in the results.

Resources