Design change not to use Dynamic SQL? - sql-server

There is a sporadic performance problem with a batch process (nested sprocs) in SQL Server 2012. Sometimes, it takes much longer than usual.
The process rebuilds certain tables based on input parameters. RULE table has a statement column that has 70+ different statements (involving 20+ different columns) to be used in the WHERE clause in dynamic sql.
So, the DELETE, at each process run, not only has different parameters, but also different types and number of columns in the WHERE clause.
What would you recommend other than managing stats and index tuning? The dev team is open to code and schema changes.
SELECT rul.SQL_STATEMENT
FROM APP_RULE rul
LEFT JOIN APP_RULE_EXCEPTION exc
ON rul.RULE_ID = exc.RULE_ID
WHERE rul.APP_ID = #AppId
AND (exc.RULE_ID IS NULL OR exc.RULE_ID NOT IN (
SELECT RULE_ID FROM APP_RULE_EXCEPTION ))
SET #SQLStatement =
'DELETE FROM EMPLOYEE ' +
'WHERE APP_ID = ' + CAST(#AppId AS VARCHAR(10)) +
' AND EMPLOYEE_ID NOT IN (' +
'SELECT EMPLOYEE_ID FROM EMPLOYEE_NEW ' +
'WHERE '+ #SQLStatement + ')'
EXEC (#SQLStatement)
SET #SQLStatement =
'DELETE FROM ' + #TableName + ' ' +
'WHERE EMPLOYEE_ID NOT IN (' +
'SELECT EMPLOYEE_ID FROM EMPLOYEE_STG ' +
'WHERE APP_ID = ' + CAST(#AppId AS VARCHAR(10)) +''')'
EXEC (#SQLStatement)
SET #SQLStatement =
'INSERT INTO ' + #DestinationTable + ' '+'( [APP_ID], ' + #AttributeList + ')'+
'SELECT ' + CAST(#AppId AS VARCHAR(10)) + ',''' + #ExposedAttributeList +
'FROM ' + #SourceTable + ' ' +
'WHERE [DATE] = ''' + #TDate + ''''
EXEC (#SQLStatement)
The following are two sample DELETEs generated by dynamic sql.
DELETE FROM EMPLOYEE WHERE APP_ID = 103 AND APP_TYPE = 'IE' AND EMPLOYEE_ID NOT IN (
SELECT EMPLOYEE_ID FROM EMPLOYEE_NEW WHERE APP_TYPE = 'IE' )
DELETE FROM EMPLOYEE WHERE APP_ID = 103 AND APP_TYPE = 'IE' AND COUNTRY='USA' AND EMPLOYEE_ID NOT IN (
SELECT EMPLOYEE_ID FROM EMPLOYEE_NEW WHERE APP_TYPE = 'IE' AND EMPLOYEE_ID IN (
SELECT EMPLOYEE_ID FROM EMPLOYEE_EMAIL WHERE ISNULL(EMAIL_ADDRESS,'')<>'' and EType='OFFICE' ))
Thanks,
Kuzey

First thing you should do is to understand what the problem is. When something runs a long time randomly, it's usually case of getting non-optimal query plan, assuming you have already checked there is no blocking happening at the same time.
You should look at plan cache and see what is the difference between the plans and the CPU & IO measurements of the different cases.
Here's short SQL you can use to look at plan cache:
select top 100
SUBSTRING(t.text, (s.statement_start_offset/2)+1,
((CASE s.statement_end_offset WHEN -1 THEN DATALENGTH(t.text) ELSE s.statement_end_offset END
- s.statement_start_offset)/2) + 1) as statement_text,
t.text,
s.total_logical_reads, s.total_logical_reads / s.execution_count as avg_logical_reads,
s.total_worker_time, s.total_worker_time / s.execution_count as avg_worker_time,
s.execution_count,
max_logical_reads,
creation_time,
last_execution_time
--,cast(p.query_plan as xml) as query_plan
from sys.dm_exec_query_stats s
cross apply sys.dm_exec_sql_text (sql_handle) t
--cross apply sys.dm_exec_text_query_plan (plan_handle, statement_start_offset, statement_end_offset) p
order by s.total_logical_reads desc
That will show you all the measurements collected from plans that are still in cache. When the plan is dropped, also the measurements will be deleted. The measurements are for all the executions for the plan since it was created.
The commented out part is the plan for the statements, from there you can see what are the operators and estimated row counts. Don't look at the percentages in the plans, those are based on estimates and can be totally wrong.

Related

How to find the list of tables which has new set of records impacted in SQL?

I am working on exporting data from one environment to another environment. I want to select the list of tables which has new set of records either inserted or modified.
Database has around 200 tables and only if 10 table records are impacted since yesterday, i want to filter only those tables. Some of these tables does not have createdate table column. It is harder to identify the record difference based on plain select query to the table.
How to find the list of tables which has new set of records impacted in SQL?
And if possible only those newly impacted records from the identified tables.
I tried with this query, however this query is not returning actual tables.
select * from sysobjects where id in (
select object_id
FROM sys.dm_db_index_usage_stats
WHERE last_user_update > getdate() - 1 )
If you've not got a timestamp or something to identify newly changed records such as auditing, utilising triggers or Change Data Capture enabled on those tables, it's quiet impossible to do.
However, reading your scenario is it not possible to ignore what has changed or been modified and just simply export those 200 tables from one environment to the other and override it on the destination location?
If not, then you might be only interested in comparing data rather than identifying newly changed records to identify which tables did not match. You can do that using EXCEPT
See below example of comparing two databases with the same table names and schema creating a dynamic SQL statement column using EXCEPT from both databases on the fly and running them in a while loop; inserting each table name that was effected into a temp table.
DECLARE #Counter AS INT
, #Query AS NVARCHAR(MAX)
IF OBJECT_ID('tempdb..#CompareRecords') IS NOT NULL DROP TABLE #CompareRecords
IF OBJECT_ID('tempdb..#TablesNotMatched') IS NOT NULL DROP TABLE #TablesNotMatched
CREATE TABLE #TablesNotMatched (ObjectName NVARCHAR(200))
SELECT
ROW_NUMBER() OVER( ORDER BY (SELECT 1)) AS RowNr
, t.TABLE_CATALOG
, t.TABLE_SCHEMA
, t.TABLE_NAME
, Query = 'IF' + CHAR(13)
+ '(' + CHAR(13)
+ ' SELECT' + CHAR(13)
+ ' COUNT(*) + 1' + CHAR(13)
+ ' FROM' + CHAR(13)
+ ' (' + CHAR(13)
+ ' SELECT ' + QUOTENAME(t.TABLE_NAME, '''''') + ' AS TableName, * FROM ' + QUOTENAME(t.TABLE_CATALOG) + '.' + QUOTENAME(t.TABLE_SCHEMA) + '.' + QUOTENAME(t.TABLE_NAME) + CHAR(13)
+ ' EXCEPT' + CHAR(13)
+ ' SELECT ' + QUOTENAME(t.TABLE_NAME, '''''') + ' AS TableName, * FROM ' + QUOTENAME(t2.TABLE_CATALOG) + '.' + QUOTENAME(t.TABLE_SCHEMA) + '.' + QUOTENAME(t.TABLE_NAME) + CHAR(13)
+ ' ) AS sq' + CHAR(13)
+ ') > 1' + CHAR(13)
+ 'SELECT ' + QUOTENAME(QUOTENAME(t.TABLE_CATALOG) + '.' + QUOTENAME(t.TABLE_SCHEMA) + '.' + QUOTENAME(t.TABLE_NAME), '''''') + ' AS TableNameRecordsNotMatched'
INTO #CompareRecords
FROM <UAT_DATABASE>.INFORMATION_SCHEMA.TABLES AS t
LEFT JOIN <PROD_DATABASE>.INFORMATION_SCHEMA.TABLES AS t2 ON t.TABLE_SCHEMA = t2.TABLE_SCHEMA
AND t.TABLE_NAME = t2.TABLE_NAME
WHERE t.TABLE_TYPE = 'BASE TABLE'
SET #Counter = (SELECT MAX(RowNr) FROM #CompareRecords)
WHILE #Counter > 0
BEGIN
SET #Query = (SELECT cr.Query FROM #CompareRecords AS cr WHERE cr.RowNr = #Counter)
INSERT INTO #TablesNotMatched
EXECUTE sp_executesql #Query
SET #Counter = #Counter - 1
END
SELECT
*
FROM #TablesNotMatched
Note when using EXCEPT both tables have to have the exact same column count and type.
I hope this slightly helps.

Mapping columns without knowing what's in the table

I have an odd situation where I will have data coming from various sources (all flat files, none of which are in my control, and no matter how many times I ask for a standard format, I get different column headers and different column orders). We do not have the manpower to manually go through these files to determine which columns are important. Each flat file will have between two and six "identification" columns. However, some of the columns, individually, are not unique, but their combinations can form unique keys. All told, each flat file can have somewhere around one hundred columns.
So, initially, I planned to load the data into a temp table and ask the user to identify which columns contained which data. Once I know that, I can process the file without issue. I would have the two to six columns to identify matches with existing records and the additional data that I am supposed to gather (all identified by the user).
I was then asked to add in the ability for the system to "recommend" which data columns are which. For that, my plan was a count. I would count how many nonempty values each column has and then count how many of those nonempty values match each of the six possible columns. From there, I can take a simple ratio to determine the likelihood that the data contained is of that particular type. There would be some overvaluing of columns that are not unique, but in general, it is working nicely. The problem is that it is very slow.
I created a metadata table that I am calling UploadedTableColumn which contains every column header of the source file and which column it maps to in the database. Here is my stored procedure to update the counts:
CREATE PROCEDURE stored_Procedure
#FileLoadID INT
AS
BEGIN
DECLARE #SqlCommand NVARCHAR(MAX)
DECLARE the_cursor CURSOR FAST_FORWARD FOR
SELECT N'UPDATE UploadedTableColumn SET NumberNonemptyRows = (SELECT COUNT(*) FROM ' + DestinationTableName + N' WHERE ISNULL(' + DestinationColumnName + N','''') <> ''''),' + CHAR(13)
+ N'NumberID1Rows = (SELECT COUNT(*) FROM ' + DestinationTableName + N' WHERE ISNULL(' + DestinationColumnName + N','''') IN (SELECT ID1 FROM ID1Table) AND ISNULL(' + DestinationColumnName + N','''') <> ''''),' + CHAR(13)
+ N'NumberID2Rows = (SELECT COUNT(*) FROM ' + DestinationTableName + N' WHERE ISNULL(' + DestinationColumnName + N','''') IN (SELECT ID2 FROM ID2Table) AND ISNULL(' + DestinationColumnName + N','''') <> ''''),' + CHAR(13)
+ N'NumberIDDateRows = (SELECT COUNT(*) FROM ' + DestinationTableName + N' WHERE IIF(ISDATE(' + DestinationColumnName + N')=1,IIF(CAST(' + DestinationColumnName + N' AS DATE) IN (SELECT IDDate FROM IDDateTable),1,0),0) = 1 AND ISNULL(' + DestinationColumnName + N','''') <> ''''),' + CHAR(13)
+ N'NumberID4Rows = (SELECT COUNT(*) FROM ' + DestinationTableName + N' WHERE ISNULL(' + DestinationColumnName + N', '''') IN (SELECT ID4 FROM ID4Table) AND ISNULL(' + DestinationColumnName + N','''') <> ''''),' + CHAR(13)
+ N'NumberID5Rows = (SELECT COUNT(*) FROM ' + DestinationTableName + N' WHERE ISNULL(' + DestinationColumnName + N', '''') IN (SELECT ID5 FROM ID5Table) AND ISNULL(' + DestinationColumnName + N','''') <> ''''),' + CHAR(13)
+ N'NumberID6Rows = (SELECT COUNT(*) FROM ' + DestinationTableName + N' WHERE ISNULL(' + DestinationColumnName + N', '''') IN (SELECT ID6 FROM ID6Table) AND ISNULL(' + DestinationColumnName + N','''') <> '''')' + CHAR(13)
+ N'WHERE DestinationTableName = ''' + DestinationTableName + N''' AND DestinationColumnName = ''' + DestinationColumnName + N''' AND FileLoadID = ' + CAST(#FileLoadID AS NVARCHAR) + N';' + CHAR(13) As SqlCommand
FROM UploadedTableColumn
WHERE FileLoadID = #FileLoadID
OPEN the_cursor
FETCH NEXT FROM the_cursor
INTO #SqlCommand
WHILE ##FETCH_STATUS = 0
BEGIN
EXECUTE(#SqlCommand)
FETCH NEXT FROM the_cursor
INTO #SqlCommand
END
CLOSE the_cursor
DEALLOCATE the_cursor
END
Is there a faster approach?
One small change that might help.
You say you're holding every column of the source table in UploadedTableColumn - you don't need to do that, your cursor is looping through a lot of unnecessary columns. You can eliminate a lot with a pre-emptive column name match.
So get a combined list of all possible ID columns from your ID1Table, ID2Table, etc, and only pull into UploadedTableColumn the ones that actually match a column in DestinationTableName.
On the basis that there are probably no more than 6 columns in your source data that have a matching ID column name, you're now only checking those rather than all 100+.
Of course, this doesn't help you if you've got people sending data without headers and no agreed format.
Pseudo code to get the desired columns:
SELECT name
FROM sys.columns
WHERE [object_id] = OBJECT_ID('DestinationTableName')
AND Name IN
(
SELECT ID1 AS IDColumn FROM ID1Table
UNION ALL
SELECT ID2 AS IDColumn FROM ID2Table
...
)

SQL Server : relationship between columns in different tables

I would like to find a way to understand if there is a relationship between two columns present in two different tables.
For example in the table [Sales].[SalesOrderHeader], I have a column SalesOrderID and in another table [Person].[EmailAddress], there is BusinessEntityID.
How can I check to see if there is a table that creates a relationship between these 2 columns? Or how can I be sure that there is not a relationship between these 2 columns?
INFORMATION_SCHEMA is what you are looking for. You can see whether or not a given column is used in a constraint by executing
SELECT * FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE WHERE TABLE_NAME = 'Person' AND COLUMN_NAME = 'BusinessEntityID'
You will have to do some additional work to focus on your specific solution, but this is where to start.
You could do one the following to find the tables that reference [Sales].[SalesOrderHeader]:
EXEC sp_fkeys #pktable_name = N'SalesOrderHeader',#pktable_owner = N'Sales';
I apologize before hand for what follows:
create table #rels (rel_name varchar(max), matches int) declare #sql varchar(max) = '' select #sql+= char(10) + 'insert into #rels select ''' tbla + '.' + col_a + '.' + tbl_b + '.' col_b ''' colrel, count(*) from ' + tbl_a + ' join ' + tbl_b + ' on cast(' + col_a + ' as varchar(max)) = cast(' + col_b + ' as varchar(max)) from ( select a.column_name col_a, object_name(a.object_id) tbl_a, b.column_name col_b, object_name(b.object_id) tbl_b from sys.columns a cross apply sys.columns b where a.column_name <> b.column_name where a.system_type_id = b.system_type_id ) cols exec (#sql) select * from #rels where matches > 0 order by matches desc drop table #rels

Exporting to Excel from SQL Server

I am stuck at a problem for which I cannot find any reason or solution.
I am running a SQL script to export some data to an Excel sheet. There is an application running on the other end which reads and processes the Excel sheet.
Problem: The column headers are being displayed at the bottom and the application is expecting them to be on the top row. I cannot change the functioning of the application.
This was working fine in SQL 2005, but we recently updated to SQL 2012 and this started happening.
I have not found anything over the internet to solve this issue.
This is the SQL script that I am executing
SELECT
#columnNames = COALESCE( #columnNames + ',', '') + '['+ column_name + ']',
#columnConvert = COALESCE( #columnConvert + ',', '') + 'convert(nvarchar(4000),'
+ '['+ column_name + ']' +
case
when data_type in ('datetime', 'smalldatetime') then ',121'
when data_type in ('numeric', 'decimal') then ',128'
when data_type in ('float', 'real', 'money', 'smallmoney') then ',2'
when data_type in ('datetime', 'smalldatetime') then ',120'
else ''
end + ') as ' + '['+ column_name + ']'
FROM tempdb.INFORMATION_SCHEMA.Columns
WHERE table_name = '##TempExportData'
-- execute select query to insert data and column names into new temp table
SELECT #sql = 'select ' + #columnNames + ' into ##TempExportData2 from (select ' + #columnConvert + ', ''2'' as [temp##SortID] from ##TempExportData union all select ''' + replace(replace(replace(#columnNames, ',', ''', '''),'[',''),']','') + ''', ''1'') t order by [temp##SortID]'
exec (#sql)
-- build full BCP query
DECLARE #bcpCommand VARCHAR(8000)
SET #bcpCommand = 'bcp " SELECT * from ##TempExportData2" queryout'
SET #bcpCommand = #bcpCommand + ' ' + #fullFileName + ' -T -w -S' + #serverInstance
EXEC master..xp_cmdshell #bcpCommand
Where TempExportData2 holds the data that along with column headers
I think I understand the problem: You are using the order by in the select into instead of in the final select statement.
You should know that data inside tables is considered unorderd and Sql Server (and any other rdbms I know, actually) does not guarantee the order of rows selected if the select statement does not contain an order by clause.
Therefor, you should add the [temp##SortID] column to your ##TempExportData2 table and use it to sort the last select statement:
SET #bcpCommand = 'bcp " SELECT * from ##TempExportData2 ORDER BY [temp##SortID]" queryout'
Since you don't need that column in the output query, so you might want to specify the column names in that select statement. However, if it's not causing damage to your application that reads the excel file or to the data it produces, I would suggest keeping the select * to make the query more readable.

SQL Server equivalent to DBMS_METADATA.GET_DDL

I was wondering if there is an equivalent in SQL Server 2008 to Oracle's DBMS_METADATA.GET_DDL Function? You can pass this function a table name and it will return the ddl for that table so that you can use it to build a script for a schema.
I know I can go into SSMS and use that, but I would prefer to have a t-sql script that would generate the ddl for me.
Thanks,
S
I using this query for generate query but this work for 1 table :
declare #vsSQL varchar(8000)
declare #vsTableName varchar(50)
select #vsTableName = 'Customers'
select #vsSQL = 'CREATE TABLE ' + #vsTableName + char(10) + '(' + char(10)
select #vsSQL = #vsSQL + ' ' + sc.Name + ' ' +
st.Name +
case when st.Name in ('varchar','varchar','char','nchar') then '(' + cast(sc.Length as varchar) + ') ' else ' ' end +
case when sc.IsNullable = 1 then 'NULL' else 'NOT NULL' end + ',' + char(10)
from sysobjects so
join syscolumns sc on sc.id = so.id
join systypes st on st.xusertype = sc.xusertype
where so.name = #vsTableName
order by
sc.ColID
select substring(#vsSQL,1,len(#vsSQL) - 2) + char(10) + ')'
If you are looking for a TSQL solution, it is quite verbose, as [this example]¹ shows.
A shorter alternative would be using the SMO library (example)
¹ Link for this example deleted. The way Internet Archive Wayback Machine displayed an error saying that they could not display the content. And following the link to the original went someplace malicious. (Grave errors, instruction to call some number, etc.)

Resources