I didn't find any appropriate solution for my problem so I want to ask here if someone could help me.
I have a stored procedure named spImportWord which downloads word files from a file location on another server to a local folder and saves values from each word file to a table in the database. For each file I'm calling a console application that saves the files to the local folder. For downloading I'm using a while loop. Before I used a cursor but as far as I know you should stay away from cursors.
Since I changed from cursor to while I can't even alter my stored procedure. It takes an eternity to finish. Is there any way to improve my stored procedure? (Note: With the cursor the SP could be altered but execution displayed that the subquery returned more than 1 value etc.)
My code so far:
ALTER PROCEDURE spImportWord
#fileId INT
AS
DECLARE #cmd VARCHAR(200)
DECLARE #filename VARCHAR(200)
DECLARE #foldername VARCHAR(200)
DECLARE #year VARCHAR(200)
DECLARE #remotePath VARCHAR(MAX)
DECLARE #localPath VARCHAR(MAX)
DECLARE #organisationId INT
DECLARE #folderId INT
DECLARE #import TABLE(ImportId INT)
DECLARE #counter INT
SELECT #filename = Files.FileName
, #foldername = Files.FolderName
FROM Files WHERE Files.FileId = #fileId
CREATE TABLE #organisation ([OrganisationId] INT, [FolderId] INT, [Year] VARCHAR(4) NULL)
INSERT INTO #organisation
SELECT [tabInstitution].[OrganisationId]
, [tabInstitution].[FolderId]
, [tabInstitution].[UploadDate] AS [Year]
FROM [tabInstitution]
SET #organisationId = 0
SET #counter = 0
WHILE (#counter <= (SELECT COUNT(*) FROM #organisation))
BEGIN
SELECT #organisationId = MIN([#organisation].[OrganisationId])
, #folderId = [#organisation].[FolderId]
, #year = [#organisation].[Year]
FROM [#organisation]
WHERE [#organisation].[OrganisationId] > #organisationId
GROUP BY [FolderId], [Year]
SET #remotePath = '\\somepath.path.com\somefolder\' + #organisationId + '\' + #folderId + '\' + #filename + '.docx'
SET #localPath = 'C:\Files\' + #year + '\' + #foldername + '\' + #organisationId + '.docx'
SET #cmd = 'C:\App\ImportWordFiles.exe --SourcePath ' + #remotePath + ' --TargetPath ' + #localPath
EXEC xp_cmdshell #cmd, no_output
-- Log into database
INSERT INTO WordImport
OUTPUT Inserted.ImportId INTO #import
VALUES(GETDATE())
INSERT INTO WordImportItem
VALUES((SELECT ImportId FROM #import), #organisationId, #folderId, #localPath)
SET #counter = #counter + 1
END
Instead of the while loop I had before:
DECLARE MY_CURSOR CURSOR LOCAL STATIC READ_ONLY FORWARD_ONLY
FOR
SELECT [#organisation].[OrganisationId]
, [#organisation].[FolderId]
, [#organisation].[Year]
FROM [#organisation]
OPEN MY_CURSOR
FETCH NEXT FROM MY_CURSOR INTO #organisationId, #folderId, #year
WHILE ##FETCH_STATUS = 0
BEGIN
[...]
FETCH NEXT FROM MY_CURSOR INTO #organisationId, #folderId, #year
END
I hope I explained my question understandable.
You know, you should really understand the reasons why cursors are considered bad. In your case, there's no reason not to use cursors, and in fact, your while solution is much uglier anyway.
As for your "subquery problem", (SELECT ImportId FROM #import) is to blame - you're continuously adding rows to #import, and the subquery returns all the ImportIds, not just the last one (or whatever). That will fail once #import has more than a single row. It's hard to tell what exactly you're trying to do - I assume you're just trying to get the last inserted ID, and there's no point in maintaining a whole table of all the import IDs. Just delete all the data from #import after you read it.
Related
I have a quite large script which is shrunk and simplified in this question.The overall principal is that I have some code that need to be run several times with only small adjustments for every iteration. The script is built with a major loop that has several subloops in it. Today the whole select-statement is hard coded in the loops. My thought was that I could write the select-statement once and only let the parts that needs to be changed for every loop be the only thing that changes in the loop. The purpose is easier maintaining.
Example of the script:
declare
#i1 int,
#i2 int,
#t nvarchar(50),
#v nvarchar(50),
#s nvarchar(max)
set #i1 = 1
while #i1 < 3
begin
if #i1 = 1
begin
set #i2 = 1
set #t = 'Ansokningsomgang'
set #s = '
select ' + #v + '_Ar,count(*) as N
from (
select left(' + #v + ',4) as ' + #v + '_Ar
from Vinnova_' + #t + '
) a
group by ' + #v + '_Ar
order by ' + #v + '_Ar
'
while #i2 < 4
begin
if #i2 = 1
begin
set #v = 'diarienummer'
exec sp_executesql
#stmt = #s,
#params = N'#tab as nvarchar(50), #var as nvarchar(50)',
#tab = #t, #var = #v
end
else if #i2 = 2
begin
set #v = 'utlysning_diarienummer'
exec sp_executesql
#stmt = #s,
#params = N'#tab as nvarchar(50), #var as nvarchar(50)',
#tab = #t, #var = #v
end
else if #i2 = 3
begin
set #v = 'utlysning_program_diarienummer'
exec sp_executesql
#stmt = #s,
#params = N'#tab as nvarchar(50), #var as nvarchar(50)',
#tab = #t, #var = #v
end
set #i2 = #i2 + 1
end
end
else
print('Nr: ' + cast(#i1 as char))
set #i1 = #i1 + 1
end
This script doesn't work. It runs through but have no outputs. If I declare #v above the declaration of #s it works, but then I need to declare #s for every time I need to change the value for #v. Then there is no point in doing this.
#i1 iterates far more times than what is shown here.
The else statement to "if #i1" doesn't exist in the real script. It replaces a bunch of subloops that run for every value that is aloud for #i1 in this example.
I also tried to just execute #s like:
exec(#s)
in every loop. Same result.
So what am I missing?
Database engine is MS SQL Server.
Your parallel-structured tables are not 'normalized' to any degree,
and you are now suffering the consequence. Typically, the best
approach is to go ahead and make the data more normalized before you
take any other action.
Dynamic sql could work for making this task easier, and it is okay
as long as it's an ad-hoc task that hopefully you use to begin
building permanent tables in the name of making your various
parallel tables obsolete. It is not okay if it is part of a
regular process because someone could enter in some malicious
code into one of your table values and do some damage. This is
particularly true in your case because your use of left
functions imply that you're columns are character based.
Here's some code to put your data in more normal form. It can be
made more normal after this, so it would only be the first step.
But it gets you to the point where using it for your purpose is
far easier, and so hopefully will motivate you to redesign.
-- plug in the parallel tables you want to normalize
declare #tablesToNormalize table (id int identity(1,1), tbl sysname);
insert #tablesToNormalize values ('Ansokningsomgang', 'Ansokningsomgang2');
-- create a table that will hold the restructured data
create table ##normalized (
tbl sysname,
rowKey int, -- optional, but needed if restructure is permanent
col sysname,
category varchar(50),
value varchar(50)
);
-- create template code to restructure and insert a table's data
-- into the normalized table (notice the use of #tbl as a string,
-- not as a variable)
declare #templateSql nvarchar(max) = '
insert ##normalized
select tbl = ''Vinnova_#tbl'',
rowKey = t.somePrimaryKey, -- optional, but needed if restructure is permanent
ap.col,
category = left(ap.value, 4),
ap.value
from Vinnova_#tbl t
cross apply (values
(''diarienummer'', diarienummer),
(''utlysning_diarienummer'', utlysning_diarienummer),
(''utlysning_program_diarienummer'', utlysning_program_diarienummer)
// ... and so on (much better than writing a nested loop for ever row)
) ap (col, value)
';
-- loop the table names and run the template (notice the 'replace' function)
declare #id int = 1;
while #id <= (select max(id) from #tablesToNormalize)
begin
declare #tbl sysname = (select tbl from #tablesToNormalize where id = #id);
declare #sql nvarchar(max) = replace(#templateSql, '#t', #tbl);
exec (#tbl);
end
Now that your data is in a more normal form, code for your purpose
is much simpler, and the output far cleaner.
select tbl, col, category, n = count(value)
from ##normalized
group by tbl, col, category
order by tbl, col, category;
I have the below transaction:
DECLARE #strsql1 NVARCHAR(MAX);
DECLARE #rows INT, #count INT;
DECLARE #MySource NVARCHAR(30);
DECLARE #Myfield NVARCHAR(30);
SET #rows = 1;
SET #count = 0;
SELECT #strsql1 =
'WHILE ('+#rows+')> 0
BEGIN
BEGIN TRAN
delete top (10000) from '+#MySource+'
where' +#Myfield+' =''value''
SET '+#rows+' ='+ ##ROWCOUNT+'
SET '+ #count+' = '+ #count+ #rows +'
RAISERROR(''COUNT %d'', 0, 1, '+ #count+') WITH NOWAIT
COMMIT TRAN
END;'
PRINT #strsql1
EXEC sp_executeSql #strSql1
but I get this error message:
Conversion failed when converting the varchar value ' WHILE (' to data type int.
I have try to use the cast with the two variables (#count and #row) but the problem reside.
Could you please suggest some solution?
Thank you.
First thing, when adding the value of an integer variable to a string, you should CAST it, for example:
SELECT #strsql1 = 'WHILE (' + CAST(#rows AS varchar) +') > 0'
However, you don't REALLY want to do that either... because you want to test the variable against the constant zero, it should be more like:
SELECT #strsql1 = 'WHILE (#rows > 0)'
But this line right here is just completely wrong... the result would be SET 2 = 3 or something like that, which makes no sense:
SET ' + #rows + ' =' + ##ROWCOUNT + '
So you have a LOT of problems here.
They key to dynamic SQL is not rushing ahead and executing it... try just PRINTING the string first, and look at the result, and see what's wrong with that (it should be obvious... copy/paste it into a new SSMS query window and you can see right away what's wrong), and then iterate, fixing your dynamic SQL generation until the generated SQL is correct. THEN execute it.
Basically, you need to leave the variable alone in the dynamic SQL, and then pass the variable in via the optional arguments to sp_executesql. There are many examples on-line of how to pass arguments in. But the declarations of #rows should probably go inside your dynamic SQL as well. Remember that dynamic sql executes in its OWN context, not the context of the calling code, so it can't see any variables declared outside. And when building dynamic SQL, you want to only use the things that will change PER CALL outside the string.
Maybe something like this will get you started.
declare #MySource sysname = 'YourTableName'
, #MyField sysname = 'YourFieldName'
, #strsql1 NVARCHAR(MAX)
SELECT #strsql1 =
'DECLARE #rows INT = 1;
WHILE #rows > 0
BEGIN
delete top (10000) from ' + QUOTENAME(#MySource)
+ ' where ' + QuoteName(#Myfield) + '=''value'';
SET #rows = ##ROWCOUNT;
print convert(varchar(10), #rows) + '' row(s) affected'';
END;'
PRINT #strsql1
--EXEC sp_executeSql #strSql1
I would Like to ask for your assistance regarding this matter.
I need a variable that would update a column on one of my tables.
The first problem that I have encountered is I need to retrieve the data from series of columns, checking if these columns are numeric or not.
To solve this, I used TSQL and it works great. Now here is where I seem to hit a dead end. I need to retrieve the result of this procedure. I tried converting it to a function, but as per many trials (and some google searches) It seems TSQL cannot be used in a function as stated here so I am sticking to Stored Procedures, but how do I retrieve the result? I tried to use OUTPUT parameters, but I get hit with this error
The formal parameter "#R" was not declared as an OUTPUT parameter, but the actual parameter passed in requested output.
Even though I declared R as a output parameter, I also declared #R to output the result in my sp_executesql transaction, but I still get the error, May I ask what am I doing wrong?
Please, the Stored procedure is working fine, I just need the output. Thank you.
ALTER procedure [dbo].[SaveRinHead]
#SumNo as nvarchar(15)
,#R as decimal(18,3) output
as
declare #cursor CURSOR
declare #colname as integer
declare #top as integer
declare #query as nvarchar(MAX)
declare #TSQL as nvarchar(MAX)
declare #topass as nvarchar(MAX) = ''
declare #DimItem as nvarchar(10)
set #DimItem = (select distinct dimitem from SumNo)
SET #cursor = CURSOR FOR
(select cast([Name] as decimal(18,0)) from sys.columns where object_id in (select object_id from sys.tables where [name] = 'ADetails' )and [Name] in ('1','2','3','4','5','6','7','8','9','10'))order by [Name] asc
OPEN #cursor
FETCH NEXT
FROM #cursor INTO #colname
WHILE ##FETCH_STATUS = 0
BEGIN
set #top = (select CASE WHEN Isnumeric(#colname) = 1
THEN CONVERT(int,#colname)
ELSE 0 END AS COLUMNA)
if #top <= '5'
BEGIN
set #query = '(['+cast(#top as nvarchar(10)) + ']) ,'
set #topass = rtrim(ltrim(#topass)) +' '+rtrim(ltrim(#query))
END
FETCH NEXT
FROM #cursor INTO #colName
END
CLOSE #cursor
DEALLOCATE #cursor
set #topass = (SELECT SUBSTRING(#topass,1, len(#topass)-1))
begin
set #TSQL = '
SELECT #R = (MAX(MaxValue) - MIN(MinValue)) FROM ADetails
CROSS APPLY (SELECT MIN(d) MinValue FROM (VALUES '+#topass+' ) AS a(d)) X
CROSS APPLY (SELECT MAX(d) MaxValue FROM (VALUES '+#topass+' ) AS a(d)) Y
where SumNo= #SumNo'
exec sp_executesql #TSQL, N'#DimItem nvarchar(10), #R decimal(18,3), #SumNo nvarchar(15)', #DimItem, #R output, #SumNo
update ADetails set R = #R where SumNo= #SumNo
end
As per docs, you need to specify the output keyword in both the parameter declaration and parameter list when calling sp_executesql:
exec sp_executesql #TSQL, N'#DimItem nvarchar(10), #R decimal(18,3) output, #SumNo nvarchar(15)',
#DimItem, #R output, #SumNo;
I have a table called raw_data that contains a column with a large string of data fields formatted in fixed length sub-strings. I also have a table table_1 that specifies the column name and the data range in the string for each value. I need to create a SQL INSERT statement to move data from raw_data into a table called table_2 with all the columns. table_1 has about 600 rows, so I am wondering if I can loop through each record to create the SQL statement that inserts the data into table_2.
Table_1
Name Start Length
AAA 1 2
BBB 3 3
CCC 6 1
I haven't learned how to use cursors; the below query could be incorrect. There will be 3 tables involved in this task. table_1 to look up the name, start, length values. table_2 will be the table I need to insert the data into. The third table raw_data has the column with the sub-strings of each needed value.
DECLARE #SQL VARCHAR(200)
DECLARE #NAME VARCHAR(200)
DECLARE #START VARCHAR(200)
DECLARE #LENGTH VARCHAR(200)
SET #NAME = ''
DECLARE Col_Cursor CURSOR FOR
SELECT Name, Start, Length FROM ODS_SIEMENS_LAYOUT WHERE RecordType = '1'
OPEN Col_Cursor
FETCH NEXT FROM Col_Cursor INTO #NAME, #START, #LENGTH
WHILE ##FETCH_STATUS = 0
BEGIN
SET #SQL = #NAME + '=' + 'SUBSTRING(RAW_DATA,' + #START + ',' + #LENGTH + ')'
FETCH NEXT FROM Col_Cursor INTO #NAME, #START, #LENGTH
END
CLOSE Col_Cursor
DEALLOCATE Col_Cursor
I need to generate something like the below query:
INSERT INTO TABLE_2
'AAA' = SUBSTRING(RAW_DATA,1,2)
'BBB' = SUBSTRING(RAW_DATA,3,3)
'CCC' = SUBSTRING(RAW_DATA,5,2)
........
Can I loop through each column to form the SQL Statement instead of manually coding 600 columns?
At the risk of sounding like Clippy... it looks like you're trying to import a flat file. Is your RAW_DATA coming from a flat file somewhere? If so you might look into using bulk insert:
Use a Format File to Bulk Import Data
If you are just asking how can you build your sql statement using the data from your column definition table... then the code you have is very close. You want something like this:
DECLARE #COLUMNS varchar(max)
DECLARE #SUBCOLUMNS varchar(max)
DECLARE #NAME VARCHAR(200)
DECLARE #START VARCHAR(200)
DECLARE #LENGTH VARCHAR(200)
SET #NAME = ''
DECLARE Col_Cursor CURSOR FOR
SELECT Name, Start, Length FROM ODS_SIEMENS_LAYOUT WHERE RecordType = '1'
OPEN Col_Cursor
FETCH NEXT FROM Col_Cursor INTO #NAME, #START, #LENGTH
set #SUBCOLUMNS = ''
set #COLUMNS = ''
WHILE ##FETCH_STATUS = 0
BEGIN
SET #COLUMNS = #COLUMNS + #NAME + ','
SET #SUBCOLUMNS = #SUBCOLUMNS + 'SUBSTRING(RAW_DATA,' + #START + ',' + #LENGTH + '),'
FETCH NEXT FROM Col_Cursor INTO #NAME, #START, #LENGTH
END
CLOSE Col_Cursor
DEALLOCATE Col_Cursor
set #COLUMNS = LEFT(#COLUMNS, len(#COLUMNS)-1) --get rid of last comma
set #SUBCOLUMNS = LEFT(#SUBCOLUMNS, len(#SUBCOLUMNS)-1) --get rid of last comma
print 'INSERT INTO TABLE_2 ' + '(' + #COLUMNS + ') SELECT ' + #SUBCOLUMNS + ' FROM RawDataTable'
You can take the text that prints and insert that SQL statement into your procedure that does the actual inserts.
Ahh I think I am beginning to unravel what you are trying to do. There is no need for a cursor or dynamic sql here at all. You just need to use a select statement as the values for your insert. Something like this maybe??
INSERT INTO TABLE_2(AAA, BBB, CCC)
SELECT SUBSTRING(RAW_DATA,1,2)
, SUBSTRING(RAW_DATA,3,3)
, SUBSTRING(RAW_DATA,5,2)
FROM ODS_SIEMENS_LAYOUT
WHERE RecordType = '1'
So, I'm trying to rewrite some of the stored procedures we're using to use better set-based logic and reduce or eliminate the use of cursors due to performance issues. However, I can't come up with a more efficient way to do the below without resorting to cursor use.
Presently, what i'm doing is basically selecting an initial result set into a temporary table that looks something like
INSERT INTO #tmptable
SELECT stuff.id
,stuff.datapoint
,stuff.database
,'' AS missingdata
FROM STUFF
Which usually returns anywhere from 250-500 rows of information. The '' is a datapoint that lives in any one of several hundred other databases - the name of which is specified by stuff.database. Despite there being hundreds of possible options, there's usually only three or four unique databases in each result set. As a result, what I'm currently doing is:
DECLARE #dbname VARCHAR(255)
DECLARE a_cursor CURSOR LOCAL
FOR
SELECT DISTINCT database
FROM #tmptable
OPEN a_cursor
FETCH NEXT
FROM a_cursor
INTO #dbname
WHILE ##FETCH_STATUS = 0
BEGIN
SET #SQL = 'UPDATE #tmptable
SET missingdata = bin.dataaggregate
FROM ((SELECT pd.id id
,STUFF((SELECT '','' + pdd.bin
FROM server.[' + #dbname + '].dbo.proddetails pdd
WHERE pdd.id= pd.id
GROUP BY pdd.id, pdd.bin
FOR XML PATH(''''), TYPE).value(''.'', ''VARCHAR(max)''), 1, 1, '''') dataaggregate
FROM server.[' + #dbname + '].dbo.proddetails pd) bin
INNER JOIN #tmptable tir ON tir.id= bin.id
EXEC sp_executesql #SQL
FETCH NEXT
FROM a_cursor
INTO #dbname
END
CLOSE a_cursor
DEALLOCATE a_cursor
Since there are usually only a handful of databases needed in each result set, the cursor has to loop only a handful of times and the performance hit isn't awful. Still, I don't like using them and feel like there has to be a more efficient way to do this. Any ideas?
Use this:
INSERT INTO #tmptable
SELECT stuff.id
,stuff.datapoint
,stuff.[database]
,'' AS missingdata
,row_number() over (order by stuff.id) as rn
FROM STUFF
declare #i int= 1;
declare #max int = (select max(rn) from #tmptable)
while #i <= #max
begin
declare #dbname sysname
select #dbname = [database] from #tmptable where rn = #i
--exec your dynamic sql hear
set #i +=1
end