Pulling Values From Temp Table After An Iteration - sql-server

I'm querying for the total sizes of recent data in certain databases.
I create a table containing the DBs to be queried then iterate over it to get the DB names and total number of times to run the iteration.
I then create a temptable where the needed data will be inserted into.
I run the iteration to grab the information and push it into the temptable for each database.
After the iteration finishes I'm not able to pull the values from this newly created table.
I wrote a little comment next to each portion of code explaining what I'm trying to do and what I expect to happen.
/*check if the #databases table is already present and then drop it*/
IF OBJECT_ID('tempdb..#databases', 'U') IS NOT NULL
begin
drop table #databases;
end
select ArtifactID into #databases from edds.eddsdbo.[Case]
where name like '%Review%'
/*Once this first statement has been run there will now be a
number column that is associated with the artificatID. Each database has an area that is
titled [EDDS'artifactID']. So if the artifactID = 1111111 then the DB would
be accessed at [EDDS1111111]*/
declare #runs int = 1; /*this will track the number of times iterated
over the result set*/
declare #max int = 0; /*this will be the limit*/
declare #databasename sysname='' /*this will allow the population of each
database name*/
/*check if your temp table exists and drop if necessary*/
IF OBJECT_ID('tempdb..#temptable', 'U') IS NOT NULL
begin
drop table #temptable;
end
/*create the temp table as outside the loop*/
create table #temptable(
fileSize dec,
extractedTextSize dec
)
while #runs<=#max
begin
select #max=count(*) from #databases;
/*the #max is now the number of databases inserted in to this table*/
/*This select statement pulls the information that will be placed
into the temptable. This second statment should be inside the loop. One time
for each DB that appeared in the first query's results.*/
/*begin the loop by assigning your database name, I don't know what the
column is called so I have just called it databasename for now*/
select top 1 #databasename = ArtifactID from #databases;
/*generate your sql using the #databasename variable, if you want to make
the database and table names dynamic too then you can use the same formula*/
insert into #temptable
select SUM(fileSize)/1024/1024/1024, SUM(extractedTextSize)/1024/1024
FROM [EDDS'+cast(#databasename as nvarchar(128))+'].[EDDSDBO].[Document] ed
where ed.CreatedDate >= (select CONVERT(varchar,dateadd(d,-
(day(getdate())),getdate()),106))'
/*remove that row from the databases table so the row won't be redone
This will take the #max and lessen it by one*/
delete from #databases where ArtifactID=#databasename;
/* Once the #max is less than 1 then end the loop*/
end
/* Query the final values in the temp table after the iteration is complete*/
select filesize+extractedTextSize as Gigs from #temptable
When that final select statement runs to pull values from #temptable the response is a single gigs column(as expected) but the table itself is blank.
Something is happening to clear the data out of the table and I'm stuck.
I'm not sure if my error is in syntax or a general error of logic but any help would be greatly appreciated.

Made a few tweaks to formatting but main issue is your loop would never run.
You have #runs <= #max, but #max = 1 and #runs = 0 at start so it will never loop
To fix this you can do a couple different things but I set the #max before loop, and in the loop just added 1 to #runs each loop, since you know how many you need #max before loop runs, and just add it to number of runs and do your compare.
But one NOTE there are much better ways to do this then the way you have. Put identity on your #databases table, and in your loop just do where databaseID = loopCount (then you dont have to delete from the table)
--check if the #databases table is already present and then drop it
IF OBJECT_ID('tempdb..#databases', 'U') IS NOT NULL
drop table #databases;
--Once this first statement has been run there will now be a number column that is associated with the artificatID. Each database has an area that is
-- titled [EDDS'artifactID']. So if the artifactID = 1111111 then the DB would be accessed at [EDDS1111111]
select ArtifactID
INTO #databases
FROM edds.eddsdbo.[Case]
where name like '%Review%'
-- set to 0 to start
DECLARE #runs int = 0;
--this will be the limit
DECLARE #max int = 0;
--this will allow the population of each database name
DECLARE #databasename sysname = ''
--check if your temp table exists and drop if necessary
IF OBJECT_ID('tempdb..#temptable', 'U') IS NOT NULL
drop table #temptable;
--create the temp table as outside the loop
create table #temptable(
fileSize dec,
extractedTextSize dec
)
-- ***********************************************
-- Need to set the value your looping on before you get to your loop, also so if you dont have any you wont do your loop
-- ***********************************************
--the #max is now the number of databases inserted in to this table
select #max = COUNT(*)
FROM #databases;
while #runs <= #max
BEGIN
/*This select statement pulls the information that will be placed
into the temptable. This second statment should be inside the loop. One time
for each DB that appeared in the first query's results.*/
/*begin the loop by assigning your database name, I don't know what the
column is called so I have just called it databasename for now*/
select top 1 #databasename = ArtifactID from #databases;
/*generate your sql using the #databasename variable, if you want to make
the database and table names dynamic too then you can use the same formula*/
insert into #temptable
select SUM(fileSize)/1024/1024/1024, SUM(extractedTextSize)/1024/1024
FROM [EDDS'+cast(#databasename as nvarchar(128))+'].[EDDSDBO].[Document] ed
where ed.CreatedDate >= (select CONVERT(varchar,dateadd(d,- (day(getdate())),getdate()),106))
--remove that row from the databases table so the row won't be redone This will take the #max and lessen it by one
delete from #databases where ArtifactID=#databasename;
--Once the #max is less than 1 then end the loop
-- ***********************************************
-- no need to select from the table and change your max value, just change your runs by adding one for each run
-- ***********************************************
--the #max is now the number of databases inserted in to this table
select #runs = #runs + 1 --#max=count(*) from #databases;
end
-- Query the final values in the temp table after the iteration is complete
select filesize+extractedTextSize as Gigs from #temptable

This is second answer, but its alternative to like I mentioned in above and cleaner to post to post as an alternative answer to keep them seperate
This is a better way to do your looping (not fully tested out yet so you will have to verify).
But instead of deleting from your table just add an ID to it and loop through it using that ID. Way less steps and much cleaner.
--check if the #databases table is already present and then drop it
IF OBJECT_ID('tempdb..#databases', 'U') IS NOT NULL
drop table #databases;
--create the temp table as outside the loop
create table #databases(
ID INT IDENTITY,
ArtifactID VARCHAR(20) -- not sure of this ID's data type
)
--check if your temp table exists and drop if necessary
IF OBJECT_ID('tempdb..#temptable', 'U') IS NOT NULL
drop table #temptable;
--create the temp table as outside the loop
create table #temptable(
fileSize dec,
extractedTextSize dec
)
--this will allow the population of each database name
DECLARE #databasename sysname = ''
-- initialze to 1 so it matches first record in temp table
DECLARE #LoopOn int = 1;
--this will be the max count from table
DECLARE #MaxCount int = 0;
--Once this first statement has been run there will now be a number column that is associated with the artificatID. Each database has an area that is
-- titled [EDDS'artifactID']. So if the artifactID = 1111111 then the DB would be accessed at [EDDS1111111]
-- do insert here so it adds the ID column
INSERT INTO #databases(
ArtifactID
)
SELECT ArtifactID
FROM edds.eddsdbo.[Case]
where name like '%Review%'
-- sets the max number of loops we are going to do
select #MaxCount = COUNT(*)
FROM #databases;
while #LoopOn <= #MaxCount
BEGIN
-- your table has IDENTITY so select the one for the loop your on (initalize to 1)
select #databasename = ArtifactID
FROM #databases
WHERE ID = #LoopOn;
--generate your sql using the #databasename variable, if you want to make
--the database and table names dynamic too then you can use the same formula
insert into #temptable
select SUM(fileSize)/1024/1024/1024, SUM(extractedTextSize)/1024/1024
-- dont know/think this will work like this? If not you have to use dynamic SQL
FROM [EDDS'+cast(#databasename as nvarchar(128))+'].[EDDSDBO].[Document] ed
where ed.CreatedDate >= (select CONVERT(varchar,dateadd(d,- (day(getdate())),getdate()),106))
-- remove all deletes/etc and just add one to the #LoopOn and it will be selected above based off the ID
select #LoopOn += 1
end
-- Query the final values in the temp table after the iteration is complete
select filesize+extractedTextSize as Gigs
FROM #temptable

Related

I am searching for a loop query over multiple databases and insert result into existing table in one database to collect al data

I am searching for a loop query over multiple databases and insert result into existing table in one database to collect al data.
There are 28 existing databases at the moment but when i start the query below it says table already exists at the second database.
when this works i want to loop a much larger query then this.
I also tried executing and union all but if a new database is added it must be collected autmatically.
See example i've tried below:
--drop table if exists [hulptabellen].dbo.HIdatabases
declare #dbList table (dbName varchar(128), indx int)
insert into #dbList
select dbName = dbname, row_number() over (order by dbname)
from [hulptabellen].dbo.HIdatabases
--declare variables for use in the while loop
declare #index int = 1
declare #totalDBs int = (select count(*) from #dbList)
declare #currentDB varchar(128)
declare #cmd varchar(300)
--define the command which will be used on each database.
declare #cmdTemplate varchar(300) =
'
use {dbName};
select * insert into [hulptabellen].dbo.cladrloc from {dbname}.dbo.cladrloc
'
--loop through each database and execute the command
while #index <= #totalDBs
begin
set #currentDB = (select dbName from #dbList where indx = #index)
set #cmd = replace(#cmdTemplate, '{dbName}', #currentDB)
execute(#cmd)
set #index += 1
end
Create the table outside your loop and insert into the table this way:
INSERT INTO [hulptabellen].dbo.cladrloc (col1,col2)
SELECT col1,col2
FROM {dbname}.dbo.cladrloc
FYI: When you use the following syntax, a new table is created, so it can be executed only once.
SELECT *
INTO [hulptabellen].dbo.cladrloc
FROM {dbname}.dbo.cladrloc

How to optimize cursor in a stored procedure

I'm having problems with a stored procedure that iterates over a table, it works fine with a few hundred rows however when the table is over the thousands it saturates the memory and crashes.
The procedure should iterate row by row and fill a column with a value which is calculated from another column in the row. I suspect it is the cursor that crashes the procedure and in other questions I've read to use a while loop but I'm no expert in sql and the examples I tried from those answers didn't work.
CREATE PROCEDURE [dbo].[GenerateNewHashes]
AS
BEGIN
SET NOCOUNT ON;
DECLARE #module BIGINT = 382449983
IF EXISTS(SELECT 1 FROM dbo.telephoneSource WHERE Hash IS NULL)
BEGIN
DECLARE hash_cursor CURSOR FOR
SELECT a.telephone, a.Hash
FROM dbo.telephoneSource AS a
OPEN hash_cursor
FETCH FROM hash_cursor
WHILE ##FETCH_STATUS = 0
BEGIN
UPDATE dbo.telephoneSource
SET Hash = CAST(telephone AS BIGINT) % #module
WHERE CURRENT OF hash_cursor
FETCH NEXT FROM hash_cursor
END
CLOSE hash_cursor
DEALLOCATE hash_cursor
END
END
Basically the stored procedure is intended to fill a new column called Hash that was added to the existing table, when the script that updates the table ends the new column is filled with NULL values and then this stored procedure is supposed to fill each null value with the operation telephone number (which is a bigint) % module variable (big int as well).
Is there anything besides changing to a while loop that I can do to make it use less memory or just don't crash? Thanks in advance.
You could do the following:
WHILE 1=1
BEGIN
UPDATE TOP (10000) dbo.telephoneSource
SET Hash = CAST(telephone AS BIGINT)%#module
WHERE Hash IS NULL;
IF ##ROWCOUNT = 0
BEGIN
BREAK;
END;
END;
This will update Hash as long as there are NULL values and will exit once there have been no records updated.
Adding a filtered index could be useful as well:
CREATE NONCLUSTERED INDEX IX_telephoneSource_Hash_telephone
ON dbo.telephoneSource (Hash)
INCLUDE (telephone)
WHERE Hash IS NULL;
It will speed up lookups in order to update it. But this might be not needed.
Here is example of code to do it in loops from my comment above with out using a cursor, and if you add where your field you are updating IS NOT NULL into the inner loop it wont update ones that were already done (in case you need to restart the process or something.
I didnt include your specific tables in there but if you need me to I can add it in there.
DECLARE #PerBatchCount as int
DECLARE #MAXID as bigint
DECLARE #WorkingOnID as bigint
Set #PerBatchCount = 1000
--Find range of IDs to process using yoru tablename
SELECT #WorkingOnID = MIN(ID), #MAXID = MAX(ID)
FROM YouTableHere WITH (NOLOCK)
WHILE #WorkingOnID <= #MAXID
BEGIN
-- do an update on all the ones that exist in the offer table NOW
--DO YOUR UPDATE HERE
-- include this where clause where ID is your PK you are looping through
WHERE ID BETWEEN #WorkingOnID AND (#WorkingOnID + #PerBatchCount -1)
set #WorkingOnID = #WorkingOnID + #PerBatchCount
END
SET NOCOUNT OFF;
I would simply add computed column:
ALTER TABLE dbo.telephoneSource
ADD Hash AS (CAST(telephone AS BIGINT)%382449983) PERSISTED;

Truncate some rows in table SQL Server Management Studio [duplicate]

Let's say we have table Sales with 30 columns and 500,000 rows. I would like to delete 400,000 in the table (those where "toDelete='1'").
But I have a few constraints :
the table is read / written "often" and I would not like a long "delete" to take a long time and lock the table for too long
I need to skip the transaction log (like with a TRUNCATE) but while doing a "DELETE ... WHERE..." (I need to put a condition), but haven't found any way to do this...
Any advice would be welcome to transform a
DELETE FROM Sales WHERE toDelete='1'
to something more partitioned & possibly transaction log free.
Calling DELETE FROM TableName will do the entire delete in one large transaction. This is expensive.
Here is another option which will delete rows in batches :
deleteMore:
DELETE TOP(10000) Sales WHERE toDelete='1'
IF ##ROWCOUNT != 0
goto deleteMore
I'll leave my answer here, since I was able to test different approaches for mass delete and update (I had to update and then delete 125+mio rows, server has 16GB of RAM, Xeon E5-2680 #2.7GHz, SQL Server 2012).
TL;DR: always update/delete by primary key, never by any other condition. If you can't use PK directly, create a temp table and fill it with PK values and update/delete your table using that table. Use indexes for this.
I started with solution from above (by #Kevin Aenmey), but this approach turned out to be inappropriate, since my database was live and it handles a couple of hundred transactions per second and there was some blocking involved (there was an index for all there fields from condition, using WITH(ROWLOCK) didn't change anything).
So, I added a WAITFOR statement, which allowed database to process other transactions.
deleteMore:
WAITFOR DELAY '00:00:01'
DELETE TOP(1000) FROM MyTable WHERE Column1 = #Criteria1 AND Column2 = #Criteria2 AND Column3 = #Criteria3
IF ##ROWCOUNT != 0
goto deleteMore
This approach was able to process ~1.6mio rows/hour for updating and ~0,2mio rows/hour for deleting.
Turning to temp tables changed things quite a lot.
deleteMore:
SELECT TOP 10000 Id /* Id is the PK */
INTO #Temp
FROM MyTable WHERE Column1 = #Criteria1 AND Column2 = #Criteria2 AND Column3 = #Criteria3
DELETE MT
FROM MyTable MT
JOIN #Temp T ON T.Id = MT.Id
/* you can use IN operator, it doesn't change anything
DELETE FROM MyTable WHERE Id IN (SELECT Id FROM #Temp)
*/
IF ##ROWCOUNT > 0 BEGIN
DROP TABLE #Temp
WAITFOR DELAY '00:00:01'
goto deleteMore
END ELSE BEGIN
DROP TABLE #Temp
PRINT 'This is the end, my friend'
END
This solution processed ~25mio rows/hour for updating (15x faster) and ~2.2mio rows/hour for deleting (11x faster).
What you want is batch processing.
While (select Count(*) from sales where toDelete =1) >0
BEGIN
Delete from sales where SalesID in
(select top 1000 salesId from sales where toDelete = 1)
END
Of course you can experiment which is the best value to use for the batch, I've used from 500 - 50000 depending on the table. If you use cascade delete, you will probably need a smaller number as you have those child records to delete.
One way I have had to do this in the past is to have a stored procedure or script that deletes n records. Repeat until done.
DELETE TOP 1000 FROM Sales WHERE toDelete='1'
You should try to give it a ROWLOCK hint so it will not lock the entire table. However, if you delete a lot of rows lock escalation will occur.
Also, make sure you have a non-clustered filtered index (only for 1 values) on the toDelete column. If possible make it a bit column, not varchar (or what it is now).
DELETE FROM Sales WITH(ROWLOCK) WHERE toDelete='1'
Ultimately, you can try to iterate over the table and delete in chunks.
Updated
Since while loops and chunk deletes are the new pink here, I'll throw in my version too (combined with my previous answer):
SET ROWCOUNT 100
DELETE FROM Sales WITH(ROWLOCK) WHERE toDelete='1'
WHILE ##rowcount > 0
BEGIN
SET ROWCOUNT 100
DELETE FROM Sales WITH(ROWLOCK) WHERE toDelete='1'
END
My own take on this functionality would be as follows.
This way there is no repeated code and you can manage your chunk size.
DECLARE #DeleteChunk INT = 10000
DECLARE #rowcount INT = 1
WHILE #rowcount > 0
BEGIN
DELETE TOP (#DeleteChunk) FROM Sales WITH(ROWLOCK)
SELECT #rowcount = ##RowCount
END
I have used the below to delete around 50 million records -
BEGIN TRANSACTION
DeleteOperation:
DELETE TOP (BatchSize)
FROM [database_name].[database_schema].[database_table]
IF ##ROWCOUNT > 0
GOTO DeleteOperation
COMMIT TRANSACTION
Please note that keeping the BatchSize < 5000 is less expensive on resources.
As I assume the best way to delete huge amount of records is to delete it by Primary Key. (What is Primary Key see here)
So you have to generate tsql script that contains the whole list of lines to delete and after this execute this script.
For example code below is gonna generate that file
GO
SET NOCOUNT ON
SELECT 'DELETE FROM DATA_ACTION WHERE ID = ' + CAST(ID AS VARCHAR(50)) + ';' + CHAR(13) + CHAR(10) + 'GO'
FROM DATA_ACTION
WHERE YEAR(AtTime) = 2014
The ouput file is gonna have records like
DELETE FROM DATA_ACTION WHERE ID = 123;
GO
DELETE FROM DATA_ACTION WHERE ID = 124;
GO
DELETE FROM DATA_ACTION WHERE ID = 125;
GO
And now you have to use SQLCMD utility in order to execute this script.
sqlcmd -S [Instance Name] -E -d [Database] -i [Script]
You can find this approach explaned here https://www.mssqltips.com/sqlservertip/3566/deleting-historical-data-from-a-large-highly-concurrent-sql-server-database-table/
Here's how I do it when I know approximately how many iterations:
delete from Activities with(rowlock) where Id in (select top 999 Id from Activities
(nolock) where description like 'financial data update date%' and len(description) = 87
and User_Id = 2);
waitfor delay '00:00:02'
GO 20
Edit: This worked better and faster for me than selecting top:
declare #counter int = 1
declare #msg varchar(max)
declare #batch int = 499
while ( #counter <= 37600)
begin
set #msg = ('Iteration count = ' + convert(varchar,#counter))
raiserror(#msg,0,1) with nowait
delete Activities with (rowlock) where Id in (select Id from Activities (nolock) where description like 'financial data update date%' and len(description) = 87 and User_Id = 2 order by Id asc offset 1 ROWS fetch next #batch rows only)
set #counter = #counter + 1
waitfor delay '00:00:02'
end
Declare #counter INT
Set #counter = 10 -- (you can always obtain the number of rows to be deleted and set the counter to that value)
While #Counter > 0
Begin
Delete TOP (4000) from <Tablename> where ID in (Select ID from <sametablename> with (NOLOCK) where DateField < '2021-01-04') -- or opt for GetDate() -1
Set #Counter = #Counter -1 -- or set #counter = #counter - 4000 if you know number of rows to be deleted.
End

Cursor to auto insert Roll Numbers in student table

How to go about auto populating student's roll number column in ascending order from 1,2,3... and so on on assign button click in the form ?
TABLE
How the cursor will be invoked?
I'm using stored procedures for all database operations.
SAMPLE CODE
declare #studID int
declare rollCursor CURSOR FOR
select * from TESTING
OPEN rollCursor
If you want a single StudentId, just write in your procedure
-- Where #StudendId will be parameter to your stored procedure
SELECT * FROM TESTING
WHERE StudId = #StudendId
Here is how you can use CURSOR. But note, CURSOR have performance issues. So use it rarely.
DECLARE #StudId INT
DECLARE #FName VARCHAR(50)
DECLARE #ROLL INT
-- Here you declare which all columns you need to loop in Cursor
DECLARE rollCursor CURSOR FOR
select * from TESTING
WHERE StudId = #StudendId
ORDER BY StudId;
OPEN rollCursor
-- Loop starts from here
FETCH NEXT FROM rollCursor
INTO #StudId,#FName,#ROLL
WHILE ##FETCH_STATUS = 0
BEGIN
-- Select studentid one by one from the table
SELECT * FROM TESTING
WHERE StudId = #StudId
-- Fetches next record and increments the loop
FETCH NEXT FROM rollCursor
INTO #StudId,#FName,#ROLL
END
CLOSE rollCursor;
DEALLOCATE rollCursor;
EDIT : 1 (Getting Row number for table)
If you need the result according to the roll then use the below query
-- This will bring you the records with roll number in ascending order
-- If you want in descending order just change ASC to DESC
SELECT studid,Fname,ROW_NUMBER() OVER(ORDER BY roll ASC) roll
FROM StudId
EDIT : 2 (Create identity field)
You need to set roll number column as Identity field, ie a column with an integer value which automatically increments value on new insertions.
Once you set the roll number column as Identity field, try with the below inserts
INSERT INTO TESTING(StudId,Fname)VALUES(10,'A')
INSERT INTO TESTING(StudId,Fname)VALUES(10,'A')
You will not select or include the roll number column in insert. It will automatically increment.
Click here to view more on identity field

Duplicate Auto Numbers generated in SQL Server

Be gentle, I'm a SQL newbie. I have a table named autonumber_settings like this:
Prefix | AutoNumber
SO | 112320
CA | 3542
A whenever a new sales line is created, a stored procedure is called that reads the current autonumber value from the 'SO' row, then increments the number, updates that same row, and return the number back from the stored procedure. The stored procedure is below:
ALTER PROCEDURE [dbo].[GetAutoNumber]
(
#type nvarchar(50) ,
#out nvarchar(50) = '' OUTPUT
)
as
set nocount on
declare #currentvalue nvarchar(50)
declare #prefix nvarchar(10)
if exists (select * from autonumber_settings where lower(autonumber_type) = lower(#type))
begin
select #prefix = isnull(autonumber_prefix,''),#currentvalue=autonumber_currentvalue
from autonumber_settings
where lower(autonumber_type) = lower(#type)
set #currentvalue = #currentvalue + 1
update dbo.autonumber_settings set autonumber_currentvalue = #currentvalue where lower(autonumber_type) = lower(#type)
set #out = cast(#prefix as nvarchar(10)) + cast(#currentvalue as nvarchar(50))
select #out as value
end
else
select '' as value
Now, there is another procedure that accesses the same table that duplicates orders, copying both the header and the lines. On occasion, the duplication results in duplicate line numbers. Here is a piece of that procedure:
BEGIN TRAN
IF exists
(
SELECT *
FROM autonumber_settings
WHERE autonumber_type = 'SalesOrderDetail'
)
BEGIN
SELECT
#prefix = ISNULL(autonumber_prefix,'')
,#current_value=CAST (autonumber_currentvalue AS INTEGER)
FROM autonumber_settings
WHERE autonumber_type = 'SalesOrderDetail'
SET #new_auto_number = #current_value + #number_of_lines
UPDATE dbo.autonumber_settings
SET autonumber_currentvalue = #new_auto_number
WHERE autonumber_type = 'SalesOrderDetail'
END
COMMIT TRAN
Any ideas on why the two procedures don't seem to play well together, occasionally giving the same line numbers created from scratch as lines created by duplication.
This is a race condition or your autonumber assignment. Two executions have the potential to read out the same value before a new one is written back to the database.
The best way to fix this is to use an identity column and let SQL server handle the autonumber assignments.
Barring that you could use sp_getapplock to serialize your access to autonumber_settings.
You could use repeatable read on the selects. That will lock the row and block the other procedure's select until you update the value and commit.
Insert WITH (REPEATABLEREAD,ROWLOCK) after the from clause for each select.

Resources