SQL Server - Loop records insert 100 at a time - sql-server

I have an INSERT query inside a stored procedure that creates a set of parcels monthly named "MONTHLY_SET".
Sometimes the set gets too big, I need to be able to continue run the same insert query, but rather than insert, for example, 5000 records in a single set named "MONTHLY SET", I need to end up with 5 sets of 1000 each named: "MONTHLY_SET1", "MONTHLY_SET2", "MONTHLY_SET3", "MONTHLY_SET4", "MONTHLY_SET5"
I do not know how this can be achieved, I am not familiar with the use of cursor, and loops in T-SQL, or if those are the only available options to do this.
Would it be possible to ask for some help to understand how can I split a shingle set into smaller sets?
The INSERT query that needs to be inside a loop, currently looks like:
DECLARE #SETSEQ AS INT;
SET #SETSEQ = (SELECT MAX(SET_SEQ_NBR) FROM SETDETAILS);
INSERT INTO SETDETAILS
( SERV_PROV_CODE
, SET_SEQ_NBR
, SET_ID
, REC_DATE
, REC_FUL_NAM
, REC_STATUS
, SOURCE_SEQ_NBR
, L1_PARCEL_NBR
)
SELECT Top 10
'STRING'
, ROW_NUMBER() OVER(ORDER BY ParcelNumber ASC) + #SETSEQ
, 'MONTHLY_SET'
, GETDATE()
, 'USR'
, 'A'
, '155'
, ParcelNumber
FROM
dbo.Parcels
WHERE
Create = 1;
Thank you for your help.

Related

(SOLVED) - First iteration of WHILE loop runs out of memory despite manual reconstruction of query succeeding

Environment: SQL Server 2019 (v15).
I have a large query that uses too much space when run as a single SELECT statement. When I try to run it, I get the following error:
Could not allocate a new page for database 'TEMPDB' because of insufficient disk space in filegroup 'DEFAULT'.
However, the problem breaks down naturally into a dozen or so pieces, so I wrote a WHILE loop to iterate through each piece and insert into a results table. Unfortunately, the first iteration of the WHILE loop also returns the same memory error. All the WHILE loop is doing is changing a few values in the WHERE clause.
The key thing confusing me here, is that when I manually run one iteration of the INSERT statement, absent all looping logic, it works perfectly.
Manually coding the first iteration to use the first institution_name just works, so I don't think the joins here are going wrong and causing the memory error.
WITH my_cte AS
(
SELECT [columns]
FROM mytable a
INNER JOIN bigtable b ON a.institution_name = b.institution_name
AND a.personID = b.personID
WHERE a.institution_name = 'ABC'
AND b.institution_name = 'ABC'
)
INSERT INTO results (personID, institution_name, ...)
SELECT personID, institution_name, [some aggregations]
FROM my_cte
GROUP BY personID, institution_name;
The version with the WHILE loop fails. I need to run the query with different values for institution_name.
Here I show three different values but even just the first iteration fails.
DECLARE #INSTITUTION varchar(10)
DECLARE #COUNTER int
SET #COUNTER = 0
DECLARE #LOOKUP table (temp_val varchar(10), temp_id int)
INSERT INTO #LOOKUP (temp_val, temp_id)
VALUES ('ABC', 1), ('DEF', 2), ('GHI', 3)
WHILE #COUNTER < 3
BEGIN
SET #COUNTER = #COUNTER + 1
SELECT #INSTITUTION = temp_val
FROM #LOOKUP
WHERE temp_id = #COUNTER;
WITH my_cte AS
(
SELECT [columns]
FROM mytable a
INNER JOIN bigtable b ON a.institution_name = b.institution_name
AND a.personID = b.personID
WHERE a.institution_name = #INSTITUTION
AND b.institution_name = #INSTITUTION
)
INSERT INTO results (personID, institution_name, ...)
SELECT personID, institution_name, [some aggregations]
FROM my_cte
GROUP BY personID, institution_name
END
As I write this question, I have quite literally just copy-pasted the insert statement a dozen times, changed the relevant WHERE clause, and run it without errors. Could it be some kind of datatype issue where the query can properly subset if a string literal is put in the WHERE column, but the lookup on my temporary table is failing due to the datatype? I notice that mytable.institution_name is varchar(10) while bigtable.institution_name is nvarchar(10). Setting the temp table to use nvarchar(10) didn't fix it either.

Stored Procedure SELECT UPDATE Incorrect Values

I am creating a stored procedure, which I intend on running via a job every 24 hours. I am able to successfully run the procedure query but for some reason the values dont seem to make sense. See below.
This is my table and what it looks like prior to the running of the procedure, using the following statement:
SELECT HardwareAssetDailyAccumulatedDepreciationValue,
HardwareAssetAccumulatedDepreciationValue FROM HardwareAsset
I then run the following procedure (with the intention of basically copying the value in DailyDepreciationValue to DepreciationValue):
BEGIN
SELECT HardwareAssetID, HardwareAssetDailyAccumulatedDepreciationValue,
HardwareAssetAccumulatedDepreciationValue FROM HardwareAsset
WHERE HardwareAssetDailyAccumulatedDepreciationValue IS NOT NULL
UPDATE HardwareAsset SET HardwareAssetAccumulatedDepreciationValue = CASE WHEN
(HardwareAssetAccumulatedDepreciationValue IS NULL) THEN
CONVERT(DECIMAL(7,2),HardwareAssetDailyAccumulatedDepreciationValue) ELSE
CONVERT(DECIMAL(7,2),(HardwareAssetAccumulatedDepreciationValue + HardwareAssetDailyAccumulatedDepreciationValue))
END
END
But when i re-run the select statement the results are as follows:
It really doesnt make any sense to me at all any ideas?
I am not able to replicate. We need more detail on the table structure and data. This is what I used to attempt to replicate. Feel free to modify as needed:
create table #t (
AccD1 decimal(7,2)
, AccD2 decimal(7,2)
, AccDaily as AccD1 + AccD2
, AccTotal decimal(7,2)
)
insert #t values
(100, 7.87, null)
, (300, 36.99, null)
, (400, 49.32, null)
, (100, 50.00, 100)
select * from #t
update #t set
AccTotal = isnull(AccTotal, 0) + AccDaily
, AccD1 = 0
, AccD2 = 0
select * from #t
drop table #t

Performance issue with larger resultsets MSSQL

I currently have a stored procedure in MSSQL where I execute a SELECT-statement multiple times based on the variables I give the stored procedure. The stored procedure counts how many results are going to be returned for every filter a user can enable.
The stored procedure isn't the issue, I transformed the select statement from te stored procedure to a regular select statement which looks like:
DECLARE #contentRootId int = 900589
DECLARE #RealtorIdList varchar(2000) = ';880;884;1000;881;885;'
DECLARE #publishSoldOrRentedSinceDate int = 8
DECLARE #isForSale BIT= 1
DECLARE #isForRent BIT= 0
DECLARE #isResidential BIT= 1
--...(another 55 variables)...
--Table to be returned
DECLARE #resultTable TABLE
(
variableName varchar(100),
[value] varchar(200)
)
-- Create table based of inputvariable. Example: turns ';18;118;' to a table containing two ints 18 AND 118
DECLARE #RealtorIdTable table(RealtorId int)
INSERT INTO #RealtorIdTable SELECT * FROM dbo.Split(#RealtorIdList,';') option (maxrecursion 150)
INSERT INTO #resultTable ([value], variableName)
SELECT [Value], VariableName FROM(
Select count(*) as TotalCount,
ISNULL(SUM(CASE WHEN reps.ForRecreation = 1 THEN 1 else 0 end), 0) as ForRecreation,
ISNULL(SUM(CASE WHEN reps.IsQualifiedForSeniors = 1 THEN 1 else 0 end), 0) as IsQualifiedForSeniors,
--...(A whole bunch more SUM(CASE)...
FROM TABLE1 reps
LEFT JOIN temp t on
t.ContentRootID = #contentRootId
AND t.RealEstatePropertyID = reps.ID
WHERE
(EXISTS(select 1 from #RealtorIdTable where RealtorId = reps.RealtorID))
AND (#SelectedGroupIds IS NULL OR EXISTS(select 1 from #SelectedGroupIdtable where GroupId = t.RealEstatePropertyGroupID))
AND (ISNULL(reps.IsForSale,0) = ISNULL(#isForSale,0))
AND (ISNULL(reps.IsForRent, 0) = ISNULL(#isForRent,0))
AND (ISNULL(reps.IsResidential, 0) = ISNULL(#isResidential,0))
AND (ISNULL(reps.IsCommercial, 0) = ISNULL(#isCommercial,0))
AND (ISNULL(reps.IsInvestment, 0) = ISNULL(#isInvestment,0))
AND (ISNULL(reps.IsAgricultural, 0) = ISNULL(#isAgricultural,0))
--...(Around 50 more of these WHERE-statements)...
) as tbl
UNPIVOT (
[Value]
FOR [VariableName] IN(
[TotalCount],
[ForRecreation],
[IsQualifiedForSeniors],
--...(All the other things i selected in above query)...
)
) as d
select * from #resultTable
The combination of a Realtor- and contentID gives me a set default set of X amount of records. When I choose a Combination which gives me ~4600 records, the execution time is around 250ms. When I execute the sattement with a combination that gives me ~600 record, the execution time is about 20ms.
I would like to know why this is happening. I tried removing all SUM(CASE in the select, I tried removing almost everything from the WHERE-clause, and I tried removing the JOIN. But I keep seeing the huge difference between the resultset of 4600 and 600.
Table variables can perform worse when the number of records is large. Consider using a temporary table instead. See When should I use a table variable vs temporary table in sql server?
Also, consider replacing the UNPIVOT by alternative SQL code. Writing your own TSQL code will give you more control and even increase performance. See for example PIVOT, UNPIVOT and performance

SCD type 2 using SQL Server MERGE, how to capture counts?

New to SQL Server and MERGE.
I am working on a MERGE statement to populate a slowly changing dimension table. My example includes both type 1 and type 2 attributes. I see examples of how to use OUTPUT to capture counts of actions, and I understand how to use OUTPUT to pass values out to an INSERT statement. What I would like to do is take the following code and somehow capture the count of UPDATE and INSERT actions for audit/logging purposes.
Very confused reading articles on OUTPUT and OUTPUT INTO, but from what I can tell, I don't think I can do what I want to do, at least not using OUTPUT.
Is there a way to capture the ACTION counts from the below statement? Is there a better way to accomplish this?
Thank you
BEGIN
MERGE dbo.dimTable tgt
USING dbo.stgTable src
ON tgt.NaturalKey = src.NaturalKey
AND tgt.IsActiveRow = 'Y'
WHEN MATCHED
AND EXISTS
(SELECT src.SCD1Field
EXCEPT
SELECT tgt.SCD1Field
)
THEN
UPDATE SET
tgt.SCD1Field = src.SCD1Field ;
INSERT dbo.dimTable (
tgt.NaturalKey
, tgt.SCD1Field
, tgt.SCD2Field
, tgt.RowStartDate
, tgt.RowEndDate
, tgt.IsActiveRow
)
SELECT
NaturalKey
, SCD1Field
, SCD2Field
, RowStartDate
, RowEndDate
, IsActiveRow
FROM (
MERGE dbo.dimTable tgt
USING dbo.stgTable src
ON tgt.NaturalKey = src.NaturalKey
WHEN NOT MATCHED BY TARGET
THEN
INSERT (
NaturalKey
, SCD1Field
, SCD2Field
, RowStartDate
, RowEndDate
, IsActiveRow
)
VALUES (
src.NaturalKey
, src.SCD1Field
, src.SCD2Field
, GETDATE()
, NULL
, 'Y'
)
WHEN MATCHED
AND tgt.IsActiveRow = 'Y'
AND EXISTS
(
SELECT src.SCD2Field
EXCEPT
SELECT tgt.SCD2Field
)
THEN
UPDATE
SET IsActiveRow = 'N'
, RowEndDate = DATEADD(dd,-1,GETDATE())
OUTPUT $ACTION Action_Out
, src.NaturalKey
, src.SCD1Field
, src.SCD2Field
, GETDATE() RowStartDate
, NULL RowEndDate
, 'Y' IsActiveRow
)m
WHERE m.Action_Out = 'UPDATE'
END ;
Unfortunately, the MERGE command itself does not provide a way to capture the ACTION counts.
What you can do, however, is to add another step of counting the ACTION results from the table you are logging them to, assuming that you have some sort of audit key or date column you can use to separate the results from the most recent execution.

Indexing in SQLServer

Hi I am using SQLServer2008. I want to know what is index in SQLServer and how can i use it?
This is part of my query..how can i give index? Many Thanks..
DECLARE #TableMember TABLE
(
BrokerId INT ,
RankId INT ,
MemberId INT ,
InstallmentId INT ,
PlanId INT ,
IntroducerId INT ,
Date DATETIME ,
SelfAmount DECIMAL(18, 2) ,
UnitAmount DECIMAL(18, 2) ,
SpotAmount DECIMAL(18, 2) ,
ORBPercentageSelf DECIMAL(18, 2) ,
ORBPercentageUnit DECIMAL(18, 2) ,
ORBAmountSelf DECIMAL(18, 2) ,
ORBAmountUnit DECIMAL(18, 2) ,
IsSelfBusiness BIT ,
Mode VARCHAR(50) ,
InstallmentNo INT ,
PlanType VARCHAR(50) ,
PlanName VARCHAR(50) ,
CompanyId INT ,
CscId INT ,
Year VARCHAR(50) ,
CreateDate DATETIME ,
ModifideDate DATETIME
)
INSERT INTO #TableMember
( BrokerId ,
RankId ,
MemberId ,
InstallmentId ,
PlanId ,
IntroducerId ,
Date ,
SelfAmount ,
UnitAmount ,
SpotAmount ,
ORBPercentageSelf ,
ORBPercentageUnit ,
ORBAmountSelf ,
ORBAmountUnit ,
IsSelfBusiness ,
Mode ,
InstallmentNo ,
PlanType ,
PlanName ,
CompanyId ,
CscId ,
Year ,
CreateDate ,
ModifideDate
)
( SELECT BrokerId ,
RankId ,
MemberId ,
InstallmentId ,
PlanId ,
IntroducerId ,
Date ,
SelfAmount ,
UnitAmount ,
SpotAmount ,
ORBPercentageSelf ,
ORBPercentageUnit ,
ORBAmountSelf ,
ORBAmountUnit ,
IsSelfBusiness ,
Mode ,
InstallmentNo ,
PlanType ,
PlanName ,
CompanyId ,
CscId ,
Year ,
CreateDate ,
ModifideDate
FROM dbo.MemberBusiness AS mb
WHERE ( #CscId = 0
OR mb.CscId = #CscId
)
AND mb.Date >= #StartDate
AND mb.Date <= #EndDate
AND mb.RankId >= #FromRankId
AND mb.RankId <= #ToRankId
)
Your index should be built depending on how your data is used. I would suggest reading this Indexing Best Practices as a start.
An index can be created in a table to find data more quickly and efficiently.
The users cannot see the indexes, they are just used to speed up searches/queries.
Note: Updating a table with indexes takes more time than updating a table without (because the indexes also need an update). So you should only create indexes on columns (and tables) that will be frequently searched against.
SQL CREATE INDEX Syntax
Creates an index on a table. Duplicate values are allowed:
CREATE INDEX index_name
ON table_name (column_name)
SQL CREATE UNIQUE INDEX Syntax
Creates a unique index on a table. Duplicate values are not allowed:
CREATE UNIQUE INDEX index_name
ON table_name (column_name)
Note: The syntax for creating indexes varies amongst different databases. Therefore: Check the syntax for creating indexes in your database.
CREATE INDEX Example
The SQL statement below creates an index named "PIndex" on the "LastName" column in the "Persons" table:
CREATE INDEX PIndex
ON Persons (LastName)
If you want to create an index on a combination of columns, you can list the column names within the parentheses, separated by commas:
CREATE INDEX PIndex
ON Persons (LastName, FirstName)
First and foremost, an index allows a query to return results quickly. Most indexes provide a tree structure of some kind that allow a query to skip a lot of comparisons. Instead of checking each and every table row, the query checks whether a target value is greater or less than an index root value, then, if bigger, the query checks a bigger index entry, if smaller, it checks a smaller one, and so on. The beauty of this is that the query doesn't have to check many index entries before it finds out whether the target value exists, and, if so, where an occurance is in the data table.
It's sort of a "divide and conquer" strategy.
A developer or DBA tries to anticipate which table columns will be used in a lot of searches and creates indexes on those columns. The DB SW maintains the index. The DB adds and removes index entries as the underlying table is changed. The only thing the user should be aware of is faster response.
A simple index creation example would be
CREATE INDEX IX_EmployeeName
ON EMPLOYEE(NAME);
Complete index creation syntax for Sqlserver 2008 R2 is available at
http://msdn.microsoft.com/en-us/library/ms188783(v=sql.105).aspx
The example you have provided involves a "table variable". That means you are creating a Transact-SQL variable that can be used like a table in a SQL statement. You often don't need indexes on this sort of table because they are often small. But, if you do need an index, they can be created implicitly as the example shows below. You cannot create them explicitly. You can create a temporary table, however, and index that like any other table.
DECLARE #Employee TABLE
(
ID INT PRIMARY KEY,
NAME VARCHAR(50),
UNIQUE (NAME,ID) -- ID is included to make the indexed value unique even if NAME is not
)
I like this article http://www.mssqltips.com/sqlservertip/1206/understanding-sql-server-indexing/
I understood what is the difference between clustered and non-clustered from here
I would split this into two queries you don't want to try pulling all data or data from one id in the same stored proc.
( #CscId = 0 OR mb.CscId = #CscId)
The primary reason is you probably want a non-clustered index over CscID if you are looking for just say CscID = 104256 but if you are looking for all CscID you probably want an nonclusteredindexl over date column. I would also make sure you actually need a table variable it doesn't look from what you have like there is much of a good reason to toss one in randomly.

Resources