I have a table for bookings (table_b) that has around 1.3M rows. A second table (table_s) is used to note when these rows are needed to be accessed by a separate application.
Currently there are triggers to make a record in table_s but this doesn't help with all existing data.
I believe I need to have a query that selects the rows that exists in table_b but not table_s and then insert a row for each line.
Here is my current syntax but don't think it has been formed correctly
DECLARE #b_id [INT] = 0;
WHILE(1 = 1)
BEGIN
SELECT TOP 10
#b_id = MIN([b].[b_id])
FROM
[table_b] AS [b]
LEFT JOIN
[table_s] AS [s] ON [b].[b_id] = [s].[b_id]
WHERE
[s].[b_id] IS NULL;
IF #b_id IS NULL
BREAK;
INSERT INTO [table_s] ([b_id], [processed])
VALUES (#b_id, 0);
END;
Syntactically everything is fine. But there are some misconceptions present in your query
select top 10 #b_id = MIN(b.b_id)
a variable can hold just one value, even though you select top 10 it will assign single value to variable. Your current approach will loop for each non existing record
I don't think for 1 million records insert we need to split the insert into batches. Try this way
INSERT INTO table_s
(b_id,
processed)
SELECT b_id,
0
FROM table_b AS b
WHERE NOT EXISTS (SELECT 1
FROM table_s AS s
WHERE b.b_id = s.b_id)
Related
I have a dimension table I'm trying to create that would require records with NULLs to be overwritten by a value when all other non-null fields match.
This logic works and shows what I mean by "null=Value evaluates to TRUE":
UPDATE A
SET
A.SSN = COALESCE(A.SSN, B.SSN)
,A.DOB = COALESCE(A.DOB, B.DOB)
,A.ID_1 = COALESCE(A.ID_1, B.ID_1)
,A.ID_2 = COALESCE(A.ID_2, B.ID_2)
,A.ID_3 = COALESCE(A.ID_3, B.ID_3)
,A.ID_4 = COALESCE(A.ID_4, B.ID_4)
FROM #TESTED1 A
INNER JOIN #TESTED1 B
ON (A.SSN = B.SSN
OR A.SSN IS NULL
OR B.SSN IS NULL)
AND (A.DOB = B.DOB
OR A.DOB IS NULL
OR B.DOB IS NULL)
AND (A.ID_1 = B.ID_1
OR A.ID_1 IS NULL
OR B.ID_1 IS NULL)
AND (A.ID_2 = B.ID_2
OR A.ID_2 IS NULL
OR B.ID_2 IS NULL)
AND (A.ID_3 = B.ID_3
OR A.ID_3 IS NULL
OR B.ID_3 IS NULL)
AND (A.ID_4 = B.ID_4
OR A.ID_4 IS NULL
OR B.ID_4 IS NULL)
WHERE A.ArbitraryTableID <> B.ArbitraryTableID
but takes exponentially longer the more records that are evaluated, 10k records takes 9sec, 100k records takes 9min, etc. I'm trying to do an initial load of around 30mil records and then I will have to evaluate the entire table in a MERGE operation with another 10k records every day.
For example I would need the following three rows (that all exist on the same table) to combine into two rows with all values populated:
Just like this:
Unfortunately members can have multiple IDs so I can't count on any one of these IDs to be unique or even exist at all to cut down on my join conditions.
For performance of this query, make sure you have an index sorting all the criteria you are making your join on.
I did a quick example of what you described:
`declare #test table (
row_name NVARCHAR(50),
id1 int null,
id2 int null,
id3 int null
)
insert into #test values('row1', 1,2,3), ('row2',1,4,5), ('row3',11,null,null), ('row4',null,4,null), ('row5',3,6,5), ('row6',3,null,null)
select *
from #test t1
inner join #test t2 on (
(t1.id1 = t2.id1
or t1.id1 is null
or t2.id1 is null)
and (
t1.id2 = t2.id2
or t1.id2 is null
or t2.id2 is null)
and (
t1.id3 = t2.id3
or t1.id3 is null
or t2.id3 is null)
)
where t1.row_name <> t2.row_name
order by t1.row_name`
There are a couple of possible problems I see in my test output:
row3 and row4 in my example match because they have none of the same IDs. I'm guessing this is not desired but if you really have several independent systems with different keys, is it possible that you have a lot of rows that fall into this scenario? Every row with id1 set and no other keys and every row with id2 set and no other keys will match.
row1 and row4 do not match even though they should through transitivity (row1.id1 -> row2.id1, row2.id2-> row4.id2)
Based on your response to my comment, I suggest the following solution:
a master record identifying the member/customer
child records for each master record storing the respective IDs
Replace your UPDATE statement with
INSERTs into the master table for all records in table A that are guaranteed to be unique (e.g. SSN).
INSERTs into the child table for all records in table A with not-NULL ID attributes
mark records in table A as processed by UPDATEing a foreign key column referencing the master records IDENTITY primary key
INSERT into the child table all records from A that you can safely assign to existing master records, and again set the FK
This solution would resolve the performance issues resulting from a 5-way JOIN, and also mark processed source records as processed.
I'm working in SQL Server 2016. Confusing problem with SQL issue. I have a TEMP table that contains unique rows. I have to insert 5 PRODUCTID values for each row each row based on another column value, AgentNo, in this temp table. The PRODUCTID value, there are 5 of them, comes from another table but there is no relationship between the tables. So my question is how do I insert a row for each ProductID into this temp table for each unique row that is currently in the temp table.
Here is a pic of the TEMP table that requires 5 rows for each:
Here is a pic of what I'm needing to come away with:
Here is my SQL code for both TEMP tables:
IF OBJECT_ID('tempdb..#tempTarget') IS NOT NULL DROP TABLE #tempTarget
SELECT 0 as ProductID, 1 as [Status], a.AgentNo, u.UserID, u.[Password], 'N' as AdminID, tel.LocationSysID --, tel.OwnerID, tel.LocationName, a.OwnerSysID, a.AgentName
INTO #tempTarget
FROM dbo.TEST_EvalLocations tel
INNER JOIN dbo.AGT_Agent a
ON tel.LocationName = a.AgentName
INNER JOIN dbo.IW_User u
ON a.AgentNo = u.UserID
WHERE tel.OwnerID = 13313
AND tel.LocationSysID <> 15434;
SELECT * FROM #tempTarget WHERE LocationSysID NOT IN (15425, 15434);
GO
-- Create source table
IF OBJECT_ID('tempdb..#tempSource') IS NOT NULL DROP TABLE #tempSource
SELECT DISTINCT lpr.ProductID
INTO #tempSource
FROM dbo.Eval_LocationProductRelationship lpr
WHERE lpr.ProductID IN (16, 15, 13, 14, 12) --BETWEEN 15435 AND 15595
Sorry I could not get this into a DDL file as these are TEMP tabless. Any help/direction would be appreciated. Thanks.
CROSS JOIN will be the best solution for your case.
If you only want 5 rows for each data in First table means, simply use the below cross join query.
SELECT B.ProductID,
A.[Status],
A.AgentNo,
A.UserID,
A.[Password] AS Value,
A.AdminID,
A.LocationSysID
FROM #tempTarget A
CROSS JOIN tempSource B
If you want additional row with 0, then you have to insert a 0 into your second temp table and use the same query.
INSERT INTO #tempSource SELECT 0
If i understand correctly following is the scenario,
One Temp table has all the content.
select * from #withoutProducts
product table
select * from #products
Then following is the query your are looking for
select a.ProductID,[Status],AgentNo,UserID,[value]
from #products a cross join #withoutProducts b
order by AgentNO,a.productID
I was using below query in sql server to update the table "TABLE" using the same table "TABLE". In sql server the below query is working fine.But in DB2 its getting failed.Not sure whether I need to make any change in this query to work in DB2.
The error I am getting in DB2 is
ExampleExceptionFormatter: exception message was: DB2 SQL Error:
SQLCODE=-204, SQLSTATE=42704
This is my input Data and there you can see ENO 679 is repeating in both round 3 and round 4.
My expected output is given below. Here I am taking the ID and round value from round 4 and updating rownumber 3 with the ID value from rownumber 4.
My requirement is to find the ENO which is exist in both round 3 and round 4 and update the values accordingly.
UPDATE TGT
SET TGT.ROUND = SRC.ROUND,
TGT.ID = SRC.ID
FROM TABLE TGT INNER JOIN TABLE SRC
ON TGT.ROUND='3' and SRC.ROUND='4' and TGT.ENO = SRC.ENO
Could someone help here please. I tried something like this.But its not working
UPDATE TABLE
SET ID = (SELECT t.ID
FROM TABLE t, TABLE t2
WHERE t.ENO = t2.ENO AND t.ROUND= ='4' AND t2.ROUND='3'
) ,
ROUND= (SELECT t.ROUND
FROM TABLE t, TABLE t2
WHERE t.ENO = t2.ENO AND t.ROUND= ='4' AND t2.ROUND='3')
where ROUND='3'
You may try this. I think the issue is you are not relating your inner subquery with outer main table
UPDATE TABLE TB
SET TB.ID = (SELECT t.ID
FROM TABLE t, TABLE t2
WHERE TB.ENO=t.ENO ---- added this
and t.ENO = t2.ENO AND t.ROUND= ='4' AND t2.ROUND='3'
) ,
TB.ROUND= (SELECT t.ROUND
FROM TABLE t, TABLE t2
WHERE TB.ENO=t.ENO --- added this
and t.ENO = t2.ENO AND t.ROUND= ='4' AND t2.ROUND='3')
where tb.ROUND='3'
Try this:
UPDATE MY_SAMPLE TGT
SET (ID, ROUND) = (SELECT ID, ROUND FROM MY_SAMPLE WHERE ENO = TGT.ENO AND ROUND = 4)
WHERE ROUND = 4 AND EXISTS (SELECT 1 FROM MY_SAMPLE WHERE ENO = TGT.ENO AND ROUND = 4);
The difference with yours is that the correlated subquery has to be a row-subselect, it has to guarantee zero or one row (and will assign nulls in case of returning zero rows). The EXISTS subquery excludes rows for which the correlated subquery will not return rows.
I have a simple problem. How can I add a unique constraint for a table, without relating the values to their columns? For example, I have this table
ID_A ID_B
----------
1 2
... ...
In that example, I have the record (1,2). For me, (1,2) = (2,1). So i don't want to allow my database to store both values. I know I can accomplish it using, triggers or checks and functions. But i was wondering if there is any instruccion like
CREATE UNIQUE CONSTRAINT AS A SET_CONSTRAINT
You could write a view like that:
select 1 as Dummy
from T t1
join T t2 on t1.ID1 = t2.ID2 AND t1.ID2 = t2.ID1 --join to corresponding row
cross join TwoRows
And create a unique index on Dummy. TwoRows is a table that contains two rows with arbitrary contents. It is supposed to make the unique index fail if there ever is a row in it. Any row in this view indicates a uniqueness violation.
You can do this using Instead of Insert trigger.
Demo
Table Schema
CREATE TABLE te(ID_A INT,ID_B INT)
INSERT te VALUES ( 1,2)
Trigger
Go
CREATE TRIGGER trg_name
ON te
instead OF INSERT
AS
BEGIN
IF EXISTS (SELECT 1
FROM inserted a
WHERE EXISTS (SELECT 1
FROM te b
WHERE ( ( a.id_a = b.id_b
AND a.id_b = b.id_a )
OR ( a.id_a = b.id_a
AND a.id_b = b.id_b ) )))
BEGIN
PRINT 'duplciate record'
ROLLBACK
END
ELSE
INSERT INTO te
SELECT Id_a,id_b
FROM inserted
END
SELECT * FROM te
Insert Script
INSERT INTO te VALUES (2,1) -- Duplicate
INSERT INTO te VALUES (1,2) --Duplicate
INSERT INTO te VALUES (3,2) --Will work
First attempt at a cursor so take it easy =P The cursor is supposed to grab a list of company ids that are all under a umbrella group. Then target a specific company and copy its workflow records to the companies in the cursor.
It infinitely inserts these workflow records into all the companies ... what is the issue here?
Where is the n00b mistake?
DECLARE #GroupId int = 36;
DECLARE #CompanyToCopy int = 190
DECLARE #NextId int;
Declare #Companies CURSOR;
SET #Companies = CURSOR FOR
SELECT CompanyId
FROM Company C
INNER JOIN [Group] G
ON C.GroupID = G.GroupID
WHERE C.CompanyID != 190
AND
G.GroupId = #GroupId
AND
C.CompanyID != 0
OPEN #Companies
FETCH NEXT
FROM #Companies INTO #NextId
WHILE (##FETCH_STATUS = 0)
BEGIN
INSERT INTO COI.Workflow(CompanyID, EndOfWorkflowAction, LetterType, Name)
(SELECT
#NextId,
W.EndOfWorkflowAction,
W.LetterType,
W.Name
FROM COI.Workflow W)
FETCH NEXT
FROM #Companies INTO #NextId
END
CLOSE #Companies;
DEALLOCATE #Companies;
Edit:
I decided to attempt making this set based just because after being told to do it ... I realized I didn't really quite have the answer as to how to do it as a set based query.
Thanks for all the help everyone. I'll post the set based version for posterity.
INSERT INTO COI.Workflow(CompanyID, EndOfWorkflowAction, LetterType, Name)
(
SELECT
CG.CompanyId,
W.EndOfWorkflowAction,
W.LetterType,
W.Name
FROM COI.Workflow W
CROSS JOIN (SELECT C.CompanyID
FROM Company C
INNER JOIN [Group] G
ON G.GroupID = C.GroupID
WHERE C.CompanyID != 190
AND
C.CompanyID != 0
AND
G.GroupID = 36
) AS CG
WHERE W.CompanyID = 190
)
You have no WHERE condition on this:
SELECT
#NextId,
W.EndOfWorkflowAction,
W.LetterType,
W.Name
FROM COI.Workflow W
-- WHERE CompanyID = #CompanyToCopy -- This should be here
So you are getting a kind of doubling effect.
initial state, company 190, seed row (0)
pass one, company 2, copy of seed row (1)
now 2 rows
pass two, company 3, copy of seed row (0) - call this (2)
pass two, company 3, copy of copy of seed row (1) - call this (3)
now 4 rows
then 8 rows, etc
You are inserting a new copy of all workflow records in the workflow table for each iteration, so it will double in size each time. If you for example have 30 items in your cursor, you will end up with a workflow table with 1073741824 times more records than it had before.
I beieve your logic is wrong (it's somewhat hidden because of the use of a cursor!).
Your posted code is attempting to insert a row into into COI.Workflow for every row in COI.Workflow times the number of companies matching your first select's conditions. (Notice how your insert's SELECT statement has no condition: you are selecting the whole table). On each time through the loop, you are doubling the number of rows in COI.Workflow
So, it's not infinite but it could well be very, very long!
I suggest you rewrite as a set based statement and the logic will become clearer.
First use of cursor is OK, all problems in INSERT ... SELECT logic.
I cannot understand what do you need to insert into COI.Workflow table.
I agree with previous commentatorts that your current WHERE condition doubles records, but I cannot believe that you want to insert the full-doubled records for each company each time.
so, I think you need something like
INSERT INTO COI.Workflow(CompanyID, EndOfWorkflowAction, LetterType, Name)
(SELECT TOP 1
#NextId,
W.EndOfWorkflowAction,
W.LetterType,
W.Name
FROM COI.Workflow W)
Or, we need to know more about your logic of inserting the records.