how to conditionally update table - database

I have one address table (containing k_id-PK, address) and one add_hist log table(containing k_id, address,change date) i.e. it has all address per id and on which date address change.
I want to make an update query which will update address column in address table so,fetching latest address from add_hist table will do the job.I am almost done with my query. Its fetching correct result too. But I want if address table is already updated, then dont update it.Here goes my query.Please review and correct it to get the desired result.
update address a set k_add =
(select kad from (
select h.k_id kid, h.k_add kad, h.chg_dt from add_hist h,
(select k_id, max(chg_dt) ch from add_hist
group by k_id
) h1
where h1.k_id = h.k_id
and h1.ch=h.chg_dt
) h2
where h2.kid = a.k_id)
;

You could use a merge instead of an update:
merge into address a
using (
select k_id, max(k_add) keep (dense_rank last order by chg_dt) as k_add
from add_hist
group by k_id
) h
on (a.k_id = h.k_id)
when matched then
update set a.k_add = h.k_add
where (a.k_add is null and h.k_add is not null)
or (a.k_add is not null and h.k_add is null)
or a.k_add != h.k_add;
The query in the using clause finds the most recent address for each ID from the history table. When a matching ID exists on the main table that is updated - but only if the value is different, because of the where clause.
With some dummy data:
create table address (k_id number primary key, k_add varchar2(20));
create table add_hist (k_id number, k_add varchar2(20), chg_dt date);
insert into address (k_id, k_add) values (1, 'Address 1');
insert into address (k_id, k_add) values (2, 'Address 2');
insert into address (k_id, k_add) values (3, null);
insert into address (k_id, k_add) values (4, null);
insert into add_hist (k_id, k_add, chg_dt) values (1, 'Address 1', date '2017-01-01');
insert into add_hist (k_id, k_add, chg_dt) values (1, 'Address 2', date '2017-01-02');
insert into add_hist (k_id, k_add, chg_dt) values (1, 'Address 1', date '2017-01-03');
insert into add_hist (k_id, k_add, chg_dt) values (2, 'Address 1', date '2017-01-01');
insert into add_hist (k_id, k_add, chg_dt) values (2, 'Address 2', date '2017-01-02');
insert into add_hist (k_id, k_add, chg_dt) values (2, 'Address 3', date '2017-01-03');
insert into add_hist (k_id, k_add, chg_dt) values (3, 'Address 1', date '2017-01-01');
insert into add_hist (k_id, k_add, chg_dt) values (3, null, date '2017-01-02');
insert into add_hist (k_id, k_add, chg_dt) values (4, 'Address 1', date '2017-01-01');
commit;
running your update statement gets:
4 rows updated.
select * from address;
K_ID K_ADD
---------- --------------------
1 Address 1
2 Address 3
3
4 Address 1
After rolling back to the starting state, running the merge gets:
2 rows merged.
select * from address;
K_ID K_ADD
---------- --------------------
1 Address 1
2 Address 3
3
4 Address 1
Same final result, but 1 row merged rather than 2 rows updated.
(If you run the merge without the where clause, all four rows are still affected; without the null checks only row with ID 2 is updated).

You can achieve the desired result with an UPDATE statement. Specifically, you need to "update through a join." The syntax has to be precise though. Update with joins
Using the same setup as in Alex's answer, the following update statement will update one row.
EDIT: See Alex Poole's comments below this Answer. The solution proposed here will work only in Oracle 12.1 and above. The problem is not the "update through join" concept, but the source rowset being the result of an aggregation. It has to do with the way in which Oracle knows, at compile time, that the "join" column in the source rowset is unique (it has no duplicates). In older versions of Oracle, an explicit unique or primary key constraint or index was required. Of course, when we GROUP BY <col>, the <col> will be unique in the result set of an aggregation, but it will not have a unique constraint or index on it. It seems Oracle recognized this situation, and since 12.1 it allows update through join where the source table is the result of an aggregation, as shown in this example.
update
( select a.k_add as current_address, q.new_address
from (
select k_id,
min(k_add) keep (dense_rank last order by chg_dt) as new_address
from add_hist
group by k_id
) q
join
address a on a.k_id = q.k_id
)
set current_address = new_address
where current_address != new_address
or current_address is null and new_address is not null
or current_address is not null and new_address is null
;

Related

Select all Main table rows with detail table column constraints with GROUP BY

I've 2 tables tblMain and tblDetail on SQL Server that are linked with tblMain.id=tblDetail.OrderID for orders usage. I've not found exactly the same situation in StackOverflow.
Here below is the sample table design:
/* create and populate tblMain: */
CREATE TABLE tblMain (
ID int IDENTITY(1,1) NOT NULL,
DateOrder datetime NULL,
CONSTRAINT PK_tblMain PRIMARY KEY
(
ID ASC
)
)
GO
INSERT INTO tblMain (DateOrder) VALUES('2021-05-20T12:12:10');
INSERT INTO tblMain (DateOrder) VALUES('2021-05-21T09:13:13');
INSERT INTO tblMain (DateOrder) VALUES('2021-05-22T21:30:28');
GO
/* create and populate tblDetail: */
CREATE TABLE tblDetail (
ID int IDENTITY(1,1) NOT NULL,
OrderID int NULL,
Gencod VARCHAR(255),
Quantity float,
Price float,
CONSTRAINT PK_tblDetail PRIMARY KEY
(
ID ASC
)
)
GO
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(1, '1234567890123', 8, 12.30);
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(1, '5825867890321', 2, 2.88);
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(3, '7788997890333', 1, 1.77);
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(3, '9882254656215', 3, 5.66);
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(3, '9665464654654', 4, 10.64);
GO
Here is my SELECT with grouping:
SELECT tblMain.id,SUM(tblDetail.Quantity*tblDetail.Price) AS TotalPrice
FROM tblMain LEFT JOIN tblDetail ON tblMain.id=tblDetail.orderid
WHERE (tblDetail.Quantity<>0) GROUP BY tblMain.id;
GO
This gives:
The wished output:
We see that id=2 is not shown even with LEFT JOIN, as there is no records with OrderID=2 in tblDetail.
How to design a new query to show tblMain.id = 2? Mean while I must keep WHERE (tblDetail.Quantity<>0) constraints. Many thanks.
EDIT:
The above query serves as CTE (Common Table Expression) for a main query that takes into account payments table tblPayments again.
After testing, both solutions work.
In my case, the main table has 15K records, while detail table has some millions. With (tblDetail.Quantity<>0 OR tblDetail.Quantity IS NULL) AND tblDetail.IsActive=1 added on JOIN ON clause it takes 37s to run, while the first solution of #pwilcox, the condition being added on the where clause, it ends up on 29s. So a gain of time of 20%.
tblDetail.IsActive column permits me ignore detail rows that is temporarily ignored by setting it to false.
So the for me it's ( #pwilcox's answer).
where (tblDetail.quantity <> 0 or tblDetail.quantity is null)
Change
WHERE (tblDetail.Quantity<>0)
to
where (tblDetail.quantity <> 0 or tblDetail.quantity is null)
as the former will omit id = 2 because the corresponding quantity would be null in a left join.
And as HABO mentions, you can also make the condition a part of your join logic as opposed to your where statement, avoiding the need for the 'or' condition.
select m.id,
totalPrice = sum(d.quantity * d.price)
from tblMain m
left join tblDetail d
on m.id = d.orderid
and d.quantity <> 0
group by m.id;

Prevent Grouping rows by NULL value

According to this article:
When grouping with a column in a GROUP BY statement that contains NULLs, they will be put into one group in your result set:
However, what I want is to prevent grouping rows by NULL value.
The following code gives me one row:
IF(OBJECT_ID('tempdb..#TestTable') IS NOT NULL)
DROP TABLE #TestTable
GO
CREATE TABLE #TestTable ( ID INT, Value INT )
INSERT INTO #TestTable(ID, Value) VALUES
(NULL, 70),
(NULL, 70)
SELECT
ID
, Value
FROM #TestTable
GROUP BY ID, Value
The output is:
ID Value
NULL 70
However, I would like to have two rows. My desired result looks like this:
NULL 70
NULL 70
Is it possible to have two rows with GROUP BY?
UPDATE:
What I need is to count those rows:
SELECT
COUNT(1) AS rows
FROM (SELECT 1 AS foo
FROM #TestTable
GROUP BY ID, Value
)q
OUTPUT: 1
But, actually, there are two rows. I need output to have 2.
What you need is a way to make NULL values in Id unique. Using the following code will make the values unique, but continue to group the non-NULL value by virtue of the default value for a case expression being NULL:
group by Id, case when Id is NULL then NewId() end, Value
Assuming you want this behavior because you do want to group by the values of the nullable column (Id in your example), you can add a row_number when the id column is null using a common table expression to create an artificial difference between duplicate groups - like this:
-- Adding some more rows to the table
INSERT INTO #TestTable(ID, Value) VALUES
(NULL, 70),
(NULL, 70),
(1, 70),
(1, 70),
(2, 70);
The query, with the cte:
WITH CTE AS
(
SELECT Id, Value, IIF(Id IS NULL, ROW_NUMBER() OVER(ORDER BY Id), NULL) As Surrogate
FROM #TestTable
)
SELECT
ID
, Value
FROM CTE
GROUP BY ID, Surrogate, Value
Results:
ID Value
NULL 70
NULL 70
1 70
2 70

Can I pull Max and Min values in SQL without using group by for non aggregate values?

I have a table of user data for when they enroll in a program. The fields include a user ID, start date, end date, entry reason, exit reason and program type. For each year the user is enrolled in a specific program they will have an entry and exit date for that year along with an entry reason. They only get an exit reason when they are exited from the program completely. Here is an example of the data in the table.
Data Table
Desired Result
I need to pull one line for each user that has their original start date in the program, most recent start date, and most recent end date. I also need to pull the exit reason if one exists and entry reason associated with the most recent start date and this is what is getting me hung up. I’m assuming the problem is related to having to group by the entry reason. Is there any way around using an aggregate function to get the min/max dates?
My query is:
Select
Table1.userID,
CAST(Min(table2.startdate) as date) as Originalstartdate,
CAST(Max(table2.startdate) as date) as Maxstartdate,
CAST(Max(table2.enddate) as date) as ExitDate,
CASE
WHEN table2.exitreason = NULL then ‘’
ELSE table2.exitreason
END as Exitcode,
Table2.entryreason
From
Table1 left outer join
Table2 on Table1.userID = Table2.userID
Where
Table1.status = ‘active’ and Table2.programID = ‘Program1’ and (Table2.exitreason <> ‘NULL’ or Table2.entryreason <> ‘NULL’)
Group By
Table1.userID, Table2.exitreason, Table2.entryreason
I used the below sample code in order to generate this.
The idea here is to utilize the userID as the anchor (you want one row per user, right?), aggregating the rest of the information but with the situation you requested.
CREATE TABLE SCRIPT:
CREATE TABLE table1
(
userID INT IDENTITY(1, 1) PRIMARY KEY,
name VARCHAR(200),
stat CHAR(1) NOT NULL
DEFAULT 'A');
CREATE TABLE table2
(
t2ID INT IDENTITY(1, 1),
StartDate DATE,
UserID INT FOREIGN KEY REFERENCES table1(userID),
ProgramID VARCHAR(150) DEFAULT 'Program1',
EndDate DATE,
EntryReason VARCHAR(2000),
ExitReason VARCHAR(2000));
INSERT INTO Table1
(name)
SELECT *
FROM(VALUES
(
'First name'),
(
'Second name'),
(
'Third name')) x("name");
INSERT INTO Table2
SELECT *
FROM(VALUES
(
'20180101', 1, 'Program1', '20181231', 11, NULL),
(
'20190101', 1, 'Program1', '20191231', 12, NULL),
(
'20200101', 1, 'Program1', NULL, 11, NULL),
(
'20170101', 2, 'Program1', '20171231', 11, NULL),
(
'20180101', 2, 'Program1', '20171231', 14, 2),
(
'20200101', 3, 'Program1', NULL, 11, NULL)
) x(StartDate, UserID, ProgramID, EndDate, EntryReason, ExitReason);
QUERY:
SELECT t1.userID,
CAST(MIN(t2.StartDate) AS DATE)
AS OriginalStartDate, -- This uses your logic to grab the earliest date
CAST(MAX(t2.StartDate) AS DATE)
AS RecentStartDate, -- This utilizes your logic to grab the last start date
CAST(MAX(t2.enddate) AS DATE)
AS ExitDate,
-- This works because we know an ExitDate must be populated due to the where
-- criteria (which prevents people who haven't exited yet from showing up)
ISNULL(MAX(t2.exitreason), '')
AS ExitCode, -- This is just a cleaner way to handle nulls.
STUFF(
(
SELECT CONCAT(',', EntryReason)
FROM Table2
WHERE Table2.UserID=t1.UserID FOR XML PATH('')
), 1, 1, '')
AS EntryReasonList
-- this solution creates a list of entry reasons; we could pick a best winner
-- (e.g. first entry code, last entry code..) but I created a list because
-- I didn't understand your intent.
FROM Table1
AS t1
LEFT JOIN
Table2
AS t2
ON T1.userID=T2.userID
WHERE t1.stat='A' -- you would use status= 'active'
AND t2.programID='Program1' -- same as before
AND NOT EXISTS
-- a not exists clause will do what you want to filter graduates out
(
SELECT 1
FROM Table2
AS t2self
WHERE t2.userID=t2self.userID
AND t2self.exitreason IS NOT NULL
)
GROUP BY t1.userID;

sql server TOP command with order by clause

CREATE DATABASE TEST
USE TEST
CREATE TABLE TBL_TEMP
(
ID INT,
NAME VARCHAR(100),
CREATED_ON DATETIME
)
INSERT INTO TBL_TEMP VALUES (1, 'A', NULL)
INSERT INTO TBL_TEMP VALUES (2, 'B', NULL)
INSERT INTO TBL_TEMP VALUES (3, 'C', NULL)
INSERT INTO TBL_TEMP VALUES (4, 'D', NULL)
SELECT TOP 1 *
FROM TBL_TEMP
ORDER BY CREATED_ON
Result:
ID NAME CREATED_ON
------------------
2 B NULL
SELECT TOP 1 * FROM TBL_TEMP
Result:
ID NAME CREATED_ON
--------------------
1 A NULL
Why top 1 gives two different results, is it that when order by clause is used it picks random row and when not used then it gives proper top record ?
is it a kind of bug in sql server 2008 ?
SQL does not guarantee an order unless you specify an ORDER BY clause, so in the second example you get the first-inserted row by good fortune.
If you specify an ORDER BY clause, the order is not defined if the values to sort on are identical. SQL could have selected any one of the four.
This is not a bug, but defined behaviour in SQL.

Multiple Insert for each of the Account IDs?

How do I effectively insert multiple rows without using loop for all of the Account-ID values?
INSERT INTO Table1
(AccountID, ShowColumns, GroupColumns, AvgColumnsFlag)
VALUES
(1, 'foo1', 'foo2', 'foo3')
(1, 'abc1', 'abc2', 'abc3')
(1, 'xyz1', 'xyz1', 'xyz1')
In this case, I have over 20,000 account ids. I can use one other table with unique account ID and do some kind of joining to get that. Then use it to in place of the displayed example Account-ID of "1".
I don't know how you guys do with multiple inserts for each Account-ID.
Thanks...
[Edit]
I found a way to insert using data from other table recently but unfortunately I can only insert 1 row, not multiple rows. :-( See code below... Is it possible to consolidate 3 of them into 1 instead?
INSERT INTO tblDealerSavedDataMyInventorySavedBuilds
(AccountId, LoadDefault, BuildName, ColumnShowAndSortOrderValues, ColumnGroupByValues, ColumnSortAverageValues)
SELECT DISTINCT tblaAccounts.AccountID, 0, 'My Inventory by Count', 'ImportStatus|StockNumber|Vin|Year|Make ASC|Model ASC|Trim|Mileage|PurchasePrice|StockDate|RepairCost|TotalCost|DaysInInventory|InventoryTrackerLocation|Category', 'Make|Model', 'MyInventoryCount-SortOrderByCount'
FROM tblaAccounts
ORDER BY tblaAccounts.AccountID ASC
INSERT INTO tblDealerSavedDataMyInventorySavedBuilds
(AccountId, LoadDefault, BuildName, ColumnShowAndSortOrderValues, ColumnGroupByValues, ColumnSortAverageValues)
SELECT DISTINCT tblaAccounts.AccountID, 0, 'My Inventory by Make', 'ImportStatus|StockNumber|Vin|Year|Make ASC|Model ASC|Trim|Mileage|PurchasePrice|StockDate|RepairCost|TotalCost|DaysInInventory|InventoryTrackerLocation|Category', 'Make|Model', 'MyInventoryCount-SortOrderByMake'
FROM tblaAccounts
ORDER BY tblaAccounts.AccountID ASC
INSERT INTO tblDealerSavedDataMyInventorySavedBuilds
(AccountId, LoadDefault, BuildName, ColumnShowAndSortOrderValues, ColumnGroupByValues, ColumnSortAverageValues)
SELECT DISTINCT tblaAccounts.AccountID, 0, 'My Inventory by Purchase Price', 'ImportStatus|StockNumber|Vin|Year|Make ASC|Model ASC|Trim|Mileage|PurchasePrice|StockDate|RepairCost|TotalCost|DaysInInventory|InventoryTrackerLocation|Category', 'Make|Model', 'MyInventoryCount-SortOrderByCost'
FROM tblaAccounts
ORDER BY tblaAccounts.AccountID ASC
First insert into #SourceTable all your values.
Then use this statement:
INSERT INTO Table1
SELECT *
FROM #SourceTable
It may look the same, but it's different, since you are actually addressing the table once, instead of 20,000 times..
You can also do it this way:
INSERT INTO Table1
SELECT 1, 'foo1', 'foo2', 'f003'
UNION ALL
SELECT 2, 'abc11', 'abc2', 'abc3'
UNION ALL
...
To insert multiple rows with hard-coded values use
insert into table (col1, col2, col3)
select 1, 'foo1', 'foo2', 'f003'
union all
select 2, 'abc11', 'abc2', 'abc3'
etc.
To insert from existing data
insert into table (col1, col2, col3)
select srccol1, srccol22, srccol33
from TableOrView

Resources