T-SQL: Two Level Aggregation in Same Query - sql-server

I have a query that joins a master and a detail table. Master table records are duplicated in results as expected. I get aggregation on detail table an it works fine. But I also need another aggregation on master table at the same time. But as master table is duplicated, aggregation results are duplicated too.
I want to demonstrate this situation as below;
If Object_Id('tempdb..#data') Is Not Null Drop Table #data
Create Table #data (Id int, GroupId int, Value int)
If Object_Id('tempdb..#groups') Is Not Null Drop Table #groups
Create Table #groups (Id int, Value int)
/* insert groups */
Insert #groups (Id, Value)
Values (1,100), (2,200), (3, 200)
/* insert data */
Insert #data (Id, GroupId, Value)
Values (1,1,10),
(2,1,20),
(3,2,50),
(4,2,60),
(5,2,70),
(6,3,90)
My select query is
Select Sum(data.Value) As Data_Value,
Sum(groups.Value) As Group_Value
From #data data
Inner Join #groups groups On groups.Id = data.GroupId
The result is;
Data_Value Group_Value
300 1000
Expected result is;
Data_Value Group_Value
300 500
Please note that, derived table or sub-query is not an option. Also Sum(Distinct groups.Value) is not suitable for my case.

If I am not wrong, you just want to sum value column of both table and show it in a single row. in that case you don't need to join those just select the sum as a column like :
SELECT (SELECT SUM(VALUE) AS Data_Value FROM #DATA),
(SELECT SUM(VALUE) AS Group_Value FROM #groups)

SELECT
(
Select Sum(d.Value) From #data d
WHERE EXISTS (SELECT 1 FROM #groups WHERE Id = d.GroupId )
) AS Data_Value
,(
SELECT Sum( g.Value) FROM #groups g
WHERE EXISTS (SELECT 1 FROM #data WHERE GroupId = g.Id)
) AS Group_Value

I'm not sure what you are looking for. But it seems like you want the value from one group and the collected value that represents a group in the data table.
In that case I would suggest something like this.
select Sum(t.Data_Value) as Data_Value, Sum(t.Group_Value) as Group_Value
from
(select Sum(data.Value) As Data_Value, groups.Value As Group_Value
from data
inner join groups on groups.Id = data.GroupId
group by groups.Id, groups.Value)
as t
The edit should do the trick for you.

Related

Consolidation of two data rows in a loop for n occurrences

We are running a table that holds some information for order of new products.
From time to time we receive new orders from a 3rd party system and insert them into our DB.
Sometimes, however, for a specific order there is already an entry in our table.
So instead of checking if there already IS an order, the colleagues just inserts new data sets into our table.
Now that the process of inserting is streamlined, I am supposed to consolidate the existing duplicates in the table.
The table looks like this:
I have 138 of these pairs where the PreOrderNumber occurrs twice. I'd like to insert the FK_VehicleFile number and the CommissionNumber to the row where the FK_Checklist is set and delete the duplicate with the missing FK_Checklist after that.
My idea is to write a transact script that looks like this:
First I store all the PreOrderNumbers that have duplicates in its an own table:
DECLARE #ResultSet TABLE ( PK_OrderNumber int,
FK_Checklist int,
FK_VehicleFile int,
PreOrderNumbers varchar(20))
INSERT INTO #ResultSet
SELECT PK_OrderNumber, PreOrderNumber
FROM [LUX_WEB_SAM].[dbo].[OrderNumbers]
GROUP BY PreOrderNumber
HAVING (COUNT(PreOrderNumber) > 1)
And that's it so far.
I'm very new to these kind of SQL scripts.
I think I need to use some kind of loop over all entries in the #ResultSet table to grab the FK_VehicleFile and CommissionNumber from the first data set and store them in the second data set.
Or do you have and suggestions how to solve this problem in a more easy way?
This response uses a CTE:
WITH [MergedOrders] AS
(
Select
ROW_NUMBER() OVER(PARTITION BY row1.PreOrderNumber ORDER BY row1.PK_OrderNumber) AS Instance
,row1.PK_OrderNumber AS PK_OrderNumber
,ISNULL(row1.FK_Checklist,row2.FK_Checklist) AS FK_Checklist
,ISNULL(row1.FK_VehicleFile,row2.FK_VehicleFile) AS FK_VehicleFile
,ISNULL(row1.PreOrderNumber,row2.PreOrderNumber) AS PreOrderNumber
,ISNULL(row1.CommissionNumber,row2.CommissionNumber) AS CommissionNumber
FROM [LUX_WEB_SAM].[dbo].[OrderNumbers] AS row1
INNER JOIN [LUX_WEB_SAM].[dbo].[OrderNumbers] AS row2
ON row1.PreOrderNumber = row2.PreOrderNumber
AND row1.PK_OrderNumber <> row2.PK_OrderNumber
)
SELECT
[PK_OrderNumber]
,[FK_Checklist]
,[FK_VehicleFile]
,[PreOrderNumber]
,[CommissionNumber]
FROM [MergedOrders]
WHERE Instance = 1 /* If we were to maintain Order Number of second instance, use 2 */
Here's the explanation:
A Common Table Expression (CTE) acts as an in-memory table, which we use to extract all rows that are repeated (NB: The INNER JOIN statement ensures that only rows that occur twice are selected). We use ISNULL to switch out values where one or the other is NULL, then select the output for our destination table.
You can take help from following scripts to perform your UPDATE and DELETE action.
Please keep in mind that both UPDATE and DELETE are risky operation and do your test first with test data.
CREATE TABLE #T(
Col1 VARCHAR(100),
Col2 VARCHAR(100),
Col3 VARCHAR(100),
Col4 VARCHAR(100),
Col5 VARCHAR(100)
)
INSERT INTO #T(Col1,Col2,Col3,Col4,Col5)
VALUES(30,NULL,222,00000002222,096),
(25,163,NULL,00000002222,NULL),
(30,163,NULL,00000002230,NULL)
SELECT * FROM #T
UPDATE A
SET A.Col3 = B.Col3, A.Col5 = B.Col5
FROM #T A
INNER JOIN #T B ON A.Col4 = B.Col4
WHERE A.Col2 IS NOT NULL AND B.Col2 IS NULL
DELETE FROM #T
WHERE Col4 IN (
SELECT Col4 FROM #T
GROUP BY Col4
HAVING COUNT(*) = 2
)
AND Col2 IS NULL
SELECT * FROM #T

Filling the ID column of a table NOT using a cursor

Tables have been created and used without and ID column, but ID column is now needed. (classic)
I heard everything could be done without cursors. I just need every row to contain a different int value so I was looking for some kind of row number function :
How do I use ROW_NUMBER()?
I can't tell exactly how to use it even with these exemples.
UPDATE [TableA]
SET [id] = (select ROW_NUMBER() over (order by id) from [TableA])
Subquery returned more than 1 value.
So... yes of course it return more than one value. Then how to mix both update and row number to get that column filled ?
PS. I don't need a precise order, just unique values. I also wonder if ROW_NUMBER() is appropriate in this situation...
You can use a CTE for the update
Example
Declare #TableA table (ID int,SomeCol varchar(50))
Insert Into #TableA values
(null,'Dog')
,(null,'Cat')
,(null,'Monkey')
;with cte as (
Select *
,RN = Row_Number() over(Order by (Select null))
From #TableA
)
Update cte set ID=RN
Select * from #TableA
Updated Table
ID SomeCol
1 Dog
2 Cat
3 Monkey
You can use a subquery too as
Declare #TableA table (ID int,SomeCol varchar(50))
Insert Into #TableA values
(null,'Dog')
,(null,'Cat')
,(null,'Monkey');
UPDATE T1
SET T1.ID = T2.RN
FROM #TableA T1 JOIN
(
SELECT ROW_NUMBER()OVER(ORDER BY (SELECT 1)) RN,
*
FROM #TableA
) T2
ON T1.SomeCol = T2.SomeCol;
Select * from #TableA

Is there a way to retrieve inserted identity as well as some values from the query in an INSERT SELECT?

I have a situation in which I need to insert some values from a query into a table that has an identity PK. For some of the records, I need also to insert values in another table which has a 1-to-1 (partial) relationship:
CREATE TABLE A (
Id int identity primary key clustered,
Somevalue varchar(100),
SomeOtherValue int)
CREATE TABLE B (Id int primary key clustered,
SomeFlag bit)
DECLARE #inserted TABLE(NewId int, OldId)
INSERT INTO A (Somevalue)
OUTPUT Inserted.Id into #inserted(NewId)
SELECT SomeValue
FROM A
WHERE <certain condition>
INSERT INTO B (Id, SomeFlag)
SELECT
i.NewId, B.SomeFlag
FROM #inserted i
JOIN A ON <some condition>
JOIN B ON A.Id = B.Id
The problem is that the query from A in the first INSERT/SELECT returns records that can only be differentiated by the Id, which I cannot insert. Unfortunately I cannot change the structure of the A table, to insert the "previous" Id which would solve my problem.
Any idea that could lead to a solution?
With INSERT ... OUTPUT ... SELECT ... you can't output columns that are not in the target table. You can try MERGE instead:
MERGE INTO A as tgt
USING (SELECT Id, SomeValue FROM A WHERE <your conditions>) AS src
ON 0 = 1
WHEN NOT MATCHED THEN
INSERT (SomeValue)
VALUES (src.SomeValue)
OUTPUT (inserted.Id, src.Id) -- this is your new Id / old Id mapping
INTO #inserted
;
SCOPE_IDENTITY() returns the last identity value generated by the current session and current scope. You could stick that into a #table and use that to insert into B
SELECT SCOPE_IDENTITY() as newid into #c
Though, your INSERT INTO B join conditions implies to me that the value in B is already known ?

Updating a primary key with another column of scrambled unique values

Disclaimer: this change is not generally a useful thing to do to a properly normalized database, but I have business reasons for it.
I have a table of data with a primary key of numeric values. This key is used as a foreign key reference in multiple other tables. There is also a column of numeric values that can be updated to reflect the desired order for the rows. The order and PK columns contain the same numbers, but ordering the table by either column scrambles the other one.
What I'm trying to do is to update the primary key to follow the same order as the order column, but SSMS gives me the error "Violation of PRIMARY KEY constraint 'PK_Constraint'. Cannot insert duplicate key in object 'tbl'. The duplicate key value is <value>."
My update statement looks like this:
update tbl set tbl.key = tbl.order where tbl.key <> tbl.order
I already know how to update the foreign key references in the other tables, so I just need to know how I can update the key in this situation.
Check to make sure that there are no duplicate values in tbl.Order. If there are, you must resolve the duplicates before you can update the PK column with those values.
SELECT
order,COUNT(order) as NumDupes
FROM tbl
GROUP BY order
HAVING COUNT(order) > 1
I eventually figured out enough of the issue that I could solve this using a cursor. I'm putting my solution here for reference. If someone wants to simplify/modify this to use set-based queries, I'll accept that answer.
Step 1
Using a query from this answer, I found that there were a few order/ID "chains" that had one end that would result in a duplicate with a simple set-based update:
with parents as
(
select 1 idx, ID, Order, Name from tbl where ID <> Order
union all
select idx+1, p.ID, v.Order, p.Name from parents p inner join tbl v on p.Order = v.ID and idx < 100
)
select parents from (
select distinct parents from (
select *, parents = stuff
( ( select ', ' + cast(p.Order as varchar(100)) from parents p
where t.ID = p.ID for xml path('')
) , 1, 2, '') from parents t ) x ) y
order by len(parents) desc
Step 2
I manually looked through the result set to find the longest row that ended with a given value. I then put the values from one chain into a temp table in the order given:
create table #tmp (id int identity(1,1), val int)
insert into #tmp values <list of values>
Step 3
Next I ran through the temp table with a cursor and updated each row (and foreign key references) individually:
declare #val int
declare #old int
declare val cursor for select val from #tmp order by id desc
open val
fetch next from val into #val
while ##fetch_status = 0
begin
set #old = (select ID from tbl where Order = #val)
insert into tbl(ID, <other columns>)
select #val, <other columns> from tbl where ID = #old
update <other tables> set FK_ID = #val where FK_ID = #old
delete from tbl where ID = #old
fetch next from val into #val
end;close val; deallocate val;
Step 4
I repeated steps 2 and 3 for each "chain". At the end, my table had the primary key in the same order as the Order field.

Bulk insert with SQL Server where one column has many values and all other columns take on preset values

I have a table (Table1) that has 4 columns (ID1, ID2, Percent, Time, Expired). I want to insert a bunch of new rows in that table where ID1 is taken from another SQL query I have and all the other columns are set to some specified values.
So I have my query:
SELECT someID FROM other_tables WITH other_conditions
And essentially what I want to do is
FOR v in <above query>
Insert New row into Table1 (v, some second id, some percent, some time, some expired value)
EDIT I'm not opposed to not doing this in a loop, just don't know what the best way to insert the data is
You can use a cursor and fetch I think for what you are trying to accomplish. Here is a shell for you...
WITH CURSOR
DECLARE c CURSOR FOR
SELECT DISTINCT colName FROM Table1 JOIN Table2 ON <stuff> WHERE <other_stuff>
DECLARE #ID VARCHAR(4) --or what ever is needed
OPEN c
FETCH NEXT FROM c INTO #ID
WHILE ##FETCH_STATUS = 0
BEGIN
INSERT INTO Table1 (ID, ID2, Percent, Time, Expired)
VALUES (#ID, some second id, some percent, some time, some expired value)
WHERE ID = #ID
FETCH NEXT FROM c INTO #ID
END
CLOSE c
DEALLOCATE c
WITH CROSS APPLY (DUMMY DATA)
if object_id('tempdb..#ids') is not null drop table #ids
if object_id('tempdb..#idDetails') is not null drop table #idDetails
create table #ids (id int)
insert into #ids (id) values
(1),(2),(3)
select i.*, d.*
into #idDetails
from #ids i cross apply (select 2 as id2 ,2.0 as per,'1/1/2016' as dt,'x' as x) d
select * from #idDetails
WITH CROSS APPLY (EXAMPLE WITH YOUR TABLES)
select i.someID, d.*
into #idDetails
from other_tables i
cross apply (select 'some second id' as id2 ,'some percent' as [Percent],'1/1/2016 14:55:22' as [SomeTime] as dt,'SomeExpiredVal' as [ExpiredVal]) d
select * from #idDetails
Maybe I am missing something, but you need a table valued function which returns the desired row for each row in Table1
create function fn_get_new_recs(id int)
RETURNS #results TABLE (Id INT,<other columns you need>)
AS
BEGIN
--Query here to return new records for a single id
END
then use CROSS APPLY
INSERT INTO Table1(Id,Col1,Col2,Col3)
SELECT ST.Id,ST.Col1,ST.Col2,ST.Col3
FROM Table1 T
cross apply fn_get_new_recs(T.Id) ST

Resources