SQL Query for Outer Join with Group By

SQL Query for Outer Join with Group By - sql-server

I have the Months Table with MonthName, MonthNumber and Fiscal Year starts with July so I have assigned the values to the months like
MonthName=July and MonthNumber=1
MonthName=August and MonthNumber=2.
I have another Domain table BudgetCategory and it has BudgetCategoryId, BudgetCategoryName.
The PurchaseOrder table has OrderID, PurchaseMonth, BudgetCategoryId.
Now I want the query to find out the Monthly Purchases SUM(TotalCost) for every BudgetCategory. If there are no purchases for any BudgetCategoryId I want to display the zero in report.
Schema of Table:
CREATE TABLE [dbo].[BudgetCategory](
[BudgetCategoryId] [numeric](18, 0) NOT NULL,
[BudgetCategoryName] [varchar](50) NULL,
[TotalBudget] [nvarchar](50) NULL)
CREATE TABLE [dbo].[PurchaseOrder](
[OrderId] [bigint] NOT NULL,
[BudgetCategoryId] [bigint] NULL,
[PurchaseMonth] [nvarchar](50) NULL,
[QTY] [bigint] NULL,
[CostPerItem] [decimal](10, 2) NULL,
[TotalCost] [decimal](10, 2) NULL)
CREATE TABLE [dbo].[MonthTable](
[MonthNumber] [bigint] NULL,
[MonthName] [nvarchar](30) NULL)

Try this:
select a.BudgetCategoryName,
ISNULL(c.MonthName,'No purchase') as Month,
sum(ISNULL(TotalCost,0)) as TotalCost
from #BudgetCategory a left join #PurchaseOrder b on a.BudgetCategoryId = b.BudgetCategoryId
left join #MonthTable c on b.PurchaseMonth = c.[MonthName]
group by a.BudgetCategoryName,c.MonthName
order by a.BudgetCategoryName
Tested with this data
INSERT #BudgetCategory
VALUES (1,'CategoryA',1000),
(2,'CategoryB',2000),
(3,'CategoryC',1500),
(4,'CategoryD',2000)
INSERT #PurchaseOrder (OrderId,BudgetCategoryId,TotalCost,PurchaseMonth)
VALUES (1,1,550,'July'),
(2,1,700,'July'),
(3,2,600,'August')
INSERT #MonthTable
VALUES
(1,'July'),
(2,'August')
It will produce this results:
Let me know if this could help you

SELECT b.*, m.MonthNumber, q.[BudgetCategoryId], q.[PurchaseMonth], ISNULL(q.[TotalCost],0)
FROM [dbo].[BudgetCategory] b
LEFT JOIN
(
SELECT [BudgetCategoryId], [PurchaseMonth], sum([TotalCost]) [TotalCost]
FROM [dbo].[PurchaseOrder] p
GROUP BY p.[BudgetCategoryId], [PurchaseMonth]
) q ON b.BudgetCategoryId = q.BudgetCategoryId
LEFT JOIN [dbo].[MonthTable] m ON q.[PurchaseMonth] = m.[MonthName]

Related

Select only the most recent datarows [duplicate]

This question already has answers here:
Get top 1 row of each group
(19 answers)
Closed 1 year ago.
I have a table that takes multiple entries for specific products, you can create a sample like this:
CREATE TABLE test(
[coltimestamp] [datetime] NOT NULL,
[col2] [int] NOT NULL,
[col3] [int] NULL,
[col4] [int] NULL,
[col5] [int] NULL)
GO
Insert Into test
values ('2021-12-06 12:31:59.000',1,8,5321,1234),
('2021-12-06 12:31:59.000',7,8,4047,1111),
('2021-12-06 14:38:07.000',7,8,3521,1111),
('2021-12-06 12:31:59.000',10,8,3239,1234),
('2021-12-06 12:31:59.000',27,8,3804,1234),
('2021-12-06 14:38:07.000',27,8,3957,1234)
You can view col2 as product number if u like.
What I need is a query for this kind of table that returns unique data for col2, it must choose the most recent timestamp for not unique col2 entries.
In other words I need the most recent entry for each product
So in the sample the result will show two rows less: the old timestamp for col2 = 7 and col2 = 27 are removed
Thanks for your advanced knowledge

Give a row number by ROW_NUMBER() for each col2 value in the descending order of timestamp.
;with cte as(
Select rn=row_number() over(partition by col2 order by coltimestamp desc),*
From table_name
)
Select * from cte
Whwre rn=1;

select 17 milion records in sql server is very slow

I try to select a table with 17 Million Records .
It takes about 10 min .
Here you can see the Live execution plan .
Here is my table structure :
CREATE TABLE [bas].[GatewayReceipt](
[Id] [INT] IDENTITY(1,1) NOT NULL,
[CustomerId] [INT] NULL,
[UserId] [INT] NOT NULL,
[RefNumber] [NVARCHAR](200) NULL,
[ResNumber] [NVARCHAR](200) NULL,
[Price] [DECIMAL](18, 5) NOT NULL,
[GatewayChannelId] [INT] NOT NULL,
[StatusId] [INT] NOT NULL,
[EntryDate] [DATETIME] NOT NULL,
[ModifyDate] [DATETIME] NULL,
[RowVersion] [TIMESTAMP] NOT NULL,
CONSTRAINT [PK_Bas_GatewayReceipt] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [FG_ATS]
) ON [FG_ATS]
GO
As a note I have 3 non-clustered index on
1:CustomerId
2:customerIdAndUserId
3:gatewaychannelId
my query :
select * from bas.GatewayReceipt where gatewaychannelId in (1,2,3)
why my query is slow ?

The results are slow primarily due to time the client application (SSMS) needs to render the large 17 million row result.
To wit, it takes SSMS about 70 seconds to display the 10 million row result of this query on my PC in a grid and task manager shows SSMS is completely CPU bound during execution:
WITH
t10 AS (SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) t(n))
,t1k AS (SELECT 0 AS n FROM t10 AS a CROSS JOIN t10 AS b CROSS JOIN t10 AS c)
,t10m AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS num FROM t1k AS a CROSS JOIN t1k AS b CROSS JOIN t10 AS c)
SELECT num
FROM t10m;
Repeating the same query without the full rendering (Query-->Options-->Grid-->Discard results after execution), it takes only 12 seconds to retrieve the rows but not display them.
Consider end-to-end response time is a measure of both client and server time.

The table needs an INDEX starting with gatewaychannelId.
However, if most or all the rows in the table are values 1, 2, or 3, the index would actually slow down the query. In this case, it is faster to simply read the table, filtering out the few rows that do not apply.

How to fetch self referenced column value in sql

I have two tables with column
CREATE TABLE Dept(
[ID] [int] IDENTITY(1,1) NOT NULL,
[Next_ID] [int] NULL,
[Name] [varchar](50) NOT NULL,
[Bundle_ID] [int] NULL
)
And
CREATE TABLE Bundle(
[Bundle_ID] [int] NOT NULL,
[Bundle_Name] [varchar](40) NOT NULL
)
I would like to fetch nextID name so i tried
SELECT dept.ID, dept.Next_ID, currentbundle.Bundle_Name CurrentBundleName
FROM Dept dept
join Bundle currentbundle on currentbundle.Bundle_ID = dept.Bundle_ID
join dept dept1 on dept1.Next_ID=dept.ID
With this, I get only currentBundleName. How to fetch nextbundlename?
I would like to have output like this
ID NextID CurrentBundleName NextBundleName
********************************************************
1 3 template excel
3 4 excel word
4 NULL word NULL

CREATE TABLE #Dept(
[ID] [int] IDENTITY(1,1) NOT NULL,
[Next_ID] [int] NULL,
[Name] [varchar](50) NOT NULL,
[Bundle_ID] [int] NULL
)
CREATE TABLE #Bundle(
[Bundle_ID] [int] NOT NULL,
[Bundle_Name] [varchar](40) NOT NULL
)
INSERT INTO #Bundle
( Bundle_ID, Bundle_Name )
VALUES
( 1, 'One' ),
( 2, 'Two' ),
( 3, 'Three' ),
( 4, 'Four' )
INSERT INTO #Dept
( Next_ID, Name, Bundle_ID )
VALUES
( NULL, 'First', 1),
( 1, 'Second', 2),
( 2, 'Third', 3),
( 3, 'Fouth', 4)
select
d.ID,
d.Name AS DeptName,
d2.ID AS NextDeptId,
d2.Name AS NextDeptName,
b.Bundle_Name AS BundleName,
b2.Bundle_Name AS NextBundleName
FROM #Dept d
JOIN #Bundle b ON b.Bundle_ID = d.Bundle_ID
LEFT JOIN #Dept d2 ON d2.id=d.Next_ID
LEFT JOIN #Bundle b2 ON b2.Bundle_ID = d2.Bundle_ID
DROP TABLE #Bundle
DROP TABLE #Dept
Corrected Results:
ID DeptName NextDeptId NextDeptName BundleName NextBundleName
1 First NULL NULL One NULL
2 Second 1 First Two One
3 Third 2 Second Three Two
4 Fouth 3 Third Four Three

Please try this :
SELECT d.id,
d.next_id,
b.Bundle_Name current_bundle_name,
b1.Bundle_Name AS next_bundle_name
FROM dept d
INNER JOIN bundle b ON b.Bundle_ID = d.Bundle_ID
LEFT JOIN bundle b1 ON b1.Bundle_ID = d.Next_ID

Performance loss in sql left join when using int

I have a question about left join performance.
Supose I have this 2 tables:
CREATE TABLE "Jobs" (
"Id" INT NOT NULL,
"Department" VARCHAR(25) NULL,
"Job" VARCHAR(50) NOT NULL,
PRIMARY KEY ("Id", "JobName")
);
CREATE TABLE "Workers" (
"Id" INT NOT NULL,
"JobID" INT NOT NULL,
"WorkerName" VARCHAR(25) NULL,
"WorkerSurname" VARCHAR(25) NULL,
"JobName" VARCHAR(50) NOT NULL,
PRIMARY KEY ("Id")
);
Now, I want to left join both tables in order to get all the jobs on an specific department for an specific worker, even if the worker does´t take that job.
select t1.job, t1.department, t2.WorkerName, t2.WorkerSurname
from (SELECT distinct job, department FROM Jobs WHERE DEpartment= #depto) t1
left join dbo.Workers t2 on t1.id=t2.Jobid and t2.WorkerName=#NAME
This sql tooks about 0,171ms.
But, if I join with "Job" instead of "id":
select t1.job, t1.department, t2.WorkerName, t2.WorkerSurname
from (SELECT distinct job, department FROM Jobs WHERE DEpartment= #depto) t1
left join dbo.Workers t2 on t1.Job=t2.JobName and t2.WorkerName=#NAME
I tooks about 0,030 ms.
Can anyone explain me why is this happening? i thought integer joins were faster than varchar ones
Thanks

Copy Distinct Records Based on 3 Cols

I have loads of data in a table called Temp. This data consists of duplicates.
Not Entire rows but the same data in 3 columns. They are HouseNo,DateofYear,TimeOfDay.
I want to copy only the distinct rows from "Temp" into another table, "ThermData."
Basically what i want to do is copy all the distinct rows from Temp to ThermData where distinct(HouseNo,DateofYear,TimeOfDay). Something like that.
I know we can't do that. An alternative to how i can do that.
Do help me out. I have tried lots of things but haven't solved got it.
Sample Data. Values which are repeated are like....
I want to delete the duplicate row based on the values of HouseNo,DateofYear,TimeOfDay
HouseNo DateofYear TimeOfDay Count
102 10/1/2009 0:00:02 AM 2
102 10/1/2009 1:00:02 AM 2
102 10/1/2009 10:00:02 AM 2

Here is a Northwind example based on the Orders table.
There are duplicates based on the (EmployeeID , ShipCity , ShipCountry) columns.
If you only execute the code between these 2 lines:
/* Run everything below this line to show crux of the fix */
/* Run everything above this line to show crux of the fix */
you'll see how it works. Basically:
(1) You run a GROUP BY on the 3 columns of interest. (derived1Duplicates)
(2) Then you join back to the table using these 3 columns. (on ords.EmployeeID = derived1Duplicates.EmployeeID and ords.ShipCity = derived1Duplicates.ShipCity and ords.ShipCountry = derived1Duplicates.ShipCountry)
(3) Then for each group, you tag them with Cardinal numbers (1,2,3,4,etc) (using ROW_NUMBER())
(4) Then you keep the row in each group that has the cardinal number of "1". (where derived2DuplicatedEliminated.RowIDByGroupBy = 1)
Use Northwind
GO
declare #DestinationVariableTable table (
NotNeededButForFunRowIDByGroupBy int not null ,
NotNeededButForFunDuplicateCount int not null ,
[OrderID] [int] NOT NULL,
[CustomerID] [nchar](5) NULL,
[EmployeeID] [int] NULL,
[OrderDate] [datetime] NULL,
[RequiredDate] [datetime] NULL,
[ShippedDate] [datetime] NULL,
[ShipVia] [int] NULL,
[Freight] [money] NULL,
[ShipName] [nvarchar](40) NULL,
[ShipAddress] [nvarchar](60) NULL,
[ShipCity] [nvarchar](15) NULL,
[ShipRegion] [nvarchar](15) NULL,
[ShipPostalCode] [nvarchar](10) NULL,
[ShipCountry] [nvarchar](15) NULL
)
INSERT INTO #DestinationVariableTable (NotNeededButForFunRowIDByGroupBy , NotNeededButForFunDuplicateCount , OrderID,CustomerID,EmployeeID,OrderDate,RequiredDate,ShippedDate,ShipVia,Freight,ShipName,ShipAddress,ShipCity,ShipRegion,ShipPostalCode,ShipCountry )
Select RowIDByGroupBy , MyDuplicateCount , OrderID,CustomerID,EmployeeID,OrderDate,RequiredDate,ShippedDate,ShipVia,Freight,ShipName,ShipAddress,ShipCity,ShipRegion,ShipPostalCode,ShipCountry
From
(
/* Run everything below this line to show crux of the fix */
Select
RowIDByGroupBy = ROW_NUMBER() OVER(PARTITION BY ords.EmployeeID , ords.ShipCity , ords.ShipCountry ORDER BY ords.OrderID )
, derived1Duplicates.MyDuplicateCount
, ords.*
from
[dbo].[Orders] ords
join
(
select EmployeeID , ShipCity , ShipCountry , COUNT(*) as MyDuplicateCount from [dbo].[Orders] GROUP BY EmployeeID , ShipCity , ShipCountry /*HAVING COUNT(*) > 1*/
) as derived1Duplicates
on ords.EmployeeID = derived1Duplicates.EmployeeID and ords.ShipCity = derived1Duplicates.ShipCity and ords.ShipCountry = derived1Duplicates.ShipCountry
/* Run everything above this line to show crux of the fix */
)
as derived2DuplicatedEliminated
where derived2DuplicatedEliminated.RowIDByGroupBy = 1
select * from #DestinationVariableTable
emphasized text*emphasized text*emphasized text

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

SQL Query for Outer Join with Group By - sql-server

Related

Select only the most recent datarows [duplicate]

select 17 milion records in sql server is very slow

How to fetch self referenced column value in sql

Performance loss in sql left join when using int

Copy Distinct Records Based on 3 Cols

Categories

Resources