Sum duplicated rows in SQL Server

Sum duplicated rows in SQL Server - sql-server

Sum repeated values in datatable (VB.NET) or in SQL Server (are better one or other solutions)
I have a database in which are reported repeated rows:
Description | Price | Q.ty | Tax
AAAAAAAAAAA | 10.00 | 20.0 | 5
AAAAAAAAAAA | 10.00 | 12.0 | 5
BBBBBBBBBBB | 18.00 | 09.0 | 5
BBBBBBBBBBB | 18.00 | 12.0 | 5
CCCCCCCCCCC | 13.00 | 15.0 | 5
AAAAAAAAAAA | 17.0 | 19.0 | 5
And I want obtain something like this:
Description | Price | Q.ty | Tax
AAAAAAAAAAA | 10.00 | 51.0 | 5
BBBBBBBBBBB | 18.00 | 21.0 | 5
CCCCCCCCCCC | 13.00 | 15.0 | 5
I've created a datatable in VB.NET and I tried to sum values in it, and then show summed rows in a datagridview, but without results. Then, I've tried to do this with SQL Server (2019), without results again.
Can someone help me, please?
My code is:
Dim conn as New SqlConnection("*****")
Dim cmd2 As New SqlCommand("SELECT Description, Price, SUM(Q.ty), Tax, FROM Prodotti GROUP BY Descrizione", conn)
Dim da As New SqlDataAdapter
da.SelectCommand = cmd2
Dim dt As New DataTable
dt.Clear()
da.Fill(dt)
DataGridView1.SuspendLayout()
DataGridView1.DataSource = dt
DataGridView1.ResumeLayout()
conn.Close()
I want to obtain my goal with VB.NET or directly with SQL Server.

Given this table and data:
USE [Testing]
GO
/****** Object: Table [dbo].[SO71561140] Script Date: 26/03/2022 19:01:14 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[SO71561140](
[Description] [nchar](10) NOT NULL,
[Price] [decimal](19, 4) NOT NULL,
[Quantity] [decimal](19, 4) NOT NULL,
[Tax] [decimal](19, 4) NOT NULL
) ON [PRIMARY]
GO
INSERT [dbo].[SO71561140] ([Description], [Price], [Quantity], [Tax]) VALUES (N'A ', CAST(10 AS Decimal(19, 4)), CAST(20 AS Decimal(19, 4)), CAST(5 AS Decimal(19, 4)))
GO
INSERT [dbo].[SO71561140] ([Description], [Price], [Quantity], [Tax]) VALUES (N'A ', CAST(10 AS Decimal(19, 4)), CAST(12 AS Decimal(19, 4)), CAST(5 AS Decimal(19, 4)))
GO
INSERT [dbo].[SO71561140] ([Description], [Price], [Quantity], [Tax]) VALUES (N'B ', CAST(18 AS Decimal(19, 4)), CAST(9 AS Decimal(19, 4)), CAST(5 AS Decimal(19, 4)))
GO
INSERT [dbo].[SO71561140] ([Description], [Price], [Quantity], [Tax]) VALUES (N'B ', CAST(18 AS Decimal(19, 4)), CAST(12 AS Decimal(19, 4)), CAST(5 AS Decimal(19, 4)))
GO
INSERT [dbo].[SO71561140] ([Description], [Price], [Quantity], [Tax]) VALUES (N'C ', CAST(13 AS Decimal(19, 4)), CAST(15 AS Decimal(19, 4)), CAST(5 AS Decimal(19, 4)))
GO
INSERT [dbo].[SO71561140] ([Description], [Price], [Quantity], [Tax]) VALUES (N'A ', CAST(17 AS Decimal(19, 4)), CAST(19 AS Decimal(19, 4)), CAST(5 AS Decimal(19, 4)))
GO
and this small program:
Imports System.Data.SqlClient
Public Class Form1
Sub X()
Dim csb As New SqlConnectionStringBuilder() With {.DataSource = ".\SQLEXPRESS",
.InitialCatalog = "Testing",
.IntegratedSecurity = True}
Dim sql = "
SELECT [Description], SUM([Quantity]) AS 'Quantity', SUM([Quantity] * [Price]) AS 'Total Price', [Tax]
FROM SO71561140
GROUP BY [Description], [Tax]
"
Dim dt As New DataTable()
Using conn = New SqlConnection(csb.ConnectionString),
cmd = New SqlCommand(sql, conn),
da = New SqlDataAdapter(cmd)
da.Fill(dt)
End Using
DataGridView1.DataSource = dt
End Sub
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
X()
End Sub
End Class
I got this result:
You would need to include the tax rate in the total price per item to be able to remove the tax rate from the displayed data, but that may or may not be a good idea depending on what is needed by whoever pays the bills.

Related

I have two tables in SQL Server database - users and vacation

I want to select all users from the users' table and their data from the vacation table as start vacation date and end date to get their vacation balance and if the user has no vacation taken return 0
Here is my query:
SELECT
dbo.USERINFO.USERID,
dbo.USERINFO.NAME AS إسم_الموظف,
SUM(DATEDIFF(d, dbo.vacation.STARTSDAY, dbo.vacation.ENDSDAY) + 1) AS عددالايام,
dbo.userinfo.balance - (SUM(DATEDIFF(d, dbo.vacation.STARTSDAY, dbo.vacation.ENDSDAY) + 1)) AS الرصيد
FROM
dbo.USERINFO
LEFT JOIN
dbo.Vacation ON dbo.Vacation.UserID = dbo.USERINFO.USERID
WHERE
dbo.USERINFO.DEFAULTDEPTID <> 7
AND dbo.vacation.DATEID = 2
AND year(dbo.vacation.STARTSDAY) = 2020
AND month(dbo.vacation.STARTSDAY) BETWEEN 1
AND 12
GROUP BY
dbo.USERINFO.userid,
dbo.USERINFO.NAME,
dbo.userinfo.balance

You need to move the condition inside the case when statement while calculating the sum
SELECT U.USERID,
U.NAME AS إسم_الموظف,
ISNULL(SUM(CASE WHEN V.DATEID = 2
AND year(V.STARTSDAY) = 2020
THEN DATEDIFF(d, V.STARTSDAY, V.ENDSDAY)
ELSE 0
END), 0) AS عددالايام,
ISNULL(U.balance, 0) - ISNULL(SUM(CASE WHEN V.DATEID = 2
AND year(V.STARTSDAY) = 2020
THEN DATEDIFF(d, V.STARTSDAY, V.ENDSDAY)
ELSE 0
END), 0) AS الرصيد
FROM dbo.USERINFO AS U
LEFT JOIN dbo.Vacation AS V ON V.UserID = U.USERID
WHERE U.DEFAULTDEPTID <> 7
GROUP BY U.userid,
U.NAME,
U.balance
Results:
+--------+------------+-----------+--------+
| USERID | إسم_الموظف | عددالايام | الرصيد |
+--------+------------+-----------+--------+
| 1 | NAME 1 | 8 | 2 |
+--------+------------+-----------+--------+
| 2 | NAME 2 | 6 | 4 |
+--------+------------+-----------+--------+
| 4 | NAME 4 | 0 | 10 |
+--------+------------+-----------+--------+
The script used to generate the dummy data is:
CREATE TABLE USERINFO (USERID int, NAME VARCHAR(20), BALANCE INT, DEFAULTDEPTID INT)
INSERT INTO USERINFO VALUES
(1, 'NAME 1', 10, 5),
(2, 'NAME 2', 10, 4),
(3, 'NAME 3', 10, 7),
(4, 'NAME 4', 10, 5)
CREATE TABLE Vacation (UserID INT, STARTSDAY DATETIME, ENDSDAY DATETIME, DATEID INT)
INSERT INTO Vacation VALUES
(1, '2020-01-12', '2020-01-20', 2),
(1, '2020-01-22', '2020-01-24', 3),
(2, '2020-01-27', '2020-01-31', 2),
(2, '2020-03-27', '2020-03-29', 2),
(7, '2020-03-27', '2020-03-29', 2)
Note that the UserId 3 is retired so it is not included in the result.
Note that vacation with DateId 3 not included when calculating the sum

STRING_AGG with CASE WHEN

The schema
CREATE TABLE person
(
[first_name] VARCHAR(10),
[surname] VARCHAR(10),
[dob] DATE,
[person_id] INT
);
INSERT INTO person ([first_name], [surname], [dob] ,[person_id])
VALUES
('Alice', 'AA', '1/1/1960', 1),
('Bob' , 'AA', '1/1/1980', 2),
('Carol', 'AA', '1/1/2018', 3),
('Dave' , 'BB', '1/1/1960', 4),
('Elsa', ' BB', '1/1/1980', 5),
('Fred' , 'BB', '1/1/1990', 6),
('Gina' , 'BB', '1/1/2018', 7);
CREATE TABLE person_membership
(
[person_id] INT,
[personstatus] VARCHAR(1),
[membership_id] INT,
[relationship] INT
);
INSERT INTO person_membership ([person_id], [personstatus], [membership_id], [relationship])
VALUES
(1, 'A', 10, 1),
(2, 'A', 10, 2),
(3, 'A', 10, 3),
(4, 'A', 20, 1),
(5, 'A', 20, 2),
(6, 'A', 20, 4),
(7, 'A', 20, 5);
In this simplified scheme the person with relationship set to 1 one is the main policy holder while different numbers show how other people are related to the main policy holder (spouse, children etc.)
The problem
Show all dependants for each main policy holder and group them within arbitrarily chosen age groups.
The desired output:
person_id|membership_id|first_name|dependants under 10|dependants over 10
---------+-------------+----------+-------------------+-------------------
1 | 10 | Alice | Bob | Carol
4 | 20 | Dave | Gina | Elsa, Fred
8 | 30 | Helen | Ida, Joe, Ken | NULL
My efforts so far:
SELECT
sub.person_id, sub.membership_id, sub.first_name,
STRING_AGG (sub.dependant, ',')
FROM
(SELECT
person.person_id, person_membership.membership_id,
person.first_name, p.first_name AS 'dependant',
DATEDIFF(yy, CONVERT(DATETIME, p.dob), GETDATE()) AS 'age'
FROM
person
LEFT JOIN
person_membership ON person.person_id = person_membership.person_id
LEFT JOIN
memship ON person_membership.membership_id = memship.membership_id
LEFT JOIN
person_membership pm ON person_membership.membership_id = pm.membership_id AND pm.relationship > 1
LEFT JOIN
person p ON pm.person_id = p.person_id
WHERE
person_membership.relationship = 1) as sub
GROUP BY
sub.person_id, sub.membership_id, sub.first_name
I can't figure out how to use CASE WHEN with STRING_AGG.
When I try something like
"CASE WHEN age < 10 THEN STRING_AGG (sub.dependant, ',') ELSE NULL END as 'Under 10'"
the server rightly protests that
contained in either an aggregate function or the GROUP BY clause
but of course grouping by it doesn't solve the problem either so there is a trick that I am missing. Also I'm sure it's possible to write the main query itself in a simpler way.
Edit - solution
As #Gserg rightly pointed out, and what I have realised moments after posting the question, the solution is very simple and calls for using CASE WHEN within STRING_AGG and not the other way around. Doh.
string_agg(case when age < 10 then sub.dependant else null end, ', ') as 'Under 10'
Still still looking for suggestions and ideas how to improve on my original query.

maximize using the iif function for a single condition.
SELECT sub.person_id, sub.membership_id, sub.first_name,
STRING_AGG (iif(age < 10, sub.dependant, null), ',') 'Under 10'
FROM (SELECT person.person_id, person_membership.membership_id, person.first_name, p.first_name AS 'dependant',
DATEDIFF(yy,CONVERT(DATETIME, p.dob),GETDATE()) AS 'age'
FROM person
LEFT JOIN person_membership ON person.person_id = person_membership.person_id
LEFT JOIN person_membership memship ON person_membership.membership_id = memship.membership_id
LEFT JOIN person_membership pm ON person_membership.membership_id = pm.membership_id AND pm.relationship > 1
LEFT JOIN person p ON pm.person_id = p.person_id
WHERE person_membership.relationship = 1) as sub
GROUP BY sub.person_id, sub.membership_id, sub.first_name

SQL Server - How To Converting Datetime Type From Nvarchar Type?

I have got two columns of nvarchar.
Sample data is:
column1='20180402', column2='134259'
My goal is to create a datetime type column by combining these two columns.
Like this: '2018-04-02 13:42:59.000'
How can I do that ?
Please, help me ?

You can either convert both strings to a datetime value and add them together to get your combined datetime or combine the strings and convert the result.
Note the time will need two : characters added to correctly parse:
select cast(d as datetime) + cast(stuff(stuff(t,5,0,':'),3,0,':') as datetime) as dt1
,cast(d + ' ' + stuff(stuff(t,5,0,':'),3,0,':') as datetime) as dt2
from (values('20180402','134259')) as v(d,t);
Output
+-------------------------+-------------------------+
| dt1 | dt2 |
+-------------------------+-------------------------+
| 2018-04-02 13:42:59.000 | 2018-04-02 13:42:59.000 |
+-------------------------+-------------------------+

You can use DATETIMEFROMPARTS:
SELECT DATETIMEFROMPARTS(
SUBSTRING(column1, 1, 4),
SUBSTRING(column1, 5, 2),
SUBSTRING(column1, 7, 2),
SUBSTRING(column2, 1, 2),
SUBSTRING(column2, 3, 2),
SUBSTRING(column2, 5, 2),
0
)
FROM (VALUES
('20180402', '134259')
) v(column1, column2)

declare #d varchar(50)='20180402',
#t varchar(50)='134259'
select convert(varchar(50),convert(date,#d)) + ' '+ convert(varchar(50),dateadd(hour, (#T / 10000) % 100,
dateadd(minute, (#T / 100) % 100,
dateadd(second, (#T / 1) % 100,
cast('00:00:00' as time(3))))) )

How to re write while loop using cte

I have two tables, one with Events, the other with episodes.
An Episode has a start date and end date, the event has a single date.
Both Episodes and Events have one of six Types.
Currently I'm using some fuzzy logic to run an update script on the Events table to set it's ID field to the matching Episode. It does this by checking for the Event date between the Episode start and end, both having the same Type, as well as some other links like same User etc.
Since the Events can sit outside of the Episode, or have a different Type, what I do is loop through a sequence of expanding date ranges (StartDate-1, -2 etc) and also cycle through each Type looking for a match.
I've been reading that while loops aren't very efficient, so was wondering if there was a way to rewrite this nested loop into a CTE function.
I'm using SQL Server 2012.
Event List is just a temp table that has all the possible Types with an order to loop through.
My loop currently is:
WHILE #CurrBefore <= #Before and #CurrentAfter <= #After
BEGIN
SET #Row = 0
WHILE #Row <= #MaxRow
BEGIN
UPDATE P
SET P.ID = E.ID
FROM Event P
OUTER APPLY (SELECT TOP 1 E.Id, E.Type
FROM Episode E
WHERE E.User = P.User AND
E.Type = CASE WHEN #Row=0 THEN P.Event ELSE (SELECT Event FROM #EventList WHERE RN = #Row) END AND
P.Date BETWEEN E.StartDate-#CurrentBefore AND E.EndDate+#CurrentAfter
ORDER BY P.Date) E
WHERE P.ID = 0
INCREMENT #ROW CODE
END
INCREMENT #BEFORE/AFTER CODE
END
Sample Data:
IF OBJECT_ID('tempdb..#EventList') IS NOT NULL
BEGIN
DROP TABLE #EventList
CREATE TABLE #EventList(Event Varchar(50), RN INT);
INSERT INTO #EventList SELECT 'A', 1
INSERT INTO #EventList SELECT 'B', 2
INSERT INTO #EventList SELECT 'C', 3
INSERT INTO #EventList SELECT 'D', 4
INSERT INTO #EventList SELECT 'E', 5
INSERT INTO #EventList SELECT 'F', 6
END
CREATE TABLE dbo.Episode ([ID] INT, [Start] DateTime, [End] DateTime, [Type] varchar(1), [User] INT)
INSERT INTO [dbo].Episode ([ID], [Start], [End], [Type],[User])
VALUES
(1, '2018-07-01 10:00', '2018-07-02 14:00', 'A',10),
(2, '2018-07-05 6:00', '2018-07-06 13:00', 'A',11),
(3, '2018-07-03 9:00', '2018-07-04 8:00', 'B',10),
(4, '2018-07-02 15:00', '2018-07-03 7:00', 'B',12),
(5, '2018-07-01 1:00', '2018-07-02 8:00', 'C',13),
(6, '2018-07-01 6:00', '2018-07-01 8:00', 'D',11)
CREATE TABLE dbo.Event ([ID] INT, [Date] DateTime, [Type] varchar(1), [User] INT)
INSERT INTO [dbo].Event ([ID], [Date], [Type],[User])
VALUES
(0, '2018-07-01 12:00', 'A',10),
(0, '2018-07-05 15:00', 'A',11),
(0, '2018-07-03 13:00', 'C',10),
(0, '2018-07-10 9:00', 'B',12),
(0, '2018-07-01 5:00', 'C',10),
(0, '2018-07-01 10:00', 'D',11)
Expected result, Event now looks like this:
1 2018-07-01 12:00:00.000 A 10
2 2018-07-05 15:00:00.000 A 11
3 2018-07-03 13:00:00.000 C 10
0 2018-07-10 09:00:00.000 B 12
1 2018-07-01 05:00:00.000 C 10
6 2018-07-01 10:00:00.000 D 11

I don't know, if I fully got the logic, but this might help to get you running:
USE master;
GO
CREATE DATABASE TestDB
GO
USE TestDB;
GO
CREATE TABLE dbo.Episode ([ID] INT, [Start] DateTime, [End] DateTime, [Type] varchar(1), [User] INT)
INSERT INTO [dbo].Episode ([ID], [Start], [End], [Type],[User])
VALUES
(1, '2018-07-01 10:00', '2018-07-02 14:00', 'A',10),
(2, '2018-07-05 6:00', '2018-07-06 13:00', 'A',11),
(3, '2018-07-03 9:00', '2018-07-04 8:00', 'B',10),
(4, '2018-07-02 15:00', '2018-07-03 7:00', 'B',12),
(5, '2018-07-01 1:00', '2018-07-02 8:00', 'C',13),
(6, '2018-07-01 6:00', '2018-07-01 8:00', 'D',11)
CREATE TABLE dbo.[Event] ([ID] INT, [Date] DateTime, [Type] varchar(1), [User] INT)
INSERT INTO [dbo].[Event] ([ID], [Date], [Type],[User])
VALUES
(0, '2018-07-01 12:00', 'A',10),
(0, '2018-07-05 15:00', 'A',11),
(0, '2018-07-03 13:00', 'C',10),
(0, '2018-07-10 9:00', 'B',12),
(0, '2018-07-01 5:00', 'C',10),
(0, '2018-07-01 10:00', 'D',11)
GO
CREATE TABLE #EventList(Event Varchar(50), RN INT);
INSERT INTO #EventList VALUES ('A', 1),('B', 2),('C', 3),('D', 4),('E', 5),('F', 6);
WITH mathingEpisodes AS
(
SELECT ev.ID AS evID
,ev.[Date] AS evDate
,ev.[Type] AS evType
,ev.[User] AS evUser
,e1.RN AS evRN
,ep.ID AS epID
,ep.[Type] AS epType
,e2.RN AS epRN
FROM [Event] ev
LEFT JOIN Episode ep ON ev.[User]=ep.[User] AND ev.[Date] >= ep.[Start] AND ev.[Date] < ep.[End]
LEFT JOIN #EventList e1 ON ev.[Type]=e1.[Event]
LEFT JOIN #EventList e2 ON ep.[Type]=e2.[Event]
)
SELECT COALESCE(epID,Closest.ID) AS FittingEpisodeID
,me.evDate
,evType
,evUser
FROM mathingEpisodes me
OUTER APPLY(SELECT TOP 1 *
FROM Episode ep
CROSS APPLY(SELECT ABS(DATEDIFF(SECOND,me.evDate,ep.[Start])) AS DiffToStart
,ABS(DATEDIFF(SECOND,me.evDate,ep.[End])) AS DiffToEnd) Diffs
CROSS APPLY(SELECT CASE WHEN DiffToStart<DiffToEnd THEN DiffToStart ELSE DiffToEnd END AS Smaller) Diffs2
WHERE ep.[User] = me.evUser
AND me.epID IS NULL
ORDER BY Diffs2.Smaller
) Closest
ORDER BY evDate;
GO
USE master;
GO
DROP DATABASE TestDB;
GO
DROP TABLE #EventList
GO
The result
1 2018-01-07 05:00:00.000 C 10
6 2018-01-07 10:00:00.000 D 11
1 2018-01-07 12:00:00.000 A 10
3 2018-03-07 13:00:00.000 C 10
2 2018-05-07 15:00:00.000 A 11
4 2018-10-07 09:00:00.000 B 12
Some explanation
In the first cte I try to find fitting episodes (same user and date within range).
The second cte will compute the closest Episode for the same user in all cases, where the first cte did not succeed.
The only difference for this sample is the event for userId=12. My logic will bind this to the closest episode of this user (ID=4), while your expected output shows a zero in this place.
Anyway, my solution is fully set-based, therefore faster than a loop, and should be rather close to your needs. Try to adapt it...
UPDATE Some more thoughts...
I did not get the ghist of your #EventList... I bound the results into the set (you can make it visible by using SELECT * instead of the explicit column list. But this is - assumably - not what you meant...

SQL Pivot Using three tables

I am running into some trouble trying to pivot some data out of SQL.
I have three tables that will comprise the data.
Table 1: (Clause)
-Clause
-ClauseName
Table 2: (Process)
-Id
-ProcessName
Table 3: (RELProcessClauses)
-ProcessId
-Clause
-WeightedValue
Ultimately, I am looking to have a matrix of data that is Clause, ClauseName down the left, ProcessName across the top and the Weighted value to correspond between Process and Clause.
Not sure if this will make much sense.

Join the three tables and use PIVOT on it. You can run the following query:
SELECT * FROM (
SELECT
c.Clause,
c.ClauseName,
p.ProcessName,
pc.WeightedValue
from RELProcessClauses pc
JOIN Clause c on pc.clause = c.clause
JOIN Process p on pc.ProcessId = p.id
) x
PIVOT (
SUM(WeightedValue)
FOR ProcessName IN ([ProcessName1], [ProcessName2], [ProcessName3])
) as pvt
Output table:
+--------+-------------+--------------+--------------+--------------+
| Clause | ClauseName | ProcessName1 | ProcessName2 | ProcessName3 |
+--------+-------------+--------------+--------------+--------------+
| 1 | ClauseName1 | 10 | 15 | 30 |
| 2 | ClauseName2 | 15 | 20 | 30 |
| 3 | ClauseName3 | 20 | 20 | 30 |
+--------+-------------+--------------+--------------+--------------+
The query/output works on the demo tables created using the query below:
CREATE TABLE Clause (
Clause int,
ClauseName varchar(255)
);
CREATE TABLE Process (
Id int,
ProcessName varchar(255)
);
CREATE TABLE RELProcessClauses (
ProcessId int,
Clause int,
WeightedValue int
);
INSERT INTO Clause VALUES
(1, 'ClauseName1'),
(2, 'ClauseName2'),
(3, 'ClauseName3');
INSERT INTO Process VALUES
(1, 'ProcessName1'),
(2, 'ProcessName2'),
(3, 'ProcessName3');
INSERT INTO RELProcessClauses VALUES
(1, 1, 10),
(1, 2, 15),
(1, 3, 20),
(2, 1, 15),
(2, 2, 20),
(2, 3, 20),
(3, 1, 30),
(3, 2, 30),
(3, 3, 30);