Remove duplication from sql query

Remove duplication from sql query - sql-server

Trying to select the total delivery from each store where the status is 100, some cases one repair number having 2 100(status delivery). how i can remove all the duplicated from selection even no need one means if its duplicated should cancel that repair from counting. kindly check my code below that's what i reach now.
SELECT UL.StoreName, COUNT(DISTINCT JT.REPAIRNO) AS TotalDelivery
FROM DataDetails AS UL LEFT OUTER JOIN
JOBTRACKING AS JT ON UL.storeID = JT.store_code
WHERE (CAST(JT.created_Date AS date)='2017-03-08')
AND JT.JOBSTATUS=100
GROUP BY UL.StoreName
for example
Name TotalDelivery
ABC 4
XYZ 4
this one come from
RepairNo Store Status CreatedDate
1000 ABC 100 3/8/2017
1001 ABC 100 3/8/2017
1001 ABC 100 3/8/2017
1008 ABC 100 3/8/2017
1009 ABC 100 3/8/2017
1011 XYZ 100 3/8/2017
1011 XYZ 100 3/8/2017
1013 XYZ 100 3/8/2017
1014 XYZ 100 3/8/2017
1015 XYZ 100 3/8/2017
1015 XYZ 100 3/8/2017
need the result as below
Name TotalDelivery
ABC 3
XYZ 2
it will return all the rows and removes duplication but it will return one from duplicate , i want to remove that one also. only a row those dont have any duplucates. thanks in advance.

If you want the non-duplicate results, you need to use SUB QUERY clause to filter them out. Try the below query.
Updated
SELECT UL.StoreName, COUNT(1) AS TotalDelivery
FROM DataDetails AS UL
LEFT OUTER JOIN JOBTRACKING AS JT
ON UL.storeID = JT.store_code
WHERE CAST(JT.created_Date AS date)='2017-03-08'
AND JT.JOBSTATUS=100
AND JT.REPAIRNO IN (SELECT REPAIRNO from JOBTRACKING j WHERE j.store_code = UL.storeID GROUP BY j.REPAIRNO HAVING COUNT(1) = 1)
GROUP BY UL.StoreName, UL.storeID
Test Script
CREATE TABLE #DataDetails
(
StoreName CHAR(3), storeID int
)
CREATE TABLE #JOBTRACKING
(
store_code int, REPAIRNO INT, JOBSTATUS INT, created_Date DATE
)
INSERT #DataDetails VALUES( 'ABC', 1), ('XYZ', 2)
INSERT #JOBTRACKING VALUES (1, 1000, 100, '2017-03-08'), (1, 1001, 100, '2017-03-08'), (1, 1001, 100, '2017-03-08'), (1, 1008, 100, '2017-03-08'), (1, 1009, 100, '2017-03-08')
,(2, 1011, 100, '2017-03-08'), (2, 1011, 100, '2017-03-08'), (2, 1013, 100, '2017-03-08'), (2, 1014, 100, '2017-03-08'), (2, 1015, 100, '2017-03-08'), (2, 1015, 100, '2017-03-08')
SELECT UL.StoreName, COUNT(1) AS TotalDelivery
FROM #DataDetails AS UL
LEFT OUTER JOIN #JOBTRACKING AS JT
ON UL.storeID = JT.store_code
WHERE CAST(JT.created_Date AS date)='2017-03-08'
AND JT.JOBSTATUS=100
AND JT.REPAIRNO IN (SELECT REPAIRNO from #JOBTRACKING j WHERE j.store_code = UL.storeID GROUP BY j.REPAIRNO HAVING COUNT(1) = 1)
GROUP BY UL.StoreName, UL.storeID
Results
+-----------+---------------+
| StoreName | TotalDelivery |
+-----------+---------------+
| ABC | 3 |
| XYZ | 2 |
+-----------+---------------+

I need more details but as I understand, Firstly you should prepare the total select data list result. For example in the inner select you can group the data and eleminate the all concurrent or redundant data or you can apply where criteria then use outer select and group by GROUP BY UL.StoreName and then you will get the true answer. Do not use distinct !

Related

Distribute multiple payments to invoice lines

I'm having a problem allocating payments to invoice lines.
Data looks like this:
Invoice lines table (sales):
lineId invoiceId value
1 1 100
2 1 -50
3 1 40
4 2 500
Payments table (payments):
paymentId invoiceId amount
1 1 50
2 1 40
3 2 300
Now, I want to know for each invoice line the payment details. The payments shall be allocated first to the smallest values (i.e. line 2, -50)
The output should look like this:
lineId invoiceId value paymentId valuePaid valueUnpaid
2 1 -50 1 -50 0
3 1 40 1 40 0
1 1 100 1 60 40
1 1 100 2 40 0
4 2 500 3 300 200
The problem is solved in the post below, but the solution does not work if you have negative invoice values or if you have to split an invoice line in two payments.
https://dba.stackexchange.com/questions/58474/how-can-i-use-running-total-aggregates-in-a-query-to-output-financial-accumulati/219925?noredirect=1#comment431486_219925
This is what I've done so far based on the article above:
drop table dbo.#sales
drop table dbo.#payments
CREATE TABLE dbo.#sales
( lineId int primary key, -- unique line id
invoiceId int not null , -- InvoiceId foreign key
itemValue money not null ) -- value of invoice line.
CREATE TABLE dbo.#payments
( paymentId int primary key, -- Unique payment id
InvoiceId int not null, -- InvoiceId foreign key
PayAmount money not null
)
-- Example invoice, id #1, with 3 lines, total ammount = 90; id #2, with one line, value 500
INSERT dbo.#sales VALUES
(1, 1, 100),
(2, 1, -50),
(3, 1, 40),
(4, 2, 500) ;
-- Two payments paid towards invoice id#1, 50+40 = 90
-- One payment paid towards invoice id#2, 300
INSERT dbo.#Payments
VALUES (1, 1, 50),
(2, 1, 40),
(3, 2, 300);
-- Expected output should be as follows, for reporting purposes.
/* lineId, invoiceId, value, paymentId, valuePaid, valueUnpaid
2, 1, -50, 1, -50, 0
3, 1, 40, 1, 40, 0
1, 1, 100, 1, 60, 40
1, 1, 100, 2, 40, 0
4, 2, 500, 3, 300, 200 */
WITH inv AS
( SELECT lineId, invoiceId,
itemValue,
SumItemValue = SUM(itemValue) OVER
(PARTITION BY InvoiceId
ORDER BY ItemValue Asc
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW)
FROM dbo.#Sales
)
, pay AS
( SELECT
PaymentId, InvoiceId, PayAmount as PayAmt,
SumPayAmt = SUM(PayAmount) OVER
(PARTITION BY InvoiceId
ORDER BY PaymentId
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW)
FROM dbo.#payments
)
SELECT
inv.lineId,
inv.InvoiceId,
inv.itemValue,
pay.PaymentId,
PaymentAllocated =
CASE WHEN SumPayAmt <= SumItemValue - itemValue
OR SumItemValue <= SumPayAmt - PayAmt
THEN 0
ELSE
CASE WHEN SumPayAmt <= SumItemValue THEN SumPayAmt
ELSE SumItemValue END
- CASE WHEN SumPayAmt-PayAmt <= SumItemValue-itemValue
THEN SumItemValue-itemValue
ELSE SumPayAmt-PayAmt END
END
FROM inv JOIN pay
ON inv.InvoiceId = pay.InvoiceId
ORDER BY
inv.InvoiceId,
pay.PaymentId;
The current output is:
lineId InvoiceId itemValue PaymentId PaymentAllocated
2 1 -50.00 1 0.00
3 1 40.00 1 0.00
1 1 100.00 1 50.00
2 1 -50.00 2 0.00
3 1 40.00 2 0.00
1 1 100.00 2 40.00
4 2 500.00 3 300.00
Any direction will be appreciated. Thank you.
More info on the allocation rules:
Allocating first payment to the smallest sale (i.e. -50) was just a
convention to insure all sales lines get payments. If I’d allocate
arbitrary or with another rule, and line 1 (value 100) would get the
first payment, I’d use all the payment for this line and the rest of
the invoice would remain unallocated.
As I said, it’s just an convention. If someone else comes with a
different rule that works, it’s ok. Actually, the structure is
simplified compared with the production tables: payments also have a
payment date, type, … and a correct distribution should tell us what
invoice lines were paid at each payment time.
Payments are restricted by the logic of the system to be smaller then
the sum of the invoice lines. Well, it might be a case when payments
are greater: the total invoice is negative (ie: -100). In this case
we can insert in the payments table amounts in the range of -100: 0
and Total Payments are restricted to -100

In the end I found quite a simple and natural sollution - to allocate payments based on the percentage of each payment in the total value of the invoice.
drop table dbo.#sales
drop table dbo.#payments
CREATE TABLE dbo.#sales
( lineId int primary key, -- unique line id
invoiceId int not null , -- InvoiceId foreign key
itemValue money not null ) -- value of invoice line.
CREATE TABLE dbo.#payments
( paymentId int primary key, -- Unique payment id
InvoiceId int not null, -- InvoiceId foreign key
PayAmount money not null
)
-- Example invoice, id #1, with 3 lines, total ammount = 90; id #2, with one line, value 500
INSERT dbo.#sales VALUES
(1, 1, 100),
(2, 1, -50),
(3, 1, 40),
(4, 2, 500) ;
-- Two payments paid towards invoice id#1, 50+40 = 90
-- One payment paid towards invoice id#2, 300
INSERT dbo.#Payments
VALUES (1, 1, 50),
(2, 1, 40),
(3, 2, 300);
SELECT
s.lineId,
s.InvoiceId,
s.itemValue,
p.PayAmount,
p.PaymentId,
round(p.PayAmount / ts.SumItemValue,3) as PaymentPercent,
s.ItemValue * round(p.PayAmount / ts.SumItemValue,3) as AllocatedPayment
FROM dbo.#sales s
LEFT JOIN dbo.#payments p
ON s.InvoiceId = p.InvoiceId
LEFT JOIN (SELECT invoiceId, sum(itemValue) as SumItemValue FROM dbo.#sales GROUP BY invoiceId) ts
ON s.invoiceId = ts.invoiceId
ORDER BY
s.InvoiceId,
p.PaymentId;
And the resunt looks like this:
lineId InvoiceId itemValue PayAmount PaymentId PaymentPercent AllocatedPayment
1 1 100.00 50.00 1 0.556 55.60
2 1 -50.00 50.00 1 0.556 -27.80
3 1 40.00 50.00 1 0.556 22.24
3 1 40.00 40.00 2 0.444 17.76
2 1 -50.00 40.00 2 0.444 -22.20
1 1 100.00 40.00 2 0.444 44.40
4 2 500.00 300.00 3 0.60 300.00

Insert multuple rows at once with a calculated column from prior inserts into SQL Server

I'm trying to figure out how to do a multi-row insert as one statement in SQL Server, but where one of the columns is a column computer based on the data as it stands after every insert row.
Let's say I run this simple query and get back 3 records:
SELECT *
FROM event_courses
WHERE event_id = 100
Results:
id | event_id | course_id | course_priority
---+----------+-----------+----------------
10 | 100 | 501 | 1
11 | 100 | 502 | 2
12 | 100 | 503 | 3
Now I want to insert 3 more records into this table, except I need to be able to calculate the priority for each record. The priority should be the count of all courses in this event. But if I run a sub-query, I get the same priority for all new courses:
INSERT INTO event_courses (event_id, course_id, course_priority)
VALUES (100, 500,
(SELECT COUNT (id) + 1 AS cnt_event_courses
FROM event_courses
WHERE event_id = 100)),
(100, 501,
(SELECT COUNT (id) + 1 AS cnt_event_courses
FROM event_courses
WHERE event_id = 1))
Results:
id | event_id | course_id | course_priority
---+----------+-----------+-----------------
10 | 100 | 501 | 1
11 | 100 | 502 | 2
12 | 100 | 503 | 3
13 | 100 | 504 | 4
14 | 100 | 505 | 4
15 | 100 | 506 | 4
Now I know I could easily do this in a loop outside of SQL and just run a bunch of insert statement, but that's not very efficient. There's got to be a way to calculate the priority on the fly during a multi-row insert.
Big thanks to #Sean Lange for the answer. I was able to simplify it even further for my application. Great lead! Learned 2 new syntax tricks today ;)
DECLARE #eventid int = 100
INSERT event_courses
SELECT #eventid AS event_id,
course_id,
course_priority = existingEventCourses.prioritySeed + ROW_NUMBER() OVER(ORDER BY tempid)
FROM (VALUES
(1, 501),
(2, 502),
(3, 503)
) courseInserts (tempid, course_id) -- This basically creates a temp table in memory at run-time
CROSS APPLY (
SELECT COUNT(id) AS prioritySeed
FROM event_courses
WHERE event_id = #eventid
) existingEventCourses
SELECT *
FROM event_courses
WHERE event_id = #eventid

Here is an example of how you might be able to do this. I have no idea where your new rows values are coming from so I just tossed them in a derived table. I doubt your final solution would look like this but it demonstrates how you can leverage ROW_NUMBER for accomplish this type of thing.
declare #EventCourse table
(
id int identity
, event_id int
, course_id int
, course_priority int
)
insert #EventCourse values
(100, 501, 1)
,(100, 502, 2)
,(100, 503, 3)
select *
from #EventCourse
insert #EventCourse
(
event_id
, course_id
, course_priority
)
select x.eventID
, x.coursePriority
, NewPriority = y.MaxPriority + ROW_NUMBER() over(partition by x.eventID order by x.coursePriority)
from
(
values(100, 504)
,(100, 505)
,(100, 506)
)x(eventID, coursePriority)
cross apply
(
select max(course_priority) as MaxPriority
from #EventCourse ec
where ec.event_id = x.eventID
) y
select *
from #EventCourse

Filling missing records with previous existing records

I have an existing database where some logic is made by the front end application.
Now I have to make reports from that database and I'm facing to a proble of missing records which are covered on a record basis in the frontend but have issues in the report
Given the following tables:
create table #T (id int, id1 int, label varchar(50))
create table #T1 (id int, T_id1 int, A int, B int, C int)
With the values:
insert into #T values (10, 1, 'label1'), (10, 2, 'label2'), (10, 3, 'label3'), (10, 15, 'label15'), (10, 16, 'label16'), (20, 100, 'label100'), (20, 101, 'label101')
insert into #T1 values (10, 1, 100, 200, 300), (10, 15, 150, 250, 350), (20, 100, 151, 251, 351), (20, 101, 151, 251, 351)
if I make a report we can see some missing records:
select #T.id, #T.id1, #T1.A, #T1.B, #T1.C
from #T left join #T1 on #T.id1 = #T1.T_id1
result:
id id1 A B C
10 1 100 200 300
10 2 NULL NULL NULL
10 3 NULL NULL NULL
10 15 150 250 350
10 16 NULL NULL NULL
20 100 151 251 351
20 101 151 251 351
Expected result would be:
id id1 A B B
10 1 100 200 300
10 2 100 200 300
10 3 100 200 300
10 15 150 250 350
10 16 150 250 350
20 100 151 251 351
20 101 151 251 351
As you can see here the missing data is filled out of the the first (in id, id1 order) previous existing record for a given id. For a given id there can be any number of "missing" records and for the given id there can be any number of existing records after a not existing ones.
I can do this with a cursor but I'm looking for a solution without cursor

You can use subquery (to find groups with same values) + window function
WITH Grouped AS (
SELECT #T.id, #T.id1, #T1.A, #T1.B, #T1.C,
GroupN = SUM(CASE WHEN #T1.A IS NULL THEN 0 ELSE 1 END) OVER(/* PARTITION BY id ? */ ORDER BY id1 ROWS UNBOUNDED PRECEDING)
FROM #T
LEFT JOIN #T1 ON #T.id1 = #T1.T_id1
)
SELECT Grouped.id, Grouped.id1,
A = MAX(A) OVER(PARTITION BY GroupN),
B = MAX(B) OVER(PARTITION BY GroupN),
C = MAX(C) OVER(PARTITION BY GroupN)
FROM Grouped

You can use below sql for thedesired output:
with cte (id, id1, A, B, C)
as
(
select #T.id, #T.id1, #T1.A, #T1.B, #T1.C
from #T left join #T1 on #T.id1 = #T1.T_id1
)
select cte.id, cte.id1,
coalesce(cte.A,TT.A) A,
coalesce(cte.B,TT.B) B,
coalesce(cte.C,TT.C) C
from cte
left join
(
select p.id1,max(q.id1) id1_max
from cte p
inner join cte q on p.id1 > q.id1 and p.a is null and q.a is not null
group by p.id1
) T on cte.id1 = T.id1
left join cte TT on T.id1_max = TT.id1

Using SQL unpivot on two groups

I have a table as below:
Name ValueA1 ValueA2 ValueA3 ValueB1 ValueB2 ValueB3 QtyA1 QtyA2 QtyA3 QtyB1 QtyB2 QtyB3
John 1 2 3 4 5 6 100 200 300 150 250 350
Dave 11 12 13 14 15 16 100 200 300 150 250 350
I am able to use unpivot to get the values:
select [Name]
, Replace(u.[Period],'Value','') as [Period]
, u.[Value]
from Table1
unpivot
(
[Value]
for [Period] in ([ValueA1], [ValueA2], [ValueA3], [ValueB1], [ValueB2], [ValueB3])
) u;
SQL Fiddle
However I'm trying to get both the Value and Qty columns on a single row, what I want to end up with is:
Name Number Value Qty
John A1 1 100
John A2 2 200
John A3 3 300
John B1 4 150
John B2 5 250
John B3 6 350
Dave A1 11 100
Dave A2 12 200
Dave A3 13 300
Dave B1 14 150
Dave B2 15 250
Dave B3 16 350
What I have so far is (which doesn't work at all):
select [Name]
, Replace(u.[Period],'Value','') as [Period]
, u.[Value]
, u2.[Value]
from Table1
unpivot
(
[Value]
for [Period] in ([ValueA1], [ValueA2], [ValueA3], [ValueB1], [ValueB2], [ValueB3])
) u
unpivot
(
[Qty]
for [Period] in ([QtyA1], [QtyA2], [QtyA3], [QtyB1], [QtyB2], [QtyB3])
) u2;
Is what I am trying to do even possible with unpivot?

You can use a simple apply for this by specifying the pairs in a values clause:
declare #table table (Name varchar(10), ValueA1 int, ValueA2 int, QtyA1 int, QtyA2 int);
insert into #table
select 'John', 1, 2, 100, 200 union all
select 'Dave', 11, 12, 100, 200;
select Name, Number, Value, Qty
from #table
cross
apply ( values
('A1', ValueA1, QtyA1),
('A2', ValueA2, QtyA2)
) c (number, value, qty);
If you're using an older edition of MSSQL, you might need to use this
instead of the values clause above:
cross
apply ( select 'A1', ValueA1, QtyA1 union all
select 'A2', ValueA2, QtyA2
) c (number, value, qty);
Returns:
Name Number Value Qty
John A1 1 100
John A2 2 200
Dave A1 11 100
Dave A2 12 200

Left join get all data in both table

I have two tables Q and A,
records are A are
QID UserID Value
1 100 A
2 100 B
3 100 C
1 101 AA
2 101 BB
3 101 CC
1 102 AAA
2 102 BBB
As you can see, there is no record for user 102 for QID 3. There is this another table Q.
QID Value
1 Name
2 Email
3 Site
What I want is, for each user, weather they have answered a question or not (that is, weather a entry exits in A table or not) I want all questions for all users and their answers. Something like this.
QID QValue UserID Value
1 Name 100 A
2 Email 100 B
3 Site 100 C
1 Name 101 AA
2 Email 101 BB
3 Site 101 CC
1 Name 102 AAA
2 Email 102 BBB
What the problem is one row is missing from the desired output, and that is
3 Site 102 NULL
Because for user 102 there is no entry in A table. I tried LEFT JOIN, but obviously it won't give the desired result as all the left table are already there. And INNER JOIN doesn't works either.
It is also complete possible for answers table (table A) to have data like this
QID QValue UserID Value
1 Name 100 A
2 Email 101 BB
3 Site 102 CCC
Say, all users just have filled in one record, in this case desired output is something like this
QID QValue UserID Value
1 Name 100 A
2 Email 100 NULL
3 Site 100 NULL
1 Name 101 NULL
2 Email 101 BB
3 Site 101 NULL
1 Name 102 NULL
2 Email 102 NULL
3 Email 102 CCC
If I do a LEFT JOIN on QID it doesn't works. Please suggest what should be done.

Try this:
declare #A table(QID int, UserID int, Value varchar(10))
declare #Q table(QID int, Value varchar(10))
insert into #A values (1, 100, 'A')
insert into #A values (2, 100, 'B')
insert into #A values (3, 100, 'C')
insert into #A values (1, 101, 'AA')
insert into #A values (2, 101, 'BB')
insert into #A values (3, 101, 'CC')
insert into #A values (1, 102, 'AAA')
insert into #A values (2, 102, 'BBB')
insert into #Q values (1, 'Name')
insert into #Q values (2, 'Email')
insert into #Q values (3, 'Site')
select
U.UserID,
Q.QID,
Q.Value as QValue,
A.Value
from
(select distinct UserID from #A) U -- all Users
cross join #Q Q -- all Questions
left outer join #A A on A.UserID = U.UserID and A.QID = Q.QID
So basically you do a cross join between all questions and all users first to get all combinations. Then you take this result and do a left join with all the answers. Missing answers will have NULL values in the Value (the real answer) field.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Remove duplication from sql query - sql-server

Related

Distribute multiple payments to invoice lines

Insert multuple rows at once with a calculated column from prior inserts into SQL Server

Filling missing records with previous existing records

Using SQL unpivot on two groups

Left join get all data in both table

Categories

Resources