I have following data:
+----------------+--------------+-----+
| StgDescription | ID | Amt |
+----------------+--------------+-----+
| A | OA17 | 11 |
| A | OA17 | 11 |
| A | OA17 | 11 |
| A | OA17 | 11 |
| B | ZA47/ A | 12 |
| B | ZA47/ A | 12 |
| B | ZA47/ B | 10 |
| B | ZA47/ B | 10 |
| B | ZA48/ A | 14 |
| B | ZA48/ F | 10 |
| B | ZA48 /G | 13 |
| B | ZA48 /H | 10 |
| B | ZA48/ I | 15 |
| B | ZA48/ J | 10 |
| B | ZA48/ K | 16 |
| B | ZA48/ L | 10 |
| c | FA01LM100340 | 10 |
| c | PA53 AE | 10 |
+----------------+--------------+-----+
I want to generate report in following format. The amount should be sum for ID for same StgDescription.
+----------------+-----+
| StgDescription | Amt |
+----------------+-----+
| a | 11 |
| b | 120 |
| c | 20 |
+----------------+-----+
I've written following query to get this result:
WITH CTE AS(
SELECT
distinct
s.StgDescription
,p.ID
,Amt
FROM [DinDb].[dbo].[tblTvlTransaction] t
JOIN tblstgmaster s on t.StgId=s.StgId
JOIN tblProjDocSt p on t.TDocID=p.DocId
JOIN [PdasDb].[dbo].[tblIDmaster] f ON p.ID=f.ID
where OptAuthoDateTime between '2015-07-27 00:00:00' and '2015-09-01 00:00:00')
select StgDescription,sum(AMT) from cte group by StgDescription
Is there any other efficient alternative to do this?
First in cte remove duplicates, then GROUP BY like:
WITH cte AS (
SELECT DISTINCT StgDescription, ID, Amt
FROM your_tab
)
SELECT
StgDescription,
Amt = SUM(Amt)
FROM cte
GROUP BY StgDescription;
OR:
WITH cte AS (
SELECT StgDescription, ID, Amt
FROM your_tab
GROUP BY StgDescription, ID, Amt
)
SELECT
StgDescription,
Amt = SUM(Amt)
FROM cte
GROUP BY StgDescription;
I hope that you get the data from a query, not from a table. It would not be good to store data thus redundantly. And it would not be gould to name a column ID which is not the unique identifier for a row in a table.
Your problem with the data is that you have duplicates, which prevents you from getting the sum directly. So use DISTINCT to make your data unique first.
If this data is from a query then simply add DISTINCT after the SELECT keyword. If not, use a derived table (i.e. a subquery) where you select distinct records from the table.
select stgdescription, sum(amt)
from
(
select distinct stgdescription, id, amt
from mydata
) distinct_data
group by stgdescription;
You may want to replace stgdescription with lower(stgdescription), though, if stgdescription can be 'A' or 'a' and you want to treat them the same.
I'd keep it as simple as possible, like this:
select StgDescription, sum(Amt) from
(
select distinct StgDescription, ID, Amt from tablename
) a
group by StgDescription
Hope it helps!
I suspect your duplicates are coming from [tblTvlTransaction], therefore, I would remove this table as a JOIN and use EXISTS to just check a record is there. So essentially the only tables in the FROM clause are those you actually need data from:
SELECT s.StgDescription, p.ID, s.Amt
FROM tblstgmaster AS s
INNER JOIN tblProjDocSt p on
t.TDocID = p.DocId
INNER JOIN [PdasDb].[dbo].[tblIDmaster] AS f
ON p.ID = f.ID
WHERE EXISTS
( SELECT 1
FROM [DinDb].[dbo].[tblTvlTransaction] AS t
WHERE t.OptAuthoDateTime BETWEEN '2015-07-27 00:00:00' AND '2015-09-01 00:00:00'
AND t.StgId = s.StgId
);
The advantage of EXISTS is that it can use a semi-join, which essentially means rather than pulling back all the rows from the transaction table, it will stop the seek/scan as soon as it finds one matching record. This should leave you without duplicates so you can do the SUM directly:
SELECT s.StgDescription, Amount = SUM(s.Amt)
FROM tblstgmaster AS s
INNER JOIN tblProjDocSt p on
t.TDocID = p.DocId
INNER JOIN [PdasDb].[dbo].[tblIDmaster] AS f
ON p.ID = f.ID
WHERE EXISTS
( SELECT 1
FROM [DinDb].[dbo].[tblTvlTransaction] AS t
WHERE t.OptAuthoDateTime BETWEEN '2015-07-27 00:00:00' AND '2015-09-01 00:00:00'
AND t.StgId = s.StgId
)
GROUP BY s.StgDescription;
Related
I'm having trouble figuring out how I can check if records on a table are in a specific order. The simplified table design is essentially this:
+------------+----------------+--------+
| ID (GUID) | StartDate | NumCol |
+------------+----------------+--------+
| CEE8C17... | 8/17/2019 3:11 | 22 |
| BB22001... | 8/17/2019 3:33 | 21 |
| 4D40B12... | 8/17/2019 3:47 | 21 |
| 3655125... | 8/17/2019 4:06 | 20 |
| 3456CD1... | 8/17/2019 4:22 | 20 |
| 38BAF92... | 8/17/2019 4:40 | 19 |
| E60CBE8... | 8/17/2019 5:09 | 19 |
| 5F2756B... | 8/17/2019 5:24 | 18 |
+------------+----------------+--------+
The ID column is a non-sequential GUID. The table is sorted by default on the StartDate when data is entered. However I am trying to flag instances where the NumCol values are out of descending order. The NumCol values can be identical on adjacent records, but ultimately they must be descending.
+--------+
| NumCol |
+--------+
| 22 |
| *20 | <- I want the ID where this occurs
| 21 |
| 20 |
| 20 |
| 19 |
| 19 |
| 18 |
+--------+
I've tried LEFT JOIN this table to itself, but can't seem to come up with an ON clause that gives the right results:
ON a.ID <> b.ID AND a.NumCol > b.NumCol
I also thought I could use OFFSET n ROWS to compare the default sorted table against one with an ORDER BY NumCol performed on it. I can't come up with anything that works.
I need a solution that will work for both SQL Server and SQL Compact.
With EXISTS:
select t.* from tablename t
where exists (
select 1 from tablename
where numcol > t.numcol and startdate > t.startdate
)
Or with row_number() window function:
select t.id, t.startdate, t.numcol
from (
select *,
row_number() over (order by startdate desc) rn1,
row_number() over (order by numcol) rn2
from tablename
) t
where rn1 > rn2
See the demo.
This might be easiest:
select * from T t1
where NumCol < (select max(NumCol) from T t2 where t2.StartDate > t1.StartDate);
The exists version is probably better to optimize though.
Using analytic functions you could try this approach which finds breaks in the monotonicity of consecutive rows. It might not return all the rows you're interested in seeing:
with data as (
select *, lag(NumCol) over (order by StartDate desc) as prevNumCol
from T
)
select * from data where prevNumCol > NumCol;
Here's a better solution that's probably not available in both of your environments:
with data as (
select *,
max(NumCol) over (
order by StartDate desc
rows between unbounded preceding and current row
) as prevMax
from T
)
select * from data where prevMax > NumCol;
Let's say I have a table with many columns like col1, col2, col3, id, variantId, col4, col5 etc
However I am only interested in id, variantId which look like this:
+----------+-----------+
| id | variantId |
+----------+-----------+
| a | 11 |
| a | 12 |
| b | 31 |
| c | 41 |
| c | 54 |
| d | abc |
| e | xyz |
| e | xyz |
+----------+-----------+
I need distinct ids which having count of distinct variantId more than once
In this case I would only get a and c
You can use group by and having:
select id
from t
group by id
having min(variant_id) <> max(variant_id);
You can also use:
having count(distinct variant_id) > 1
Try with group by having clause
select id
from table
group by id
having count(distinct variant_id) > 1
You can do it more efficiently with EXISTS:
select distinct t.id
from tablename t
where exists (
select 1 from tablename
where id = t.id and variantid <> t.variantid
)
I have a table dbo.X with DateTime column lastUpdated and a code product column CodeProd which may have hundreds of records, with CodeProd duplicated because the table is used as "stock history"
My Stored Procedure has parameter #Date, I want to get all CodeProd nearest to that date so for example if I have:
+----------+--------------+--------+
| CODEPROD | lastUpdated | STATUS |
+----------+--------------+--------+
| 10 | 2-1-2019 | C1 |
| 10 | 1-1-2019 | C2 |
| 10 | 31-12-2019 | C1 |
| 11 | 31-12-2018 | C1 |
| 11 | 30-12-2018 | C1 |
| 12 | 30-8-2018 | C3 |
+----------+--------------+--------+
and #Date= '1-1-2019'
I wanna get:
+----+--------------+------+
| 10 | 1-1-2019 | C2 |
| 11 | 31-12-2018 | C1 |
| 12 | 30-8-2018 | C3 |
+----+--------------+------+
How to find it?
You can use TOP(1) WITH TIES to get one row with nearest date for each CODEPROD which should be less than provided date.
Try like following code.
SELECT TOP(1) WITH TIES *
FROM [YourTableName]
WHERE lastupdated <= #date
ORDER BY Row_number()
OVER (
partition BY [CODEPROD]
ORDER BY lastupdated DESC);
You can use apply :
select distinct t.CODEPROD, t1.lastUpdated, t1.STATUS
from table t cross apply
( select top (1) t1.*
from table t1
where t1.CODEPROD = t.CODEPROD and t1.lastUpdated <= #date
order by t1.lastUpdated desc
) t1;
Let's say we have this and want to see all Tasks, that havent been done yet and an additional column showing how many open Tasks there are left for this customer.
I have a table like this in my database:
+------------+--------------------------+-------+
| CustomerID | Task | Done |
+------------+--------------------------+-------+
| 1 | CleanRoom | False |
| 1 | Cleandishes | True |
| 1 | WashClothes | False |
| 2 | TakeDogsOut | True |
| 2 | PlayWithKids | True |
| 3 | HaveFunWithMrSamplesWife | True |
| 3 | CleanMrSamplesCar | False |
+------------+--------------------------+-------+
I need this as returned table:
+------------+-------------------+-------------+
| CustomerID | Task | DoneOverAll |
+------------+-------------------+-------------+
| 1 | CleanRoom | 2 |
| 1 | WashClothes | 2 |
| 3 | CleanMrSamplesCar | 1 |
+------------+-------------------+-------------+
Perfect return table would be like this, but I can do that myself when I have the one above:
About this a question; Doing this will probably be a String combination task. Should I do this on the Select statement, or would it be more advisable to do that in the final application on the client computer?
+------------+-------------------+-------------+
| CustomerID | Task | DoneOverAll |
+------------+-------------------+-------------+
| 1 | CleanRoom | 1/3 |
| 1 | WashClothes | 1/3 |
| 3 | CleanMrSamplesCar | 1/2 |
+------------+-------------------+-------------+
I know I could go like
SELECT
a.CustomerID,
a.Task,
(
Select count(*) from myTable where
customerID = a.CustomerID and
done = False
) as DoneOverAll
FROM myTable as a
WHERE Done = False
But I think that this is very ineffective, since it would execute a Select Count for each row in my table. Is there a way to achieve this with a JOIN using groupBy or something? I'm not into GroupBy commands yet.
Okay I should have tried first. Came up with the following;
Select count(*), CustomerID from myTable group by CustomerID
All I need to do now is to get this into a join.
Okay, got it. Sorry again for not trying first!
SELECT
a.CustomerID,
a.Task,
b.cnt
FROM myTable as a
LEFT JOIN (select count(*) AS cnt, CustomerID FROM myTable GROUP BY CustomerID) as b on a.CustomerID = B.CustomerID
WHERE Done = False
Question left;
Perfect return table would be like this, but I can do that myself when I have the one above:
About this a question; Doing this will probably be a String combination task. Should I do this on the Select statement, or would it be more advisable to do that in the final application on the client computer?
+------------+-------------------+-------------+
| CustomerID | Task | DoneOverAll |
+------------+-------------------+-------------+
| 1 | CleanRoom | 1/3 |
| 1 | WashClothes | 1/3 |
| 3 | CleanMrSamplesCar | 1/2 |
+------------+-------------------+-------------+
I'm not sure why Done = False, but this is your logic. :-)
Here's what I would do, without the LEFT JOIN.
SELECT
a.CustomerID,
a.Task,
SUM(CASE WHEN a.Done = 'False' THEN 1 ELSE 0 END) DoneOverAll,
SUM(Case WHEN a.Done = 'True' THEN 1 ELSE 0 END) NotDone
FROM myTable as a
Group By a.CustomerID, a.Task
Do calculate separately .
;with tempfalse as(
SELECT
a.CustomerID,
a.Task,
count(*) as DoneOverAll
FROM myTable as a
WHERE Done = False
group by a.CustomerID, a.Task
)
, temptrue (
SELECT
a.CustomerID,
a.Task,
count(*) as total
FROM myTable as a
group by a.CustomerID, a.Task
)
SELECT
a.CustomerID,
a.Task,
cast(NULLIF(DoneOverAll,0) as varchar (10) ) + '/' + cast(NULLIF(b.total,0) as varchar (10) )
from temptrue as a left join tempfalse b
on a.CustomerID =a.CustomerID and
a.Task = b.Task
Hello I have a temp table (#tempResult) that contains results like the following...
-----------------------------------------
| DrugAliasID | Dosage1 | Unit1 | rowID |
-----------------------------------------
| 322 | 10 | MG | 1 |
| 322 | 50 | ML | 2 |
| 441 | 20 | ML | 3 |
| 443 | 15 | ML | 4 |
-----------------------------------------
I'm looking to get the results to be like the following, pivoting the rows that have the same DrugAliasID.
--------------------------------------------------
| DrugAliasID | Dosage1 | Unit1 | Dosage2 | Unit2 |
--------------------------------------------------
| 322 | 10 | MG | 50 | ML |
| 441 | 20 | ML | NULL | NULL |
| 443 | 15 | ML | NULL | NULL |
--------------------------------------------------
So far I have a solution that isn't using pivot. I'm not too good with pivot and was wondering if anyone knew how to use it in this scenario. Or solve it some other way. Thanks
SELECT
tr.drugAliasID,
MIN(trmin.dosage1) AS dosage1,
MIN(trmin.unit1) AS unit1,
MIN(trmax.dosage1) AS dosage2,
MIN(trmax.unit1) AS unit2
FROM
#tempResult tr
JOIN
#tempResult trmin ON trmin.RowID = tr.rowid AND trmin.drugAliasID = tr.drugAliasID
JOIN
#tempResult trmax ON trmax.RowID = tr.rowid AND trmax.drugAliasID = tr.drugAliasID
JOIN
(SELECT
MIN(RowID) AS rowid,
drugAliasID
FROM
#tempResult
GROUP BY
drugAliasID) tr1 ON tr1.rowid = trmin.RowID
JOIN
(SELECT
MAX(RowID) AS rowid,
drugAliasID
FROM
#tempResult
GROUP BY
drugAliasID) tr2 ON tr2.rowid = tr.RowID
GROUP BY
tr.drugAliasID
HAVING
count(tr.drugAliasID) > 1
Assuming your version of SQL Server supports the use of CTEs, you can simplify your query thus:
;with cte as
(select *, row_number() over (partition by drugaliasid order by rowid) rn
from #tempResult
)
select c.drugaliasid, c.dosage1, c.unit1, c2.dosage1 as dosage2, c2.unit1 as unit2
from cte c
left join cte c2 on c.drugaliasid = c2.drugaliasid and c.rn = 1 and c2.rn = 2
where c.rn = 1
Demo
This will give you the desired result, without having to use the pivot keyword.