How to define a field & group by within Select Query in SQL - sql-server

At present have a column with status types, and a separate column with dates. What I would like to do is create new columns for each status type. Have created column names by using case when statements, but cannot then group by those.
At present, the table kicks out the following:
Reference | Status | Date
----- | ----------- | -----
1 | Approve | 1/1/2017
1 | In Progress | 1/2/2017
2 | Approve | 1/1/2017
2 | In Progress | 1/2/2017
2 | Close | 1/3/2017
Would like to take this and make:
Reference | Approve | In Progress | Close
--------- | -------- | ----------- | -----
1 | 1/1/2017 | 1/2/2017 |
2 | 1/1/2017 | 1/2/2017 | 1/3/2017
Have a lot of other selects, and intention is to export to excel/run automatically, so trying to avoid temp tables.
I don't know that case when is appropriate, but am struggling to find a better solution.

You can use pivot in sql server
select * from #yourStatus
pivot( max(date) for [Status] in ([Approve], [In Progress],[Close])) p
Your table:
create table #yourStatus (Reference int, Status varchar(20), Date date)
insert into #yourstatus (
Reference , Status , Date ) values
( 1 ,'Approve','1/1/2017')
,( 1 ,'In Progress','1/2/2017')
,( 2 ,'Approve','1/1/2017')
,( 2 ,'In Progress','1/2/2017')
,( 2 ,'Close','1/3/2017')

Something like this should work based on the table you've provided (I've also handled null values to get you the output you need):-
select
Reference,
isnull(convert(varchar(10),[Approve],103),'') [Approve],
isnull(convert(varchar(10),[In Progress],103),'') [In Progress],
isnull(convert(varchar(10),[Close],103),'') [Close]
from
(
select
Reference,
Status,
Date
from YourTableName
) d
pivot (min([Date]) for Status in ([Approve],[In Progress],[Close])) p

Related

Group by a value if it exists otherwise group by another value of the same column

I have a table like this
| Id | ExternalId | Type | Date | StatusCode |
-------------------------------------------------------
| 1 | 123 | 25 | 2020-01-01 | A |
| 2 | 123 | 25 | 2020-01-02 | A |
| 5 | 125 | 25 | 2020-01-01 | A |
| 6 | 125 | 25 | 2020-01-02 | B |
| 3 | 124 | 25 | 2020-01-01 | B |
| 4 | 124 | 25 | 2020-01-02 | A |
I need to take just one row for each ExternalId having the Max(Date) and having the StatusCode = B if B exists, otherwise the StatusCode = A
So, the expected result is
| Id | ExternalId | Type | Date | StatusCode |
-------------------------------------------------------
| 2 | 123 | 25 | 2020-01-02 | A | <--I take Max Date and the StatusCode of the same row
| 6 | 125 | 25 | 2020-01-02 | B | <--I take Max Date and the StatusCode of the same row
| 3 | 124 | 25 | 2020-01-02 | B | <--I take Max Date and B, even if the Status code of the Max Date is A
Here the query I have tried to write:
SELECT ExternalId, Type, EntityType, Max(Date) as Date
From MyTable
group by ExternalId, Type, EntityType
But I cannot finish it.
If I understand your requirements, this could be, what you want:
SELECT ExternalId, Type, MAX(Date) AS Date, MAX(StatusCode) AS StatusCode
FROM MyTable
GROUP BY ExternalId, Type
Explanation:
You want the Max of StatusCode, because B is greater than A. You want the Max of Date, no matter what StatusCode is shown. And you want it for each ExternalId. Therefore you have to Group by ExternalId.
Furthermore, you Need also the Type shown, and as it's no group function, the query has to be grouped by type either. It's no problem though, because type is dependent on ExternalId ( or at least in your example data, it is).
As far as I understand from your sql, you also need to group by Type and EntityType. If it’s correct, you can write max with condition for 'B' and another max for all rows and use those results in isnull or coalesce function like this:
Select
t.ExternalId
,t.Type
,t.EntityType
,isnull(
max(iif(t.StatusCode='B', t.Date, null))
,max(t.Date)
) as Date
From MyTable t
Group by
t.ExternalId
,t.Type
,t.EntityType
You want to filter instead of aggregate. One solution is to use row_number():
select *
from (
select
t.*,
row_number() over(partition by ExternalId order by StatusCode desc, Date desc) rn
from mytable t
) t
where rn = 1
The order by clause of row_number() puts rows with StatusCode = 'B' first, and then orders by descending date.
This works because StatusCode has only two values, and because 'B'> 'A'. If your real data has different values (or more than 2 values), then you would need something more explicit, like:
order by case when StatusCode = 'B' then 0 else 1 end, Date desc
Here is the Query, Which can help you.
SELECT Externalid, MAX([Date]) as 'Date', MAX(StatusCode) 'StatusCode' from MyTable Group by Externalid
In your expected result, you have added the id column which cannot added here, if you want to have values from multiple rows.
Result will be
|123|2020-01-02|A|
|124|2020-01-02|B|
|125|2020-01-02|B|

SQL Server find sum of values based on criteria within another table

I have a table consisting of ID, Year, Value
---------------------------------------
| ID | Year | Value |
---------------------------------------
| 1 | 2006 | 100 |
| 1 | 2007 | 200 |
| 1 | 2008 | 150 |
| 1 | 2009 | 250 |
| 2 | 2005 | 50 |
| 2 | 2006 | 75 |
| 2 | 2007 | 65 |
---------------------------------------
I then create a derived, aggregated table consisting of an ID, MinYear, and MaxYear
---------------------------------------
| ID | MinYear | MaxYear |
---------------------------------------
| 1 | 2006 | 2009 |
| 2 | 2005 | 2007 |
---------------------------------------
I then want to find the sum of Values between the MinYear and MaxYear foreach ID in the aggregated table, but I am having trouble determining a proper query.
The final table should look something like this
----------------------------------------------------
| ID | MinYear | MaxYear | SumVal |
----------------------------------------------------
| 1 | 2006 | 2009 | 700 |
| 2 | 2005 | 2007 | 190 |
----------------------------------------------------
Right now I can perform all the joins to create the second table. But then I use a fast forward cursor to iterate through each record of the second table with the code inside the for loop looking like the following
DECLARE #curMin int
DECLARE #curMax int
DECLARE #curID int
FETCH Next FROM fastCursor INTo #curISIN, #curMin , #curMax
WHILE ##FETCH_STATUS = 0
BEGIN
SELECT Sum(Value) FROM ValTable WHERE Year >= #curMin and Year <= #curMax and ID = #curID
Group By ID
FETCH Next FROM fastCursor INTo #curISIN, #curMin , #curMax
Having found the sum of values between specified years, I can connect it back to the second table and I wind up the desired result (the third table).
However, the second table in reality is roughly 4 million rows, so this iteration is extremely time consuming (~generating 300 results a minute) and presumably not the best solution.
My question is, is there a way to generate the third table's results without having to use a cursor/for loop?
During a group by the sum will only be for the ID in question -- since the min year and max year is for the ID itself then you don't need to double query. The query below should give you exactly what you need. If you have a different requirement let me know.
SELECT ID, MIN(YEAR) as MinYear, MAX(YEAR) as MaxYear, SUM(VALUE) as SUMVALUE
FROM tablenameyoudidnotsay
GROUP BY ID
You could use query as bellow
TableA is your first table, and TableB is the second one
SELECT *,
(select SUM(Value) FROM TableA where tablea.ID=TableB.ID AND tableA.Year BETWEEN
TableB.MinYear AND TableB.MaxYear) AS SumValue
from TableB
You can put your criteria into a join and obtain the result all as one set which should be faster:
SELECT b.Id, b.MinYear, b.MaxYear, sum(a.Value)
FROM Table2 b
JOIN Table1 a ON a.Id=b.Id AND b.MinYear <= a.Year AND b.MaxYear >= a.Year
GROUP BY b.Id, b.MinYear, b.MaxYear

SQL apply functions to multiple id rows

I'm using SQL Server 2008, and trying to gather individual customer data appearing over multiple rows in my table, an example of my database is as follows:
custID | status | type | value
-------------------------
1 | 1 | A | 150
1 | 0 | B | 100
1 | 0 | A | 153
1 | 0 | A | 126
2 | 0 | A | 152
2 | 0 | B | 101
2 | 0 | B | 103
For each custID, my task is to find a flag if status=1 for any row, if type=B for any row, and the average of value in all cases where type=B. So my solution should look like:
custID | statusFlag | typeFlag | valueAv
-------------------------------------------
1 | 1 | 1 | 100
2 | 0 | 1 | 102
I can get answers for this using lots of row_number() over (partition by .. ), to create ids, and creating subtables for each column selecting the desired id. My issue is this method is awkward and time consuming, as I have many more columns than shown above to do this over, and many tables to repeat it for. My ideal solution would be to define my own aggregate() function so I could just do:
select custID, ag1(statusFlag), ag2(typeFlag)
group by custID
but as far as I can tell custom aggregates can't be defined in SQL server. Is there a nicer general approach to this problem, which doesn't require defining lots of id's ?
use CASE WHEN to evaluate the value and apply the aggregate function accordingly
select custID,
statusFlag = max(status),
typeFlag = max(case when type = 'B' then 1 else 0 end),
valueAv = avg(case when type = 'B' then value end)
from samples
group by custID

how to update only 1 item in a duplicate field in sql 2005

have a duplicate fields and i need to update only one row how do i do this with sql 2005?
my database is as seen below:
+----------------+-------+-------------+------------------+---------------+
| Transaction_no | User | Check-In | Check-Out | barcode |
+----------------+-------+-------------+------------------+---------------+
| 01-2013 | User1 | --/--/-- | 12/28/2013 11:10 | APH009300L030 |
| 01-2013 | User1 | --/--/-- | 12/28/2013 11:10 | APH009300L030 |
| 01-2013 | User1 | --/--/-- | 12/28/2013 11:10 | APH009300L030 |
| 01-2013 | User1 | --/--/-- | 12/28/2013 11:10 | APH009300L030 |
+----------------+-------+-------------+------------------+---------------+
Try Like this
;WITH NumberedTbl AS
(
SELECT ROW_NUMBER() OVER (
PARTITION User,Check-In,Check-Out,barcode,Transaction_no
ORDER BY User,Check-In,Check-Out,barcode,Transaction_no
) AS RowNumber,
User,Check-In,Check-Out,barcode,Transaction_no
FROM tbl
)
UPDATE T
SET Check-In= GETDATE()
FROM NumberedTbl T
WHERE T.RowNumber = 1
SELECT * FROM tbl
Please see Demo here
Source
What you need in your table is a primary key column. You can just add an identity field. Then you'll be able to easily write queries like the one you need.
RAnking functions like row_number , rank , dense_rank wont work as all th field values are same and applying ORDER BY on any column after partitioninh will generate same rank for all rows.
Instead add another column habing unique values eg Identity column qfter which you can use this columm in ORDER By clause.

Get latest records for a table that stores the delta of data?

I have a table Actions, schema blow:
[Actions]
ActionID
Date
Status <--Nullable, a delta column, only stores value when status changes
Now I want to retrieve the latest record, however it is very likely that Stutus for that record is null, therefore I want to get its last status change(ranked by Date).
Here is an example:
ActionID | Date | Status
------------------------
1 | 04/12| 'Bon'
2 | 04/13| NULL
3 | 04/14| NULL
4 | 04/15| NULL
and my latest record should look like: ActionID: 4, Date: 04/15, Status: 'Bon'
I know it's possible to do with nested select statements, but in my real table, I have about 10 of these columns, it will drastically affect the performance when a lot of queries like these are made. I wonder if there is a simpler way to do it?
Not sure if I understood your rules, but try this:
SELECT TOP 1
ActionID,
Date,
(SELECT TOP 1 Status FROM Actions WHERE Status IS NOT NULL ORDER BY ActionID DESC) AS Status
FROM Actions
ORDER BY ActionID DESC
You said you have 10 columns.. How it works?
Like scenario A)
ActionID | Date | StatusA | StatusB | StatusC
1 | 04/11 | DELTA_A | NULL | DELTA_C
2 | 04/12 | NULL | DELTA_B | DELTA_C
3 | 04/13 | DELTA_A | NULL | NULL
..then multiple SELECT TOP 1 subqueries is still best choise I think..
but if it's like scenario B)
ActionID | Date | StatusA | StatusB | StatusC
1 | 04/11 | NULL | NULL | NULL
2 | 04/12 | DELTA_A | DELTA_B | DELTA_C
3 | 04/13 | NULL | NULL | NULL
..then you may "reverse" your query like this:
SELECT TOP 1
(SELECT TOP 1 ActionID FROM Actions WHERE Status IS NOT NULL ORDER BY ActionID DESC) AS ActionID,
(SELECT TOP 1 Date FROM Actions WHERE Status IS NOT NULL ORDER BY ActionID DESC) AS Date,
StatusA,
StatusB,
StatusC
FROM Actions
WHERE StatusA IS NOT NULL -- then StatusB and StatusC are also NOT NULL
ORDER BY ActionID DESC
..but be aware you may get empty result if there is no row with StatusA = NOT NULL.
Maybe this helps you, although i'm not sure whether or not i've understood your requirement:
WITH Actions AS(
SELECT ROW_NUMBER()OVER(Order By Date DESC)AS DateRank
, ActionsID
, Date
, Status
FROM [Actions]
)
SELECT TOP 1 a1.ActionsID,a1.Date,a2.Status
FROM Actions a1 INNER JOIN Actions a2
ON a1.DateRank < a2.DateRank AND a2.Status IS NOT NULL
WHERE a1.DateRank=1

Resources