T sql group by month - sql-server

I'm trying to group by according to month from datetime
I run below query
select cf.flow_name as 'Process', COUNT(c.case_ID) as 'Case', CONVERT(VARCHAR(10),c.xdate,104) as 'Date'
from cases c inner join case_flow cf on c.case_flow_ID=cf.CF_ID
where project_ID=1 and c.subject_ID=1
group by cf.flow_name,c.xdate
Columns data types as below
flow_name varchar(100)
case_ID int
xdate datetime
Result displays like below if i run above query
Process - Case - Date
Test 1 30.01.2015
Test 1 30.01.2015
analysis 1 19.03.2015
analysis 1 30.03.2015
analysis 1 13.04.2015
analysis 1 16.04.2015
Question:
I need to group by as below (group by according to month for x.date)
Correct Result should be as below
Process - Case - Date
Test 2 30.01.2015 (Because Test has 2 data from 01 month)
analysis 2 19.03.2015 (Because analysis has 2 data from 03 month)
analysis 2 13.04.2015 (Because analysis has 2 data from 04 month)
as above all result should group by month how can i do this according to my query ?
hope you understand my english thanks

SELECT cf_flow,
Count(*),
Min(xdate)
FROM cases c
INNER JOIN case_flow cf
ON c.case_flow_id = cf.cf_id
WHERE project_id = 1
AND c.subject_id = 1
GROUP BY cf_flow,
Dateadd(month, Datediff(month, 0, xdate), 0)

Related

Create a select statement that returns a record for each day after a given created date

I have a Dimension table containing machines.
Each machine has a date created value.
I would like to have a Select statement that generates for each day after a certain start date the available number of machines. A machine is available after the date created on wards
As I have read only access to the database I am not able to create a physical calendar table
I hope somebody can help me solving my issue
I assume this is what you want. Based on this sample table:
USE tempdb;
GO
CREATE TABLE dbo.Machines
(
MachineID int,
CreatedDate date
);
INSERT dbo.Machines VALUES(1,'20200104'),(2,'20200202'),(3,'20200214');
Then say you wanted the number of active machines starting on January 1st:
DECLARE #StartDate date = '20200101';
;WITH x AS
(
SELECT n = 0 UNION ALL SELECT n + 1 FROM x
WHERE n < DATEDIFF(DAY, #StartDate, GETDATE())
),
days(d) AS
(
SELECT DATEADD(DAY, x.n, #StartDate) FROM x
)
SELECT days.d, MachineCount = COUNT(m.MachineID)
FROM days
LEFT OUTER JOIN dbo.Machines AS m
ON days.d >= m.CreatedDate
GROUP BY days.d
ORDER BY days.d
OPTION (MAXRECURSION 0);
Results:
d MachineCount
---------- ------------
2020-01-01 0
2020-01-02 0
2020-01-03 0
2020-01-04 1
2020-01-05 1
...
2020-01-31 1
2020-02-01 1
2020-02-02 2
2020-02-03 2
...
2020-02-12 2
2020-02-13 2
2020-02-14 3
2020-02-15 3
Clean up:
DROP TABLE dbo.Machines;
(Yes, some people hiss at recursive CTEs. You can replace it with any number of set generation techniques, some I talk about here, here, and here.)

Is there an optimal way to create logical columns from physical column using SQL query statement?

I'm writing a SQL query using a table. My requirement is that I need to generate two logical columns from one physical column with certain conditions. In SQL how to generate two logical columns in final result set?
I have so far tried using sub-queries to derive those logical columns. But that sub-query returns error when incorporate it as a column in main query.
Overall there are other tables which will be joined using SQL JOIN to derive respective columns.
Columns:
CarrierName NVARCHAR(10)
MonthDate DATETIME
Stage INT
Scenario:
In my SQL Server table there is a column called Stage of type int that contains values like 1, 2, 3, 4.
Now, I have two date criteria to apply on above column to derive two logical columns in final result set.
Criteria #1:
Get carriers from past 12 months and priors to past month end date and value of "CurrentStage" should be less than and derive "PriorStage"
Example:
Current month is: March 2019 (2019-03-25) or any given date
Past latest month end date would be: 2019-02-28
12 months prior to above past latest month would be:
From 2018-02-01 To 2019-01-31
Criteria #2:
Get Carriers from past latest month end date and derive "CurrentStage"
While writing two independent SQL SELECT statements I get my desired results.
My challenge is when I think them to integrate in one select statement.
I get this error:
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression
Code:
DECLARE #DATE DATETIME
SET #DATE = '2018-08-25';
--QUERY 1 - RECORDS WITH PREVIOUS MONTH END DATE
SELECT
T1.CarrierName AS 'Carrier_Number',
T1.Stage AS 'Monitoring_Stage–Current'
FROM
table1 T1
WHERE
T1.Stage IS NOT NULL AND
CONVERT(DATE, T1.MonthDate) = CONVERT(DATE, DATEADD(D, -(DAY(#DATE)), #DATE))
--QUERY 2 - RECORDS FROM PAST 12 MONTHS PRIOR PREVIOUS MONTH END DATE
SELECT
T2.CarrierName,
T2.Stage AS 'Monitoring_Stage–Prior'
FROM
table2 T2
WHERE
T2.Stage IS NOT NULL AND
CONVERT(DATE, T2.MonthDate) BETWEEN CONVERT(DATE, DATEADD(M, -12, DATEADD(D, -(DAY(#DATE)), #DATE)))
AND CONVERT(DATE, DATEADD(D, -(DAY(#DATE) + (DAY(DATEADD(D, -(DAY(#DATE)), #DATE)))), #DATE))
AND T2.Stage) > (SELECT DISTINCT MAX(m.Stage AS INT))
FROM table1 m
WHERE CONVERT(DATE, m.MonthDate) = CONVERT(DATE, DATEADD(D, -(DAY(#DATE)), #DATE))
AND T2.CarrierName = m.CarrierName)
My final expected result set should contain below columns.
Where CurrentStage value is less than PriorStage value.
Expected Results
CarrierName | CurrentStage | PriorStage
--------------+--------------+-------------
C11122 | 1 | 2
C32233 | 3 | 4
Actual Result
I am looking for alternatives. I.e. CTE, Union, temp table etc.
Something like:
SELECT
CarrierName,
Query 1 Result As 'CurrentStage',
Query 2 Result As 'PrioreStage'
FROM
table1
To improve this post, I am adding my response here. My resolution below for this posted question is still under evaluation hence not posting it as my final answer. But it really brought a light to my effort.
RESOLUTION:
SELECT
DISTINCT M.CarrierName, A.[CurrentStage], B.[PriorStage]
FROM
--QUERY 1 - RECORDS WITH CURRENT MONTH END DATE
(SELECT M.CarrierName, M.CarrierID
, Stage AS 'CurrentStage'
FROM table1 M
WHERE M.Stage IS NOT NULL AND
CONVERT(date, M.MonthDate) = CONVERT(date, DATEADD(D,-(DAY(#DATE)), #DATE))
)
A **inner join**
(
--QUERY 2 - RECORDS FROM PAST 12 MONTHS PRIOR CURRENT MONTH END DATE
SELECT M2.CarrierName, M2.CarrierID
, Stage AS 'PriorStage'
FROM table1 M2
WHERE M2.Stage IS NOT NULL AND
CONVERT(date, M2.MonthDate) BETWEEN CONVERT(date, DATEADD(M, -12, DATEADD(D,-(DAY(#DATE)), #DATE)))
AND CONVERT(date, DATEADD(D,-(DAY(#DATE)+(DAY(DATEADD(D,-(DAY(#DATE)), #DATE)))), #DATE))
AND M2.Stage > (SELECT DISTINCT max(m.Stage)
FROM table1 m
WHERE CONVERT(date, m.MonthDate) = CONVERT(date, DATEADD(D,-(DAY(#DATE)), #DATE)) AND
M2.CarrierName = m.CarrierName
)
) B on b.Carrier_Number = a.Carrier_Number
INNER JOIN table1 M ON A.CarrierID = M.CarrierID AND B.CarrierID = M.CarrierID

update stats from nested query based on isnull and aggregate values

I have a parent system table called lt_program_data it contains customer data, percents tracked for that customer and a year data field, as those percents are tracked on a yearly basis.
The percents are populated from a localized table based on some criteria, then the parent table lt_program_data is updated based on the year and customer values.
However, in some cases we only have past data and what the user is requesting is in the cases where we have customer data, but no percents corresponding to this season we use the max season value.
our logic is like this for now:
update lt_program_data
set percent = ( select percent
from #percent b
where b.year = a.fyear and b.customer = a.customer)
from lt_program_data
This works great, but now we have to say something like
if b.year is null select Max year for that customer for the data we have.
select *
From #lt_program_data a
join #percent b on b.fyear= isnull(a.fyear,max(a.fyear)) and b.customer = a.customer
I tried to write a select and then an update but get the following message:
Msg 1015, Level 15, State 1, Line 3
An aggregate cannot appear in an ON clause unless it is in a subquery contained in a HAVING clause or select list, and the column being aggregated is an outer reference.
Please help sort this out.
Here is a sample of our output
lt_program_data
customer year .. percent .. Other columns
1 2016 ..
2 2016
1 2017
2 2017
3 2017
etc.
percent table looks like this
customer year percent
1 2016 40
2 2016 64
3 2016 11
The expected result will take lt_program_data for
customer year percent
1 2016 40
1 2017 40
2 2016 64
2 2017 64
3 2017 NULL
It matches customer number and percent for the given year that exists in the percent table (so the value for customer 1 becomes 40 and customer 2 becomes 64) since no data for those customers exist for 2017 season, it uses the same data (max existing) data for the respective customers from 2016 season. in the case of customer 3 since there is nothing its left NULL.
The percent table goes back to 2016, so what we want to say is since the max data we have for our customers goes back to 2016, we will populate the 2017 value in lt_program_data for customer 1 with the 2016 value of 40.
I hope this query will work for you.
update lt_program_data
set percent_poverty = case
when b.year is null -- year in #percent is null (no join found)
then (select top 1 poverty_percent -- then get first percent by ordering year descending
from #percent
where customer = a.customer
order by year desc)
else b.poverty_percent -- else get the percent
end
from lt_program_data a -- lets left join both tables on year and customer
left join #percent b on b.year = a.fyear and b.customer = a.customer

SQL Server: How to get a rolling sum over 3 days for different customers within same table

This is the input table:
Customer_ID Date Amount
1 4/11/2014 20
1 4/13/2014 10
1 4/14/2014 30
1 4/18/2014 25
2 5/15/2014 15
2 6/21/2014 25
2 6/22/2014 35
2 6/23/2014 10
There is information pertaining to multiple customers and I want to get a rolling sum across a 3 day window for each customer.
The solution should be as below:
Customer_ID Date Amount Rolling_3_Day_Sum
1 4/11/2014 20 20
1 4/13/2014 10 30
1 4/14/2014 30 40
1 4/18/2014 25 25
2 5/15/2014 15 15
2 6/21/2014 25 25
2 6/22/2014 35 60
2 6/23/2014 10 70
The biggest issue is that I don't have transactions for each day because of which the partition by row number doesn't work.
The closest example I found on SO was:
SQL Query for 7 Day Rolling Average in SQL Server
but even in that case there were transactions made everyday which accomodated the rownumber() based solutions
The rownumber query is as follows:
select customer_id, Date, Amount,
Rolling_3_day_sum = CASE WHEN ROW_NUMBER() OVER (partition by customer_id ORDER BY Date) > 2
THEN SUM(Amount) OVER (partition by customer_id ORDER BY Date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
END
from #tmp_taml9
order by customer_id
I was wondering if there is way to replace "BETWEEN 2 PRECEDING AND CURRENT ROW" by "BETWEEN [DATE - 2] and [DATE]"
One option would be to use a calendar table (or something similar) to get the complete range of dates and left join your table with that and use the row_number based solution.
Another option that might work (not sure about performance) would be to use an apply query like this:
select customer_id, Date, Amount, coalesce(Rolling_3_day_sum, Amount) Rolling_3_day_sum
from #tmp_taml9 t1
cross apply (
select sum(amount) Rolling_3_day_sum
from #tmp_taml9
where Customer_ID = t1.Customer_ID
and datediff(day, date, t1.date) <= 3
and t1.Date >= date
) o
order by customer_id;
I suspect performance might not be great though.

Cumulative Addition in SQL Server 2008

Sample data in tblData:
RowID SID Staken DateTaken
---------------------------------------------
1 1 1 2014-09-15 14:18:11.997
2 1 1 2014-09-16 14:18:11.997
3 1 1 2014-09-17 14:18:11.997
I would like to get the daywise count of SIDs and also a cumulative sum like
Date ThisDayCount TotalCount
-----------------------------------
2014-09-15 1 1
2014-09-16 10 11
2014-09-17 30 41
This is what I have now in my stored procedure with the start & end date parameters. Is there a more elegant way to do this?
;WITH TBL AS
(
SELECT
CONVERT(date, asu.DateTaken) AS Date,
COUNT(*) AS 'ThisDayCount'
FROM
tblData asu
WHERE
asu.SID = 1
AND asu.STaken = 1
AND asu.DateTaken >= #StartDate
AND asu.DateTaken <= #EndDate
GROUP BY
CONVERT(date, asu.DateTaken)
)
SELECT
t1.Date, t1.ThisDayCount, SUM(t1.ThisDayCount) AS 'TotalCount'
FROM
TBL t1
INNER JOIN
TBL t2 ON t1.date >= t2.date
GROUP BY
t1.Date, t1.ThisDayCount
I am not aware of a more elegant way to do that, other than perhaps with a subquery for your running total. What you have is pretty elegant by T-SQL standards.
But, depending on how many records you have to process and what your indexes look like, this could be very slow. You don't say what the destination of this information is, but if it's any kind of report or web page, I'd consider doing the running total as part of the processing at the destination rather than in the database.

Resources