How Can We Count New Distinct Values group by date in SQL

How Can We Count New Distinct Values group by date in SQL - sql-server

I need to a distinct number of values in a column and group it by date, The real problem is the count should not include if it already occurred in a previous result.
Eg: Consider the table tblcust
Date Customers
March 1 Mike
March 1 Yusuf
March 1 John
March 2 Ajay
March 2 Mike
March 2 Anna
The result should be
Date Customer_count
March 1 3
March 2 2
If I use
select date,count(distinct(customer)) as customer_count
group by date
The Result I am getting is
Date customer_count
March 1 3
March 2 3
The customer, Mike has been visited twice, It should not be counted as a new customer.

You can try and achieve this using SQL Server ROW_NUMBER Function
.
create table tblCust (DtDate varchar(20), Customers Varchar(20))
insert into tblCust Values
('March 1', 'Mike'),
('March 1', 'Yusuf'),
('March 1', 'John'),
('March 2', 'Ajay'),
('March 2', 'Mike'),
('March 2', 'Anna')
Select dtDate
, Customers
, ROW_NUMBER() OVER (PARTITION BY Customers ORDER BY DtDate) as SrNo
from tblCust
order by DtDate, SrNo
select Dtdate,
count(distinct(customers)) as customer_count
from(Select dtDate
, Customers
, ROW_NUMBER() OVER (PARTITION BY Customers ORDER BY DtDate) as SrNo
from tblCust
)a where SrNo = 1
group by Dtdate
Here is the live db<>fiddle demo

Related

Group row values into columns without hardcoding PL SQL

I have a table that looks like this:
MONTH
NAME
July
Ally
July
Don
July
Ken
March
Lee
March
Froyo
March
Denise
April
Kram
I want it to look like this:
July
March
April
Ally
Lee
Kram
Don
Froyo
Ken
Denise
Here is a solution I have tried:
SELECT
(SELECT name FROM mytable WHERE month='July') AS July,
(SELECT name FROM mytable WHERE month='March') AS March,
(SELECT name FROM mytable WHERE month='April') AS April
FROM DUAL;
However, I do not want to hardcode the month values like month='July'
Is there any way to do this?

Seems you're trying to construct a query with conditional aggregate, sorting the month-columns in their order within a year such as
SELECT MAX(CASE
WHEN month = 'March' THEN
name
END) AS March,
MAX(CASE
WHEN month = 'April' THEN
name
END) AS April,
MAX(CASE
WHEN month = 'July' THEN
name
END) AS July
FROM (SELECT t.*,
ROW_NUMBER() OVER(PARTITION BY month ORDER BY name) AS rn
FROM t) -- "t" represents your mentioned table
GROUP BY rn
ORDER BY rn
and you can generate a stored function code as you wish to make it dynamic such as
CREATE OR REPLACE FUNCTION Get_People_By_Months RETURN SYS_REFCURSOR IS
v_recordset SYS_REFCURSOR;
v_sql VARCHAR2(32767);
v_cols VARCHAR2(32767);
BEGIN
SELECT LISTAGG('MAX(CASE WHEN month = '''||month||''' THEN name END) AS '||month ,',')
WITHIN GROUP (ORDER BY month_nr)
INTO v_cols
FROM (SELECT DISTINCT t.month, m.month_nr
FROM t
JOIN (SELECT TO_CHAR(TO_DATE(level, 'mm'),
'Month',
'NLS_DATE_LANGUAGE=English') AS month,
level AS month_nr
FROM dual
CONNECT BY level <= 12) m
ON t.month = TRIM(m.month));
v_sql :='SELECT '|| v_cols ||'
FROM (SELECT t.*, ROW_NUMBER() OVER (PARTITION BY month ORDER BY name) AS rn
FROM t)
GROUP BY rn
ORDER BY rn';
OPEN v_recordset FOR v_sql;
DBMS_OUTPUT.PUT_LINE(v_sql);
RETURN v_recordset;
END;
/
then invoke from the SQL Developer's console as
SQL> DECLARE
result SYS_REFCURSOR;
BEGIN
:result := Get_People_By_Months;
END;
/
SQL> PRINT result;

Split a date column, and calculate an amount based on the result

In the sales table I have a column that contains full dates of each sale, I need to split it and add 2 separate columns of month and year to the sales table, and a column that contains the sum of all sales of each month that was.
This is the table I have-
Sales table
customer_id
date
Quantity
123
01/01/2020
6
124
01/02/2020
7
123
01/03/2021
5
123
15/01/2020
4
Here's what I wrote -
ALTER TABLE SALES ADD SELECT DATEPART (year, date) as year FROM SALES;
ALTER TABLE SALES ADD SELECT DATEPART (month, date) as month FROM SALES;
ALTER TABLE SALES ADD SUM_Quantity AS SUM() (Here I was stuck in a query...)
Is this a valid query, and how to write it correctly? Thanks!

One of the problems you're going to have here is that the outcome of the computed columns you have isn't compatible with the data being stored.
For example you can't build a sum of the quantity for all of the rows in January for each row in January. You need to group by the date and aggregate the quantity.
As such I think this might be an ideal candidate for an indexed view. This will allow you to store the calculated data, whilst preserving the data in the original table.
create table SalesTable (customer_id int, date date, quantity smallint);
insert SalesTable (customer_id, date, quantity)
select 123, '2020-01-01', 6
union
select 124, '2020-02-01', 7
union
select 123, '2021-03-01', 5
union
select 123, '2020-01-15', 4;
select * from SalesTable order by date; /*original data set*/
GO
CREATE VIEW dbo.vwSalesTable /*indexed view*/
WITH SCHEMABINDING
AS
SELECT customer_id, DATEPART(year, date) as year, datepart(MONTH, date) as
month, SUM(quantity) as sum_quantity from dbo.SalesTable group by Customer_Id,
DATEPART(year, date), DATEPART(MONTH, date)
GO
select * from vwSalesTable /*data from your indexed view*/
drop view vwSalesTable;
drop table SalesTable;

Sort by max date per ID, then other records by same ID, then other IDs by their max date

I have data with multiple rows per ID. I would like to first sort by maximum date, but keep the other rows with the same ID together by date descending, then the next group of IDs with the next max date, and so on.
For example, this data
create table #tbl (id int, dt date);
insert into #tbl (id, dt)
values (1, '2020-07-01')
, (1, '2020-07-17')
, (1, '2020-07-31')
, (2, '2020-07-07')
, (2, '2020-07-14')
, (2, '2020-07-16')
, (3, '2020-07-02')
, (3, '2020-07-20')
;
would output as
id dt
1 7/31/2020
1 7/17/2020
1 7/1/2020
3 7/20/2020
3 7/2/2020
2 7/16/2020
2 7/14/2020
2 7/7/2020
So id 1 has the greatest date, then the other id 1 rows by date descending. Next, id 3 has the greatest date of the remaining rows, then the other id 3 rows by date descending, and so on.
I can get max dates and row numbers but it is sorted by dates then IDs and doesn't keep the IDs grouped together.
Version: Microsoft SQL Azure (RTM) - 12.0.2000.8 Jul 31 2020 08:26:29 Copyright (C) 2019 Microsoft Corporation

Use MAX() window function in the ORDER BY clause:
select *
from #tbl
order by max(dt) over (partition by id) desc,
id, -- just in case 2 ids have the same max dt
dt desc
See the demo.
Results:
> id | dt
> -: | :---------
> 1 | 2020-07-31
> 1 | 2020-07-17
> 1 | 2020-07-01
> 3 | 2020-07-20
> 3 | 2020-07-02
> 2 | 2020-07-16
> 2 | 2020-07-14
> 2 | 2020-07-07

Maybe not the most efficient but this seems to give you the output you desire:
declare #tempTbl table (id int, dataorder int)
insert into #tempTbl
select
id,
ROW_NUMBER() over (order by max(dt) desc)
from
#tbl
group by id
order by max(dt) desc
select
tbl2.id,
tbl2.dt
from
#tempTbl tbl1 left join #tbl tbl2 on tbl1.id = tbl2.id
order by
tbl1.dataorder,
tbl2.dt desc

Date Comparison of Two Tables in SQL SERVER

I had this Data,
Table One :
EmpID Date Absent
1 01/01/2018 1
1 01/02/2018 1
1 02/05/2018 1
1 03/25/2018 1
1 04/01/2018 0
1 05/02/2018 1
1 06/03/2018 1
Table Two
ID Amount DateEffective
1 5.00 02/06/2018
2 3.00 05/02/2018
3 10.00 06/03/2018
Desired Output
EmpID Month Year Absent Penalty
1 January 2018 2 5.00
1 February 2018 1 5.00
1 March 2018 1 3.00
1 April 2018 0 3.00
1 May 2018 1 13.00
1 June 2018 1 10.00
This is my Code
SELECT { fn MONTHNAME(one.Date) } AS MonthName, YEAR(one.Date) AS Year, SUM(one.Absent) AS Absent,
(
SELECT top 1 two.DailyRate
FROM table_two as two
WHERE EmpID = '1'
AND one.Date <= two.EffectivityDate
)
FROM table_one as one
WHERE EmpID = '1'
GROUP BY { fn MONTHNAME(one.Date) }, MONTH(one.Date), YEAR(one.DTRDate)
ORDER BY Year(one.Date),month(one.Date)
and it shows an error :
Column 'one.Date' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
please help for this issue...
Thanks

Try this :
SELECT
one.EmpID
,DATENAME(MONTH,one.Date) AS [MonthName]
,YEAR(one.Date) AS [Year]
,SUM(one.Absent) AS [Absent]
,(SELECT top 1 two.Amount
FROM table_two as two
WHERE two.ID = one.EmpID
AND YEAR(two.DateEffective) >= YEAR(one.Date)
AND MONTH(two.DateEffective) >=MONTH(one.Date)
) AS [Penalty]
FROM table_one as one
WHERE
one.EmpID = '1'
GROUP BY one.EmpID,DATENAME(MONTH,one.Date), MONTH(one.Date), YEAR(one.Date)
ORDER BY Year(one.Date),month(one.Date)

From my understanding to do this,
select e.EmpID
,datename(month,e.Date)[month]
,year(e.Date) [year]
,sum(e.Absent) as [Abscount]
,a.Amount
from
empl e left join abs a
on datename(month,e.Date)=DATENAME(month,a.DateEffective)
group by e.EmpID,DATENAME(MONTH,e.Date), MONTH(e.Date), YEAR(e.Date) , a.Amount
order by Abscount desc
Revert me if any clarifications needed...

is this helpful.?
Create Table #TabOne(EmpID int,[Date] Date,[Absent] Bit)
Create Table #TabTwo(ID int,Amount float,DateEffective Date)
Insert into #TabOne
SELECT 1,'01/01/2018',1 Union All
SELECT 1,'01/02/2018',1 Union All
SELECT 1,'02/05/2018',1 Union All
SELECT 1,'03/25/2018',1 Union All
SELECT 1,'04/01/2018',0 Union All
SELECT 1,'05/02/2018',1 Union All
SELECT 1,'06/03/2018',1
Insert into #TabTwo
Select 1,5.00 ,'02/06/2018' Union All
Select 2,3.00 ,'05/02/2018' Union All
Select 3,10.00,'06/03/2018'
;with cte1
As
(
Select One.EmpID,MONTH(one.[Date]) As [mon],YEAR(one.[Date]) As [Year],two.Amount,one.[Absent],
ROW_NUMBER() OVER(partition by One.EmpID,One.[Date] order by DATEDIFF(dd,two.DateEffective,one.[Date]) desc) as rn
from #TabOne one
LEFT JOIN #TabTwo two on one.[Date]<=two.DateEffective
)
Select EmpID,DATENAME(month, DATEADD(month, [mon]-1, CAST('2008-01-01' AS datetime))) As [Month],
[Year],SUM(CASE WHEN [Absent]=0 then 0 ELSE 1 END) As [Absent] ,MAX(Amount) As Penalty
from cte1
where rn=1
Group by EmpID,[Year],[mon]
order by EmpID,[Year],[mon]
Drop Table #TabOne
Drop Table #TabTwo

If Value is present in two consecutive months , display only one month in sql

I would want to check ID in consecutive months, IF Same ID is present in two consecutive months then consider that ID only for 1st month.
If ID's are not in consecutive month then show the distinct ID's grouped by start date month.(We consider only start date)
For example, ID 1 is present in start date months january and Feb , then Distinct count of this ID will be 1 in Jan, how ever ID 2 and 3 are
present in Jan and March and Feb and May Resp, now I would like to see this distinct count of ID in Jan and March.
Current Data
Table1:
ID StartDate EndDate
1 2017-01-12 2017-01-28
1 2017-01-19 2017-01-28
1 2017-01-29 2017-02-11
1 2017-02-01 2017-02-11
1 2017-02-19 2017-02-24
2 2017-01-12 2017-01-28
2 2017-01-19 2017-01-28
2 2017-03-09 2017-03-20
3 2017-02-12 2017-02-28
3 2017-02-19 2017-02-28
3 2017-05-05 2017-05-29
3 2017-05-09 2017-05-29
I tried with below logic bt I know I am missing on something here.
select t.* from Table1 t
join Table1 t t1
on t1.ID=t.ID
and datepart(mm,t.StartDate)<> datepart(mm,t1.StartDate)+1
Expected Result:
DistinctCount StartDateMonth(In Numbers)
1 1(Jan)
2 1(Jan)
2 3(March)
3 2(Feb)
3 5(May)
Any help is appreciated!

Here's my solution. The thinking for this is:
1) Round all the dates to the first of the month, then work with the distinct dataset of (ID, StartDateRounded). From your dataset, the result should look like this:
ID StartDateRounded
1 2017-01-01
1 2017-02-01
2 2017-01-01
2 2017-03-01
3 2017-02-01
3 2017-05-01
2) From this consolidated dataset, find all records by ID that do not have a record for the previous month (which means it's not a consecutive month and thus is a beginning of a new data point). This is your final dataset
with DatesTable AS
(
SELECT DISTINCT ID
,DATEADD(month,DateDiff(month,0,StartDate),0) StartDateRounded
,DATEADD(month,DateDiff(month,0,StartDate)+1,0) StartDateRoundedPlusOne
FROM Table1
)
SELECT t1.ID, DatePart(month,t1.StartDateRounded) AS StartDateMonth
FROM DatesTable t1
LEFT JOIN DatesTable t2
ON t1.ID = t2.ID
AND t1.StartDateRounded = t2.StartDateRoundedPlusOne
WHERE t2.ID IS NULL; --Verify no record exists for prior month
sqlfiddler for reference. Let me know if this helps

Just need to take advantage of the lag on the inner query to compare values between rows, and apply the logic in question on the middle query, and then do a final select.
/*SAMPLE DATA*/
create table #table1
(
ID int not null
, StartDate date not null
, EndDate date null
)
insert into #table1
values (1, '2017-01-12', '2017-01-28')
, (1, '2017-01-19', '2017-01-28')
, (1, '2017-01-29', '2017-02-11')
, (1, '2017-02-01', '2017-02-11')
, (1, '2017-02-19', '2017-02-24')
, (2, '2017-01-12', '2017-01-28')
, (2, '2017-01-19', '2017-01-28')
, (2, '2017-03-09', '2017-03-20')
, (3, '2017-02-12', '2017-02-28')
, (3, '2017-02-19', '2017-02-28')
, (3, '2017-05-05', '2017-05-29')
, (3, '2017-05-09', '2017-05-29')
/*ANSWER*/
--Final Select
select c.ID
, c.StartDateMonth
from (
--Compare record values to rule a record in/out based on OP's logic
select b.ID
, b.StartDateMonth
, case when b.StartDateMonth = b.StartDateMonthPrev then 0 --still the same month?
when b.StartDateMonth = b.StartDateMonthPrev + 1 then 0 --immediately prior month?
when b.StartDateMonth = 1 and b.StartDateMonthPrev = 12 then 0 --Dec/Jan combo
else 1
end as IncludeFlag
from (
--pull StartDateMonth of previous record into current record
select a.ID
, datepart(mm, a.StartDate) as StartDateMonth
, lag(datepart(mm, a.StartDate), 1, NULL) over (partition by a.ID order by a.StartDate asc) as StartDateMonthPrev
from #table1 as a
) as b
) as c
where 1=1
and c.IncludeFlag = 1
Output:
+----+----------------+
| ID | StartDateMonth |
+----+----------------+
| 1 | 1 |
| 2 | 1 |
| 2 | 3 |
| 3 | 2 |
| 3 | 5 |
+----+----------------+

Try the below query,
SELECT ID,MIN(YEARMONTH) AS YEARMONTH
FROM (
SELECT ID
,YEAR([StartDate])*100+MONTH([StartDate]) AS YEARMONTH
,LAG(YEAR([StartDate])*100+MONTH([StartDate]))
OVER(ORDER BY ID) AS PREVYEARMONTH
,ROW_NUMBER() OVER(ORDER BY ID) AS ROW_NO
FROM #Table1
GROUP BY ID,((YEAR([StartDate])*100)+MONTH([StartDate]))
) AS T
GROUP BY ID
,(CASE WHEN YEARMONTH - PREVYEARMONTH > 1 THEN ROW_NO ELSE 0 END)
ORDER BY ID
Output:
ID YEARMONTH
1 201701
2 201701
2 201703
3 201702
3 201705

Thank you all guys. most of the logic seemed to work..but I tried just with below one and I Was good with thiis.
SELECT t1.ID, DatePart(month,t1.Startdate) AS StartDateMonth
FROM DatesTable t1
LEFT JOIN DatesTable t2
ON t1.ID = t2.ID
AND DatePart(month,t1.Startdate) = DatePart(month,t2.Startdate)+1
WHERE t2.ID IS NULL;
Thanks again

Ok, I wrote my first query without checking, believed that will work correctly. This is my updated version, should be faster than other solutions
select
id
, min(st)%12 --this will return start month
, min(st)/12 + 1 --this will return year, just in case if you need it
from (
select
id, st, gr = st - row_number() over (partition by ID order by st)
from (
select
distinct ID, st = (year(StartDate) - 1) * 12 + month(StartDate)
from
#table2
) t
) t
group by id, gr

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How Can We Count New Distinct Values group by date in SQL - sql-server

Related

Group row values into columns without hardcoding PL SQL

Split a date column, and calculate an amount based on the result

Sort by max date per ID, then other records by same ID, then other IDs by their max date

Date Comparison of Two Tables in SQL SERVER

If Value is present in two consecutive months , display only one month in sql

Categories

Resources