Related
I would want to check ID in consecutive months, IF Same ID is present in two consecutive months then consider that ID only for 1st month.
If ID's are not in consecutive month then show the distinct ID's grouped by start date month.(We consider only start date)
For example, ID 1 is present in start date months january and Feb , then Distinct count of this ID will be 1 in Jan, how ever ID 2 and 3 are
present in Jan and March and Feb and May Resp, now I would like to see this distinct count of ID in Jan and March.
Current Data
Table1:
ID StartDate EndDate
1 2017-01-12 2017-01-28
1 2017-01-19 2017-01-28
1 2017-01-29 2017-02-11
1 2017-02-01 2017-02-11
1 2017-02-19 2017-02-24
2 2017-01-12 2017-01-28
2 2017-01-19 2017-01-28
2 2017-03-09 2017-03-20
3 2017-02-12 2017-02-28
3 2017-02-19 2017-02-28
3 2017-05-05 2017-05-29
3 2017-05-09 2017-05-29
I tried with below logic bt I know I am missing on something here.
select t.* from Table1 t
join Table1 t t1
on t1.ID=t.ID
and datepart(mm,t.StartDate)<> datepart(mm,t1.StartDate)+1
Expected Result:
DistinctCount StartDateMonth(In Numbers)
1 1(Jan)
2 1(Jan)
2 3(March)
3 2(Feb)
3 5(May)
Any help is appreciated!
Here's my solution. The thinking for this is:
1) Round all the dates to the first of the month, then work with the distinct dataset of (ID, StartDateRounded). From your dataset, the result should look like this:
ID StartDateRounded
1 2017-01-01
1 2017-02-01
2 2017-01-01
2 2017-03-01
3 2017-02-01
3 2017-05-01
2) From this consolidated dataset, find all records by ID that do not have a record for the previous month (which means it's not a consecutive month and thus is a beginning of a new data point). This is your final dataset
with DatesTable AS
(
SELECT DISTINCT ID
,DATEADD(month,DateDiff(month,0,StartDate),0) StartDateRounded
,DATEADD(month,DateDiff(month,0,StartDate)+1,0) StartDateRoundedPlusOne
FROM Table1
)
SELECT t1.ID, DatePart(month,t1.StartDateRounded) AS StartDateMonth
FROM DatesTable t1
LEFT JOIN DatesTable t2
ON t1.ID = t2.ID
AND t1.StartDateRounded = t2.StartDateRoundedPlusOne
WHERE t2.ID IS NULL; --Verify no record exists for prior month
sqlfiddler for reference. Let me know if this helps
Just need to take advantage of the lag on the inner query to compare values between rows, and apply the logic in question on the middle query, and then do a final select.
/*SAMPLE DATA*/
create table #table1
(
ID int not null
, StartDate date not null
, EndDate date null
)
insert into #table1
values (1, '2017-01-12', '2017-01-28')
, (1, '2017-01-19', '2017-01-28')
, (1, '2017-01-29', '2017-02-11')
, (1, '2017-02-01', '2017-02-11')
, (1, '2017-02-19', '2017-02-24')
, (2, '2017-01-12', '2017-01-28')
, (2, '2017-01-19', '2017-01-28')
, (2, '2017-03-09', '2017-03-20')
, (3, '2017-02-12', '2017-02-28')
, (3, '2017-02-19', '2017-02-28')
, (3, '2017-05-05', '2017-05-29')
, (3, '2017-05-09', '2017-05-29')
/*ANSWER*/
--Final Select
select c.ID
, c.StartDateMonth
from (
--Compare record values to rule a record in/out based on OP's logic
select b.ID
, b.StartDateMonth
, case when b.StartDateMonth = b.StartDateMonthPrev then 0 --still the same month?
when b.StartDateMonth = b.StartDateMonthPrev + 1 then 0 --immediately prior month?
when b.StartDateMonth = 1 and b.StartDateMonthPrev = 12 then 0 --Dec/Jan combo
else 1
end as IncludeFlag
from (
--pull StartDateMonth of previous record into current record
select a.ID
, datepart(mm, a.StartDate) as StartDateMonth
, lag(datepart(mm, a.StartDate), 1, NULL) over (partition by a.ID order by a.StartDate asc) as StartDateMonthPrev
from #table1 as a
) as b
) as c
where 1=1
and c.IncludeFlag = 1
Output:
+----+----------------+
| ID | StartDateMonth |
+----+----------------+
| 1 | 1 |
| 2 | 1 |
| 2 | 3 |
| 3 | 2 |
| 3 | 5 |
+----+----------------+
Try the below query,
SELECT ID,MIN(YEARMONTH) AS YEARMONTH
FROM (
SELECT ID
,YEAR([StartDate])*100+MONTH([StartDate]) AS YEARMONTH
,LAG(YEAR([StartDate])*100+MONTH([StartDate]))
OVER(ORDER BY ID) AS PREVYEARMONTH
,ROW_NUMBER() OVER(ORDER BY ID) AS ROW_NO
FROM #Table1
GROUP BY ID,((YEAR([StartDate])*100)+MONTH([StartDate]))
) AS T
GROUP BY ID
,(CASE WHEN YEARMONTH - PREVYEARMONTH > 1 THEN ROW_NO ELSE 0 END)
ORDER BY ID
Output:
ID YEARMONTH
1 201701
2 201701
2 201703
3 201702
3 201705
Thank you all guys. most of the logic seemed to work..but I tried just with below one and I Was good with thiis.
SELECT t1.ID, DatePart(month,t1.Startdate) AS StartDateMonth
FROM DatesTable t1
LEFT JOIN DatesTable t2
ON t1.ID = t2.ID
AND DatePart(month,t1.Startdate) = DatePart(month,t2.Startdate)+1
WHERE t2.ID IS NULL;
Thanks again
Ok, I wrote my first query without checking, believed that will work correctly. This is my updated version, should be faster than other solutions
select
id
, min(st)%12 --this will return start month
, min(st)/12 + 1 --this will return year, just in case if you need it
from (
select
id, st, gr = st - row_number() over (partition by ID order by st)
from (
select
distinct ID, st = (year(StartDate) - 1) * 12 + month(StartDate)
from
#table2
) t
) t
group by id, gr
I have a table that looks like the below
Date | ID | Period | ArchivedBy | ArchivedFlag | Value
2018-01-20 12:23 |23344 | Q1 | NULL | NULL | 200
2018-01-20 12:20 |23344 | NULL | P.Tills | 1 | NULL
2018-01-20 12:19 |23344 | NULL | NULL | 1 | NULL
This table represents all edits made to an agreement (each new edit gets it's own row). If a value hasn't been changed at all, it will say NULL.
so ideally the above would look like the following
Date | ID | Period | ArchivedBy | ArchivedFlag | Value
2018-01-20 |23344 | Q1 | P.Tills | 1 | 200
This returned row should show the latest state of the agreement based on the date. So for the date in my example (2018-01-20) this one row would be returned, combining all changes that were made throughout the day into 1 row which shows how it looks following all the changes throughout the day.
I hope this makes sense?
Thank you!
Here is one way using Row_Number and Group by
SELECT [Date] = Cast([Date] AS DATE),
ID,
Max(period),
Max(ArchivedBy),
Max(ArchivedFlag),
Max(CASE WHEN rn = 1 THEN [Value] END)
FROM (SELECT *,
Rn = Row_number()OVER(partition BY Cast([Date] AS DATE), ID ORDER BY [Date] DESC)
FROM Yourtable)a
GROUP BY Cast([Date] AS DATE),
ID
I would propose 2 solutions.
Simple
For each day select top 1 NOT NULL value:
SELECT G.ID, G.GD Date, Period.*, ArchivedBy.*, Value.* FROM
(SELECT DISTINCT ID, CAST(Date AS Date) GD FROM T) G
CROSS APPLY (SELECT TOP 1 Period FROM T WHERE Period IS NOT NULL AND CAST(Date AS Date)=GD ORDER BY Date DESC) Period
CROSS APPLY (SELECT TOP 1 ArchivedBy FROM T WHERE ArchivedBy IS NOT NULL AND CAST(Date AS Date)=GD ORDER BY Date DESC) ArchivedBy
CROSS APPLY (SELECT TOP 1 Value FROM T WHERE Value IS NOT NULL AND CAST(Date AS Date)=GD ORDER BY Date DESC) Value
Optimized (intuitively, not tested*)
Use varbinary sorting rules and aggregation, manually order NULLs:
SELECT CAST(Date AS Date), ID,
CAST(SUBSTRING(MAX(Arch),9, LEN(MAX(Arch))) AS varchar(10)) ArchivedBy --unbox
--other columns
FROM
(
SELECT Date, ID,
CAST(CASE WHEN ArchivedBy IS NOT NULL THEN ROW_NUMBER() OVER (PARTITION BY CAST(Date AS Date) ORDER BY Date) ELSE 0 END AS varbinary(MAX))+CAST(ArchivedBy AS varbinary(MAX)) Arch --box
--other columns
FROM T
) Tab
GROUP BY ID, CAST(Date AS Date)
I try to make pivot the table to the following image but i failed , the final result i need is as follow
and the query i used is
SELECT COUNT(T.TTOutID) AS Currenct, TTOutTargetTrxnCount AS Targets,
B.BranchCode ,
ROW_NUMBER() OVER (Order BY COUNT(T.TTOutID) DESC) AS RANK,
CAST((CAST (COUNT(T.TTOutID) AS DECIMAL)/CAST (TTOutTargetTrxnCount AS decimal))*100 AS decimal (4,2)) AS Percentages ,
TTOutTargetTrxnCount-COUNT(T.TTOutID) as Difference
FROM ALX_SalesTargets S
LEFT JOIN ALX_TTOut T ON S.BranchID=T.BranchID
LEFT JOIN ALX_Branches B ON S.BranchID=B.BranchID
Group BY S.TTOutTargetTrxnCount,B.BranchCode
and i make a tmp table for help
CREATE TABLE #Tempsample
(
currenct int ,
targets bigint,
branchcode nvarchar(128),
rank int,
percentage decimal,
difference int
);
INSERT INTO #Tempsample
(currenct, targets,branchcode,rank,percentage,difference)
VALUES
('131', '2650','EXB', '1','4.94', '2519'),
('25', '3500','MHQ', '2','0.71', '3475'),
('3', '850','MNM', '3','0.35', '847')
using cross apply(values ...) to unpivot your data, and conditional aggregation to re-pivot your data:
select
u.Attribute
, EXB = max(case when branchcode = 'EXB' then value end)
, MHQ = max(case when branchcode = 'MHQ' then value end)
, MNM = max(case when branchcode = 'MNM' then value end)
, Totals = sum(case when u.Attribute = 'rank' then null else value end)
from #tempsample t
cross apply (values
('currenct',currenct)
, ('targets',targets)
, ('difference',difference)
, ('rank',rank)
) u (attribute,value)
group by u.Attribute
order by (case when u.Attribute='rank' then 1 else 0 end)
rextester demo: http://rextester.com/WLES86332
returns:
+------------+------+------+-----+--------+
| Attribute | EXB | MHQ | MNM | Totals |
+------------+------+------+-----+--------+
| currenct | 131 | 25 | 3 | 159 |
| difference | 2519 | 3475 | 847 | 6841 |
| targets | 2650 | 3500 | 850 | 7000 |
| rank | 1 | 2 | 3 | NULL |
+------------+------+------+-----+--------+
You can try pivot like this:
select 'Targets' as [Header], * from (
select targets, branchcode from #Tempsample ) a
pivot (sum(targets) for branchcode in ([EXB], [MHQ], [MNM])) p
union all
select 'Current' as [Header],* from (
select currenct, branchcode from #Tempsample ) a
pivot (sum(currenct) for branchcode in ([EXB], [MHQ], [MNM])) p
union all
select 'Difference' as [Header], * from (
select difference, branchcode from #Tempsample ) a
pivot (sum(difference) for branchcode in ([EXB], [MHQ], [MNM])) p
union all
select 'Rank' as [Header], * from (
select Rank, branchcode from #Tempsample ) a
pivot (MAX(rank) for branchcode in ([EXB], [MHQ], [MNM])) p
I have this table:
lnumber | lname | bez_gem
---------+----------------+------------------------------
1 | name1 | Berg b.Neumarkt i.d.OPf.
1 | name1 | Altdorf b.Nürnberg
2 | name2 | Berg b.Neumarkt i.d.OPf.
2 | name2 | Altdorf b.Nürnberg
3 | name3 | Mainleus
3 | name3 | Weismain
4 | name4 | Weismain
4 | name4 | Mainleus
The code for the query is:
WITH double AS (
SELECT
partnumber,
bez_gem
FROM accumulation a, municipality b
WHERE ST_Intersects(a.geom, b.geom)
AND EXISTS (
SELECT
lnumber
FROM mun_more_than_once c
WHERE a.partnumber=c.lnumber)
ORDER BY partnumber)
SELECT
landslide.lnumber,
lname,
bez_gem
FROM double, landslide
WHERE double.partnumber=landslide.lnumber
ORDER BY lnumber
I want to transpose in this format
lnumber | lname | bez_gem1 | bez_gem2
---------+----------------+------------------------------------------------
1 | name1 | Berg b.Neumarkt i.d.OPf. | Altdorf b.Nürnberg
2 | name2 | Berg b.Neumarkt i.d.OPf. | Altdorf b.Nürnberg
It depends. If you always have two bez_gem per lnumber, you can simply use:
SELECT lnumber, lname
, min(bez_gem) AS bez_gem1
, max(bez_gem) AS bez_gem2
FROM test
GROUP BY 1,2
ORDER BY 1;
SQL Fiddle.
Note that the order of peers is undefined in your question. Collation rules (alphabetical order) decide in my example.
For an actual cross tabulation you would use the crosstab() function from the additional module tablefunc. But your table is missing a category name (no indication which row holds bez_gem1 and which bez_gem2). Explanation, details and links:
PostgreSQL Crosstab Query
SQLFiddle
Data
-- drop table if exists test;
create table test (lnumber int, lname varchar, bez_gem varchar);
insert into test values
(1 , 'name1' , 'Berg b.Neumarkt i.d.OPf.'),
(1 , 'name1' , 'Altdorf b.Nürnberg'),
(2 , 'name2' , 'Berg b.Neumarkt i.d.OPf.'),
(2 , 'name2' , 'Altdorf b.Nürnberg'),
(3 , 'name3' , 'Mainleus'),
(3 , 'name3' , 'Weismain'),
(4 , 'name4' , 'Weismain'),
(4 , 'name4' , 'Mainleus'),
(4 , 'name4' , 'XXMainleus')
;
Query
select
lnumber,
lname,
max(case when rn = 1 then bez_gem end) as bez_gem1,
max(case when rn = 2 then bez_gem end) as bez_gem2,
max(case when rn = 3 then bez_gem end) as bez_gem3
from
(
select
*,
row_number() over(partition by lname) rn
from
test
) a
group by
lnumber,
lname
Result
1;name1;Berg b.Neumarkt i.d.OPf.;Altdorf b.Nürnberg;
2;name2;Berg b.Neumarkt i.d.OPf.;Altdorf b.Nürnberg;
3;name3;Mainleus;Weismain;
4;name4;Weismain;Mainleus;XXMainleus
Old Answer
If you have only two possible rows for every lnumber (you should add this important info to your question), you can simply use min and max:
WITH double AS (
SELECT
partnumber,
bez_gem
FROM accumulation a, municipality b
WHERE ST_Intersects(a.geom, b.geom)
AND EXISTS (
SELECT
lnumber
FROM mun_more_than_once c
WHERE a.partnumber=c.lnumber)
ORDER BY partnumber)
SELECT
landslide.lnumber,
lname,
min(bez_gem) as bez_gem1,
max(bez_gem) as bez_gem2
FROM double, landslide
WHERE double.partnumber=landslide.lnumber
group by
landslide.lnumber,
lname
ORDER BY lnumber
If you have possibly more than two rows for every lnumber and you really need crosstab, there is a lot of questions regarding crosstab in PostgreSQL on SO (example). As an alternative you can try the following approach.
Because this is one-time analysis, you can easily get maximum number of unique bez_gem values:
select
landslide.lnumber,
count(distinct bez_gem) cnt
from
<<some_data>>
group by
landslide.lnumber
order by
cnt desc limit 1
Then you can use:
select
landslide.lnumber,
lname,
max(case when rn=1 then bez_gem end) as bez_gem1,
max(case when rn=2 then bez_gem end) as bez_gem2,
max(case when rn=3 then bez_gem end) as bez_gem3,
max(case when rn=4 then bez_gem end) as bez_gem4,
max(case when rn=5 then bez_gem end) as bez_gem5,
... up to cnt ...
from(
select
landslide.lnumber,
lname,
bez_gem,
row_number() over(partition by landslide.lnumber) rn
from
<<some_data>>
) a
group by
landslide.lnumber,
lname
For your data and 5 possible values it would look like:
WITH double AS (
SELECT
partnumber, bez_gem
FROM
accumulation a, municipality b
WHERE
ST_Intersects(a.geom, b.geom)
AND EXISTS (
SELECT lnumber
FROM mun_more_than_once c
WHERE a.partnumber=c.lnumber)
ORDER BY
partnumber
)
select
landslide.lnumber,
lname,
max(case when rn=1 then bez_gem end) as bez_gem1,
max(case when rn=2 then bez_gem end) as bez_gem2,
max(case when rn=3 then bez_gem end) as bez_gem3,
max(case when rn=4 then bez_gem end) as bez_gem4,
max(case when rn=5 then bez_gem end) as bez_gem5
from (
select
landslide.lnumber,
lname,
bez_gem,
row_number() over(partition by landslide.lnumber) rn
from
double, landslide
where
double.partnumber=landslide.lnumber
) a
group by
landslide.lnumber,
lname
I have this current select:
SELECT MIN(CONVERT(DateTime,SUBSTRING(NameAndDate,14,8))),
SUBSTRING(NameAndDate,1,12)
FROM MyData
WHERE pName IN (SELECT SUBSTRING(NameAndDate,1,12)
FROM MyData
GROUP BY SUBSTRING(NameAndDate,1,12)
HAVING COUNT(*) > 1)
GROUP BY SUBSTRING(NameAndDate,1,12)
Where SUBSTRING(NameAndDate,1,60) is the person's ID name and SUBSTRING(NameAndDate,61,8) is the date they came in.
There are many times where this data shows up multiple times in the table which is why I want to select the MIN date.
The problem is that there is another column in the table (ID) that I need to be added, but I don't want to group by it because it adds duplicates to the Person's ID.
Is there a way I can do the following:
SELECT ID,
MIN(CONVERT(DateTime,SUBSTRING(NameAndDate,14,8))),
SUBSTRING(NameAndDate,1,12)
FROM MyData
WHERE pName IN (SELECT SUBSTRING(NameAndDate,1,12)
FROM MyData
GROUP BY SUBSTRING(NameAndDate,1,12)
HAVING COUNT(*) > 1)
GROUP BY SUBSTRING(NameAndDate,1,12)
EDIT:
There could be multiple times a person comes through:
ID | NameAndDate
----+-----------------------
1 | J60047238486 08162013
2 | J60047238486 08182013
3 | J60047238486 08242013
4 | J60047238486 09032013
5 | J60047238486 10102013
6 | C40049872351 05302013
7 | C40049872351 07212013
8 | C40049872351 07252013
My current select pulls:
Name | Date
--------------+---------------------
J60047238486 | 08/16/2013 00:00:00
C40049872351 | 05/30/2013 00:00:00
But I want to add the ID column for those specific rows:
ID | Name | Date
----+--------------+---------------------
1 | J60047238486 | 08/16/2013 00:00:00
6 | C40049872351 | 05/30/2013 00:00:00
Try this
SELECT * FROM (
SELECT id,
CONVERT(DateTime,right (SUBSTRING(NameAndDate,14,8),4)
+ SUBSTRING(NameAndDate,14,4)) D,
SUBSTRING(NameAndDate,1,12) N,
COUNT(*) OVER (PARTITION BY SUBSTRING(NameAndDate,1,12)) Cnt,
ROW_NUMBER() OVER (PARTITION BY SUBSTRING(NameAndDate,1,12)
ORDER By SUBSTRING(NameAndDate,14,8)) rn
FROM
mydata
) v WHERE CNT > 1 and rn = 1;
SQL DEMO HERE
You can do this, but it aint' pretty. You have to run your original query to get the minimum date for each name, and then join that back to your MyData table. It's particularly ugly because of how you store the data. Converting your MMDDYYYY string to a data was really fun.
SQL Fiddle
select
MyData.[ID],
t1.theName,
t1.theDate
from
Mydata
inner join
(
select
SUBSTRING(NameAndDate,1,12) as theName,
min (
convert(datetime,
right (SUBSTRING(NameAndDate,14,8),4) + '-' +
left (SUBSTRING(NameAndDate,14,8),2) + '-' +
SUBSTRING((SUBSTRING(NameAndDate,14,8)),3,2)
))as theDate
from
mydata
where
SUBSTRING(NameAndDate,1,12) in
(SELECT SUBSTRING(NameAndDate,1,12)
FROM MyData
GROUP BY SUBSTRING(NameAndDate,1,12)
HAVING COUNT(*) > 1)
group by
SUBSTRING(NameAndDate,1,12) ) t1
ON SUBSTRING(mydata.NameAndDate,1,12) = t1.theName
AND (convert(datetime,
right (SUBSTRING(NameAndDate,14,8),4) + '-' +
left (SUBSTRING(NameAndDate,14,8),2) + '-' +
SUBSTRING((SUBSTRING(NameAndDate,14,8)),3,2))) = t1.theDate
with cte as (
-- first cte - parsing data
select
ID,
left(NameAndDate, 12) as Name,
convert(date,
right(NameAndDate, 4) +
substring(NameAndDate, 14, 2) +
substring(NameAndDate, 16, 2),
112) as Date
from Table1
), cte2 as (
-- second cte - create row_number
select
ID, Name, Date,
row_number() over(partition by Name order by Date) as rn
from cte
)
select
ID, Name, Date
from cte2
where rn = 1
sql fiddle demo