How to group by data in postgresql by timestamp and distinct values? - database

This is my current table format :
userName userId recordedAt
Deepesh 1 2021-03-29 07:20:36
Sushant 2 2021-03-29 07:27:55
Ankita 3 2021-03-29 07:30:40
Aman 4 2021-03-29 07:39:15
Ankita 3 2021-03-29 07:51:29
Suman 5 2021-03-29 07:55:19
Ankita 3 2021-03-29 08:36:55
I want to query data in such a way that it should count the distinct userId and then group it by hour. Expected result -
time userLogged
07:00 5
08:00 1

When you want to group by minute, hour, day, week, etc., it's tempting to just group by your timestamp column, however, then you'll get one group per second, which is likely not what you want. Instead, you need to "truncate" your timestamp to the granularity you want, like minute, hour, day, week, etc. The PostgreSQL function you need here is date_trunc.
select
date_trunc('minute', created_at), -- or hour, day, week, month, year
count(1)
from users
group by 1

SELECT recordedAt, COUNT(DISTINCT userId) AS userLogged FROM table_name GROUP BY recordedAt;

Related

Return Customer ID if all records added in the same month

I have a table with customer names, items and dates added
ID
Cust_ID
Item
DateAdded
1
Cust_1
Handle
2022-12-05 11:51:28.973
2
Cust_1
Foot
2022-12-02 14:43:36.407
3
Cust_1
Door
2022-12-02 14:42:20.727
4
Cust_2
Handle
2022-10-10 13:07:49.640
5
Cust_2
Door
2022-09-15 12:09:13.820
6
Cust_2
Leg
2022-12-02 11:02:43.110
7
Cust_3
Handle
2022-07-01 15:31:28.547
8
Cust_3
Door
2022-12-06 10:26:56.987
I need a select statement that returns the customer name but only where all items purchased were last month. Example, all purchases for Cust_1 were last month so this customer is returned but Cust2 and Cust_3 had purchases in other months so they are not returned.
Cust_ID
Cust_1
I have the date range sorted out and have tried various 'Group By' and 'Having' clauses but im struggling due to it being dates and not strings.
Use a HAVING clause with MIN and MAX on your DAteAdded Column. You can easily create a date boundary with DATEADD and EOMONTH:
SELECT Cust_ID
FROM dbo.YourTable
GROUP BY Cust_ID
HAVING MIN(DateAdded) >= DATEADD(DAY, 1, EOMONTH(GETDATE(),-2))
AND MAX(DateAdded) < DATEADD(DAY, 1, EOMONTH(GETDATE(),-1));
db<>fiddle

compare only hours in timestamp column postgresql

I have a table with date like that:
OBJECT TIMESTAMP_START TIMESTAMP_END
House 2020-02-20 09:33:24 2020-02-20 09:33:33
Dog 2020-02-20 18:00:03 2020-02-21 18:33:22
Cat 2020-02-11 19:00:00 2020-02-11 19:15:23
I need to extract all objects,start timestamp and end timestamp whose timestamp start is between (18:00 hours and 09:00)
In that case was Dog and Cat
How could I make that in postgreSql ? Do you think is possible easily?
Thanks!
Since you exclude both bounds, a rare case where BETWEEN is correct:
select *
from tbl
where timestamp_start::time NOT BETWEEN time '09:00' AND time '18:00';
You cannot do this with TIME alone because in hours 09:00 is always less than 18:00, and from 09:00 to 18:00 is the time to be excluded. You can get this by truncating to the start and adding the appropriate interval.
with the_table (object, timestamp_start,timestamp_end ) as
( values ('House', '2020-02-20 09:33:24'::timestamp, '2020-02-20 09:33:33'::timestamp)
, ('Dog', '2020-02-20 18:00:03'::timestamp, '2020-02-21 18:33:22'::timestamp)
, ('Cat', '2020-02-11 19:00:00'::timestamp, '2020-02-11 19:15:23'::timestamp)
, ('Mouse', '2020-02-11 20:00:00'::timestamp, '2020-02-12 08:00:00'::timestamp)
)
select *
from the_table
where timestamp_start between
date_trunc('day', timestamp_start) + interval '18 hours' and
date_trunc('day', timestamp_start) + interval '1 day 9 hours' ;
Of course this get all such rows matching the times even if they are years old. You might want to consider that as well. Just a suggestion.
You can cast the timestamp to time
select *
from the_table
where timestamp_start::time > time '18:00'
and timestamp_start::time < time '09:00'

Adding days to a date in Netezza

I have a query pulling dates from field [DATE] BETWEEN '10/1/2017' AND '10/31/2017'
I want to add days to the the end date in the between criteria (10/31/2017). It seems impossible. I can add months perfectly using ADD_MONTHS, but there doesn't seem to be a function ADD_DAYS.
Your help is greatly appreciated!
add_months deals with the special cases that arise from having variable length months.
For other intervals of time, things are much simpler:
To add 5 days to the current day, use this:
SYSTEM.ADMIN(ADMIN)=> select current_date, current_date + interval '5 days';
DATE | ?COLUMN?
------------+---------------------
2017-12-19 | 2017-12-24 00:00:00
(1 row)
T2DB.ADMIN(ADMIN)=> select * from interval_test where col1 between (current_timestamp - interval '2 days') and (current_timestamp + interval '3 days');
COL1
------------
2017-12-19
(1 row)

Populating a list of dates without a defined end date - SQL server

I have a list of accounts and their cost which changes every few days.
In this list I only have the start date every time the cost updates to a new one, but no column for the end date.
Meaning, I need to populate a list of dates when the end date for a specific account and cost, should be deduced as the start date of the same account with a new cost.
More or less like that:
Account start date cost
one 1/1/2016 100$
two 1/1/2016 150$
one 4/1/2016 200$
two 3/1/2016 200$
And the result I need would be:
Account date cost
one 1/1/2016 100$
one 2/1/2016 100$
one 3/1/2016 100$
one 4/1/2016 200$
two 1/1/2016 150$
two 2/1/2016 150$
two 3/1/2016 200$
For example, if the cost changed in the middle of the month, than the sample data will only hold two records (one per each unique combination of account-start date-cost), while the results will hold 30 records with the cost for each and every day of the month (15 for the first cost and 15 for the second one). The costs are a given, and no need to calculate them (inserted manually).
Note the result contains more records because the sample data shows only a start date and an updated cost for that account, as of that date. While the results show the cost for every day of the month.
Any ideas?
Solution is a bit long.
I added an extra date for test purposes:
DECLARE #t table(account varchar(10), startdate date, cost int)
INSERT #t
values
('one','1/1/2016',100),('two','1/1/2016',150),
('one','1/4/2016',200),('two','1/3/2016',200),
('two','1/6/2016',500) -- extra row
;WITH CTE as
( SELECT
row_number() over (partition by account order by startdate) rn,
*
FROM #t
),N(N)AS
(
SELECT 1 FROM(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1))M(N)
),
tally(N) AS -- tally is limited to 1000 days
(
SELECT ROW_NUMBER()OVER(ORDER BY N.N) - 1 FROM N,N a,N b
),GROUPED as
(
SELECT
cte.account, cte.startdate, cte.cost, cte2.cost cost2, cte2.startdate enddate
FROM CTE
JOIN CTE CTE2
ON CTE.account = CTE2.account
and CTE.rn = CTE2.rn - 1
)
-- used DISTINCT to avoid overlapping dates
SELECT DISTINCT
CASE WHEN datediff(d, startdate,enddate) = N THEN cost2 ELSE cost END cost,
dateadd(d, N, startdate) startdate,
account
FROM grouped
JOIN tally
ON datediff(d, startdate,enddate) >= N
Result:
cost startdate account
100 2016-01-01 one
100 2016-01-02 one
100 2016-01-03 one
150 2016-01-01 two
150 2016-01-02 two
200 2016-01-03 two
200 2016-01-04 one
200 2016-01-04 two
200 2016-01-05 two
500 2016-01-06 two
Thank you #t-clausen.dk!
It didn't solve the problem completely, but did direct me in the correct way.
Eventually I used the LEAD function to generate an end date for every cost per account, and then I was able to populate a list of dates based on that idea.
Here's how I generate the end dates:
DECLARE #t table(account varchar(10), startdate date, cost int)
INSERT #t
values
('one','1/1/2016',100),('two','1/1/2016',150),
('one','1/4/2016',200),('two','1/3/2016',200),
('two','1/6/2016',500)
select account
,[startdate]
,DATEADD(DAY, -1, LEAD([Startdate], 1,'2100-01-01') OVER (PARTITION BY account ORDER BY [Startdate] ASC)) AS enddate
,cost
from #t
It returned the expected result:
account startdate enddate cost
one 2016-01-01 2016-01-03 100
one 2016-01-04 2099-12-31 200
two 2016-01-01 2016-01-02 150
two 2016-01-03 2016-01-05 200
two 2016-01-06 2099-12-31 500
Please note that I set the end date of current costs to be some date in the far future which means (for me) that they are currently active.

Group by on Postgresql Date Time

Hy. There are employee records in my postgresql database something like
CODE DATE COUNT
"3443" "2009-04-02" 3
"3444" "2009-04-06" 1
"3443" "2009-04-06" 1
"3443" "2009-04-07" 7
I want to use a query "SELECT ALL CODES AND COUNT THEM THAT OCCURRED IN THE MONTH"
RESULT:
CODE DATE COUNT
"3443" "2009-04" 3
"3441" "2009-04" 13
"3442" "2009-04" 11
"3445" "2009-04" 72
I did use a query i.e.
SELECT CODE,date_part('month',DATE),count(CODE)
FROM employee
where
group by CODE,DATE
The above query runs fine but the months listed in the records are in form of numbers and its hard to find that a month belongs to which year. In short I want to get the result just like mention above in the RESULT section. Thanks
Try this:
SELECT CODE, to_char(DATE, 'YYYY-MM'), count(CODE)
FROM employee
where
group by CODE, to_char(DATE, 'YYYY-MM')
Depending on whether you want the result as text or a date, you can also write it like this:
SELECT CODE, date_trunc('month', DATE), COUNT(*)
FROM employee
GROUP BY CODE, date_trunc('month', DATE);
Which in your example would return this, with DATE still a timestamp, which can be useful if you are going to do further calculations on it since no conversions are necessary:
CODE DATE COUNT
"3443" "2009-04-01" 3
"3441" "2009-04-01" 13
"3442" "2009-04-01" 11
"3445" "2009-04-01" 72
date_trunc() also accepts other values, for instance quarter, year etc.
See the documentation for all values
Try any of
SELECT CODE,count(CODE),
DATE as date_normal,
date_part('year', DATE) as year,
date_part('month', DATE) as month,
to_timestamp(
date_part('year', DATE)::text
|| date_part('month', DATE)::text, 'YYYYMM')
as date_month
FROM employee
where
group by CODE,DATE;

Resources