How to do WHERE <before> an aggregate function (Postgres) - database

It's hard to explain from the title, but this is my SQL:
SELECT
SUM("payments"."amount"),
"invoices"."property_id"
FROM "payments"
JOIN "invoices"
ON "payments"."invoice_id" = "invoices"."id"
GROUP BY "property_id"
It returns the sum of all Payment records (amount column) for a particular Property (which is connected through it's invoices).
In other words:
Property has_many: :invoices
Invoice has_one: :payment
I'm trying to select payments between a particular date range though, but it has to happen "before" the aggregate function (so do the exact query above, but only for 2017-01-01 through 2017-02-01). The field would be generated_at on Payment

You are looking for a WHERE clause. (WHERE is executed before aggregation; HAVING is executed after.) Suggested date literals in PostgreSQL are ANSI standard DATE 'YYYY-MM-DD'. Date ranges are usually checked with >= start day and < end day + 1 (in order to deal properly with the time part if any).
SELECT
SUM(p.amount),
i.property_id
FROM payments p
JOIN invoices i ON p.invoice_id = i.id
WHERE p.generated_at >= DATE '2017-01-01'
AND p.generated_at < DATE '2017-02-02'
GROUP BY i.property_id;

Related

Binding tables with fulltext search query?

I run into a problem. I have three tables: product, where is stored his price and name. Then table Query with atributtes like description of searched words and their frequency(so there are no duplicates). And table UsersQuery, where each of the searched words of the user are stored.
PRODUCT
id
price
name
QUERY
id
description_query
number_of_freq
USERSQUERY
id
query_id FK
user_id FK
timestamp
I have to calculate for each month in a given year and subsequent years (January 2018, February 2018,…), calculate the ratio between those search queries that contain the product name and those that do not. If the given ratio is not defined for the given month, the output should be NULL.
Do you guys know how it would be possible?
So far I just have this
select q.description_query,
to_char(uq.timestamp, 'YYYY-MM') as year_month
from usersquery as uq
join query as q ON q.id = uq.query_id;
But I dont really know how to bind table with products, just with his atributte name. Should I use some sort of fulltext search using tsvector?
-- table is case insensitive so use product,query, user_query. please refer manual 4.1 lexical structure.
demo
I hope I understand correctly. The number_of_freq refer to the time the query contain the product name. and if number_of_freqtext = 0 means that this query don't contain the product key word.
basically a generate_series to generate date series data(later for left or right join), count filter function to count the freq is 0.
final code:
WITH cte AS (
SELECT
to_char(querytimestamp, 'YYYY-MM') AS tochar1,
count(number_of_freq) AS count_all,
count(number_of_freq) FILTER (WHERE number_of_freq = 0) AS count_0
FROM
query
JOIN user_query uq ON query.query_id = uq.query_id
WHERE
querytimestamp >= '2021-01-01 00:00' at time zone 'UTC'
AND querytimestamp <= '2022-12-31 23:59' at time zone 'UTC'
GROUP BY
1
),
cte2 (
yearmonth
) AS (
SELECT
to_char(g, 'YYYY-MM')
FROM
generate_series('2021-01-01', '2022-12-31', interval '1 month') g
)
SELECT
yearmonth,
cte.*,
round(cte.count_0::numeric / count_all, 2)
FROM
cte
RIGHT JOIN cte2 ON cte.tochar1 = yearmonth;
updated demo
About count the frequency of the word. full text search won't help.
Since full text search will parse 'product.id' as 'product.id'.
You may need regexp split string functions.
refer count frequency demo to solve the words count frequency issue:

Modify T-SQL query

I have trouble with this query:
SELECT DISTINCT
r.uuid AS id,
r.uuid,
r.customerId
FROM
IF_reminders r
LEFT JOIN
IF_reminders_sent rs ON rs.reminderUuid = r.uuid
AND rs.event = 'eventName'
WHERE
r.eventNameEnabled = 1
AND (rs.sentAt IS NULL
OR rs.sentAt NOT BETWEEN (DATEADD(DAY, -14, '2022-05-01')) AND (DATEADD(DAY, 1, '2022-05-01')))
The date in DATEADD function is filled programmatically.
Table IF_reminders contain defined reminders of different types for the users.
Table IF_reminders_sent contain records then the reminder for particular event was sent to the user.
The query must return a list of user reminders for the event to which the reminder should be sent. If a reminder has already been sent, this user should be ignored.
The query shown above works as expected if table IF_reminders_sent does not contain any rows from the past years. If table does contain rows from past years, then user will get reminder every day in the specified date range.
How to update the query in a way that if for current year remainders for particular event not yet sent then full list will be returned but if current year has sent reminders for particular event then past years records will be ignored.
Update
Tables structures. Three ... represent additional events columns but birthday and mothersday describe possible structures for all of them.
IF_reminders columns
IF_reminders_sent
uuidcustomerIdsortByfirstNamelastNameemailphoneaddressrelationshipbirthdayEnabledbirthdayDate....mothersDayEnabledcreatedAtupdatedAt....
idcustomerIdreminderUuideventsentAt
The idea of query is to filter out user reminders what the program should sent out. Program will fire function for to send out Mothersday reminders 7 days before event and send it once. While the IF_remiders_sent was empty all works OK. But then it contains records from the past year then query returns always list of reminders to be sent because sentAt for previous year is NOT BETWEEN dates specified in the query and starts spam users. If for the mothersDay event reminders for current year are not sent yet the query has output full list of users who have this reminder active. If for the current year the reminder is sent it should ignore current year records (NOT BETWEEN part of query) and now it has to ignore previous years too. How to add this condition to the query?
Sample IF_reminders_sent:
id
customerId
Uuid
event
sentAt
2
124724
4871a550-0d85-4391-83e0-2fff63e412ae
mothersDay
2021-04-26 16:36:59.877
9
124724
4871a550-0d85-4391-83e0-2fff63e412ae
mothersDay
2022-04-26 16:36:59.877
You can define in your where clause to check the current year:
SELECT DISTINCT
r.uuid AS id,
r.uuid,
r.customerId
FROM
IF_reminders r
LEFT JOIN
IF_reminders_sent rs ON rs.reminderUuid = r.uuid
AND rs.event = 'eventName'
WHERE
DATEPART(year,rs.sentAt)=DATEPART(year,getdate()) and
r.eventNameEnabled = 1
AND (rs.sentAt IS NULL
OR rs.sentAt NOT BETWEEN (DATEADD(DAY, -14, '2022-05-01')) AND
(DATEADD(DAY, 1, '2022-05-01')))
You need to flip the logic around. You are looking for all reminders which do not have a sent reminder since the beginning of the year.. So you need NOT EXISTS
SELECT
r.uuid AS id,
r.uuid,
r.customerId
FROM
IF_reminders r
WHERE NOT EXISTS (SELECT 1
FROM
IF_reminders_sent rs
WHERE rs.reminderUuid = r.uuid
AND rs.event = 'eventName'
AND rs.sentAt >= DATEFROMPARTS(YEAR(GETDATE()), 1, 1)
AND rs.sentAt NOT BETWEEN DATEADD(DAY, -14, '2022-05-01') AND DATEADD(DAY, 1, '2022-05-01')
)
AND r.eventNameEnabled = 1;
Note that if you want to give a date range you should always use rs.sentAt >= SomeDateCalculation AND rs.sentAt < OtherDateCalculation rather than using functions such as YEAR(rs.sentAt) = YEAR(GETDATE())

Count by days, with all days

I need to count records by days, even if in the day were no records.
Count by days, sure, easy.
But how i can make it to print information, that 'in day 2018-01-10 was 0 records)
Should I use connect by level? Please, any help would be good. Can't use plsql, just oracle sql
First you generate every date that you want in an inline view. I chose every date for the current year because you didn't specify. Then you left outer join on date using whichever date field you have in that table. If you count on a non-null field from the source table then it will count 0 rows on days where there is no join.
select Dates.r, count(tablename.id)
from (select trunc(sysdate,'YYYY') + level - 1 R
from dual
connect by level <= trunc(add_months(sysdate,12),'YYYY') - trunc(sysdate,'YYYY')) Dates
left join tablename
on trunc(tablename.datefield) = Dates.r
group by Dates.r

SQL determine schedule availability

I have a tbl_availability that determines when a resource is available. The table structure is:
id - running id
startdate - when this availability starts
enddate - when this availability ends
dayofweek - weekday of availability
fromtime - start time
totime - end time
There can be multiple records for the same dayofweek, for example one record for Sundays 1000-1200 and another record for Sundays 1300-1400.
I am trying to figure out how to get two things:
Check when entering a new record that there is no conflict (overlap)
with an existing record
Given a startdate and enddate, find all of
the available periods that apply.
To determine if there's a conflict this query will return any overlapping time ranges:
SELECT * FROM tbl_availability
WHERE (startDate > #CheckEnd AND startDate <= #CheckStart)
OR (endDate < #CheckStart AND endDate >= #CheckEnd)
OR (startDate >= #CheckStart AND endDate <= #CheckEnd)
The first part of the where clause checks for anything that overlaps the start time. The second part check for anything that overlaps the end time. The third part check for anything within the range.
To check for available time ranges for a specified duration use this:
SELECT * FROM
(SELECT endDate AS PeriodStart, (SELECT TOP 1 startDate
FROM tbl_availability as sub
WHERE sub.startDate > tbl_availability.endDate
ORDER by sub.startDate) AS PeriodEnd
FROM tbl_availability) AS OpenPeriods
WHERE DateDiff(MINUTE, PeriodStart, PeriodEnd) >= DateDiff(MINUTE, #RangeStart, #RangeEnd)
I haven't tested these queries, so there may have to be some tweaking going on here.

SQL Server: selecting a year of account based on a specific date and a date range

I need to apportion some values to a financial year that begins on the 1st December and ends on the 30th November each year.
The rows that contain the value fields are in a table (TABLE A) that has a reference number and an incident date
Table A
ReferenceNumber, Value, IncidentDate
1, 10.00, 01/12/14
2, 15.00, 10/05/13
3, 20.00, 14/10/13
TABLE A is the joined to TABLE B which also has the reference number and contains transactional data including a start date field. Each reference number may have several transactions with different start date values and the aim is to ensure the row selected from TABLE B is the one where the start date is the most recent start date before the incident date from table A
TABLE B
ReferenceNumber, StartDate
1, 01/05/14
1, 01/05/15
2, 12/04/14
2, 12/04/15
3, 05/06/14
3, 04/06/15
TABLE C is a time table that apportions specific dates to financial years.
TABLE C
Date, FinancialYear
30/11/14, FY2013/14
01/12/14, FY2014/15
I am trying to construct a query which joins table A to table B on the Reference number and incident date to start date as described above and then adds the FinancialYear value based on the start date from Table B.
I am struggling to get this to return the correct financial year.
In addition, the data quality is poor so there are many examples where the Incident date from table A is greater than the scope of the financial year selected based on the start date from table B.
I need to be able to return either the appropriate financial year based on start date or, failing that, the financial year corresponding to the incident date
Here is the code I currently have:
SELECT a.ReferenceNumber,
b.StartDate,
c.FinancialYear
FROM dbo.TableA a
INNER JOIN dbo.TableB b
ON a.ReferenceNumber = b.ReferenceNumber
AND b.StartDate = (SELECT MIN(StartDate) FROM dbo.TableB WHERE a.IncidentDateTime > StartDate AND ReferenceNumber = a.ReferenceNumber)
INNER JOIN dbo.Calendar c
ON rdc.PolicyStartDate = c.[Date]
select
a.ReferenceNumber,
min(Value) as Value,
min(IndicentDate) as IncidentDate,
max(StartDate) as StartDate /* others are dummy aggregates but this one is not */
'FY'
+ cast(year(dateadd(month, -11, min(IncidentDate))) as char(4))
+ '/'
+ cast(year(dateadd(month, -11, min(IncidentDate))) - 1999 as char(2)) as FY
from
TableA a cross apply
(
select * from TableB b
where b.ReferenceNumber = a.Reference.Number and b.StartDate < a.IncidentDate
) b
group by a.ReferenceNumber
Your fiscal year starts eleven months "late" so it's easy to determine where a date falls without a lookup.
year(dateadd(month, -11, <date>))
Getting it to match your "FY2013/14" format takes a little extra work but you could write little functions to do these kinds of calculations. By the way, the 1999 comes from adding 1 and subtracting 2000 to get a two-digit year value. Could use modulo 100 to make it generic beyond the year 2098 if that's important.
My assumptions going in:
IncidentDate and StartDate are datatype "DATE". This should also work if they are DATETIME with all time values set the same.
TableC contains a row for every possible date (which is what you implied). Another style would be {FinancialYear, FirstDate, LastDate}, and you'd join to this table using between in the on clause.
I didn't quite get what you meant regarding "the data quality is poor". This query will pull back the desired IncidentDate and StartDate
(if available), allowing you to apply business logic to them. My sample here is "if there is no applicable StartDate, base the FinancialYear on IncidentDate. (Replace those outer joins with inner joins if the data permits it.)
Toss in parameters if you dont' want this data for all ReferenceNumbers.
Check for syntax errors, I couldn't run and test this query.
(Note that "Date" is a confusing name for a column.)
WITH ctePart1 (ReferenceNumber, IncidentDate, ClosestStartDate)
as (-- Data based on the join to "most recent prior StartDate"
select
ta.ReferenceNumber
,ta.IncidentDate
,max(tb.StartDate)
from TableA ta
left outer join TableB tb
on tb.ReferenceNumber = ta.ReferenceNumber
and tb.StartDate < ta.IncidentDate
group by
ta.ReferenceNumber
,ta.IncidentDate)
select
cte.ReferenceNumber
,cte.IncidentDate
,cte.ClosestStartDate
,isnull(tcStart.FinancialYear, tcIncident.FinancialYear) FinancialYear
from ctePart1 cte
left outer join TableC tcStart
on tcStart.Date = cte.ClosestStartDate
left outer join TableC tcIncident
on tcIncident.Date = cte.IncidentDate

Resources