I'd like to compare in a query between decimal values in varchar columns.
The comparison should be if maximum age is highest than the minimum age * 10 for each groupid
for example (the age column is varchar):
ID | Name | Age | GroupID
--------------------------
1 | AAA | 10.1 | 1
2 | BBB | 11 | 1
3 | CCC | 31.2 | 1
4 | DDD | 30.4 | 2
This is what I've tried to do after searching for a solution
SELECT TOP 10 * FROM Groups g
JOIN People p1 ON p1.GroupID = g.ID
JOIN People p2 ON p2.GroupID = g.ID
WHERE CONVERT(DECIMAL(8,2),ISNULL(p1.Age,0)) > (CONVERT(DECIMAL(8,2),ISNULL(p2.Age,0)) * 10)
Thanks in advance!
I think you want to use window functions for this:
select t.*
from (select t.*,
min(case when isnumeric(Age) = 1 then cast(Age as float) end) over
(partition by groupID) as minage
from table t
) t
where age > 10 * minage;
The case statement helps prevent errors in the event that age's are not numeric. Also, I figure float is a good enough data type for the age, although you can also use decimal.
Related
I have a table with a history of assigning Eployee Type to a Work item, like follows:
| WorkItemID | EmployeeTypeID | ValidFrom | ValidTo |
| 1 | 1 | 2017-03-01 12:19:20.000 | 2017-03-05 14:11:20.000 |
| 1 | 1 | 2017-03-10 17:00:20.000 | NULL |
| 1 | 2 | 2017-05-12 12:19:20.000 | 2017-05-29 14:11:20.000 |
| 1 | 2 | 2017-07-01 12:19:20.000 | NULL |
| 2 | 1 | 2017-01-01 15:19:20.000 | 2017-03-01 11:29:20.000 |
| 2 | 1 | 2017-04-03 16:19:20.000 | NULL |
NULL means that there's no End date for the last assignment and it is still valid.
I also have a table with a history of assigning Eployee Type to an Employee:
| EmployeeID | EmployeeTypeID | ValidFrom | ValidTo |
| 1 | 1 | 2017-01-01 12:19:20.000 | 2017-03-05 14:11:20.000 |
| 1 | 2 | 2017-03-05 14:11:20.000 | NULL |
| 2 | 1 | 2016-05-05 15:19:20.000 | 2017-03-01 11:29:20.000 |
| 2 | 2 | 2017-03-01 11:29:20.000 | NULL |
For a given EmployeeID and WorkItemID, I need to select a minimum date within these date ranges where their EmployeeTypeID matched (if there is any).
For example, for EmployeeID = 1 And WorkItemID = 1 the minimum date when their Employeetypes matched is 2017-03-01 (disregard the time part).
How do I write an SQL query to join these two tables correctly and select the desired date?
The following way appeared to be correct for me:
Firstly, I select Min Date from table 1 that match with table 2 by date ranges and they should overlap as well:
DECLARE #MinDate1 datetime
DECLARE #MinDate2 datetime
SELECT #MinDate1 =
(SELECT MIN(t1.ValidFrom)
FROM Table1 t1
JOIN Table2 t2 ON t1.EmployeeTypeID = t2.EmployeeTypeID
WHERE t1.WorkItemID = 1 AND t2.EmployeeID = 1
AND (t1.ValidFrom <= t2.ValidTo OR t2.ValidTo IS NULL)
AND (t1.ValidTo >= t2.ValidFrom OR t1.ValidTo IS NULL))
Then I select Min Date from table 2 that match with table 1 by date ranges and they should overlap as well:
SELECT #MinDate2 =
(SELECT MIN(t2.ValidFrom)
FROM Table1 t1
JOIN Table2 t2 ON t1.EmployeeTypeID = t2.EmployeeTypeID
WHERE t1.WorkItemID = 1 AND t2.EmployeeID = 1
AND (t1.ValidFrom <= t2.ValidTo OR t2.ValidTo IS NULL)
AND (t1.ValidTo >= t2.ValidFrom OR t1.ValidTo IS NULL))
And finaly, I select the max date of two which would be the min date when the two ranges actually overlap and have the same EmployeeTypeID
SELECT CASE WHEN #MinDate1 > #MinDate2 THEN #MinDate1 ELSE #MinDate2 END AS MinOverlapDate
The output would be:
| MinOverlapDate |
| 2017-03-01 12:19:20.000 |
So it should be something like this:
SELECT MIN(Date)
FROM table1 t1
JOIN table2 t2 ON t1.EmployeeTypeID = t2.EmployeeTypeID
WHERE t1.EmployeeID = givenValue AND t2.WorkitemID = givenValue
But again if you dont know from which table the result goes you cant write a query for that.
What you should do is do at least 3 tables or maybe more
Would contain Employee informations
Items jobs dates whatever is connected to WORK
Some connection between them (Emp 1 has Work 2) (Emp 2 has Work 4) and so on
You CANNOT have same values in two tables without knowing from which one you want to get tha data!
OR .. You can do it into one table.
Columns: WorkItem | EmployeeID | EmployeeType | Date | Date
Actually, my variant still does not work correctly. The #MinDate1 and #MinDate2 should be compared by each EmployeeTypeID one by one. There it was compared independently.
Here is correct variant of solving this problem:
SELECT MIN(CASE WHEN t1.ValidFrom > t2.ValidFrom THEN t1.ValidFrom ELSE t2.ValidFrom END) AS MinOverlapDate
FROM Table1 t1
JOIN Table2 t2 ON t1.EmployeeTypeID = t2.EmployeeTypeID
WHERE t1.WorkItemID = 1 AND t2.EmployeeID = 1
AND (t1.ValidFrom <= t2.ValidTo OR t2.ValidTo IS NULL)
AND (t1.ValidTo >= t2.ValidFrom OR t1.ValidTo IS NULL)
Don't use >=, <=, = or between when comparing datetime fields. Since all of the mention operator would check against time as well. You would want to use datediff to check against the smallest interval according to your needs
select
Min_Overlap_Per_Section = (select MAX(ValidFrom)
FROM (VALUES (t1.ValidFrom), (t2.ValidFrom)) as ValidFrom(ValidFrom))
, Section_From = (select MAX(ValidFrom)
FROM (VALUES (t1.ValidFrom), (t2.ValidFrom)) as ValidFrom(ValidFrom))
, Section_To = (select MIN(ValidTo)
FROM (VALUES (t1.ValidTo), (t2.ValidTo)) as ValidTo(ValidTo))
from Table1
JOIN Table2 t2 ON t1.EmployeeTypeID = t2.EmployeeTypeID
where (
datediff(day, t1.ValidFrom, t2.ValidTo) >= 0
or t2.ValidTo IS NULL
)
and (
datediff(day, t2.ValidFrom, t1.ValidTo) >= 0
or t1.ValidTo IS NULL
)
I can use a traditional subquery approach to count the occurrences in the last ten minutes. For example, this:
drop table if exists [dbo].[readings]
go
create table [dbo].[readings](
[server] [int] NOT NULL,
[sampled] [datetime] NOT NULL
)
go
insert into readings
values
(1,'20170101 08:00'),
(1,'20170101 08:02'),
(1,'20170101 08:05'),
(1,'20170101 08:30'),
(1,'20170101 08:31'),
(1,'20170101 08:37'),
(1,'20170101 08:40'),
(1,'20170101 08:41'),
(1,'20170101 09:07'),
(1,'20170101 09:08'),
(1,'20170101 09:09'),
(1,'20170101 09:11')
go
-- Count in the last 10 minutes - example periods 08:31 to 08:40, 09:12 to 09:21
select server,sampled,(select count(*) from readings r2 where r2.server=r1.server and r2.sampled <= r1.sampled and r2.sampled > dateadd(minute,-10,r1.sampled)) as countinlast10minutes
from readings r1
order by server,sampled
go
How can I use a window function to obtain the same result ? I've tried this:
select server,sampled,
count(case when sampled <= r1.sampled and sampled > dateadd(minute,-10,r1.sampled) then 1 else null end) over (partition by server order by sampled rows between unbounded preceding and current row) as countinlast10minutes
-- count(case when currentrow.sampled <= r1.sampled and currentrow.sampled > dateadd(minute,-10,r1.sampled) then 1 else null end) over (partition by server order by sampled rows between unbounded preceding and current row) as countinlast10minutes
from readings r1
order by server,sampled
But the result is just the running count. Any system variable that refers to the current row pointer ? currentrow.sampled ?
This isn't a very pleasing answer but one possibility is to first create a helper table with all the minutes
CREATE TABLE #DateTimes(datetime datetime primary key);
WITH E1(N) AS
(
SELECT 1 FROM (VALUES(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1)) V(N)
) -- 1*10^1 or 10 rows
, E2(N) AS (SELECT 1 FROM E1 a, E1 b) -- 1*10^2 or 100 rows
, E4(N) AS (SELECT 1 FROM E2 a, E2 b) -- 1*10^4 or 10,000 rows
, E8(N) AS (SELECT 1 FROM E4 a, E4 b) -- 1*10^8 or 100,000,000 rows
,R(StartRange, EndRange)
AS (SELECT MIN(sampled),
MAX(sampled)
FROM readings)
,N(N)
AS (SELECT ROW_NUMBER()
OVER (
ORDER BY (SELECT NULL)) AS N
FROM E8)
INSERT INTO #DateTimes
SELECT TOP (SELECT 1 + DATEDIFF(MINUTE, StartRange, EndRange) FROM R) DATEADD(MINUTE, N.N - 1, StartRange)
FROM N,
R;
And then with that in place you could use ROWS BETWEEN 9 PRECEDING AND CURRENT ROW
WITH T1 AS
( SELECT Server,
MIN(sampled) AS StartRange,
MAX(sampled) AS EndRange
FROM readings
GROUP BY Server )
SELECT Server,
sampled,
Cnt
FROM T1
CROSS APPLY
( SELECT r.sampled,
COUNT(r.sampled) OVER (ORDER BY N.datetime ROWS BETWEEN 9 PRECEDING AND CURRENT ROW) AS Cnt
FROM #DateTimes N
LEFT JOIN readings r
ON r.sampled = N.datetime
AND r.server = T1.server
WHERE N.datetime BETWEEN StartRange AND EndRange ) CA
WHERE CA.sampled IS NOT NULL
ORDER BY sampled
The above assumes that there is at most one sample per minute and that all the times are exact minutes. If this isn't true it would need another table expression pre-aggregating by datetimes rounded to the minute.
As far as I know, there is not a simple exact replacement for your subquery using window functions.
Window functions operate on a set of rows and allow you to work with them based on partitions and order.
What you are trying to do isn't the type of partitioning that we can work with in window functions.
To generate the partitions we would need to be able to use window functions in this instance would just result in overly complicated code.
I would suggest cross apply() as an alternative to your subquery.
I am not sure if you meant to restrict your results to within 9 minutes, but with sampled > dateadd(...) that is what is happening in your original subquery.
Here is what a window function could look like based on partitioning your samples into 10 minute windows, along with a cross apply() version.
select
r.server
, r.sampled
, CrossApply = x.CountRecent
, OriginalSubquery = (
select count(*)
from readings s
where s.server=r.server
and s.sampled <= r.sampled
/* doesn't include 10 minutes ago */
and s.sampled > dateadd(minute,-10,r.sampled)
)
, Slices = count(*) over(
/* partition by server, 10 minute slices, not the same thing*/
partition by server, dateadd(minute,datediff(minute,0,sampled)/10*10,0)
order by sampled
)
from readings r
cross apply (
select CountRecent=count(*)
from readings i
where i.server=r.server
/* changed to >= */
and i.sampled >= dateadd(minute,-10,r.sampled)
and i.sampled <= r.sampled
) as x
order by server,sampled
results: http://rextester.com/BMMF46402
+--------+---------------------+------------+------------------+--------+
| server | sampled | CrossApply | OriginalSubquery | Slices |
+--------+---------------------+------------+------------------+--------+
| 1 | 01.01.2017 08:00:00 | 1 | 1 | 1 |
| 1 | 01.01.2017 08:02:00 | 2 | 2 | 2 |
| 1 | 01.01.2017 08:05:00 | 3 | 3 | 3 |
| 1 | 01.01.2017 08:30:00 | 1 | 1 | 1 |
| 1 | 01.01.2017 08:31:00 | 2 | 2 | 2 |
| 1 | 01.01.2017 08:37:00 | 3 | 3 | 3 |
| 1 | 01.01.2017 08:40:00 | 4 | 3 | 1 |
| 1 | 01.01.2017 08:41:00 | 4 | 3 | 2 |
| 1 | 01.01.2017 09:07:00 | 1 | 1 | 1 |
| 1 | 01.01.2017 09:08:00 | 2 | 2 | 2 |
| 1 | 01.01.2017 09:09:00 | 3 | 3 | 3 |
| 1 | 01.01.2017 09:11:00 | 4 | 4 | 1 |
+--------+---------------------+------------+------------------+--------+
Thanks, Martin and SqlZim, for your answers. I'm going to raise a Connect enhancement request for something like %%currentrow that can be used in window aggregates. I'm thinking this would lead to much more simple and natural sql:
select count(case when sampled <= %%currentrow.sampled and sampled > dateadd(minute,-10,%%currentrow.sampled) then 1 else null end) over (...whatever the window is...)
We can already use expressions like this:
select count(case when sampled <= getdate() and sampled > dateadd(minute,-10,getdate()) then 1 else null end) over (...whatever the window is...)
so thinking would be great if we could reference a column that's in the current row.
I have 2 tables, one with the ID and its count and the other with the names belonging to the respective IDs. They need to be joined so that the end result is a table with count of 1 in each row and the respective names next to them. Note that the number of names in table 2 is less than the count in table 1 for the same ID in some cases.
Table 1
ID | Count
-----------
100 | 3
101 | 2
102 | 4
Table 2
ID | Name
----------
100 | abc
100 | def
101 | ghi
101 | jkl
102 | mno
102 | pqr
102 | stu
Result
ID | Count | Name
------------------
100 | 1 | abc
100 | 1 | def
100 | 1 |
101 | 1 | ghi
101 | 1 | jkl
102 | 1 | mno
102 | 1 | pqr
102 | 1 | stu
102 | 1 |
I'm using TSQL for this and my current query converts table 1 into multiple rows in the result table; then it inserts individual names from table 2 into the result table through a loop. I'm hoping there must be a simpler or more efficient way to do this as the current method takes considerable amount of time. If there is, please let me know.
The first thing that comes to mind for me involves using a Number table, which you could create (as a one-time task) like this:
CREATE TABLE numbers (
ID INT
)
DECLARE #CurrentNumber INT, #MaxNumber INT
SET #MaxNumber = 100 -- Choose a value here which you feel will always be greater than MAX(table1.Count)
SET #CurrentNumber = 1
WHILE #CurrentNumber <= #MaxNumber
BEGIN
INSERT INTO numbers VALUES (#CurrentNumber)
SET #CurrentNumber = #CurrentNumber + 1
END
Once you have a numbers table, you can solve this problem like this:
SELECT one.ID,
1 AS [Count],
ISNULL(two.Name,'') AS Name
FROM table1 one
JOIN numbers n ON n.ID <= CASE WHEN one.[Count] >= (SELECT COUNT(1) FROM table2 two WHERE one.ID = two.ID)
THEN one.[Count]
ELSE (SELECT COUNT(1) FROM table2 two WHERE one.ID = two.ID)
END
LEFT JOIN (SELECT ID,
Name,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS RecordNo
FROM table2) two ON one.ID = two.ID
AND two.RecordNo = n.ID
I have the Following tables
Disposition Table
Dis_ID | OfferID | RequestID
------------------------------------
34564 | 123 | 9
77456 | 123 | 8
25252 | 124 | 7
46464 | 125 | 10
36464 | 125 | 6
35353 | 125 | 5
Request Table
RequestID | AccountNum |
---------------------------
5 | 548543 |
6 | 548543 |
7 | 684567 |
8 | 684567 |
9 | 684567 |
10 | 548543 |
11 | 684567 |
Rank Table
RankID | OfferId | RequestID | Score
-------------------------------------------
34564 | 123 | 11 | 1
77456 | 124 | 11 | 2
25252 | 125 | 11 | 3
Using the data above I need a query which would behave as follows given a request number look at every record in the Rank Table in this example we have 3 (123, 124, & 125). return the OfferId that appears the fewest times in the Disposition table for this joined account number. in this example offerId 123 appears twice for this account number, offerId 124 appears once and offerId 125 doesn't appears at all for this account number. So offerId 125 should be returned. The offerId which exist in the Rank Table with the fewest appearances in the Disposition table should always be returned unless they are all the same then return the offerId with the lowest value in the Score field. for example if none of the offerIDs appeared in the Dispostion table offerId 123 would return since its Score value is 1.
Resulting table would look something like this
| OfferId | Score | Dis_Occurrences
---------------------------------------------------------------
| 123 | 1 | 2
| 124 | 2 | 1
| 125 | 3 | 0 <--Return this record
This is what I have so far.
SELECT oRank.OfferId, oRank.Rank_Number, count(oRank.OfferId) AS NumDispositions
From Rank oRank
join Request req
on oRank.RequestId = req.RequestId
join Disposition dis
on oRank.OfferId = dis.OfferId
where req.Customer_Account_Number = 684567 and req.RequestId = 11 and oRank.OfferId = dis.OfferId
group by oRank.Rank_Number, oRank.OfferId
order by NumDispositions, oRank.Rank_Number
My incorrect Resulting table looks like this
| OfferId | Score | Dis_Occurrences
---------------------------------------------------------------
| 123 | 1 | 2
| 124 | 2 | 1
| 125 | 3 | 3
It is counting the total number of times the offerId appears in the Disposition Table
EDIT - based on author's comments, here's another version:
Example in SQLFiddle: http://sqlfiddle.com/#!6/d3f99/1/0
with RankReqMap as (
select rnk.OfferId, rnk.Score, reqAcct.AccountNum, reqReq.RequestID
from [Rank] rnk
left join Request reqAcct on reqAcct.RequestID = rnk.RequestID
left join Request reqReq on reqReq.AccountNum = reqAcct.AccountNum
where rnk.RequestID = 11 -- Put your RequestId filter here
)
select oRank.OfferId
,oRank.Score
,count(dis.RequestID) as NumDispositions
from RankReqMap oRank
left join Disposition dis on dis.OfferID = oRank.OfferId
and dis.RequestID = oRank.RequestID
group by oRank.OfferId , oRank.Score
order by NumDispositions, oRank.Score;
ORIGINAL POST
Example in SQLFiddle: http://sqlfiddle.com/#!6/770a8/1/0
This query makes the assumption that you're joining Disposition to Rank based on OfferID, since the RequestIDs for those tables in your example data don't match up. You may have to tweak depending on your needs, but something like the query below should get you the record you're looking for:
-- Gather base data
with RankData as (
select rnk.RankID
,rnk.OfferID
,rnk.RequestID
,rnk.Score
,Dis_Occurrences = count(dis.OfferID)
from dbo.[Rank] rnk
left join dbo.Disposition dis on dis.OfferID = rnk.OfferId
left join dbo.Request req on req.RequestID = rnk.RequestID
group by rnk.RankID, rnk.OfferID, rnk.RequestID, rnk.Score
)
-- Rank count of Dis_Occurrences, taking lowest score into account as a tie breaker
, DispRanking as (
select rdt.*, Dis_Rank = row_number() over (order by Dis_Occurrences asc, rdt.Score asc)
from RankData rdt
)
-- Return only the value with the highest ranking
select * from DispRanking where Dis_Rank = 1
Note also that if you convert the second CTE into a naked SELECT and remove the SELECT statement at the end, you can see all of the records and how they get ranked by the row_number() function:
-- Gather base data
with RankData as (
select rnk.RankID
,rnk.OfferID
,rnk.RequestID
,rnk.Score
,Dis_Occurrences = count(dis.OfferID)
from dbo.[Rank] rnk
left join dbo.Disposition dis on dis.OfferID = rnk.OfferId
left join dbo.Request req on req.RequestID = rnk.RequestID
group by rnk.RankID, rnk.OfferID, rnk.RequestID, rnk.Score
)
-- Output all values, with rankings
select rdt.*, Dis_Rank = row_number() over (order by Dis_Occurrences asc, rdt.Score asc)
from RankData rdt
Good luck!
I think you can use window function for this:
;with disp as(select offerid, count(*) as ocount
from dispositions group by offerid),
rnk as(select r.offerid,
row_number() over(partition by r.requestid
order by isnull(d.ocount, 0), r.score) rn
from ranks r
left join disp d on r.offerid = d.offerid)
select * from rnk where rn = 1
Can anyone help me with query, I have table
vendorid, agreementid, sales
12001 1004 700
5291 1004 20576
7596 1004 1908
45 103 345
41 103 9087
what is the goal ?
when agreemtneid >1 then show me data when sales is the highest
vendorid agreementid sales
5291 1004 20576
41 103 9087
Any ideas ?
Thx
Well you could try using a CTE and ROW_NUMBER something like
;WITH Vals AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY AgreementID ORDER BY Sales DESC) RowID
FROM MyTable
WHERE AgreementID > 1
)
SELECT *
FROM Vals
WHERE RowID = 1
This will avoid you returning multiple records with the same sale.
If that was OK you could try something like
SELECT *
FROM MyTable mt INNER JOIN
(
SELECT AgreementID, MAX(Sales) MaxSales
FROM MyTable
WHERE AgreementID > 1
) MaxVals ON mt.AgreementID = MaxVals.AgreementID AND mt.Sales = MaxVals.MaxSales
SELECT TOP 1 WITH TIES *
FROM MyTable
ORDER BY DENSE_RANK() OVER(PARTITION BY agreementid ORDER BY SIGN (SIGN (agreementid - 2) + 1) * sales DESC)
Explanation
We break table MyTable into partitions by agreementid.
For each partition we construct a ranking or its rows.
If agreementid is greater than 1 ranking will be equal to ORDER BY sales DESC.
Otherwise ranking for every single row in partition will be the same: ORDER BY 0 DESC.
See how it looks like:
SELECT *
, SIGN (SIGN (agreementid - 2) + 1) * sales AS x
, DENSE_RANK() OVER(PARTITION BY agreementid ORDER BY SIGN (SIGN (agreementid - 2) + 1) * sales DESC) AS rnk
FROM MyTable
+----------+-------------+-------+-------+-----+
| vendorid | agreementid | sales | x | rnk |
+----------|-------------|-------+-------+-----+
| 0 | 0 | 3 | 0 | 1 |
| -1 | 0 | 7 | 0 | 1 |
| 0 | 1 | 3 | 0 | 1 |
| -1 | 1 | 7 | 0 | 1 |
| 41 | 103 | 9087 | 9087 | 1 |
| 45 | 103 | 345 | 345 | 2 |
| 5291 | 1004 | 20576 | 20576 | 1 |
| 7596 | 1004 | 1908 | 1908 | 2 |
| 12001 | 1004 | 700 | 700 | 3 |
+----------+-------------+-------+-------+-----+
Then using TOP 1 WITH TIES construction we leave only rows where rnk equals 1.
you can try like this.
SELECT TOP 1 sales FROM MyTable WHERE agreemtneid > 1 ORDER BY sales DESC
I really do not know the business logic behind agreement_id > 1. It looks to me you want the max sales (with ties) by agreement id regardless of vendor_id.
First, lets create a simple sample database.
-- Sample table
create table #sales
(
vendor_id int,
agreement_id int,
sales_amt money
);
-- Sample data
insert into #sales values
(12001, 1004, 700),
(5291, 1004, 20576),
(7596, 1004, 1908),
(45, 103, 345),
(41, 103, 9087);
Second, let's solve this problem using a common table expression to get a result set that has each row paired with the max sales by agreement id.
The select statement just applies the business logic to filter the data to get your answer.
-- CTE = max sales for each agreement id
;
with cte_sales as
(
select
vendor_id,
agreement_id,
sales_amt,
max(sales_amt) OVER(PARTITION BY agreement_id) AS max_sales
from
#sales
)
-- Filter by your business logic
select * from cte_sales where sales_amt = max_sales and agreement_id > 1;
The screen shot below shows the exact result you wanted.