SSRS 2008 R2 - evaluating running total only on change of group - sql-server

I have a report where I capture patient information, some of which is stored in the patient table and some of which is stored in the observations table. Taking date of birth as my example, if I count all the records for which the DOB has been supplied, I get significantly more than the total number of patients, because of the join to the observations table. How do I evaluate the running total only once for each group?
Edit: some sample data over at http://sqlfiddle.com/#!3/27b91/1/0. If I count birthdates from that query, I want 2 as the answer; same for race and ethnicity.

The following may or may not be the right approach for your specific situation, but it can be a useful technique to have at your disposal.
You can add some code to your select statement to help yourself answer questions like these 'downstream' (either via added criteria or via SSRS). See this modification of your SQL Fiddle:
select pid, firstName, lastName, dateOfBirth, obsName, obsValue, obsDate,
rowRank, CASE rowRank WHEN 1 THEN 1 ELSE 0 END AS countableRow
from
(
select Person.pid, Person.firstName, Person.lastName, Person.dateOfBirth
, Obs.obsName, Obs.obsValue, Obs.obsDate,
ROW_NUMBER() OVER (PARTITION BY Person.pid, Person.firstName, Person.lastName, Person.dateOfBirth ORDER BY Obs.obsDate) AS rowRank
from Person
join Obs on Person.pId = Obs.pId
) rankedData
The rowRank field will create a group-relative ranking number, which may or may not be useful to you downstream. The countableRow field will be either 1 or 0 such that each group will have one and only one row with a 1 in it. Doing SUM(countableRow) will give you the proper number of groups in your data.
Now, you can extend this functionality (if you wish) by dumping out actual field values instead of a constant scalar like 1 in the first row of each group. So, if you had CASE rowRank WHEN 1 THEN dateOfBirth ELSE NULL END AS countableDOB, you could then, for example, get the total number of people with each distinct birthday using just this dataset.
Of course, you can do all those things using methods like #Russell's with SQL anyway, so this would be most relevant with specific downstream requirements that may not match your situation.
EDIT
Obviously the countableRow field there isn't a one-size-fits-all solution to the types of queries you want. I have added a few more examples of the PARTITION BY strategy to another SQL Fiddle:
select pid, firstName, lastName, dateOfBirth, obsName, obsValue, obsDate,
rowRank, CASE rowRank WHEN 1 THEN 1 ELSE 0 END AS countableRow,
valueRank, CASE valueRank WHEN 1 THEN 1 ELSE 0 END AS valueCount,
dobRank, CASE WHEN dobRank = 1 AND dateOfBirth IS NOT NULL THEN 1 ELSE 0 END AS dobCount
from
(
select Person.pid, Person.firstName, Person.lastName, Person.dateOfBirth
, Obs.obsName, Obs.obsValue, Obs.obsDate,
ROW_NUMBER() OVER (PARTITION BY Person.pid, Person.firstName, Person.lastName, Person.dateOfBirth ORDER BY Obs.obsDate) AS rowRank,
ROW_NUMBER() OVER (PARTITION BY Obs.obsName, Obs.obsValue ORDER BY Obs.obsDate) AS valueRank,
ROW_Number() OVER (PARTITION BY Person.dateOfBirth ORDER BY Person.pid) AS dobRank
from Person
join Obs on Person.pId = Obs.pId
) rankedData
Lest anyone misunderstand me as suggesting this is always appropriate, it obviously isn't. This isn't a better solution to getting specific answers using additional SQL queries. What it allows you to do is encode enough information to simply answer such questions in the consuming code all in a single result set. That's where it can come in handy.
SECOND EDIT
Since you were wondering whether you can do this if race data is stored in more than one place, the answer is, absolutely. I have revised the code from my previous SQL Fiddle, which is now available in a new one:
select pid, firstName, lastName, dateOfBirth, obsName, obsValue, obsDate,
rowRank, CASE rowRank WHEN 1 THEN 1 ELSE 0 END AS countableRow,
valueRank, CASE valueRank WHEN 1 THEN 1 ELSE 0 END AS valueCount,
dobRank, CASE WHEN dobRank = 1 AND dateOfBirth IS NOT NULL THEN 1 ELSE 0 END AS dobCount,
raceRank, CASE WHEN raceRank = 1 AND (race IS NOT NULL OR obsName = 'RACE') THEN 1 ELSE 0 END AS raceCount
from
(
select Person.pid, Person.firstName, Person.lastName, Person.dateOfBirth, Person.[race]
, Obs.obsName, Obs.obsValue, Obs.obsDate,
ROW_NUMBER() OVER (PARTITION BY Person.pid, Person.firstName, Person.lastName, Person.dateOfBirth ORDER BY Obs.obsDate) AS rowRank,
ROW_NUMBER() OVER (PARTITION BY Obs.obsName, Obs.obsValue ORDER BY Obs.obsDate) AS valueRank,
ROW_NUMBER() OVER (PARTITION BY Person.dateOfBirth ORDER BY Person.pid) AS dobRank,
ROW_NUMBER() OVER (PARTITION BY ISNULL(Person.race, CASE Obs.obsName WHEN 'RACE' THEN Obs.obsValue ELSE NULL END) ORDER BY Person.pid) AS raceRank
from Person
left join Obs on Person.pId = Obs.pId
) rankedData
As you can see, in the new Fiddle, this properly counts the number of Races as 3, with 2 being in the Obs table and the third being in the Person table. The trick is that PARTITION BY can contain expressions, not just raw column output. Note that I changed the join to a left join here, and that we need to use a CASE to only include obsValue WHERE obsName is 'RACE'. It is a little complicated, but not overwhelmingly so, and it handles even fairly complex cases gracefully.

It turned out that Jeroen's pointer to RunningValue was more on-target than I thought. I was able to get the results I wanted with the following code:
=RunningValue(Iif(Not IsNothing(Fields!DATEOFBIRTH.Value)
, Fields!PATIENTID.Value
, Nothing)
, CountDistinct
, Nothing
)
Thanks particularly to Dominic P, whose technique I'll keep in mind for next time.

This will only pull one record per patient, unless they reported different DOBs:
SELECT P.FOO,
P.BAR,
(etc.),
O.DOB
FROM Patients P
INNER JOIN Observations O
ON P.PatientID = O.PatientID
GROUP BY P.FOO, P.BAR, (P.etc), O.DOB

Related

Need guidance in using LISTAGG with Regular Expression

SELECT
ID_Col,
lower(LISTAGG(distinct TEXT_COL,',')WITHIN GROUP(ORDER BY TEXT_COL))
AS TEXT_COL_TXT
FROM
(SELECT
CREATE_DT,
ID_Col,
TEXT_COL,
TRY_CAST(Q_NO as INTEGER) as Q_NO
FROM db_name.schema_name.tbl_name
WHERE Flg = '0'
AND date_of_cr = '2022-02-05'
AND P_CODE NOT IN ('1','2','3','4')
AND ID_Col IN('12345','23456')
ORDER BY Q_NO)
GROUP BY 1;
When I run the above query, I'm getting results like this:
ID_COL TEXT_COL
12345 ::abcd::0,aforapple
23456 ::abcd::0,n:sometext:::empty::
I want this value to be removed in the result --> ::abcd::0,
The result should look like below:
ID_COL TEXT_COL
12345 aforapple
23456 n:sometext:::empty::
Can anyone guide me how to bring such a result?
When I use the below logic, I could see comma in the results now:
LISTAGG(distinct iff(TEXT_COL = '::abcd::0', '', TEXT_COL),',')
Result I could see is:
ID_COL TEXT_COL
12345 ,aforapple
23456 ,n:sometext:::empty::
I should not display comma in the results
It would nice if you started accepting answershttps://stackoverflow.com/help/accepted-answer
It would also be nice if you posted minimal required SQL.
So you posted SQL
SELECT
ID_Col,
lower(LISTAGG(distinct TEXT_COL,',') WITHIN GROUP (ORDER BY TEXT_COL)) AS TEXT_COL_TXT
FROM
(
SELECT
CREATE_DT,
ID_Col,
TEXT_COL,
TRY_CAST(Q_NO as INTEGER) as Q_NO
FROM db_name.schema_name.tbl_name
WHERE Flg = '0'
AND date_of_cr = '2022-02-05'
AND P_CODE NOT IN ('1','2','3','4')
AND ID_Col IN('12345','23456')
ORDER BY Q_NO
)
GROUP BY 1;
So because your filters on db_name.schema_name.tbl_name have zero impact on the LISTAGG question, those can be dropped. The ORDER BY should be removed, SQL-Server for example will fail this SQL, because it doesn't make a lot of sense to order a sub-select. Thus it can become:
SELECT
ID_Col,
lower(LISTAGG(distinct TEXT_COL,',') WITHIN GROUP (ORDER BY TEXT_COL)) AS TEXT_COL_TXT
FROM
(
SELECT
ID_Col,
TEXT_COL,
FROM db_name.schema_name.tbl_name
)
GROUP BY 1;
But actually that can become:
SELECT
ID_Col,
lower(LISTAGG(distinct TEXT_COL,',') WITHIN GROUP (ORDER BY TEXT_COL)) AS TEXT_COL_TXT
FROM db_name.schema_name.tbl_name
GROUP BY 1;
Now if you want to be friend you can provide some working data in a table
ID_COL
TEXT_COL
12345
::abcd::0
12345
aforapple
23456
::abcd::0
23456
n:sometext:::empty::
Of you could provide the data in the small example query that you provide:
SELECT
column1,
lower(LISTAGG(distinct column2,',') WITHIN GROUP (ORDER BY column2)) AS TEXT_COL_TXT
FROM VALUES
(12345, '::abcd::0'),
(12345, 'aforapple'),
(23456, '::abcd::0'),
(23456, 'n:sometext:::empty::')
GROUP BY 1;
There is a huge benefit of pulling your SQL down it to the smallest reproducible example. Sometimes as you remove the unneeded bits, you can see the bigger picture and notice the mistake. Sometimes as you pull things out you undo a part that you didn't fully understand, and thus you have smaller code that works and small + just a little more that does not work, and that is enough tell you which commands needs to be reread in the help to understand the interactions.
Try to apply IFF(), which is similar to CASE WHEN: https://docs.snowflake.com/en/sql-reference/functions/iff.html
LISTAGG(distinct iff(TEXT_COL = '::abcd::0', '', TEXT_COL),',')
Logic described:
If TEXT_COL = THEN use empty string ELSE use TEXT_COL for concatenation in LISTAGG
You could use nullif which will set that column value to null if it matches ::abcd::0. The listagg will ignore the nulls in aggregation
listagg(distinct nullif(text_col,'::abcd::0'),',') within group (order by text_col)

How to get the latest not null value from multiple columns in SQL or Azure Synapse

I have data like in the below format
I want output in the below format
Please help me with the SQL code. Thanks !
Like I mention in the comments, you need to fix whatever it is that's inserting the data and not lose the values so that they become NULL in "newer" rows.
To get the results you want, you'll going to have to use row numbering and conditional aggregation, which is going to get messy the more columns you have; and why you need to fix the real problem. This will look something like this:
WITH CTE AS(
SELECT GroupingColumn,
NullableCol1,
NullableCol2,
DateColumn,
CASE WHEN NullableCol1 IS NOT NULL THEN ROW_NUMBER() OVER (PARTITION BY GroupingColumn, CASE WHEN NullableCol1 IS NULL THEN 1 ELSE 0 END ORDER BY DateColumn DESC) AS NullableCol1RN,
CASE WHEN NullableCol2 IS NOT NULL THEN ROW_NUMBER() OVER (PARTITION BY GroupingColumn, CASE WHEN NullableCol2 IS NULL THEN 1 ELSE 0 END ORDER BY DateColumn DESC) AS NullableCol2RN
FROM dbo.YourTable)
SELECT GroupingColumn,
MAX(CASE NullableCol1RN WHEN 1 THEN NullableCol1 END) AS NullableCol1,
MAX(CASE NullableCol2RN WHEN 1 THEN NullableCol2 END) AS NullableCol2,
MAX(DateColumn) AS DateColumn
FROM CTE;

Transaction data aggregate

As a disclaimer, I am not entirely sure the title of the question is best, if not I apologize.
I am trying to calculate cycle times for individuals, but files are occasionally transferred out of their work queues and eventually back. There are no unique transaction IDs recorded just a date and time stamp.
I tried looking for an aggregate group by functions and was told that is not a feature sql-server has.
I started by trying to identify the first and last transaction and was going to build out the query from there but it wasn't too helpful. Any insight would be very helpful.
Changedate is when the transfer from one person to another is recorded (year, moth, day time)
select a.claimId,
a.claimincidentID,
cast(a.changeDate as date) changedate,
a.claimNum,
a.Coverage,
a.AssignedAdjID,
a.AssignedAdj,
a.AssignedUnit,
a.TransferedAdjID,
a.TransferedAdj,
a.TransferedUnit,
a.usertypeid,
a.ChangedBy,
b.Feature_Create_Date,
DATEDIFF(day, b.Feature_Create_Date, a.changedate) transfer1,
cast(FIRST_VALUE(changeDate) OVER (ORDER BY changedate ASC)as date) AS firstchangedate,
cast(LAST_VALUE(changeDate) OVER (ORDER BY a.changedate ASC)as date) AS lastchangedate
from DB1.dbo.Assign_Transfer a
left join DB2.claimslist b on a.claimid=b.claimId
group by a.claimId, a.claimincidentID, a.changeDate, a.claimNum, a.Coverage, a.AssignedAdjID, a.AssignedAdj, a.AssignedUnit, a.TransferedAdjID, a.TransferedAdj, a.TransferedUnit, a.usertypeid, a.ChangedBy, b.Feature_Create_Date
Think of each of these rows as a Start (because the most recent one hasn't ended)
We would need to generate the complement End for this person in the chain.
Then with pairs of Start/End one could create GrossDuration.
Even after we get an assignment's start and end date/time,
we will have workday (8-4, or 9-5, or noon-8, ...) considerations,
also Sat/Sun/Hol and Vacation/out-of-office.
All of which affect Duration--- For Each Person differently.
Which would need to be factored by workday/etc into AdjDuration.
Lets say we can sequence these
Row_Number() Over (Partition by claimID Order by changeDate) as tfrNum
Assigned is the prior, and Transfered is the next
1, 2, 3, ... thru N
V
a.changeDate -- NOW()
V V
a.AssignedAdjID, | a.TransferedAdjID,
a.AssignedAdj, | a.TransferedAdj,
a.AssignedUnit, | a.TransferedUnit,
|
a.usertypeid,
a.ChangedBy,
So, is tfrNum=1 or tfrNum=N the oddball??
Lets look at pairs: each pair goes StartFrom->EndTo
1-2, 2-3, 3-4, 4-5, 5-6, 6-Now
----
From row1 we get TransferredID Start(changeDate) and
from row2 we get AssignedAdjID End (changeDate)
-- 2-3, 3-4, 4-5, etc repeating
--except for
From row6 we get TransferredID Start(changeDate) and
from default (still them) End (Now)
-- -- except again when TransferredUnit is "Closed"
After getting these pairs and their Start and End, we can do the Duration calc.
I need to visualize this problem before I try to run some sql. Real data would help.
Lets start with this, and later I would expand on it after you get it working and look at some data--
With cte_tfrNum (claimID, changeDate, tfrNum, tfrMax) AS
(
SELECT
a.claimId
,a.changeDate
,ROW_NUMBER() Over ( Partition By a.claimId Order By a.changeDate) as tfrNum
,b.tfrMax
FROM DB1.dbo.Assign_Transfer a
-- just for giggles, lets also get the max# of transfers for this claim
Left Join
(SELECT claimId, COUNT(*) as tfrMax
FROM DB1.dbo.Assign_Transfer
Group By claimId
) as b
On b.claimId = a.claimId
)
-- Statement using the CTE
Select
tfrTo.*
From cte_tfrNum as tfrTo
Thank you! I was able to take what you gave me and add a few things to be able to look at what I needed.
select
case when abc.tfrMax > abc.tfrnum then datediff(day,lag(abc.changedate) over(partition by abc.claimID order by abc.claimId),abc.changeDate)
when abc.tfrMax = abc.tfrnum then datediff(day,lag(abc.changedate) over(partition by abc.claimID order by abc.claimId),abc.changeDate)
end as test
, abc.*
from
(
SELECT
a.claimId
,a.changeDate
,a.AssignedAdj
,a.TransferedAdj
,a.Coverage
,ROW_NUMBER() Over ( Partition By a.claimId Order By a.changeDate) as tfrNum
,b.tfrMax
FROM db1.dbo.Assign_Transfer a
Left Join
(SELECT claimId, COUNT(*) as tfrMax
FROM db1.dbo.Assign_Transfer
Group By claimId
) as b
On b.claimId = a.claimId
) abc
group by
abc.claimId
,abc.changeDate
,abc.AssignedAdj
,abc.TransferedAdj
,abc.Coverage
,abc.tfrMax
,abc.tfrNum

T-SQL - Get last as-at date SUM(Quantity) was not negative

I am trying to find a way to get the last date by location and product a sum was positive. The only way i can think to do it is with a cursor, and if that's the case I may as well just do it in code. Before i go down that route, i was hoping someone may have a better idea?
Table:
Product, Date, Location, Quantity
The scenario is; I find the quantity by location and product at a particular date, if it is negative i need to get the sum and date when the group was last positive.
select
Product,
Location,
SUM(Quantity) Qty,
SUM(Value) Value
from
ProductTransactions PT
where
Date <= #AsAtDate
group by
Product,
Location
i am looking for the last date where the sum of the transactions previous to and including it are positive
Based on your revised question and your comment, here another solution I hope answers your question.
select Product, Location, max(Date) as Date
from (
select a.Product, a.Location, a.Date from ProductTransactions as a
join ProductTransactions as b
on a.Product = b.Product and a.Location = b.Location
where b.Date <= a.Date
group by a.Product, a.Location, a.Date
having sum(b.Value) >= 0
) as T
group by Product, Location
The subquery (table T) produces a list of {product, location, date} rows for which the sum of the values prior (and inclusive) is positive. From that set, we select the last date for each {product, location} pair.
This can be done in a set based way using windowed aggregates in order to construct the running total. Depending on the number of rows in the table this could be a bit slow but you can't really limit the time range going backwards as the last positive date is an unknown quantity.
I've used a CTE for convenience to construct the aggregated data set but converting that to a temp table should be faster. (CTEs get executed each time they are called whereas a temp table will only execute once.)
The basic theory is to construct the running totals for all of the previous days using the OVER clause to partition and order the SUM aggregates. This data set is then used and filtered to the expected date. When a row in that table has a quantity less than zero it is joined back to the aggregate data set for all previous days for that product and location where the quantity was greater than zero.
Since this may return multiple positive date rows the ROW_NUMBER() function is used to order the rows based on the date of the positive quantity day. This is done in descending order so that row number 1 is the most recent positive day. It isn't possible to use a simple MIN() here because the MIN([Date]) may not correspond to the MIN(Quantity).
WITH x AS (
SELECT [Date],
Product,
[Location],
SUM(Quantity) OVER (PARTITION BY Product, [Location] ORDER BY [Date] ASC) AS Quantity,
SUM([Value]) OVER(PARTITION BY Product, [Location] ORDER BY [Date] ASC) AS [Value]
FROM ProductTransactions
WHERE [Date] <= #AsAtDate
)
SELECT [Date], Product, [Location], Quantity, [Value], Positive_date, Positive_date_quantity
FROM (
SELECT x1.[Date], x1.Product, x1.[Location], x1.Quantity, x1.[Value],
x2.[Date] AS Positive_date, x2.[Quantity] AS Positive_date_quantity,
ROW_NUMBER() OVER (PARTITION BY x1.Product, x1.[Location] ORDER BY x2.[Date] DESC) AS Positive_date_row
FROM x AS x1
LEFT JOIN x AS x2 ON x1.Product=x2.Product AND x1.[Location]=x2.[Location]
AND x2.[Date]<x1.[Date] AND x1.Quantity<0 AND x2.Quantity>0
WHERE x1.[Date] = #AsAtDate
) AS y
WHERE Positive_date_row=1
Do you mean that you want to get the last date of positive quantity come to positive in group?
For example, If you are using SQL Server 2012+:
In following scenario, when the date going to 01/03/2017 the summary of quantity come to 1(-10+5+6).
Is it possible the quantity of following date come to negative again?
;WITH tb(Product, Location,[Date],Quantity) AS(
SELECT 'A','B',CONVERT(DATETIME,'01/01/2017'),-10 UNION ALL
SELECT 'A','B','01/02/2017',5 UNION ALL
SELECT 'A','B','01/03/2017',6 UNION ALL
SELECT 'A','B','01/04/2017',2
)
SELECT t.Product,t.Location,SUM(t.Quantity) AS Qty,MIN(CASE WHEN t.CurrentSum>0 THEN t.Date ELSE NULL END ) AS LastPositiveDate
FROM (
SELECT *,SUM(tb.Quantity)OVER(ORDER BY [Date]) AS CurrentSum FROM tb
) AS t GROUP BY t.Product,t.Location
Product Location Qty LastPositiveDate
------- -------- ----------- -----------------------
A B 3 2017-01-03 00:00:00.000

Is there a way to combine these queries?

I have begun working some of the programming problems on HackerRank as a "productive distraction".
I was working on the first few in the SQL section and came across this problem (link):
Query the two cities in STATION with the shortest and
longest CITY names, as well as their respective lengths
(i.e.: number of characters in the name). If there is
more than one smallest or largest city, choose the one
that comes first when ordered alphabetically.
Input Format
The STATION table is described as follows:
where LAT_N is the northern latitude and LONG_W is
the western longitude.
Sample Input
Let's say that CITY only has four entries:
1. DEF
2. ABC
3. PQRS
4. WXY
Sample Output
ABC 3
PQRS 4
Explanation
When ordered alphabetically, the CITY names are listed
as ABC, DEF, PQRS, and WXY, with the respective lengths
3, 3, 4 and 3. The longest-named city is obviously PQRS,
but there are options for shortest-named city; we choose
ABC, because it comes first alphabetically.
I agree that this requirement could be written much more clearly, but the basic gist is pretty easy to get, especially with the clarifying example. The question I have, though, occurred to me because the instructions given in the comments for the question read as follows:
/*
Enter your query here.
Please append a semicolon ";" at the end of the query and
enter your query in a single line to avoid error.
*/
Now, writing a query on a single line doesn't necessarily imply a single query, though that seems to be the intended thrust of the statement. However, I was able to pass the test case using the following submission (submitted on 2 lines, with a carriage return in between):
SELECT TOP 1 CITY, LEN(CITY) FROM STATION ORDER BY LEN(CITY), CITY;
SELECT TOP 1 CITY, LEN(CITY) FROM STATION ORDER BY LEN(CITY) DESC, CITY;
Again, none of this is advanced SQL. But it got me thinking. Is there a non-trivial way to combine this output into a single results set? I have some ideas in mind where the WHERE clause basically adds some sub-queries in an OR statement to combine the two queries into one. Here is another submission I had that passed the test case:
SELECT
CITY,
LEN(CITY)
FROM
STATION
WHERE
ID IN (SELECT TOP 1 ID FROM STATION ORDER BY LEN(CITY), CITY)
OR
ID IN (SELECT TOP 1 ID FROM STATION ORDER BY LEN(CITY) DESC, CITY)
ORDER BY
LEN(CITY), CITY;
And, yes, I realize that the final , CITY in the final ORDER BY clause is superfluous, but it kind of makes the point that this query hasn't really saved that much effort, especially against returning the query results separately.
Note: This isn't a true MAX and MIN situation. Given the following input, you aren't actually taking the first and last rows:
Sample Input
1. ABC
2. ABCD
3. ZYXW
Based on the requirements as written, you'd take #1 and #2, not #1 and #3.
This makes me think that my solutions actually might be the most efficient way to accomplish this, but my set-based thinking could always use some strengthening, and I'm not sure if that might play in here or not.
Here's another alternative. I think it's pretty straight forward, easy to understand what's going on. Performance is good.
Still has a couple of sub-queries though.
select
min(City), len(City)
from Station
group by
len(City)
having
len(City) = (select min(len(City)) from Station)
or
len(City) = (select max(len(City)) from Station)
Untested as well, but I don't see a reason for it not to work:
SELECT *
FROM (
SELECT TOP (1) CITY, LEN(CITY) AS CITY_LEN
FROM STATION
ORDER BY CITY_LEN, CITY
) AS T
UNION ALL
SELECT *
FROM (
SELECT TOP (1) CITY, LEN(CITY) AS CITY_LEN
FROM STATION
ORDER BY CITY_LEN DESC, CITY
) AS T2;
You cant have UNION ALL with ORDER BY for each SELECT statement, but you can workaround it by using subqueries togeter with TOP (1) clause and ORDER BY.
UNTESTED:
WITH CTE AS (
Select ID, len(City), row_number() over (order by City) as AlphaRN,
row_number() over (order by Len(City) desc) as LenRN) B
Select * from cte
Where AlphaRN = 1 and (lenRN = (select max(lenRN) from cte) or
lenRN = (Select min(LenRN) from cte))
Here's the best I could come up with:
with Ordering as
(
select
City,
Forward = row_number() over (order by len(City), City),
Backward = row_number() over (order by len(City) desc, City)
from
Station
)
select City, len(City) from Ordering where 1 in (Forward, Backward);
There are definitely a lot of ways to approach this as evidenced by the variety of answers, but I don't think anything beats your original two-query solution in terms of cleanly and concisely expressing the intended behavior. Interesting question, though!
This is what I came with. I tried to use only one query, without CTE's or sub-queries.
;WITH STATION AS ( --Dummy table
SELECT *
FROM (VALUES
(1,'DEF','EU',1,9),
(2,'ABC','EU',1,6), -- This is shortest
(3,'PQRS','EU',1,5),
(4,'WXY','EU',1,4),
(5,'FGHA','EU',1,2),
(6,'ASDFHG','EU',1,3) --This is longest
) as t(ID, CITY, [STATE], LAT_N,LONG_W)
)
SELECT TOP 1 WITH TIES CITY,
LEN(CITY) as CITY_LEN
FROM STATION
ORDER BY ROW_NUMBER() OVER(PARTITION BY LEN(CITY) ORDER BY LEN(CITY) ASC),
CASE WHEN MAX(LEN(CITY)) OVER (ORDER BY (SELECT NULL)) = LEN(CITY)
OR MIN(LEN(CITY)) OVER (ORDER BY (SELECT NULL))= LEN(CITY)
THEN 0 ELSE 1 END
Output:
CITY CITY_LEN
ABC 3
ASDFHG 6
select min(CITY), length(CITY)
from STATION
group by length(CITY)
having length(CITY) = (select min(length(CITY)) from STATION)
or length(CITY) = (select max(length(CITY)) from STATION);

Resources