Distinct on one column with additional criteria to select non-unique rows

Distinct on one column with additional criteria to select non-unique rows - sql-server

Okay, I know a lot of variations on this question have been asked, but here I go.
I'm starting with this query.
SELECT lprArchived, lprReportId, lprOwner
FROM ReportIndex
WHERE lprArchived = 1
In most cases, each row returned will have a unique value in the lprReportId column. However, for cases where multiple rows have the same value in lprReportId, I only want one row.
So which one? I would prefer the row where lprOwner = 'ABCD'.
Is it possible to write a query that would return unique rows and, in cases where rows were not unique, give me the one that has lprOwner = 'ABCD'?
Note: I believe that only one row will match lprOwner = 'ABCD' for a given lprReportId, but if for some reason there was more than one, I'd still only want one row returned.

Try this:
It will take one record per lprReportId and in cases where there are multiple entries with the same lprReportId, it will prioritise ones that have an lprOwner = 'ABCD'
SELECT t.Archived, t.ReportID, t.[Owner]
FROM (
SELECT
ROW_NUMBER() OVER ( PARTITION BY lprReportId ORDER BY lprReportId, CASE WHEN lprOwner = 'ABCD' THEN 1 ELSE 10 END ) AS RowNum,
lprArchived AS Archived,
lprReportId AS ReportID,
lprOwner AS [Owner]
FROM
ReportIndex
WHERE
lprArchived = 1
) t
WHERE t.RowNum = 1

select top 1
lprArchived,
lprReportId,
lprOwner
from ReportIndex
where
lprArchived = 1
order by case when lprOwner = 'ABCD' then 0 else 1 end asc
This will take all matches, order them with 'ABCD' at the top, and then take the first row. If you have any other criteria for selecting rows, you could add it to the end of the order by clause (most recent, for example).

Related

Can SQLite3 SELECT (up to) some rows based on a WHERE, and in the same query SELECT other rows on another condition?

Hi it's my first time using SQLite and I am trying to understand if it can count the results it gets and switch conditions to find other results: I am not sure how to write this, so I will use an example:
From a table of players, with the same query I'd like to:
SELECT (x) players WHERE WINS>0 AND WINS<= 3 and
(y) other players WHERE WINS = Null OR WINS=0.
x and y should be integer numbers, but they could vary.
I think I can split the query in 2 queries, but I am worried about the performance since in this way I have to connect to the db twice, and I have to check on the second query that the new IDs have not been selected already in the first one (this should never happen in this example scenario, but might happen with different conditions).
If it's possible to write this all in one single query, that would be much more "simple" and straightforward.
Unfortunately I have to stick to SQLite 3.7.17 and I can't make use of all the updates until now.
mytable is defined like this one:
CREATE TABLE mytable (
ID TEXT PRIMARY KEY,
NAME TEXT,
WINS INTEGER,
LOSSES INTEGER,
NOTES TEXT
);
These are some dummy values:
INSERT INTO mytable(ID, NAME, WINS,LOSSES, NOTES) VALUES
('A001','John','1','0','blue')
('A002','Mark','2','1','blue')
('A003','Hubert','null','null','red')
('A004','Otto','0','0','green')
('A005','Johnson','3','5','red')
('A006','Frank','null','1','green')
As an example result the query should return:
#first part (WINS>0 AND WINS<=3) x = 3
('A005','Johnson','3','5','red')
('A001','John','1','0','blue')
('A002','Mark','2','1','blue')
#second part(WINS=0 OR WINS=Null) y = 2
('A006','Frank','null','1','green')
('A004','Otto','0','0','green')
..so the result should be a table like that
('A005','Johnson','3','5','red')
('A001','John','1','0','blue')
('A002','Mark','2','1','blue')
('A006','Frank','null','1','green')
('A004','Otto','0','0','green')
Thank you for your time and knowledge. :)

Use a CTE where you filter the table for the conditions that you want to apply and with a CASE expression get an integer for the condition that is satisfied for each row.
In another CTE use ROW_NUMBER() window function to rank the rows of each condition.
Finally filter the rows with a CASE expression:
WITH
cond AS (
SELECT *,
CASE
WHEN WINS > 0 AND WINS <= 3 THEN 1
WHEN WINS IS NULL OR WINS = 0 THEN 2 -- or WHEN COALESCE(WINS, 0) = 0 THEN 2
END AS condition
FROM mytable
WHERE condition IN (1, 2)
),
cte AS (SELECT *, ROW_NUMBER() OVER (PARTITION BY condition ORDER BY condition, WINS) rn FROM cond)
SELECT ID, NAME, WINS, LOSSES, NOTES
FROM cte
WHERE rn <= CASE condition
WHEN 1 THEN 3 -- x = 3
WHEN 2 THEN 2 -- y = 2
END
ORDER BY condition, WINS DESC;
See the demo.

Is there a way you can produce an output like this in T-SQL

I have a column which I translate the values using a case statements and I get numbers like this below. There are multiple columns I need to produce the result like this and this is just one column.
How do you produce the output as a whole like this below.
The 12 is the total numbers counting from top to bottom
49 is the Average.
4.08 is the division 49/12.
1 is how many 1's are there in the output list above. As you can see there is only one 1 in the output above
8.33% is the division and percentage comes from 1/12 * 100
and so on. Is there a way to produce this output below?
drop table test111
create table test111
(
Q1 nvarchar(max)
);
INSERT INTO TEST111(Q1)
VALUES('Strongly Agree')
,('Agree')
,('Disagree')
,('Strongly Disagree')
,('Strongly Agree')
,('Agree')
,('Disagree')
,('Neutral');
SELECT
CASE WHEN [Q1] = 'Strongly Agree' THEN 5
WHEN [Q1] = 'Agree' THEN 4
WHEN [Q1] = 'Neutral' THEN 3
WHEN [Q1] = 'Disagree' THEN 2
WHEN [Q1] = 'Strongly Disagree' THEN 1
END AS 'Test Q1'
FROM test111

I have to make a few assumptions here, but it looks like you want to treat an output column like a column in a spreadsheet. You have 12 numbers. You then have a blank "separator" row. Then a row with the number 12 (which is the count of how many numbers you have). Then a row with the number 49, which is the sum of those 12 numbers. Then the 4.08 row, which is rougly the average, and so on.
Some of these outputs can be provided by cube or rollup, but neither is a complete solution.
If you wanted to produce this output directly from TSQL, you would need to have multiple select statements and combine the results of all of those statements using union all. First you would have a select just to get the numbers. Then you would have a second select which outputs a "blank". Then another select which is providing a count. Then another select which is providing a sum. And so on.
You would also no longer be able to output actual numbers, since a "blank" is not a number. Visually it's best represented as an empty string. But now your output column has to be of datatype char or varchar.
You also have to make sure rows come out in the correct order for presentation. So you need a column to order by. You would have to add some kind of ordering column "manually" to each of the select statements, so when you union them all together you can tell SQL in what order the output should be provided.
So the answer to "can it be done?" is technically "yes". But if you think seems like a whole lot of laborious and inefficient TSQL work, you'd be right.
The real solution here is to change your approach. SQL should not be concerned with "output formatting". What you should do is just return the actual data (your 12 numbers) from SQL, and then do all of the additional presentation (like adding a blank row, adding a count row, etc), in the code of the program that is calling SQL to get that data.

I must say, this is one of the strangest T-SQL requirements I've seen, and is really best left to the presentation layer.
It is possible using GROUPING SETS though. We can use it to get an extra rollup row that aggregates the whole table.
Once you have the rollup, you need to unpivot the totalled row (identified by GROUPING() = 1) to get your final result. We can do this using CROSS APPLY.
This is impossible without a row-identifier. I have added ROW_NUMBER, but any primary or unique key will do.
WITH YourTable AS (
SELECT
ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS rn,
CASE WHEN [Q1] = 'Strongly Agree' THEN 5
WHEN [Q1] = 'Agree' THEN 4
WHEN [Q1] = 'Neutral' THEN 3
WHEN [Q1] = 'Disagree' THEN 2
WHEN [Q1] = 'Strongly Disagree' THEN 1
END AS TestQ1
FROM test111
),
RolledUp AS (
SELECT
rn,
TestQ1,
grouping = GROUPING(TestQ1),
count = COUNT(*),
sum = SUM(TestQ1),
avg = AVG(TestQ1 * 1.0),
one = COUNT(CASE WHEN TestQ1 = 1 THEN 1 END),
onePct = COUNT(CASE WHEN TestQ1 = 1 THEN 1 END) * 1.0 / COUNT(*)
FROM YourTable
GROUP BY GROUPING SETS(
(rn, TestQ1),
()
)
)
SELECT v.TestQ1
FROM RolledUp r
CROSS APPLY (
SELECT r.TestQ1, 0 AS ordering
WHERE r.grouping = 0
UNION ALL
SELECT v.value, v.ordering
FROM (VALUES
(NULL , 1),
(r.count , 2),
(r.sum , 3),
(r.avg , 4),
(r.one , 5),
(r.onePct, 6)
) v(value, ordering)
WHERE r.grouping = 1
) v
ORDER BY
v.ordering,
r.rn;
db<>fiddle

T-SQL Selecting TOP 1 In A Query With Aggregates/Groups

I'm still fairly new to SQL. This is a stripped down version of the query I'm trying to run. This query is suppose to find those customers with more than 3 cases and display either the top 1 case or all cases but still show the overall count of cases per customer in each row in addition to all the case numbers.
The TOP 1 subquery approach didn't work but is there another way to get the results I need? Hope that makes sense.
Here's the code:
SELECT t1.StoreID, t1.CustomerID, t2.LastName, t2.FirstName
,COUNT(t1.CaseNo) AS CasesCount
,(SELECT TOP 1 t1.CaseNo)
FROM MainDatabase t1
INNER JOIN CustomerDatabase t2
ON t1.StoreID = t2.StoreID
WHERE t1.SubmittedDate >= '01/01/2017' AND t1.SubmittedDate <= '05/31/2017'
GROUP BY t1.StoreID, t1.CustomerID, t2.LastName, t2.FirstName
HAVING COUNT (t1.CaseNo) >= 3
ORDER BY t1.StoreID, t1.PatronID
I would like it to look something like this, either one row with just the most recent case and detail or several rows showing all details of each case in addition to the store id, customer id, last name, first name, and case count.
Data Example

For these I usually like to make a temp table of aggregates:
DROP TABLE IF EXISTS #tmp;
CREATE TABLE #tmp (
CustomerlD int NOT NULL DEFAULT 0,
case_count int NOT NULL DEFAULT 0,
case_max int NOT NULL DEFAULT 0,
);
INSERT INTO #tmp
(CustomerlD, case_count, case_max)
SELECT CustomerlD, COUNT(tl.CaseNo), MAX(tl.CaseNo)
FROM MainDatabase
GROUP BY CustomerlD;
Then you can join this "tmp" table back to any other table you want to display the number of cases on, or the max case number on. And you can limit it to customers that have more than 3 cases with WHERE case_count > 3

How to select Second Last Row in mySql?

I want to retrieve the 2nd last row result and I have seen this question:
How can I retrieve second last row?
but it uses order by which in my case does not work because the Emp_Number Column contains number of rows and date time stamp that mixes data if I use order by .
The rows 22 and 23 contain the total number of rows (excluding row 21 and 22) and the time and day it got entered respectively.
I used this query which returns the required result 21 but if this number increases it will cause an error.
SELECT TOP 1 *
FROM(
SELECT TOP 2 *
FROM DAT_History
ORDER BY Emp_Number ASC
) t
ORDER BY Emp_Number desc
Is there any way to get the 2nd last row value without using the Order By function?

There is no guarantee that the count will be returned in the one-but-last row, as there is no definite order defined. Even if those records were written in the correct order, the engine is free to return the records in any order, unless you specify an order by clause. But apparently you don't have a column to put in that clause to reproduce the intended order.
I propose these solutions:
1. Return the minimum of those values that represent positive integers
select min(Emp_Number * 1)
from DAT_history
where Emp_Number not regexp '[^0-9]'
See SQL Fiddle
This will obviously fail when the count is larger then the smallest employee number. But seeing the sample data, that would represent a number of records that is maybe not expected...
2. Count the records, ignoring the 2 aggregated records
select count(*)-2
from DAT_history
See SQL Fiddle
3. Relying on correct order without order by
As explained at the start, you cannot rely on the order, but if for some reason you still want to rely on this, you can use a variable to number the rows in a sub query, and then pick out the one that has been attributed the one-but-last number:
select Emp_Number * 1
from (select Emp_Number,
#rn := #rn + 1 rn
from DAT_history,
(select #rn := 0) init
) numbered
where rn = #rn - 1
See SQL Fiddle
The * 1 is added to convert the text to a number data type.

This is not a perfect solution. I am making some assumptions for this. Check if this could work for you.
;WITH cte
AS (SELECT emp_number,
Row_number()
OVER (
ORDER BY emp_number ASC) AS rn
FROM dat_history
WHERE Isdate(emp_number) = 0) --Omit date entries
SELECT emp_number
FROM cte
WHERE rn = 1 -- select the minimum entry, assuming it would be the count and assuming count might not exceed the emp number range of 9888000

selecting previous and next rows in mysql - how?

I can't figure out how to select a previous/next row IF the current row does not have any numeric identifiers.
With numeric value I always use 2 queries:
SELECT min(customer_id)
FROM customers
WHERE `customer_id` < 10
GROUP BY customer_status
ORDER BY customer_name ASC
LIMIT 1;
SELECT max(customer_id)
FROM customers
WHERE `customer_id` > 10
GROUP BY customer_status
ORDER BY customer_name DESC
LIMIT 1;
However, I don't have "customer_id" anymore and only "customer_name". When I query the DB and sort by this column, I get:
Ab
Bb
Cc
Dd
Ee
Let's assume my current customer's name is "Cc". I want to be able to select "Bb" and "Dd" from the DB. How? :)

Rows do not have an order, mysql stores the rows in whatever order it wants. Its called clustering. You use LIMIT to grab subsets of a result set. LIMIT 10 says rows 1 to 10. LIMIT 11,20 says rows 11 to 20 and so on. Row 1 corresponding to the order of the row in the result set, since the rows in the tables are more like a "cloud", there is no order until you build a result set with an ORDER BY clause.

i'd select the previous one with...
SELECT MAX(customer_name)
FROM customers
WHERE `customer_name` < 'Cc'
LIMIT 1;
and the next one with...
SELECT MIN(customer_name)
FROM customers
WHERE `customer_name` > 'Cc'
LIMIT 1;
You where nearly there, I think.
Edit: Removed superfluous ORDER BY statements as suggested by Col. Shrapnel.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight