DISTINCT and GROUP BY with SQL Server

DISTINCT and GROUP BY with SQL Server - sql-server

I have the following table (sql server) and i'm looking for a query to select the last two rows with all fields:
order by created_at
group by / distinct type_id
id type_id some_value created_at
1 B mk2 2016-10-01 00:00:00.000
2 A mbs 2016-10-01 10:02:39.077
3 B sa 2016-10-02 10:03:08.123
4 A xc 2016-10-02 10:03:28.777
5 B q1 2016-10-03 10:04:20.920
6 A tr 2016-10-03 10:04:48.533
7 A 1a 2016-09-30 10:36:26.287
In MySQL its an easy task - but with SQL Server all fields have to be contained in either an aggregate function or the GROUP BY clause. But that results in field combinations that does not exist.
Is there a way to handle this?
Thanks in advance!

Solution
Based on the comment from Andrew Deighton i did this:
SELECT *
FROM (
SELECT
id,
type_id,
some_value,
created_at,
ROW_NUMBER()
OVER (PARTITION BY type_id
ORDER BY created_at DESC) AS row
FROM test_sql
) AS ts
WHERE row = 1
ORDER BY row
Conclusion: No need for GROUP BY and DISTINCT.

Related

How can i use sql query for the following

My data table sampletime in one column and sample value in another column contain data like follow
sampletime value
----------------------------
2016-03-02 08:31:14 1
2016-03-02 09:31:14 2
2016-03-02 12:31:14 3
2016-03-04 08:31:14 4
2016-03-04 09:31:14 5
2016-03-05 08:31:14 3
I need two minimum sample time in each day. How can I group?
Query
SELECT rn.sampletime AS stime
FROM rn_qos_data_0007 rn
INNER JOIN s_qos_data qos
ON qos.table_id = rn.table_id
AND qos.qos = 'QOS_CPU_USAGE'
AND Substring(qos.origin, 1, 4) = 'A0C3'
AND qos.host = '10.98.48.100'
WHERE rn.sampletime BETWEEN '2016/01/01' AND '2016/06/22'
GROUP BY rn.sampletime

You need ROW_NUMBER window function
Select * From
(
select row_number()over(partition by cast(sampletime as date) order by sampletime) RN,*
From ..
) A
Where RN <=2

T-SQL Query to remove duplicate records in the output based on one particular column

I am running SQL Server 2014 and I have the following T-SQL query:
USE MYDATABASE
SELECT *
FROM RESERVATIONLIST
WHERE [MTH] IN ('JANUARY 2015','FEBRUARY 2015')
RESERVATIONLIST mentioned in the code above is a view. The query gives me the following output (extract):
ID NAME DOA DOD Nights Spent MTH
--------------------------------------------------------------------
251 AH 2015-01-12 2015-01-15 3 JANUARY 2015
258 JV 2015-01-28 2015-02-03 4 JANUARY 2015
258 JV 2015-01-28 2015-02-03 2 FEBRUARY 2015
The above output consist of around 12,000 records.
I need to modify my query so that it eliminates all duplicate ID and give me the following results:
ID NAME DOA DOD Nights Spent MTH
--------------------------------------------------------------------
251 AH 2015-01-12 2015-01-15 3 JANUARY 2015
258 JV 2015-01-28 2015-02-03 4 JANUARY 2015
I tried something like this, but it's not working:
USE MYDATABASE
SELECT *
FROM RESERVATIONLIST
WHERE [MTH] IN ('JANUARY 2015', 'FEBRUARY 2015')
GROUP BY [ID]
HAVING COUNT ([MTH]) > 1

Following query will return one row per ID :
SELECT * FROM
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY (SELECT NULL)) rn FROM RESERVATIONLIST
WHERE [MTH] IN ('JANUARY 2015','FEBRUARY 2015')
) T
WHERE rn = 1
Note : this will return a random row from multiple rows having same ID. IF you want to select some specific row then you have to define it in order by. For e.g. :
SELECT * FROM
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY DOA DESC) rn FROM RESERVATIONLIST
WHERE [MTH] IN ('JANUARY 2015','FEBRUARY 2015')
) T
WHERE rn = 1
definitely, it will return the row having max(DOA).

You are trying to do a GROUP BY statement which IMHO is the right way to go. You should formulate all columns that are a constant, and roll-up the others. Depending on the value of DOD and DOA I can see two solutions:
SELECT ID,NAME,DOA,DOD,SUM([Nights Spent]) as Nights,
min(MTH) as firstRes, max(MTH) as lastRes
FROM RESERVATIONLIST
GROUP BY ID,NAME,DOA,DOD
OR
SELECT ID,NAME,min(DOA) as firstDOA,max(DOD) as lastDOD,SUM([Nights Spent]) as Nights,
min(MTH) as firstRes, max(MTH) as lastRes
FROM RESERVATIONLIST
GROUP BY ID,NAME

Query to SELECT non-repeating values from table

I have a table structured as below:
ID Name RunDate
10001 Item 1 12/09/2013 02:11:47
10002 Item 2 12/09/2013 01:13:25
10001 Item 1 12/09/2013 01:11:37
10007 Item 7 12/08/2013 11:02:04
10001 Item 1 12/08/2013 10:25:00
My problem is that this table will be sent to a distribution group email and it makes the e-mail so big because the table has more than hundreds of rows. What I want to achieve is to only show the records that have DISTINCT ID showing only the most-recent RunDate.
ID Name RunDate
10001 Item 1 12/09/2013 02:11:47
10002 Item 2 12/09/2013 01:13:25
10007 Item 7 12/08/2013 11:02:04
Any idea how I can do this? I'm not very good with aggregate stuff and I've used DISTINCT but it always mess up my query.
Thanks!

Group by the values that should be distinct and use max() to get the most current date
select id, name, max(rundate) as rundate
from your_table
group by id, name

This is more flexible because it doesn't require grouping by all columns:
;WITH x AS
(
SELECT ID, Name, RunDate, /* other columns, */
rn = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY RunDate DESC)
FROM dbo.TableName
)
SELECT ID, Name, RunDate /* , other columns */
FROM x
WHERE rn = 1
ORDER BY ID;
(Since Name doesn't really need to be grouped, and in fact shouldn't even be in this table, and the next follow-up question to the GROUP BY solution is almost always, "How do I add <column x> and <column y> to the output, if they have different values and can't be added to the GROUP BY?")

T-SQL select rows by oldest date and unique category

I'm using Microsoft SQL. I have a table that contains information stored by two different categories and a date. For example:
ID Cat1 Cat2 Date/Time Data
1 1 A 11:00 456
2 1 B 11:01 789
3 1 A 11:01 123
4 2 A 11:05 987
5 2 B 11:06 654
6 1 A 11:06 321
I want to extract one line for each unique combination of Cat1 and Cat2 and I need the line with the oldest date. In the above I want ID = 1, 2, 4, and 5.
Thanks

Have a look at row_number() on MSDN.
SELECT *
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY col1, col2 ORDER BY date_time, id) rn
FROM mytable
) q
WHERE rn = 1
(run the code on SQL Fiddle)

Quassnoi's answer is fine, but I'm a bit uncomfortable with how it handles dups. It seems to return based on insertion order, but I'm not sure if even that can be guaranteed? (see these two fiddles for an example where the result changes based on insertion order: dup at the end, dup at the beginning)
Plus, I kinda like staying with old-school SQL when I can, so I would do it this way (see this fiddle for how it handles dups):
select *
from my_table t1
left join my_table t2
on t1.cat1 = t2.cat1
and t1.cat2 = t2.cat2
and t1.datetime > t2.datetime
where t2.datetime is null

SQL Server 2008 how to select top [column value] and random record?

I'm using SQL Server 2008, I want select random row record, and the total number of record is depend on another table's column value, how to do this?
My SQL statement is something like this, but wrong..
select top b.number a.name, a.link_id
from A a
left join B b on b.link_id = a.link_id
order by newid()
Here are my tables and the expected result.
Table A:
name link_id
james 100
albert 100
susan 100
simon 101
tom 101
fion 101
Table B:
link_id number
100 2
101 1
Expected result:
when run 1st time, result may be:
name link_id
james 100
susan 100
fion 101
2nd time result may be:
albert 100
susan 100
simon 101
3rd time could be:
james 100
albert 100
fion 101
Explaination
Refer to table B, link_id: 100, number: 2
meaning that Table A should select out 2 random record for link_id = 100
and need to select 1 random record for link_id=101

You can use the ROW_NUMBER() function:
SELECT A.name, A.link_id
FROM(
SELECT name,link_id, ROW_NUMBER()OVER(PARTITION BY link_id ORDER BY NEWID()) rn
FROM dbo.tblA
) AS A
JOIN dbo.tblB AS B
ON A.link_id = B.link_id
WHERE A.rn <= B.number;
Here is a SqlFiddle to show this in action: http://sqlfiddle.com/#!3/92eac/2

Try this:
SELECT a.*
FROM b
CROSS APPLY
(
SELECT TOP (b.number) a.*
FROM a
WHERE a.link_id = b.link_id
ORDER BY
NEWID()
) a
Also see: SQLFiddle

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

DISTINCT and GROUP BY with SQL Server - sql-server

Solution Based on the comment from Andrew Deighton i did this: SELECT * FROM ( SELECT id, type_id, some_value, created_at, ROW_NUMBER() OVER (PARTITION BY type_id ORDER BY created_at DESC) AS row FROM test_sql ) AS ts WHERE row = 1 ORDER BY row Conclusion: No need for GROUP BY and DISTINCT.

Related

How can i use sql query for the following

T-SQL Query to remove duplicate records in the output based on one particular column

Query to SELECT non-repeating values from table

T-SQL select rows by oldest date and unique category

SQL Server 2008 how to select top [column value] and random record?

Categories

Resources