TSQL (SQL Server) Sorting and paging with row_number - sql-server

I have a database (SQL Server) and app which fetches data and converts them into JSON.
I wrote a T-SQL query to order data by userid column (DESC) and take only first 10 rows, but it causes problem returning wrong results.
For example if I have following table:
UserID
---
User1
User2
User3
...
User10
..
User25
I want to to UserID to be DESC and get first ten results (then second ten results, etc). Simple saying I am looking for MySQL LIMIT substitute in SQL Server.
My query
SELECT * FROM
(SELECT
system_users_ranks.RankName,
system_users.userid,
system_users.UserName,
system_users.Email,
system_users.LastIP,
system_users.LastLoginDate,
row_number() OVER (ORDER BY system_users.userid) as myrownum
FROM
system_users
INNER JOIN
system_users_ranks
ON system_users.UserRank = system_users_ranks.rankid
) as dertbl
WHERE myrownum BETWEEN #startval AND #endval
ORDER BY userid DESC
I can't move ORDER BY to inner SELECT.

You don't need it in the inner SELECT.
ROW_NUMBER has its own ORDER BY, and the final presentation is defined by the outermost ORDER BY anyway.
Your current query will work just fine.

Related

SQl Query with even distribution of samples

Is there a way to query SQL to get an even distribution of samples. For example if one of my fields is a State field... I want to query top 5000 results with (100 from each state)... Or another example, if I have a field that says whether a client is a new client or an existing client, and I want the top 500 results where 250 are new clients and 250 are existing clients.
I am trying to avoid two different queries that I have to manually combine the results.
You can do this by using ROW_NUMBER. You partition your data on one or more columns, so the row numbering starts from 1 in every partition. You then select the top x rows and ORDER BY the row number column.
e.g.
WITH cte
AS
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY StateName ORDER BY NEWID() ) AS RN
FROM dbo.Sales
)
SELECT TOP 5 *
FROM cte
ORDER BY RN;

TOP clause in SQL server returns more records than specified

TOP clause in SQL server(I tried this in w3schools.com website) gives more records than specified when used with order by clause. This is the query I used:
select TOP 1 * from orders left join customers on
orders.customerID=customers.customerID order by EmployeeID desc
Please visit this link for my result: https://i.stack.imgur.com/xg7sV.jpg
Instead, this query returns 6 records. Is this how it is supposed to work?
Read the ENTIRE screen. And obviously there is something terribly wrong with their site.
The JOIN part throws you off. Try something like this instead:
select TOP 1 * from orders a
outer apply (select top 1 * from customers where customerID = a.customerID) b
order by a.EmployeeID desc
Try to order the query by name, address, by the order date or whatever that doesn't REPEAT like the EmployeeID. Souns like a very very dumb solution but it worked for me in an Access database with DESC counts.
Looks like if in orders the values repeat the sql returns more than needed... i really don't know why but it worked for me!
Try it!

SQL Server : return id column using max on different column

In my table I have the columns id, userId and points. As a result of my query I would like to have the id of the record that contains the highest points, per user.
In my experience (more with MySQL than SQL Server) I would use the following query to get this result:
SELECT id, userId, max(points)
FROM table
GROUP BY userId
But SQL Server does not allow this, because all columns in the select should also be in the GROUP BY or be an aggregate function.
This is a hypothetical situation. My actual situation is a lot more complicated!
Use ROW_NUMBER window function in SQL Server
Select * from
(
select Row_Number() over(partition by userId Order by points desc) Rn,*
From yourtable
) A
Where Rn = 1

SQL Server Pagination w/o row_number() or nested subqueries?

I have been fighting with this all weekend and am out of ideas. In order to have pages in my search results on my website, I need to return a subset of rows from a SQL Server 2005 Express database (i.e. start at row 20 and give me the next 20 records). In MySQL you would use the "LIMIT" keyword to choose which row to start at and how many rows to return.
In SQL Server I found ROW_NUMBER()/OVER, but when I try to use it it says "Over not supported". I am thinking this is because I am using SQL Server 2005 Express (free version). Can anyone verify if this is true or if there is some other reason an OVER clause would not be supported?
Then I found the old school version similar to:
SELECT TOP X * FROM TABLE WHERE ID NOT IN (SELECT TOP Y ID FROM TABLE ORDER BY ID) ORDER BY ID where X=number per page and Y=which record to start on.
However, my queries are a lot more complex with many outer joins and sometimes ordering by something other than what is in the main table. For example, if someone chooses to order by how many videos a user has posted, the query might need to look like this:
SELECT TOP 50 iUserID, iVideoCount FROM MyTable LEFT OUTER JOIN (SELECT count(iVideoID) AS iVideoCount, iUserID FROM VideoTable GROUP BY iUserID) as TempVidTable ON MyTable.iUserID = TempVidTable.iUserID WHERE iUserID NOT IN (SELECT TOP 100 iUserID, iVideoCount FROM MyTable LEFT OUTER JOIN (SELECT count(iVideoID) AS iVideoCount, iUserID FROM VideoTable GROUP BY iUserID) as TempVidTable ON MyTable.iUserID = TempVidTable.iUserID ORDER BY iVideoCount) ORDER BY iVideoCount
The issue is in the subquery SELECT line: TOP 100 iUserID, iVideoCount
To use the "NOT IN" clause it seems I can only have 1 column in the subquery ("SELECT TOP 100 iUserID FROM ..."). But when I don't include iVideoCount in that subquery SELECT statement then the ORDER BY iVideoCount in the subquery doesn't order correctly so my subquery is ordered differently than my parent query, making this whole thing useless. There are about 5 more tables linked in with outer joins that can play a part in the ordering.
I am at a loss! The two above methods are the only two ways I can find to get SQL Server to return a subset of rows. I am about ready to return the whole result and loop through each record in PHP but only display the ones I want. That is such an inefficient way to things it is really my last resort.
Any ideas on how I can make SQL Server mimic MySQL's LIMIT clause in the above scenario?
Unfortunately, although SQL Server 2005 Row_Number() can be used for paging and with SQL Server 2012 data paging support is enhanced with Order By Offset and Fetch Next, in case you can not use any of these solutions you require to first
create a temp table with identity column.
then insert data into temp table with ORDER BY clause
Use the temp table Identity column value just like the ROW_NUMBER() value
I hope it helps,

SQL Server Selecting newest entry for each row

I have a SQL Server Table like this :
id(autoincrement)
hostname(varchar)
group(varchar)
member(varchar)
datequeried(varchar)
The table is filled by a scheduled job that scans network for windows client PCs local admin group members.
Network scans -
seams with the fact that it may happen that some of the stations are not available during scans.
The query that I'd like to write is :
"select every hostname having the latest datequeried"
This is to display the newest result (rows) of each hostname queried on network.
Is it clear ?
I'm still facing some syntax issues and I'm sure it is quite easy.
Thanks in advance.
If you're on SQL SErver 2005 or newer (you didn't specify...), you can use a CTE to do this:
;WITH MostCurrent AS
(
SELECT
id, hostname, group,
member, datequeried,
ROW_NUMBER() OVER(PARTITION BY hostname ORDER BY datequeried DESC) 'RowNum'
FROM
dbo.YourTable
)
SELECT *
FROM MostCurrent
WHERE RowNum = 1
The inner SELECT inside the CTE "partitions" your data by hostname, e.g. each hostname gets a new "group" of data, and it numbers those entries starting at 1 for each group. Those entries are numbered by datequeried DESC, so the most recent one has the RowNum = 1 - for each group of data (e.g. for each hostname).
From SQL 2005 and later, you can use ROW_NUMBER() like this:
;WITH CTE AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY hostname ORDER BY datequeried DESC) AS RowNo
FROM YourTable
)
SELECT * FROM CTE WHERE RowNo = 1
"CTE" is a Commom Table Expression, basically just aliasing that first SELECT which I can then use in the 2nd query.
This will return 1 row for each hostname, with the row returned for each being the one
I can display the required results using :
select hostname, member, max(lastdatequeried)
as lastdatequeried
from members
group by hostname, member order by hostname
Thanks to all who helped.
select hostname,
max(datequeried) as datequeried
from YourTable
group by hostname
SELECT TOP 1 WITH TIES *
FROM YourTable
ORDER BY ROW_NUMBER() OVER(PARTITION BY hostname ORDER BY datequeried DESC)
Do you want to find each station's most recent scan?
Or do you want to find every station that was online (or not online) during the most recent scan?
I'd have a master list of workstations, first of all. Then I'd have a master list of scans. And then I'd have the scans table that holds the results of the scans.
To answer #1, you'd would use a subquery or inline view that returns for each workstation its id and max(scandate) and then you'd join that subquery back to scans table to pull out the scan row for that workstation id whose scandate matched its max(scandate).
To answer #2, you'd look for all workstations where exists a record (or where not exists a record, mutatis mutandis) in the scans table where scandate = the max(date) in the master scans list.

Resources