T-SQL get row count before TOP is applied - sql-server

I have a SELECT that can return hundreds of rows from a table (table can be ~50000 rows). My app is interested in knowing the number of rows returned, it means something important to me, but it actually uses only the top 5 of those hundreds of rows. What I want to do is limit the SELECT query to return only 5 rows, but also tell my app how many it would have returned (the hundreds). This is the original query:
SELECT id, a, b, c FROM table WHERE a < 2
Here is what I came up with - a CTE - but I don't feel comfortable with the total row count appearing in every column. Ideally I would want a result set of the TOP 5 and a returned parameter for the total row count.
WITH Everything AS
(
SELECT id, a, b, c FROM table
),
DetermineCount AS
(
SELECT COUNT(*) AS Total FROM Everything
)
SELECT TOP (5) id, a, b, c, Total
FROM Everything
CROSS JOIN DetermineCount;
Can you think of a better way?
Is there a way in T-SQl to return the affected row count of a select top query before the top was applied? ##rowcount would return 5 but I wonder if there is a ##rowcountbeforetop sort of thing.
Thanks in advance for your help.
** Update **
This is what I'm doing now and I kind of like it over the CTE although CTEs as so elegant.
-- #count is passed in as an out param to the stored procedure
CREATE TABLE dbo.#everything (id int, a int, b int, c int);
INSERT INTO #everything
SELECT id, a, b, c FROM table WHERE a < 2;
SET #count = ##rowcount;
SELECT TOP (5) id FROM #everything;
DROP TABLE #everything;

Here's a relatively efficient way to get 5 random rows and include the total count. The random element will introduce a full sort no matter where you put it.
SELECT TOP (5) id,a,b,c,total = COUNT(*) OVER()
FROM dbo.mytable
ORDER BY NEWID();

Assuming you want the top 5 ordering by id ascending, this will do it with a single pass through your table.
; WITH Everything AS
(
SELECT id
, a
, b
, c
, ROW_NUMBER() OVER (ORDER BY id ASC) AS rn_asc
, ROW_NUMBER() OVER (ORDER BY id DESC) AS rn_desc
FROM <table>
)
SELECT id
, a
, b
, c
, rn_asc + rn_desc - 1 AS total_rows
FROM Everything
WHERE rn_asc <= 5

** Update **
This is what I'm doing now and I kind of like it over the CTE although CTEs as so elegant. Let me know what you think. Thanks!
-- #count is passed in as an out param to the stored procedure
CREATE TABLE dbo.#everything (id int, a int, b int, c int);
INSERT INTO #everything
SELECT id, a, b, c FROM table WHERE a < 2;
SET #count = ##rowcount;
SELECT TOP (5) id FROM #everything;
DROP TABLE #everything;

Related

Create a SELECT with defined number of rows

I have a stored procedure that should insert some random rows in a table depending on the amount values
#amount1 INT --EligibilityID = 1
#amount2 INT --EligibilityID = 2
#amount3 INT --EligibilityID = 3
Maybe the obvious way is to use TOP(#amount) but there are a lot of amount values and the second select is much larger. So, I was looking for a way to do it in a single statement if possible.
INSERT INTO [dbo].[CaseInfo]
SELECT ([EligibilityID],[CaseNumber],[CaseMonth])
FROM (
SELECT TOP(#amount1) [EligibilityID],[CaseNumber],[CaseMonth]
FROM [dbo].[tempCases]
WHERE [EligibilityID] = 1
)
INSERT INTO [dbo].[CaseInfo]
SELECT ([EligibilityID],[CaseNumber],[CaseMonth])
FROM (
SELECT TOP(#amount2) [EligibilityID],[CaseNumber],[CaseMonth]
FROM [dbo].[tempCases]
WHERE [EligibilityID] = 2
)
INSERT INTO [dbo].[CaseInfo]
SELECT ([EligibilityID],[CaseNumber],[CaseMonth])
FROM (
SELECT TOP(#amount3) [EligibilityID],[CaseNumber],[CaseMonth]
FROM [dbo].[tempCases]
WHERE [EligibilityID] = 3
)
I would recommend to use row_number, partitioned by eligibilityID, and then compare it with a case statement to select the correct variable each time:
INSERT INTO [dbo].[CaseInfo]
SELECT ([EligibilityID],[CaseNumber],[CaseMonth])
FROM (
SELECT [EligibilityID],[CaseNumber],[CaseMonth]
,row_number() over (partition by EligibilityID order by CaseNumber) as rn -- you haven't mentioned an ORDER BY, you can change it here
FROM [dbo].[tempCases]
) as table1
where rn<=case
when EligibilityID=1 then #amount1
when EligibilityID=2 then #amount2
when EligibilityID=3 then #amount3
end

Get random data from SQL Server without performance impact

I need to select random rows from my sql table, when search this cases in google, they suggested to ORDER BY NEWID() but it reduces the performance. Since my table has more than 2'000'000 rows of data, this solution does not suit me.
I tried this code to get random data :
SELECT TOP 10 *
FROM Table1
WHERE (ABS(CAST((BINARY_CHECKSUM(*) * RAND()) AS INT)) % 100) < 10
It also drops performance sometimes.
Could you please suggest good solution for getting random data from my table, I need minimum rows from that tables like 30 rows for each request. I tried TableSAMPLE to get the data, but it returns nothing once I added my where condition because it return the data by the basis of page not basis of row.
Try to calc the random ids before to filter your big table.
since your key is not identity, you need to number records and this will affect performances..
Pay attention, I have used distinct clause to be sure to get different numbers
EDIT: I have modified the query to use an arbitrary filter on your big table
declare #n int = 30
;with
t as (
-- EXTRACT DATA AND NUMBER ROWS
select *, ROW_NUMBER() over (order by YourPrimaryKey) n
from YourBigTable t
-- SOME FILTER
WHERE 1=1 /* <-- PUT HERE YOUR COMPLEX FILTER LOGIC */
),
r as (
-- RANDOM NUMBERS BETWEEN 1 AND COUNT(*) OF FILTERED TABLE
select distinct top (#n) abs(CHECKSUM(NEWID()) % n)+1 rnd
from sysobjects s
cross join (SELECT MAX(n) n FROM t) t
)
select t.*
from t
join r on r.rnd = t.n
If your uniqueidentifier key is a random GUID (not generated with NEWSEQUENTIALID() or UuidCreateSequential), you can use the method below. This will use the clustered primary key index without sorting all rows.
SELECT t1.*
FROM (VALUES(
NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID())
,(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID())
,(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID()),(NEWID())) AS ThirtyKeys(ID)
CROSS APPLY(SELECT TOP (1) * FROM dbo.Table1 WHERE ID >= ThirtyKeys.ID) AS t1;

DISTINCT Query for One Column but not any other columns

I have a small 2 column table. Lets say the columns are A and B. Column A needs to be distinct so that it does not display a repeated value. Column B needs to have everything selected in the query so if there are multiple B values for a value in A, the multiple values will display. How can I write a query that will do this for me?
While the duplicates are now gone...there is a bunch of blank space in my dropdown.
You could use a CTE to simplify it:
WITH CTE AS
(
SELECT A, B,
RN = ROW_NUMBER() OVER (PARTITION BY A ORDER BY A, B)
FROM dbo.TableName
)
SELECT A = CASE WHEN RN = 1 THEN Cast(A as varchar(50)) ELSE '' END,
B
FROM CTE

How to select Top % in T-SQL without using Top clause?

How to select Top 40% from a table without using the Top clause (or Top percent, the assignment is a little ambiguous) ? This question is for T-SQL, SQL Server 2008. I am not allowed to use Top for my assignment.
Thanks.
This is what I've tried but seems complicated. Isn't there an easier way ?
select top (convert (int, (select round (0.4*COUNT(*), 0) from MyTable))) * from MyTable
Try the NTILE function:
;WITH YourCTE AS
(
SELECT
(some columns),
percentile = NTILE(10) OVER(ORDER BY SomeColumn DESC)
FROM
dbo.YourTable
)
SELECT *
FROM YourCTE
WHERE percentile <= 4
The NTILE(10) OVER(....) creates 10 groups of percentages over your data - and thus, the top 40% are the groups no. 1, 2, 3, 4 of that result
Use NTILE
CREATE TABLE #temp(StudentID CHAR(3), Score INT)
INSERT #temp VALUES('S1',75 )
INSERT #temp VALUES('S2',83)
INSERT #temp VALUES('S3',91)
INSERT #temp VALUES('S4',83)
INSERT #temp VALUES('S5',93 )
INSERT #temp VALUES('S6',75 )
INSERT #temp VALUES('S7',83)
INSERT #temp VALUES('S8',91)
INSERT #temp VALUES('S9',83)
INSERT #temp VALUES('S10',93 )
SELECT * FROM (
SELECT NTILE(10) OVER(ORDER BY Score) AS NtileValue,*
FROM #temp) x
WHERE NtileValue <= 4
ORDER BY 1
Interesting enough I blogged about NTILE today: Does anyone use the NTILE() windowing function?
A problem with the NTILE(10) answers given so far is that if the table has 15 rows they will return 8 rows (53%) rather than the correct number to make up 40% (6).
If the number of rows is not evenly divisible by number of buckets the extra rows all go into the first buckets rather than being evenly distributed.
This alternative (borrows SQL Menace's table) avoids that issue.
WITH CTE
AS (SELECT *,
ROW_NUMBER() OVER ( ORDER BY Score) AS RN,
COUNT(*) OVER() AS Cnt
FROM #temp)
SELECT StudentID,
Score
FROM CTE
WHERE RN <= CEILING(0.4 * Cnt )
Using Top t-sql command:
select top 10 [Column_1],
[Column_2] from [Table]
order by [Column_1]
Using Paging method:
select
[Column_1],
[Column_2]
from
(Select ROW_NUMBER() Over (ORDER BY [Column_1]) AS Row,
[Column_1],
[Column_2]
FROM [Table]) as [alias]
WHERE (Row between 0 and 10)
This is finding the top 10 with order by [Column_1]...please note this is using [variable] method of documentation.
If you could provide column names and table names i could write much more beneficial t-sql, for example to find the top 40% you are going to need to do another sub-query to get count of all rows then do division, i'd likely do this as a query before i do the main query.
Calculate and set ROWCOUNT for whatever number of records.
Then execute you query for the limited set.
declare #rc as integer
select #rc = count(*)*0.40 from CTE
Set ROWCOUNT #rc
select * from CTE
ROWCOUNT is not deprecated yet - see http://msdn.microsoft.com/en-us/library/ms188774.aspx

How do I select last 5 rows in a table without sorting?

I want to select the last 5 records from a table in SQL Server without arranging the table in ascending or descending order.
This is just about the most bizarre query I've ever written, but I'm pretty sure it gets the "last 5" rows from a table without ordering:
select *
from issues
where issueid not in (
select top (
(select count(*) from issues) - 5
) issueid
from issues
)
Note that this makes use of SQL Server 2005's ability to pass a value into the "top" clause - it doesn't work on SQL Server 2000.
Suppose you have an index on id, this will be lightning fast:
SELECT * FROM [MyTable] WHERE [id] > (SELECT MAX([id]) - 5 FROM [MyTable])
The way your question is phrased makes it sound like you think you have to physically resort the data in the table in order to get it back in the order you want. If so, this is not the case, the ORDER BY clause exists for this purpose. The physical order in which the records are stored remains unchanged when using ORDER BY. The records are sorted in memory (or in temporary disk space) before they are returned.
Note that the order that records get returned is not guaranteed without using an ORDER BY clause. So, while any of the the suggestions here may work, there is no reason to think they will continue to work, nor can you prove that they work in all cases with your current database. This is by design - I am assuming it is to give the database engine the freedom do as it will with the records in order to obtain best performance in the case where there is no explicit order specified.
Assuming you wanted the last 5 records sorted by the field Name in ascending order, you could do something like this, which should work in either SQL 2000 or 2005:
select Name
from (
select top 5 Name
from MyTable
order by Name desc
) a
order by Name asc
You need to count number of rows inside table ( say we have 12 rows )
then subtract 5 rows from them ( we are now in 7 )
select * where index_column > 7
select * from users
where user_id >
( (select COUNT(*) from users) - 5)
you can order them ASC or DESC
But when using this code
select TOP 5 from users order by user_id DESC
it will not be ordered easily.
select * from table limit 5 offset (select count(*) from table) - 5;
Without an order, this is impossible. What defines the "bottom"? The following will select 5 rows according to how they are stored in the database.
SELECT TOP 5 * FROM [TableName]
Well, the "last five rows" are actually the last five rows depending on your clustered index. Your clustered index, by definition, is the way that he rows are ordered. So you really can't get the "last five rows" without some order. You can, however, get the last five rows as it pertains to the clustered index.
SELECT TOP 5 * FROM MyTable
ORDER BY MyCLusteredIndexColumn1, MyCLusteredIndexColumnq, ..., MyCLusteredIndexColumnN DESC
Search 5 records from last records you can use this,
SELECT *
FROM Table Name
WHERE ID <= IDENT_CURRENT('Table Name')
AND ID >= IDENT_CURRENT('Table Name') - 5
If you know how many rows there will be in total you can use the ROW_NUMBER() function.
Here's an examble from MSDN (http://msdn.microsoft.com/en-us/library/ms186734.aspx)
USE AdventureWorks;
GO
WITH OrderedOrders AS
(
SELECT SalesOrderID, OrderDate,
ROW_NUMBER() OVER (ORDER BY OrderDate) AS 'RowNumber'
FROM Sales.SalesOrderHeader
)
SELECT *
FROM OrderedOrders
WHERE RowNumber BETWEEN 50 AND 60;
In SQL Server 2012 you can do this :
Declare #Count1 int ;
Select #Count1 = Count(*)
FROM [Log] AS L
SELECT
*
FROM [Log] AS L
ORDER BY L.id
OFFSET #Count - 5 ROWS
FETCH NEXT 5 ROWS ONLY;
Try this, if you don't have a primary key or identical column:
select [Stu_Id],[Student_Name] ,[City] ,[Registered],
RowNum = row_number() OVER (ORDER BY (SELECT 0))
from student
ORDER BY RowNum desc
You can retrieve them from memory.
So first you get the rows in a DataSet, and then get the last 5 out of the DataSet.
There is a handy trick that works in some databases for ordering in database order,
SELECT * FROM TableName ORDER BY true
Apparently, this can work in conjunction with any of the other suggestions posted here to leave the results in "order they came out of the database" order, which in some databases, is the order they were last modified in.
select *
from table
order by empno(primary key) desc
fetch first 5 rows only
Last 5 rows retrieve in mysql
This query working perfectly
SELECT * FROM (SELECT * FROM recharge ORDER BY sno DESC LIMIT 5)sub ORDER BY sno ASC
or
select sno from(select sno from recharge order by sno desc limit 5) as t where t.sno order by t.sno asc
When number of rows in table is less than 5 the answers of Matt Hamilton and msuvajac is Incorrect.
Because a TOP N rowcount value may not be negative.
A great example can be found Here.
i am using this code:
select * from tweets where placeID = '$placeID' and id > (
(select count(*) from tweets where placeID = '$placeID')-2)
In SQL Server, it does not seem possible without using ordering in the query.
This is what I have used.
SELECT *
FROM
(
SELECT TOP 5 *
FROM [MyTable]
ORDER BY Id DESC /*Primary Key*/
) AS T
ORDER BY T.Id ASC; /*Primary Key*/
DECLARE #MYVAR NVARCHAR(100)
DECLARE #step int
SET #step = 0;
DECLARE MYTESTCURSOR CURSOR
DYNAMIC
FOR
SELECT col FROM [dbo].[table]
OPEN MYTESTCURSOR
FETCH LAST FROM MYTESTCURSOR INTO #MYVAR
print #MYVAR;
WHILE #step < 10
BEGIN
FETCH PRIOR FROM MYTESTCURSOR INTO #MYVAR
print #MYVAR;
SET #step = #step + 1;
END
CLOSE MYTESTCURSOR
DEALLOCATE MYTESTCURSOR
Thanks to #Apps Tawale , Based on his answer, here's a bit of another (my) version,
To select last 5 records without an identity column,
select top 5 *,
RowNum = row_number() OVER (ORDER BY (SELECT 0))
from [dbo].[ViewEmployeeMaster]
ORDER BY RowNum desc
Nevertheless, it has an order by, but on RowNum :)
Note(1): The above query will reverse the order of what we get when we run the main select query.
So to maintain the order, we can slightly go like:
select *, RowNum2 = row_number() OVER (ORDER BY (SELECT 0))
from (
select top 5 *, RowNum = row_number() OVER (ORDER BY (SELECT 0))
from [dbo].[ViewEmployeeMaster]
ORDER BY RowNum desc
) as t1
order by RowNum2 desc
Note(2): Without an identity column, the query takes a bit of time in case of large data
Get the count of that table
select count(*) from TABLE
select top count * from TABLE where 'primary key row' NOT IN (select top (count-5) 'primary key row' from TABLE)
If you do not want to arrange the table in ascending or descending order. Use this.
select * from table limit 5 offset (select count(*) from table) - 5;

Resources