Why is this stored procedure using TOP 100 PERCENT? - sql-server

I have the following stored procedure inside a third party application inside sql server 2008 R2:-
ALTER PROCEDURE [dbo].[GetContacts]
AS
BEGIN
---------
SELECT TOP (100) PERCENT .....
INTO [#temp200]
FROM dbo.Contact
ORDER BY dbo.Contact.Name
--SELECT * from #temp200
SELECT top 38 *
FROM #temp200
ORDER BY Fullname
delete top (38) FROM #temp200
SELECT top 38 *
FROM #temp200
ORDER BY Fullname
delete top (38) FROM #temp200
SELECT top 38 *
FROM #temp200
ORDER BY Fullname
delete top (38) FROM #temp200
SELECT *
FROM #temp200
ORDER BY Fullname
now I run this inside sql management studio where I got the following results tabs:-
the first one contains 38 records.
the second one 38 records.
the third one contains 38 records.
the fourth one contains 30 records.
where in this case I got 144 records ,, so not sure what is the purpose of the
(SELECT TOP (100)) , as I will get 144 records. now as a test I changed the Select TOP(100) to be Select TOP(35) where in this case I got 2 results; the first on with 38 records while the second one with 17 records and .. so can anyone advice how my above SP is working ?

That guy did not understand that tables do not have order. He tried to insert in an ordered way into the temp tables. This is not possible. The TOP 100 PERCENT trick shuts up the warning about that but does nothing to ensure order.
In earlier SQL Server versions this code might well have worked by coincidence. Since then more optimizations have been added and this code is extremely brittle. Rewrite this if you get the chance. It's a latent time bomb.

Related

SELECT TOP 1 from table value function much slower than selecting all rows

I have a table-valued function that returns four rows:
SELECT *
FROM dbo.GetStuff('0D182B8B-7A80-4D45-8900-23FA01FCFE5A')
ORDER BY TurboEncabulatorID DESC
and this returns quickly (< 1s):
TurboEncabulatorID Trunion Casing StatorSlots
------------------ ------- ------ -----------
4 G Red 19
3 F Pink 24
2 H Maroon 17
1 G Purple 32
But i only want the "last" row (i.e. the row with the highest TurboEncabulatorID). So i add the TOP:
SELECT TOP 1 *
FROM dbo.GetStuff('0D182B8B-7A80-4D45-8900-23FA01FCFE5A')
ORDER BY TurboEncabulatorID DESC
This query takes ~40s to run, with a huge amount of I/O, and a much worse query plan.
Obviously this is an issue with the optimizer - but how can i work around it?
i've updated all statistics
i've rebuilt all indexes
Bonus Reading
Will the OPTIMIZE option work in a multi-statement table function?
OPTIMIZE FOR UNKNOWN – a little known SQL Server 2008 feature
The workaround i came up with, which obviously isn't an answer, is to try to confuse the optimizer:
select top 1 rows
from top 100 percent of rows
from table valued function
In other words:
WITH optimizerWorkaround AS
(
SELECT TOP 1 PERCENT *
FROM dbo.GetStuff('0D182B8B-7A80-4D45-8900-23FA01FCFE5A')
ORDER BY TurboEncabulatorID DESC
)
SELECT TOP 1 *
FROM optimizerWorkaround
ORDER BY TurboEncabulatorID DESC
This returns quickly as if i had no TOP in the first place.

SQL select top 101

I have a strange situation with a select. I've noticed that when I select top 100, a record is not returning from the database, but when doing top 101 the record appears on position 41.
The query is like this:
select top 100 GroupId, count(HouseId)
from House h
group by h.GroupId
order by max([DateCreated]) desc
From all discussions about top 100 vs top 101 I've noticed that everybody is saying that the top 101 is using another algorithm and we can have a speed problem, but my problem is not about this. With top 100 I'm missing a record that should appear at index 41. Has anybody noticed something like this?
when you use
order by max([DateCreated]) desc
it is calculated before TOP in the query.
Every time you include one more record, max([DateCreated]) re-positions this new record according to its value among all other records.
The only way this makes sense to me is if you have a lot of records with the same max([DateCreated]). What do you get when you run this:
select top 100 with ties GroupId, count(HouseId)
from House h
group by h.GroupId
order by max([DateCreated]) desc
If you get more than 100 records, then the database was just picking the first 100 rows it got to after sorting the result set. When you changed it to TOP 101, another row was added and it probably ended up at row 41 due to clustered indexes or other query engine implementation details that might affect row order when the query isn't deterministic.

How to SELECT LIMIT in ASE 12.5? LIMIT 10, 10 gives syntax error?

How can I LIMIT the result returned by a query in Adaptive Server IQ/12.5.0/0306?
The following gives me a generic error near LIMIT:
SELECT * FROM mytable LIMIT 10, 10;
Any idea why? This is my first time with this dbms
Sybase IQ uses row_count to limit the number of rows returned.
You will have to set the row count at the beginning of your statement and decide whether the statement should be TEMPORARY or not.
ROW_COUNT
SET options
LIMIT statement is not supported in Sybase IQ 12. I think there is not simple or clean solution, similarly like in old SQL server. But there are some approches that works for SQL server 2000 and should work also for Sybase IQ 12. I don't promise that queries below will work on copy&paste.
Subquery
SELECT TOP 10 *
FROM mytable
WHERE Id NOT IN (
SELECT TOP 10 Id FROM mytable ORDER BY OrderingColumn
)
ORDER BY OrderingColumn
Basically, it fetches 10 rows but also skips first 10 rows. To get this works, rows must be unique and ordering is important. Id cannot be more times in results. Otherwise you can filter out valid rows.
Asc-Desc
Another workaround depends on ordering. It uses ordering and fetches 10 rows for second page and you have to take care of last page (it does not work properly with simple formula page * rows per page).
SELECT *
FROM
(
SELECT TOP 10 *
FROM
(
SELECT TOP 20 * -- (page * rows per page)
FROM mytable
ORDER BY Id
) AS t1
ORDER BY Id DESC
) AS t2
ORDER BY Id ASC
I've found some info about non working subqueries in FROM statement in ASE 12. This approach maybe is not possible.
Basic iteration
In this scenario you can just iterate through rows. Let's assume id of tenth row is 15. Then it will select next 10 rows after tenth row. Bad things happen when you will order by another column than Id. It is not possible.
SELECT TOP 10 *
FROM mytable
WHERE Id > 15
ORDER BY Id
Here is article about another workarounds in SQL server 2000. Some should also works in similar ways in Sybase IQ 12.
http://www.codeproject.com/Articles/6936/Paging-of-Large-Resultsets-in-ASP-NET
All those things are workarounds. If you can try to migrate on newer version.

Slow "Select" Query with varchar(max)

I have a small table with 500 rows.
This table has 10 columns including one varchar(max) column.
When I perform this query:
SELECT TOP 36 *
FROM MyTable
WHERE (Column1 = Value1)
It retrieves around 36 rows in 3 minutes.
The varchar(max) columns contains in each row 3000 characters.
If I try to retrieve only one row less:
SELECT TOP 35 *
FROM MyTable
WHERE (Column1 = Value1)
Then the query retrieves 35 rows in 0 seconds.
In my clients statistics, Bytes received from server, I have:
95 292 for the query retrieving data in 0 sec
over 200 000 000 for the query retrieving data in 3 min
Do you know does it come from?
EDIT --- Here is my real code:
select top 36 *
from Snapshots
where ExamId = 212
select top 35 *
from Snapshots
where ExamId = 212
EDIT --- More info on clients statistics
The two statistics having a huge variation are:
Bytes received from server : 66 038 Vs More than 2 000 000
TDS packets received from server 30 Vs 11000
Varchar(max) can't be part of a index key and apart from this other major drawback is it cannot be stored internally as a contiguous memory area as they can possibly grow up to 2Gb. So for improve the performance you need to avoid it.
Use Index for ExamId also use select field1,field2,etc instead of select * ....
I am not sure but try this:
select * from Snapshots where ExamId = (select top 36 ExamId from Snapshots where ExamId = 212)
Your execution time should be very low, while fetch is much longer.
Remove the varchar(max) from the SELECT TOP statement and only retrieve those values as you specifically need them.
Include SET STATISTICS IO ON before running the SELECT query and provide the output. Also, can you post the query plans from the 2 different queries as that will go a long way to explaining what the differences are. You can use https://www.brentozar.com/pastetheplan/ to upload it and provide the links.
Your TOP also does not have a matching ORDER BY so you cannot guarantee the ordering of the first 35 or 36 rows returned. This means that the 35 rows may not all be included in the 36 and you may be returning hugely different volumes of data.
Finally, also try in SSMS to enable Client Statistics with the query - this will show whether the delay is at the server side or all in latency in returning the result set to you.
Without the complete table description as a DDL statement (CREATE TABLE...) and indexes, it is very difficult to answer.
One important question is: do you use the "directive" TEXTIMAGE_ON when creating your table ? This will separate LOBs storage from relational data to avoid row overflow storage...
As other people are saying you should throw schema (datatype+existing index) of Snapshot table.
In snapshot table i believe examid is non clustered index which is not unique.
One examid has many record.Snapshot table must be having any PK column .
Top clause should always be use with Order by clause.Top clause without Order by clause is Non Determinstic.
On what basis it will select Top N.
So knowing schema of Snapshot then decide correct Index.
Using Order by clause can also be Non Determinstic but this is another discussion.
You can try this,
create table #temp(PKID int)
insert into #temp(pkid)
select top 36 pkid
from dbo.Snapshots
where ExamId = 212
Then you can do this,
select col1,col2,col3,col4
from dbo.Snapshots S
where exists(select 1 from #temp t where t.pkid=s.pkid)
Now your main question and problem,
Why 35 rows retrieve in 0 seconds and 36 rows retrieve in 3 minute.
I will write thst soon here.Meanwhile I am waiting for complete structure of Snapshot table.

Different output when executing statement directly and from stored procedure?

Sql Server 2008 is behaving in a strange way. When I execute the stored procedure the out put is in a different order than when I execute the statements directly for the same parameters. I am not sure what I am doing wrong. Please help!!!
Here is a simple query structure and explain what it does.
Top 10 Query1
Union all
Top 10 Query2
Order by name
a. When u run it in a proc :
From Query 1 it fetches top 10 , then from query 2 it fetches top 10 and then finally it does the order
b. When you open the query :
From Query 1 it applies the order and then fetches top 10, and from Query 2 it also applies the order and then fetches top 10
This is strange that it's doing 2 different things with the same query.
Output from Procedure
Name Cost Price
A2 Bag Stickerss DO NOT STOCKTAKES 24
aaaaaa 5
aaaaaa 7.5
Output from Query
Name Cost Price
A2 Bag Stickerss DO NOT STOCKTAKES 24
A2 Bag Stickerss DO NOT STOCKTAKES 27
aaaaaa 5
aaaaaa 7.5
aaaaaa 9
TOP without ORDER BY is not deterministic.
It just means "Select any 10 records". So you are selecting an arbitrary set of 10 results from query 1 and an arbitrary set of 10 records from query 2 then ordering these 20 records by name.
Which TOP 10 you end up with depends on the plan chosen (which may well be different in the stored procedure) You would need to add an order by (on a set of columns with no ties) to each query to make it deterministic.
Your current query is like
SELECT TOP 10 *
FROM master..spt_values
UNION ALL
SELECT TOP 10 *
FROM master..spt_values
ORDER BY name
You see that SQL Server just adds a TOP iterator to both branches of the plan to limit the output of both queries then these feed into the Union and the sort by name happens after that. SQL Server chose a clustered index scan for this so the results will likely be the TOP 10 in clustered index order type,number,name (though this shouldn't be relied upon either, without a specified order by to indicate what the TOP refers to any set of 10 rows would be valid. It would be perfectly valid for it to use the advanced scanning feature here and give you an arbitrary 10 rows that it knows to be in cache as they have just been read by an other query's scan.)
To rewrite the query with TOP...ORDER BY specified for each element you could use CTEs as below.
;WITH Query1 AS
(
SELECT TOP 10 *
FROM master..spt_values
ORDER BY name,number,type
), Query2 AS
(
SELECT TOP 10 *
FROM master..spt_values
ORDER BY number,type,name
)
SELECT *
FROM Query1
UNION ALL
SELECT *
FROM Query2
ORDER BY name

Resources