optimising sql select statement - sql-server

Is there a way to fetch 4 million records from SQL Server 2005 in under 60 seconds?
My table consists of 15 columns. Each has datatype of varchar(100) and there is no primary key.

Assuming you want the entire contents of the table then try this first:
SELECT col1, col2, ... col15 FROM your_table
If that is too slow then there's not really anything more you can do apart from change your program design so that it is not necessary to fetch so many rows at once.
If this records will be displayed in a graphical user interface you could consider using paging instead of fetching all the rows at once.

Actually last time I did something like this, i put a filter dropdown and then the records would be filtered using the filter user selects. I also give the option "All" in the dropdown selecting which I show the user a message like "Retrieving all records will be bit slow. Want to continue?". And in any case, as Mark suggested, I used paging .

Related

Optimize SQL Server query

I have created a view that returns data from more than one table using join. When I select from that view without using Order By clause, the time taken to execute that query is only about 1 second or less. But when I use order by with my select query, it takes about 27 seconds to return only the top(15) records from that view.
Here is my query that I run To get data from View
SELECT TOP(15) *
FROM V_transaction
ORDER BY time_stamp DESC
Note : total number of records that view has is about 300000
What can I change in my view's design to get better performance?
First thing that pops into mind is creating an index on time_stamp in the view. If you don't want to/can't create an indexed view you could create an index on the column in the underlying table that you are getting that value from. This should increase your queries performance.
If you are still having issues post the execution plan - this should show you exactly where/why your query is experiencing performance problems.
Why don't you create an extra column which stores the numbers of days + number of seconds for a each date record and then order by that column..

Cognos Report Studio: Cascading Prompts populating really slow

I've a Cognos report in which I've cascading prompts. The Hierarchy is defined in the image attached.
The First Parent (Division) fills the two cascading child in 3-5 seconds.
But when I select any Policy, (that will populate the two child beneath) it took around 2 minutes.
Facts:
The result set after two minutes is normal (~20 rows)
The Queries behind all the prompts are simple Select DISTINCT Col_Name
Ive created indexes on all the prompt columns.
Tried turning on the local cache and Execution Method to concurrent.
I'm on Cognos Report Studio 10.1
Any help would be much appreciated.
Thanks,
Nuh
There is an alternative to a one-off dimension table. Create a Query Subject in Framework for your AL-No prompt. In the query itself, build a query that gets distinct AL-No (you said that is fast, probably because there is an index on AL-No). Wrap that in a select that does a filter on ' #prompt('pPolicy')#' (assuming your Policy Prompt is keyed to ?pPolicy?)
This will force the Policy into the sql before it is sent to the database, but wrapping on the distinct AL-No will allow you to use the AL-No index.
select AL_NO from
(
select AL_NO, Policy_NO
from CLAIMS
group by AL_NO, Policy_NO
)
where Policy_NO = #prompt('pPolicyNo')#
Your issue is just too much table scanning. Typically, one would build a prompt page from dimension-based tables, not the fact table, though I admit that is not always possible with cascading prompts. The ideal solution is to create a one-off dimension table with these distinct values, then model that strictly for the prompts.
Watch out for indexing each field, as the indexes will not be used due to the selectivity of the values. A compound index of the fields may work instead. As with any time you are making changes to the DDL - open SQL profiler and see what SQL Cognos is generating, then run an explain plan before/after the changes.

The setting 'auto create statistics' causes wildcard TEXT field searches to hang

I have an interesting issue happening in Microsoft SQL when searching a TEXT field. I have a table with two fields, Id (int) and Memo (text), populated with hundreds of thousands of rows of data. Now, imagine a query, such as:
SELECT Id FROM Table WHERE Id=1234
Pretty simple. Let's assume there is a field with Id 1234, so it returns one row.
Now, let's add one more condition to the WHERE clause.
SELECT Id FROM Table WHERE Id=1234 AND Memo LIKE '%test%'
The query should pull one record, and then check to see if the word 'test' exists in the Memo field. However, if there is enough data, this statement will hang, as if it were searching the Memo field first, and then cross referencing the results with the Id field.
While this is what it is appearing to do, I just discovered that it is actually trying to create a statistic on the Memo field. If I turn off "auto create statistics", the query runs instantly.
So my quesiton is, how can you disable auto create statistics, but only for one query? Perhaps something like:
SET AUTO_CREATE_STATISTICS OFF
(I know, any normal person would just create a full text index on this field and call it a day. The reason I can't necessarily do this is because our data center is hosting an application for over 4,000 customers using the same database design. Not to mention, this problem happens on a variety of text fields in the database. So it would take tens of thousands of full text indexes if I went that route. Not to mention, adding a full text index would add storage requirements, backup changes, disaster recovery procedure changes, red tape paperwork, etc...)
I don't think you can turn this off on a per query basis.
Best you can do would be to identify all potentially problematic columns and then CREATE STATISTICS on them yourself with 0 ROWS or 0 PERCENT specified and NORECOMPUTE.
If you have a maintenance window you can run this in it would be best to run without this 0 ROWS qualifier but still leave the NORECOMPUTE in place.
You could also consider enabling AUTO_UPDATE_STATISTICS_ASYNC instead so that they are still rebuilt automatically but this happens in the background rather than holding up compilation of the current query but this is a database wide option.

SQL Query taking forever

I have this webapplication tool which queries over data and shows it in a grid. Now a lot of people use it so it has to be quite performant.
The thing is, I needed to add a couple of extra fields through joins and now it takes forever for the query to run.
If I in sql server run the following query:
select top 100 *
from bam_Prestatie_AllInstances p
join bam_Zending_AllRelationships r on p.ActivityID = r.ReferenceData
join bam_Zending_AllInstances z on r.ActivityID = z.ActivityID
where p.PrestatieZendingOntvangen >= '2010-01-26' and p.PrestatieZendingOntvangen < '2010-01-27'
This takes about 35-55seconds, which is waaay too long. Because this is only a small one.
If I remove one of the two date checks it only takes 1second. If I remove the two joins it also takes only 1 second.
When I use a queryplan on this I can see that 100% of the time is spend on the indexing of the PrestatieZendingOntvangen field. If I set this field to be indexed, nothing changes.
Anybody have an idea what to do?
Because my clients are starting to complain about time-outs etc.
Thanks
Besides the obvious question of an index on the bam_Prestatie_AllInstances.PrestatieZendingOntvangen column, also check if you have indices for the foreign key columns:
p.ActivityID (table: bam_Prestatie_AllInstances)
r.ReferenceData (table: bam_Zending_AllRelationships)
r.ActivityID (table: bam_Zending_AllRelationships)
z.ActivityID (table: bam_Zending_AllInstance)
Indexing the foreign key fields can help speed up JOINs on those fields quite a bit!
Also, as has been mentioned already: try to limit your fields being selected by specifying a specific list of fields - rather than using SELECT * - especially if you join several tables, just the sheer number of columns you select (multiplied by the number of rows you select) can cause massive data transfer - and if you don't need all those columns, that's just wasted bandwidth!
Specify the fields you want to retrieve, rather than *
Specify either Inner Join or Outer Join
Try between?
where p.PrestatieZendingOntvangen
between '2010-01-26 00:00:00' and '2010-01-27 23:00:00'
Have you placed an indexes on the date fields in your Where clause.
If not, I would create a INDEX on that fields to see if it makes any differences to yourv time.
Of course, Indexes will take up more disk space so you will have to consider the impact of that extra index.
EDIT:
The others have also made good points about specifing what columns you require in the Select instead of * (wildcard), and placing more indexes on foreign keys etc.
Someone from DB background can clear my doubt on this.
I think, you should specify date in the style in which the DB will be able to understand it.
for e.g. Assuming, the date is stored in mm/dd/yyyy style inside the table & your query tries to put in a different style of date for comparison (yyyy-mm-dd), the performance will go down.
Am I being too naive, when I assume this?
How many columns does bam_Prestatie_AllInstances and the other tables have? It looks like you are oulling all columns and that can definitely be a performance issue.
Have you tried to select specific columns from specific tables such as:
select top 100 p.column1, p.column2, p.column3
Instead of querying all columns as you are currently doing:
select top 100 *

How to 'thin' a database table out?

I have a large DB table which I am using for testing. It contains 7.3m phone call records. I would like to delete many of those, but still maintain a good spread over phone numbers and dates. Is there any way of achieving this? Maybe something to do with table sample?
Delete where the id finishes in a 1 or 6? Or similar, depending on exactly how many you need to remove.
i.e. to keep just 10% of the records for testing delete all the records that don't end in (say) 7.
(Note that a delete like this could take a while. You might be better doing a CREATE TABLE AS with just the records you need.)
Copy the data you want to keep:
SELECT TOP 1000 * INTO dbo.Buffer
FROM Data.Numbers
ORDER BY NewID()
Delete all data:
TRUNCATE TABLE Data.Numbers
Move back the kept data
INSET INTO Data.Numbers(column list) SELECT FROM dbo.Buffer

Resources