Sybase query optimization - sybase

I'm seeing how we can improve the performance of the following sybase query. Currently it takes about 1.5 hrs.
CREATE TABLE #TempTable
(
T_ID numeric,
M_ID numeric,
M_USR_NAME char(10),
M_USR_GROUP char(10),
M_CMP_DATE datetime,
M_CMP_TIME numeric,
M_TYPE char(10),
M_ACTION char(15),
)
select
T.M_USR_NAME,
T.M_USR_GROUP,
T.M_CMP_DATE,
T.M_CMP_TIME,
T.M_TYPE,
T.M_ACTION
from #TempTable T, AUD_TN B
where T.M_ID=B.M_ID
and T.T_ID in
(
select M_NB from TRN H where (M_BENTITY ="KROP" or M_SENTITY = "KROP")
)
UNION
select
A.M_USR_NAME,
A.M_USR_GROUP,
A.M_DATE_CMP,
A.M_TIME_CMP,
A.M_TYPE,
A.M_ACTION
from AUD_VAL A, TRN H
where A.M_DATE_CMP >= '1 May 2012' and A.M_DATE_CMP <= '31 May 2012'
and A.M_ACT_NB0=H.M_NB
and (H.M_BENTITY ="KROP" or H.M_SENTITY = "KROP")
UNION
select
TR.M_USR_NAME,
TR.M_USR_GROUP,
TR.M_DATE_CMP,
TR.M_TIME_CMP,
TR.M_TYPE,
TR.M_ACTION
from TRN_AUD TR, TRN H
where TR.M_DATE_CMP >= '1 May 2012' and TR.M_DATE_CMP <= '31 May 2012'
and TR.M_ACT_NB0=H.M_NB
and (H.M_BENTITY ="KROP" or H.M_SENTITY = "KROP")
DROP table #TempTable
Any help is greatly appreciated. Please note the following
The only table which is not indexed above is AUD_TN
Cheers
RC

Presumably the temporary table is populated, and with a lot of rows?
The temp doesn't need to be indexed but all joins in that part will need to use indexes.
Why not try each part of the UNION separately to find if one of them's slow?
Are you okay using SET SHOWPLAN ON? I think you need to be able to do that as well probably - you need to be able to check that Sybase is using indexes to join right.
TRN BENTITY and SENTITY - indexed? If not your IN is going to be a bit slow, although it might be okay, doing a single table scan into a worktable that Sybase'll index internally. Use an EXISTS instead as well - that might/should work better.
2nd part - both have SARGS (look up in Sybooks if you don't know - search arguments.) I don't know what proportion of rows is found by them but assuming it's a small fraction, you should see an index used on a SARG for whichever table is scanned first, then you should see index join (or perhaps merge join) to the 2nd - but using indexes.
3rd part - similar discussion to the 2nd.
I reckon it'll be the 2nd or 3rd part

How about using cache for these tables. if the query is used kn a regular basis. Its better to get a named cache and bind the tables to it. Also bind the tempdb to cache. This will greatly improve the process execution time. If the temp table is huge then you can create a index on it which may help with performance but i need some more details for that.

If you still have this issue open :
1) Try this at top of sql batch
set showplan on
set noexec on
See if the expected indexes are being picked up by SQL optimizer. If no indexes exist on the columns in where clause, please create one. Create clustered index if possible.
2) In the first query you can replace the subquery in where clause with
create table #T_ID (
M_NB datatype
)
insert into #T_ID
select M_NB from TRN H where (M_BENTITY ="KROP" or M_SENTITY = "KROP")
and modify the where clause as :
where T.M_ID=B.M_ID
and T.T_ID = #T_ID.M_NB

Related

Optimize SQL in MS SQL Server that returns more than 90% of records in the table

I have the below sql
SELECT Cast(Format(Sum(COALESCE(InstalledSubtotal, 0)), 'F') AS MONEY) AS TotalSoldNet,
BP.BoundProjectId AS ProjectId
FROM BoundProducts BP
WHERE ( BP.IsDeleted IS NULL
OR BP.IsDeleted = 0 )
GROUP BY BP.BoundProjectId
I already have an index on the table BoundProducts on this column order (BoundProjectId, IsDeleted)
Currently this query takes around 2-3 seconds to return the result. I am trying to reduce it to zero seconds.
This query returns 25077 rows as of now.
Please provide me any ideas to improvise the query.
Looking at this in a bit different point of view, I can think that your OR condition is screwing up your query, why not to rewrite it like this?
SELECT CAST(FORMAT(SUM(COALESCE(BP.InstalledSubtotal, 0)), 'F') AS MONEY) AS TotalSoldNet
, BP.BoundProjectId AS ProjectId
FROM (
SELECT BP.BoundProjectId, BP.InstalledSubtotal
FROM dbo.BoundProducts AS BP
WHERE BP.IsDeleted IS NULL
UNION ALL
SELECT BP.BoundProjectId, BP.InstalledSubtotal
FROM dbo.BoundProducts AS BP
WHERE BP.IsDeleted = 0
) AS BP
GROUP BY BP.BoundProjectId;
I've had better experience with UNION ALL rather than OR.
I think it should work totally the same. On top of that, I'd create this index:
CREATE NONCLUSTERED INDEX idx_BoundProducts_IsDeleted_BoundProjectId_iInstalledSubTotal
ON dbo.BoundProducts (IsDeleted, BoundProjectId)
INCLUDE (InstalledSubTotal);
It should satisfy your query conditions and seek index quite well. I know it's not a good idea to index bit fields, but it's worth trying.
P.S. Why not to default your IsDeleted column value to 0 and make it NOT NULLABLE? By doing that, it should be enough to do a simple check WHERE IsDeleted = 0, that'd boost your query too.
If you really want to try index seek, it should be possible using query hint forceseek, but I don't think it's going to make it any faster.
The options I suggested last time are still valid, remove format and / or create an indexed view.
You should also test if the problem is the query itself or just displaying the results after that, for example trying it with "select ... into #tmp". If that's fast, then the problem is not the query.
The index name in the screenshot is not the same as in create table statement, but I assume that's just a name you changed for the question. If the scan is happening to another index, then you should include that too.

Indexing and optimization of where clause based on datetime field

I have a database with more than a million of rowset data. When I execute this query it takes hours, mostly due to pageIOLatch_sh. There are currently no indexing. Can you suggest the possible indexing in where clause. I believe it should be on datetime as it is used in where as well as order by , if so which index to use.
if(<some condition>)
BEGIN
select <some columns>
From <some tables with joins(no lock)>
WHERE
((#var2 IS NULL AND a.addr IS NOT NULL)OR
(a.addr LIKE #var2 + '%')) AND
((#var3 IS NULL AND a.ca_id IS NOT NULL) OR
(a.ca_id = #var3)) AND
b.time >= #from_datetime AND b.time <= #to_datetime AND
(
(
b.shopping_product IN ('CX12343', 'BG8945', 'GF4543') AND
b.shopping_category IN ('online', 'COD')
)
OR
(
b.shopping_product = 'LX3454' and b.sub_shopping_list in ('FF544','GT544','KK543','LK5343')
)
OR
(
b.shopping_product = 'LK434434' and b.sub_shopping_list in ('LL5435','PO89554','IO948854','OR4334','TH5444')
)
OR
(
b.shopping_product = 'AZ434434' and b.sub_shopping_list in ('LL54352','PO489554','IO9458854','OR34334','TH54344')
)
)AND
ORDER BY
b.time desc
ELSE
BEGIN
select <some columns>
From <some tables with joins(no lock)>
where <similar where as above with slight difference>
Okay then,
I said "first take indexes on these : shopping_product and shopping_category sub_shopping_list , and secondly u can try on the date , after that see the execution plan. (or would be better to create partition on the time column)"
I'm working on oracle, but the basics are the same.
You can create 3 distinct indexes on that cols : shopping_product, shopping_category, sub_shopping_list . Or you can create 1 composite index for that 3 cols. The point is you need to examine the execution plan which one is the most effective for you.
Oh, and here is a.ca_id column (almost forget), you need an index for this too.
For the date column i think you would better create a partition instead of an index.
Summary, two ways: - create 4 distinct index (shopping_product,shopping_category,sub_shopping_list, ca_id) , create a range typed partition on the date column
- create 1 composite index (shopping_product,shopping_category,sub_shopping_list) and 1 normal index(ca_id) , create a range typed partition on the date column
You probably should learn about indexing if you're dealing with tables of this size. It's not a trivial process. JOIN operations are a big deal when sorting out which indexes you need. Read this. http://use-the-index-luke.com/
In the meantime, if your date-range is highly selective (that is, if
b.time >= #from_datetime AND b.time <= #to_datetime
chooses a reasonably small fraction of the rows in your database) you should try the following compound index.
b.shopping_product, b.time
If that doesn't help, try
b.time
by itself. The idea is to structure your index so the server can do a range scan. Without a knowledge of your whole query, there's not much else to offer.

How can I speed up this sql server query?

-- Holds last 30 valdates
create table #valdates(
date int
)
insert into #valdates
select distinct top (30) valuation_date
from tbsm.tbl_key_rates_summary
where valuation_date <= 20150529
order by valuation_date desc
select
sum(fv_change), sc_group, valuation_date
from
(select *
from tbsm.tbl_security_scorecards_summary
where valuation_date in (select date from #valdates)) as fact
join
(select *
from tbsm.tbl_security_classification
where sc_book = 'UC' ) as dim on fact.classification_id = dim.classification_id
group by
valuation_date, sc_group
drop table #valdates
This query takes around 40 seconds to return because the fact table has almost 13 million rows.. Can I do anything about this?
Based on the fact that there's no proper index that supports the fetch, that's probably the easiest (or only) option to really improve the performance. Most likely index like this would improve the situation a lot:
create index idx_security_scorecards_summary_1 on
tbl_security_scorecards_summary (valuation_date, classification_id)
include (fv_change)
Everything depends of course on how good the selectivity of the valuation_date and classification_id fields are (=how big portion of the table needs to be fetched) and might work better with the fields in opposite order. The field fv_change is in the include section so that it's included in the index structure so there's no need to fetch it from the base table.
Include fields help if the SQL has to fetch a lot of rows from the table. If the amount of rows that this touches is small, then it might not help at all. Like always in indexing, this of course slows down the inserts / updates, and is optimized for this case only and you should of course look at the bigger picture too.
The select is written in a little bit strange way, not sure if that makes any difference, but you could also try the normal way to do this:
select
sum(fact.c), dim.sc_group, fact.valuation_date
from
tbsm.tbl_security_scorecards_summary fact
join tbsm.tbl_security_classification dim
on fact.classification_id = dim.classification_id
where
fact.valuation_date in (select date from #valdates) and
dim.sc_book = 'UC'
group by
fact.valuation_date,
dim.sc_group
Looking at "statistics io" output should give you a good idea which table is causing the slowness, and looking at query plan to see if there's any strange operators might help to understand the situation better.

MAX keyword taking a lot of time to select a value from a column

Well, I have a table which is 40,000,000+ records but when I try to execute a simple query, it takes ~3 min to finish execution. Since I am using the same query in my c# solution, which it needs to execute over 100+ times, the overall performance of the solution is deeply hit.
This is the query that I am using in a proc
DECLARE #Id bigint
SELECT #Id = MAX(ExecutionID) from ExecutionLog where TestID=50881
select #Id
Any help to improve the performance would be great. Thanks.
What indexes do you have on the table? It sounds like you don't have anything even close to useful for this particular query, so I'd suggest trying to do:
CREATE INDEX IX_ExecutionLog_TestID ON ExecutionLog (TestID, ExecutionID)
...at the very least. Your query is filtering by TestID, so this needs to be the primary column in the composite index: if you have no indexes on TestID, then SQL Server will resort to scanning the entire table in order to find rows where TestID = 50881.
It may help to think of indexes on SQL tables in the same way as those you'd find in the back of a big book that are hierarchial and multi-level. If you were looking for something, then you'd manually look under 'T' for TestID then there'd be a sub-heading under TestID for ExecutionID. Without an index entry for TestID, you'd have to read through the entire book looking for TestID, then see if there's a mention of ExecutionID with it. This is effectively what SQL Server has to do.
If you don't have any indexes, then you'll find it useful to review all the queries that hit the table, and ensure that one of those indexes is a clustered index (rather than non-clustered).
Try to re-work everything into something that works in a set based manner.
So, for instance, you could write a select statement like this:
;With OrderedLogs as (
Select ExecutionID,TestID,
ROW_NUMBER() OVER (PARTITION BY TestID ORDER By ExecutionID desc) as rn
from ExecutionLog
)
select * from OrderedLogs where rn = 1 and TestID in (50881, 50882, 50883)
This would then find the maximum ExecutionID for 3 different tests simultaneously.
You might need to store that result in a table variable/temp table, but hopefully, instead, you can continue building up a larger, single, query, that processes all of the results in parallel.
This is the sort of processing that SQL is meant to be good at - don't cripple the system by iterating through the TestIDs in your code.
If you need to pass many test IDs into a stored procedure for this sort of query, look at Table Valued Parameters.

Tune Slow SQL Query

I got an app running on my SQL Server that is starting to slow down on a specific task. I ran SQL Profiler and noticed that the
following query is taking an enormous (1-2 minutes) amount of time. I don't have access to the code to change the query.
Is there anything I can tune/change in the database? The PC10000 table in the statement below has approx. 119000 records. I also have the execution plan attached.
SELECT TOP 25
zProjectID, zTaskID, zTransactionNumber, zTransactionDate, zUserID,
zCostCategoryDDL, zCostCategoryString, zSubCostCategory, zSubCostCategoryString,
zDepartmentID, zJournalEntry, zPostingDate, zSalesPostingDate, zPeriodNumber,
zTransactionDescription, zBillingDescriptionLine1, zBillingDescriptionLine2,
zBillingDescriptionLine3, zBillingDescriptionLine4, zSalesAccountIndex,
zSalesAccountString, zDistDocumentTypeDDL, zDistDocumentNumber, zDistSequenceNumber,
zSalesDocumentTypeDDL, zSalesDocumentNumber, zSalesLineNumber, zDistHistoryYear,
zSeriesDDL, zSourceDoc, zWebSource, zOrigDocumentNumber, zOrigDocumentDate,
zOrigID, zOrigName, zExpenseStatusDDL, zApprovalUserIDCost, zAccountIndex,
zAccountNumberString, zBillingStatusDDL, zApprovalUserIDBilling, zBillingWorkQty,
zBillingWorkAmt, zQty, zQtyBilled, zUnitCost,
zUnitPrice, zRevenueAmt, zOriginatingRevenueAmt, zCostAmtEntered, zCostAmt,
zOriginatingCostAmt, zPayGroupID, zPayrollStatusDDL, zTotalTimeStatusDDL,
zEmployeeID, zHoursEntered, zHoursPaid, zPayRecord, zItemID, zItemDescription,
zUofM, zItemQty, zBurdenStatusDDL, zUserDefinedDate, zUserDefinedDate2,
zUserDefinedString, zUserDefinedString2, zUserDefinedCurrency,
zUserDefinedCurrency2, zNoteIndex, zImportType, DEX_ROW_ID
FROM
DBServer.dbo.pc10000
WHERE
(zDistDocumentNumber in
(select cast(JRNENTRY as varchar(20))
from DBServer..GL10001
where BACHNUMB = 'PMCHK00004283')
or zSalesDocumentNumber in
(select cast(JRNENTRY as varchar(20))
from DBServer..GL10001
where BACHNUMB = 'PMCHK00004283'))
ORDER BY
zProjectID ASC ,zTaskID ASC ,zTransactionNumber ASC
The biggest problem you have looks to be due to lack of suitable indexes.
You can see that because of the presence of Table Scans within the execution plan.
Table Scans hit performance as they mean the whole table is being scanned for data that matches the given clauses in the query.
I'd recommend you add an index on BACHNUMB in GL10001
You may also want to try indexes on zDistDocumentNumber and zSalesDocumentNumber in PC10000, but I think the GL10001 index is the main one.
"IN" clauses are typically quite expensive compared to other techniques, but as you can't change the query itself then there's nothing you can do about that.
Without a doubt, you need to add suitable indexes
The query is doing 2 table scans on the GL10001 table. From a quick look at the query (which is a bit hard to read) I would see if you have an index on the BACHNUMB column.
the execution plan shows pretty clearly that actually locating the rows is what's taking all the time (no cumbersome bookmark lookups, or aggregation/rearrange tasks), so it's quite positively going to be a question of indexing. hover the table scans in the execution plan, and check 'object' in the tooltip, to see what columns are being used. see to it that they're indexed.
you might also want to run a trace to sample some live data, and feed that to the database tuning advisor.
You could rewrite those sub-selects as a join, and add an index to GP01..GL10001 on BACHNUMB and JRNENTRY
Since you can't change the query, the best thing you could do is make sure you have indexes on the columns that you're using for your joins (and subqueries). If you can think of a better query plan, you could provide that to SQL Server instead of letting it calculate its own (this is a very rare case).
Replace the OR with a UNION ALL of two queries this should get shot of those spools
i.e. run the query once with something like this
SELECT ....
(zDistDocumentNumber in
(select cast(JRNENTRY as varchar(20))
from DBServer..GL10001
where BACHNUMB = 'PMCHK00004283')
UNION ALL
SELECT ...
zSalesDocumentNumber in
(select cast(JRNENTRY as varchar(20))
from DBServer..GL10001
where BACHNUMB = 'PMCHK00004283'))
In addition to adding indexes, you can also convert the IN statements to EXISTS... something along these lines:
SELECT TOP 25 ....
FROM GP01.dbo.pc10000 parent
WHERE EXISTS
(
SELECT child.*
FROM GP01..GL10001 child
WHERE BACHNUMB = 'PMCHK00004283'
and parent.zDistDocumentNumber = child.JRNENTRY
)
OR EXISTS
(
SELECT child2.*
FROM GP01..GL10001 child2
WHERE BACHNUMB = 'PMCHK00004283'
and parent.zSalesDocumentnumber = child2.JRENTRY
)
ORDER BY zProjectID ASC ,zTaskID ASC ,zTransactionNumber ASC

Resources