SQL Query takes more than 3h - too long

SQL Query takes more than 3h - too long - query-optimization

I have a TableSpace with a capacity of 1GB and I send a query about PL SQL every month.
Here is my configuration
Tables
RowNumbers
DBOs
TableSpaces
Tbl_A
400 million rows
DB_E
TS_Y
Tbl_B
60 million rows
DB_F
TS_Z
Tbl_C
5 million rows
DB_F
TS_Z
I want to fill the resultset to this table :
Table
RowNumbers
DBOs
TableSpaces
Tbl_D
~10 million rows
DB_F
TS_Z
Here is my Query :
CREATE TABLE DB_F.Tbl_D TABLESPACE TS_Z AS
SELECT
SUM(DURATION) AS DURATION,
DATMOUNT,
NUMDOC,
TYPDOC,
TEAM,
MONTH
FROM (
SELECT DISTINCT
Tbl_B.NUMDOC,
Tbl_B.TYPDOC,
Tbl_C.TEAM,
Tbl_C.MONTH,
Tbl_A.DATEX,
Tbl_A.DATEX + 6 - (CASE WHEN TO_CHAR(Tbl_A.DATEX,'D') = '7' THEN '0' ELSE TO_CHAR(Tbl_A.DATEX,'D') END) AS DATMOUNT,
Tbl_A.HOURSFIRST,
Tbl_A.DURATION
FROM DB_F.Tbl_B Tbl_B
INNER JOIN DB_F.Tbl_C Tbl_C ON Tbl_B.IDAFFA = Tbl_C.ID_NAME
INNER JOIN DB_E.Tbl_A ON Tbl_A.NUMDOC = Tbl_B.NUMDOC
WHERE
Tbl_A.NUMDOC = Tbl_B.NUMDOC
AND TO_CHAR(Tbl_C.FIRSTDAY,'YYYYMM' ) <= TO_CHAR(ADD_MONTHS (SYSDATE,12),'YYYYMM' )
AND TO_CHAR(Tbl_C.FIRSTDAY,'YYYYMM' ) > TO_CHAR(ADD_MONTHS (SYSDATE,-13),'YYYYMM' )
AND Tbl_A.DATEX + 6 - (CASE WHEN TO_CHAR(Tbl_A.DATEX,'D') = '7' THEN '0' ELSE TO_CHAR(Tbl_A.DATEX,'D') END) <= Tbl_C.LASTDAY
AND Tbl_A.DATEX + 6 - (CASE WHEN TO_CHAR(Tbl_A.DATEX,'D') = '7' THEN '0' ELSE TO_CHAR(Tbl_A.DATEX,'D') END) >= Tbl_C.FIRSTDAY
AND Tbl_B.DATE_M <= Tbl_C.LASTDAY
AND Tbl_B.DATE_I >= Tbl_C.LASTDAY
AND Tbl_B.IDAFFA = Tbl_C.ID_NAME
AND (Tbl_A.NATURE = 'IT' OR Tbl_A.NATURE = 'IS')
)
GROUP BY
NUMDOC,
TYPDOC,
TEAM,
DATMOUNT,
MONTH;
Here the Query Plan with reduced Data because it takes too much time :
I will send later a second Plan
All where clauses of all 3 tables (Tbl_A, Tbl_B, Tbl_C) are indexed in a separate tablespace.
Unfortunately, the import of the data takes almost 3 hours, which is very long for 10 million lines. When the data is imported, I only have 100MB of free space in my tablespace.
Question: does the import of the data take so long because there is not enough free memory (100MB) available? Or if there is a possibility to change the query more efficient?
Thanks in advance

Related

SQL Query to group by time and roll up and concatenate string values

I am trying to get a particular format from a group of times and days between two tables.
Database:
MeetingTime table has a relationship from MeetingTime.DayOfWeekId (foreign key) to table DayOfWeek.Id (Primary Key). Example Query:
select t.ClassId, d.Name, t.StartTime, t.EndTime
From MeetingTime t
Inner Join DaysOfWeek d on d.Id = t.DayOfWeekId
Where t.classId = 8
Results:
My desired results for this set of data would be one row, because the start and end times are the same.
09:00-15:35 M/T/W/Th/F
NOTE, the start and end time above, can be separate columns above, the main goal is display the days of the week for each grouped time.
The monkey wrench is that the times can be completely different or the same. For example this data set:
I would want displayed in 2 rows:
07:35-14:15 M/T/W
08:00-14:15 Th/F
And finally, this dataset where all times are different:
Would display in 5 rows:
13:48-14:48 M
15:48-16:48 T
05:49-23:53 W
14:49-16:49 Th
13:49-16:49 F
I haven't had much success with grouping the times. I did figure out how to concatenate the days of the week rolling the days up into one column using the 'Stuff' Operator, but didn't get anywhere with the grouping of the start and end time coupled with this yet.
Concatenating and rolling up days:
STUFF((SELECT '/ ' +
(CASE
WHEN d.[Name] = 'Thursday' THEN SUBSTRING(d.[Name], 1, 2)
WHEN d.[Name] = 'Sunday' THEN 'U'
WHEN d.[Name] != '' THEN SUBSTRING(d.[Name], 1, 1)
ELSE NULL
END)
FROM MeetingTime m
Inner Join [DayOfWeek] d on d.Id = m.DayOfWeekId
Where m.ClassId = class.Id
FOR XML PATH('')), 1, 1, '') [ClassSchedule]
I'm also not opposed to just returning the rows and handling the data manipulation in C# code, but wanted to see if SQL could handle it.

I was able to get this working. Here is the query:
select
t.ClassId,
t.StartTime,
t.EndTime,
STUFF((SELECT '/' + (CASE
WHEN w.[Name] = 'Thursday' THEN SUBSTRING(w.[Name], 1, 2)
WHEN w.[Name] = 'Sunday' THEN 'U'
WHEN w.[Name] != '' THEN SUBSTRING(w.[Name], 1, 1)
ELSE NULL
END)
From MeetingTime s
Inner Join DayOfWeek w on w.Id = s.DayOfWeekId
Where s.classId = 7 and s.DayOfWeekId > 0
and s.StartTime = t.StartTime
and s.EndTime = t.EndTime
FOR XML PATH('')), 1, 1, '') [ClassSchedule]
From MeetingTime t
Inner Join DayOfWeek d on d.Id = t.DayOfWeekId
Where t.classId = 7 and t.DayOfWeekId > 0
Group by t.StartTime, t.EndTime, t.ClassId
Obviously hardcoded Id you would want to create a variable.
Results where the start and end time are all the same:
Some times the same and some different:
Some times the same and some different with days not in order:
Times all different:
Times with only Mon/Wed/Fri.
I feel pretty good about this, except I'd like to fix the order of the above result image where all times are different and the days are not in chronological order.

SQL - Finding Gaps in Coverage

I am running this problem on SQL server
Here is my problem.
have something like this
Dataset A
FK_ID StartDate EndDate Type
1 10/1/2018 11/30/2018 M
1 12/1/2018 2/28/2019 N
1 3/1/2019 10/31/2019 M
I have a second data source I have no control over with data something like this:
Dataset B
FK_ID SpanStart SpanEnd Type
1 10/1/2018 10/15/2018 M
1 10/1/2018 10/25/2018 M
1 2/15/2019 4/30/2019 M
1 5/1/2019 10/31/2019 M
What I am trying to accomplish is to check to make sure every date within each TYPE M record in Dataset A has at least 1 record in Dataset B.
For example record 1 in Dataset A does NOT have coverage from 10/26/2018 through 11/30/2018. I really only care about when the coverage ends, in this case I want to return 10/26/2018 because it is the first date where the span has no coverage from Dataset B.
I've written a function that does this but it is pretty slow because it is cycling through each date within each M record and counting the number of records in Dataset B. It exits the loop when it finds the first one but I would really like to make this more efficient. I am sure I am not thinking about this properly so any suggestions anyone can offer would be helpful.
This is the section of code I'm currently running
else if #SpanType = 'M'
begin
set #CurrDate = #SpanStart
set #UncovDays = 0
while #CurrDate <= #SpanEnd
Begin
if (SELECT count(*)
FROM eligiblecoverage ec join eligibilityplan ep on ec.plandescription = ep.planname
WHERE ec.masterindividualid = #IndID
and ec.planbegindate <= #CurrDate and ec.planenddate >= #CurrDate
and ec.sourcecreateddate = #MaxDate
and ep.medicaidcoverage = 1) = 0
begin
SET #Result = concat('NON Starting ',format(#currdate, 'M/d/yyyy'))
BREAK
end
set #CurrDate = #CurrDate + 1
end
end
I am not married to having a function it just could not find a way to do this in queries that wasn't very very slow.
EDIT: Dataset B will never have any TYPEs except M so that is not a consideration
EDIT 2: The code offered by DonPablo does de-overlap the data but only in cases where there is an overlap at all. It reduces dataset B to:
FK_ID SpanStart SpanEnd Type
1 10/1/2018 10/25/2018 M
instead of
FK_ID SpanStart SpanEnd Type
1 10/1/2018 10/25/2018 M
1 2/15/2019 4/30/2019 M
1 5/1/2019 10/31/2019 M
I am still futzing around with it but it's a start.

I would approach this by focusing on B. My assumption is that any absent record would follow span_end in the table. So here is the idea:
Unpivot the dates in B (adding "1" to the end dates)
Add a flag if they are present with type "M".
Check to see if any not-present records are in the span for A.
Check the first and last dates as well.
So, this looks like:
with bdates as (
select v.dte,
(case when exists (select 1
from b b2
where v.dte between b2.spanstart and b2.spanend and
b2.type = 'M'
)
then 1 else 0
end) as in_b
from b cross apply
(values (spanstart), (dateadd(day, 1, spanend)
) v(dte)
where b.type = 'M' -- all we care about
group by v.dte -- no need for duplicates
)
select a.*,
(case when not exists (select 1
from b b2
where a.startdate between b2.spanstart and b2.spanend and
b2.type = 'M'
)
then 0
when not exists (select 1
from b b2
where a.enddate between b2.spanstart and b2.spanend and
b2.type = 'M'
)
when exists (select 1
from bdates bd
where bd.dte between a.startdate and a.enddate and
bd.in_b = 0
)
then 0
when exists (select 1
from b b2
where a.startdate between b2.spanstart and b2.spanend and
b2.type = 'M'
)
then 1
else 0
end)
from a;
What is this doing? Four validity checks:
Is the starttime valid?
Is the endtime valid?
Are any intermediate dates invalid?
Is there at least one valid record?

Start by framing the problem in smaller pieces, in a sequence of actions like I did in the comment.
See George Polya "How To Solve It" 1945
Then Google is your friend -- look at==> sql de-overlap date ranges into one record (over a million results)
UPDATED--I picked Merge overlapping dates in SQL Server
and updated it for our table and column names.
Also look at theory from 1983 Allen's Interval Algebra https://www.ics.uci.edu/~alspaugh/cls/shr/allen.html
Or from 2014 https://stewashton.wordpress.com/2014/03/11/sql-for-date-ranges-gaps-and-overlaps/
This is a primer on how to setup test data for this problem.
Finally determine what counts via Ranking the various pairs of A vs B --
bypass those totally Within, then work with earliest PartialOverlaps, lastly do the Precede/Follow items.
--from Merge overlapping dates in SQL Server
with SpanStarts as
(
select distinct FK_ID, SpanStart
from Coverage_B as t1
where not exists
(select * from Coverage_B as t2
where t2.FK_ID = t1.FK_ID
and t2.SpanStart < t1.SpanStart
and t2.SpanEnd >= t1.SpanStart)
),
SpanEnds as
(
select distinct FK_ID, SpanEnd
from Coverage_B as t1
where not exists
(select * from Coverage_B as t2
where t2.FK_ID = t1.FK_ID
and t2.SpanEnd > t1.SpanEnd
and t2.SpanStart <= t1.SpanEnd)
),
DeOverlapped_B as
(
Select FK_ID, SpanStart,
(select min(SpanEnd) from SpanEnds as e
where e.FK_ID = s.FK_ID
and SpanEnd >= SpanStart) as SpanEnd
from SpanStarts as s
)
Select * from DeOverlapped_B
Now we have something to feed into the next steps, and we can use the above as a CTE
======================================
with SpanStarts as
(
select distinct FK_ID, SpanStart
from Coverage_B as t1
where not exists
(select * from Coverage_B as t2
where t2.FK_ID = t1.FK_ID
and t2.SpanStart < t1.SpanStart
and t2.SpanEnd >= t1.SpanStart)
),
SpanEnds as
(
select distinct FK_ID, SpanEnd
from Coverage_B as t1
where not exists
(select * from Coverage_B as t2
where t2.FK_ID = t1.FK_ID
and t2.SpanEnd > t1.SpanEnd
and t2.SpanStart <= t1.SpanEnd)
),
DeOverlapped_B as
(
Select FK_ID, SpanStart,
(select min(SpanEnd) from SpanEnds as e
where e.FK_ID = s.FK_ID
and SpanEnd >= SpanStart) as SpanEnd
from SpanStarts as s
),
-- find A row's coverage
ACoverage as (
Select
a.*, b.SpanEnd, b.SpanStart,
Case
When SpanStart <= StartDate And StartDate <= SpanEnd
And SpanStart <= EndDate And EndDate <= SpanEnd
Then '1within' -- starts, equals, during, finishes
When EndDate < SpanStart
Or SpanEnd < StartDate
Then '3beforeAfter' -- preceeds, meets, preceeded, met
Else '2overlap' -- one or two ends hang over spanStart/End
End as relation
From Coverage_A a
Left Join DeOverlapped_B b
On a.FK_ID = b.FK_ID
Where a.Type = 'M'
)
Select
*
,Case
When relation1 = '2' And StartDate < SpanStart Then StartDate
When relation1 = '2' Then DateAdd(d, 1, SpanEnd)
When relation1 = '3' Then StartDate
End as UnCoveredBeginning
From (
Select
*
,SUBSTRING(relation,1,1) as relation1
,ROW_NUMBER() Over (Partition by A_ID Order by relation, SpanStart) as Rownum
from ACoverage
) aRNO
Where Rownum = 1
And relation1 <> '1'

SQL query to show good records as well as null records

My query works perfectly well to find records with real values, however, I also need my query to show records with null values. So far my attempts at recreating this query to also show null values has resulted in losing at least 1 of my columns of results so now I'm looking for help.
This is my query so far:
SELECT sq.*, sq.TransactionCountTotal - sq.CompleteTotal as InProcTotal from
(
select
c.CustName,
t.[City],
sum (t.TransactionCount) as TransactionCountTotal
sum (
case
when (
[format] in (23,25,38)
or [format] between 400 and 499
or format between 800 and 899
)
then t.TransactionCount
else 0
end
) as CompleteTotal
FROM [log].[dbo].[TransactionSummary] t
INNER JOIN [log].[dbo].[Customer] c
on t.CustNo = c.CustNo
and t.City = c.City
and t.subno = c.subno
where t.transactiondate between '7/1/16' and '7/11/16'
group by c.CustName,t.City
) sq
This is currently what my query results show:
CustName City InProcTotal TransactionCountTotal Complete Total
Cust 1 City(a) 23 7 30
Cust 2 City(b) 74 2 76
Cust 3 City(c) 54 4 58
This is what I want my query results to show:
CustName City InProcTotal TransactionCountTotal Complete Total
Cust 1 City(a) 23 7 30
Cust 2 City(b) 74 2 76
Cust 3 City(c) 54 4 58
Cust 4 City(d) 0 0 0
Cust 5 City(e) 0 0 0

I suggest you use RIGHT JOIN in the place of INNER JOIN. You should then retain the rows from Customer that don't have matching rows in TransactionSummary.
You may also want to refactor the query like this so you use LEFT JOIN. The next person to work on the query will thank you; LEFT JOIN operations are more common.
FROM [log].[dbo].[Customer] c
LEFT JOIN [log].[dbo].[TransactionSummary] t
on t.CustNo = c.CustNo
and t.City = c.City

jwabsolution, your issue stems from grabbing all transactions instead of all customers. My mind works in this way: you want to select all of the customers & find all transaction states. Therefore, you should be selecting from the customer table. Also, you shouldn't use the INNER JOIN or you will ignore any customers that don't have transactions. Instead, use left join the transactions table. In this manner, you will retrieve all customers (even those with no transactions). Here is a good visual for SQL joins: http://www.codeproject.com/KB/database/Visual_SQL_Joins/Visual_SQL_JOINS_orig.jpg
So your query should look like this:
SELECT sq.*, sq.TransactionCountTotal - sq.CompleteTotal as InProcTotal from
(
select
c.CustName,
t.[City],
sum (t.TransactionCount) as TransactionCountTotal
sum (
case
when (
[format] in (23,25,38)
or [format] between 400 and 499
or format between 800 and 899
)
then t.TransactionCount
else 0
end
) as CompleteTotal
FROM [log].[dbo].[Customer] c
LEFT JOIN [log].[dbo].[TransactionSummary] t
on c.CustNo = t.CustNo
and c.City = t.City
and c.subno = t.subno
where t.transactiondate between '7/1/16' and '7/11/16'
group by c.CustName,t.City
) sq

Fixed it. Needed to use coalesce to get the values to show up properly.
Also added a "where" option if I want to query individual customers
SELECT sq.* ,sq.TransactionCountTotal - sq.CompleteTotal as [InProcTotal]
from
(
select
c.custname
,c.port
,sum(coalesce(t.transactioncount,0)) as TransactionCountTotal
,sum(
case when (
[format]in(23,25,38)
or[format]between 400 and 499
or[format]between 800 and 899)
then t.TransactionCount
else 0
end) as CompleteTotal
from log.dbo.customer c
left join log.dbo.TransactionSummary t
on c.custNo=t.custno
and c.subno=t.subno
and c.city=t.city
and t.transactiondate between '7/1/16' and '7/12/16'
/*where c.custname=''*/
group by c.custname,c.city
) sq

Exclude certain line depending on a certain condition in t-SQL

Firstly, I have created the following stock sales SELECT statement:
SELECT
PostST.TxDate AS TxDate,
PostST.Reference AS InvNum,
Client.Name AS CustomerName,
SalesRep.Code AS SalesRep,
WhseMst.Code AS Whse,
CONCAT (StkItem.Description_1, ' - ',StkItem.Code) AS Item,
_etblLotTracking.cLotDescription AS LotNumber,
CASE
WHEN PostST.TrCodeID = 34 THEN (PostST.Quantity * 1)
WHEN PostST.TrCodeID = 30 THEN (PostST.Quantity * -1)
ELSE 0
END AS QtySold,
CAST
(
CASE
WHEN PostST.TrCodeID = 34 THEN (PostST.Credit / PostST.Quantity)
WHEN PostST.TrCodeID = 30 THEN (PostST.Debit / (PostST.Quantity * -1))
ELSE 0
END
as [money]
)
AS CustomerPrice,
StkItem.ItemGroup AS ItemGroup,
StkItem.ulIIStockType AS StockType,
YEAR (PostST.TxDate) AS Year,
DATENAME (MONTH, (PostST.TxDate)) AS Month
FROM
PostST
INNER JOIN StkItem
ON StkItem.StockLink = PostST.AccountLink
INNER JOIN PostAR
ON PostAR.cAuditNumber = PostST.cAuditNumber
INNER JOIN Client
ON Client.DCLink = PostAR.AccountLink
INNER JOIN SalesRep
ON SalesRep.idSalesRep = PostAR.RepID
INNER JOIN WhseMst
ON WhseMst.WhseLink = PostST.WarehouseID
FULL JOIN _etblLotTracking
ON _etblLotTracking.idLotTracking = PostST.iLotID
WHERE
PostST.TrCodeID IN (34, 30) AND StkItem.ItemGroup <> 'TRAN'
This query is run from Sage Evolution and data is dumped into Excel. From there I manipulate to view each item and the sales history in quantity per month and year.
My problem is as follows:
Within the data that follows, is sometimes a customer who has perhaps stopped buying a particular item. What I need is for this statement to also filter out a line by the following condition:
If the customer does not have a sale for 6 months or more, then that STOCK ITEM must be filtered out for that specific customer.
Thank you!

Query runs quickly in Oracle SQL Developer, but slowly in SSRS 2008 R2

It's that simple: a query that runs in just a few seconds in SQL Developer connecting to Oracle 11g takes 15-25 minutes in SSRS 2008 R2. I haven't tried other versions of SSRS. So far I'm doing all the report execution from VS 2008.
I'm using the OLE DB Provider "OraOLEDB.Oracle.1" which in the past has seemed to give me better results than using the Oracle provider.
Here's what I've been able to determine so far:
• The delay is during the DataSet execution phase and has nothing to do with the result set or rendering time. (Proving by selecting the same rowset directly from a table I inserted it to.)
• SSRS itself is not hung up. It is truly waiting for Oracle which is where the delay is (proven by terminating the DB session from the Oracle side, which resulted in a prompt error in SSRS about the session being killed).
• I have tried direct queries with parameters in the form :Parameter. Very early versions of my query that were more simple worked okay for direct querying, but it seemed like past a certain complexity, the query would start taking forever from SSRS.
• I then switched to executing an SP that inserts the query results to a table or global temp table. This helped for a little while, getting me farther than direct querying, but again, it almost seems like increased query complexity or length eventually broke this method, too. Note: running a table-populating SP works because with option "use single transaction" checked in the DataSource options, DataSets are then run in the order of their appearance in the rdl file. DataSets that return no Fields are still run, as long as all their parameters are satisfied.
• I just tried a table-returning function and this still made no improvement, even though direct calls with literal parameters in SQL Developer return in 1-5 seconds.
• The database in question does not have statistics. It is part of a product created by a vendor and we have not had the time or management buy-in to create/update statistics. I played with the DYNAMIC_SAMPLING hint to calculate statistics on the fly and got a better execution plan: without statistics the cost-based optimizer had been poorly using a LOOP join instead of a HASH join, causing similar many-minute execution times. Thus I put in query hints to force join order and also to cause it to use the strategic hash join, bringing the execution time back down to just seconds. I did not go back and try direct querying in SSRS using these execution hints.
• I got some help from our Oracle DBA who set up a trace (or whatever the Oracle equivalent is) and he was able to see things being run, but he hasn't found anything useful so far. Unfortunately his time is limited and we haven't been able to really dig in to find out what's being executed server-side. I don't have the experience to do this quickly or the time to study up on how to do this myself. Suggestions on what to do to determine what's going on would be appreciated.
My only hypotheses are:
• The query is somehow getting a bad execution plan. E.g., improperly using a LOOP join instead of a HASH join when there are tens of thousands of "left" or outer-loop rows rather than just a few hundred.
• SSRS could be submitting the parameters as nvarchar(4000) or something instead of something reasonable, and as Oracle SP & function parameters don't have length specifications but get their execution lengths from the query call, then some process such as parameter sniffing is messing up the execution plan as in the previous point.
• The query is somehow being rewritten by SSRS/the provider. I AM using a multi-valued parameter, but not as is: the parameter is being submitted as expression Join(Parameters!MultiValuedParameter.Value, ",") so it shouldn't need any rewriting. Just a simple binding and submitting. I don't see how this could be true in the SP and the function calls, but gosh, what else could it be?
I realize it is a very complicated and lengthy query, but it does exactly what I need. It runs in 1-5 seconds depending on how much data is asked for. Some of the reasons for the complexity are:
Properly handling the comma-separated cost center list parameter
Allowing the weekly breakdown to be optional and if included, ensuring all the weeks in a month are shown even if there is no data for them.
Showing "No Invoices" when appropriate.
Allowing a variable number of summary months.
Having an optional YTD total.
Including previous/historical comparison data means I can't simply use this month's vendors, I have to show all the vendors that will be in any historical column.
Anyway, so here's the query, SP version (though I don't think it will be much help).
create or replace
PROCEDURE VendorInvoiceSummary (
FromDate IN date,
ToDate IN date,
CostCenterList IN varchar2,
IncludeWeekly IN varchar2,
ComparisonMonths IN number,
IncludeYTD IN varchar2
)
AS
BEGIN
INSERT INTO InvoiceSummary (Mo, CostCenter, Vendor, VendorName, Section, TimeUnit, Amt)
SELECT
Mo,
CostCenter,
Vendor,
VendorName,
Section,
TimeUnit,
Amt
FROM (
WITH CostCenters AS (
SELECT Substr(REGEXP_SUBSTR(CostCenterList, '[^,]+', 1, LEVEL) || ' ', 1, 15) CostCenter
FROM DUAL
CONNECT BY LEVEL <= Length(CostCenterList) - Length(Replace(CostCenterList, ',', '')) + 1
), Invoices AS (
SELECT /*+ORDERED USE_HASH(D)*/
TRUNC(I.Invoice_Dte, 'YYYY') Yr,
TRUNC(I.Invoice_Dte, 'MM') Mo,
D.Dis_Acct_Unit CostCenter,
I.Vendor,
V.Vendor_VName,
CASE
WHEN I.Invoice_Dte >= FromDate AND I.Invoice_Dte < ToDate
THEN (TRUNC(I.Invoice_Dte, 'W') - TRUNC(I.Invoice_Dte, 'MM')) / 7 + 1
ELSE 0
END WkNum,
Sum(D.To_Base_Amt) To_Base_Amt
FROM
ICCompany C
INNER JOIN APInvoice I
ON C.Company = I.Company
INNER JOIN APDistrib D
ON C.Company = D.Company
AND I.Invoice = D.Invoice
AND I.Vendor = D.Vendor
AND I.Suffix = D.Suffix
INNER JOIN CostCenters CC
ON D.Dis_Acct_Unit = CC.CostCenter
INNER JOIN APVenMast V ON I.Vendor = V.Vendor
WHERE
D.Cancel_Seq = 0
AND I.Cancel_Seq = 0
AND I.Invoice_Dte >= Least(ADD_MONTHS(FromDate, -ComparisonMonths), TRUNC(FromDate, 'YYYY'))
AND I.Invoice_Dte < ToDate
AND V.Vendor_Group = '1 ' -- index help
GROUP BY
TRUNC(I.Invoice_Dte, 'YYYY'),
TRUNC(I.Invoice_Dte, 'MM'),
D.Dis_Acct_Unit,
I.Vendor,
V.Vendor_VName,
CASE
WHEN I.Invoice_Dte >= FromDate AND I.Invoice_Dte < ToDate
THEN (TRUNC(I.Invoice_Dte, 'W') - TRUNC(I.Invoice_Dte, 'MM')) / 7 + 1
ELSE 0
END
), Months AS (
SELECT ADD_MONTHS(Least(ADD_MONTHS(FromDate, -ComparisonMonths), TRUNC(FromDate, 'YYYY')), LEVEL - 1) Mo
FROM DUAL
CONNECT BY LEVEL <= MONTHS_BETWEEN(ToDate, Least(ADD_MONTHS(FromDate, -ComparisonMonths), TRUNC(FromDate, 'YYYY')))
), Sections AS (
SELECT 1 Section, 1 StartUnit, 5 EndUnit FROM DUAL
UNION ALL SELECT 2, 0, ComparisonMonths FROM DUAL
UNION ALL SELECT 3, 1, 1 FROM DUAL WHERE IncludeYTD = 'Y'
), Vals AS (
SELECT LEVEL - 1 TimeUnit
FROM DUAL
CONNECT BY LEVEL <= (SELECT Max(EndUnit) FROM Sections) + 1
), TimeUnits AS (
SELECT S.Section, V.TimeUnit
FROM
Sections S
INNER JOIN Vals V
ON V.TimeUnit BETWEEN S.StartUnit AND S.EndUnit
), Names AS (
SELECT DISTINCT
M.Mo,
Coalesce(I.Vendor, '0') Vendor,
Coalesce(I.Vendor_VName, 'No Paid Invoices') Vendor_VName,
Coalesce(I.CostCenter, ' ') CostCenter
FROM
Months M
LEFT JOIN Invoices I
ON Least(ADD_MONTHS(M.Mo, -ComparisonMonths), TRUNC(M.Mo, 'YYYY')) < I.Mo
AND M.Mo >= I.Mo
WHERE
M.Mo >= FromDate
AND M.Mo < ToDate
)
SELECT
N.Mo,
N.CostCenter,
N.Vendor,
N.Vendor_VName VendorName,
T.Section,
T.TimeUnit,
Sum(I.To_Base_Amt) Amt
FROM
Names N
CROSS JOIN TimeUnits T
LEFT JOIN Invoices I
ON N.CostCenter = I.CostCenter
AND N.Vendor = I.Vendor
AND (
(
T.Section = 1 -- Weeks for current month
AND N.Mo = I.Mo
AND T.TimeUnit = I.WkNum
) OR (
T.Section = 2 -- Summary months
AND ADD_MONTHS(N.Mo, -T.TimeUnit) = I.Mo
) OR (
T.Section = 3 -- YTD
AND I.Mo BETWEEN TRUNC(N.Mo, 'YYYY') AND N.Mo
)
)
WHERE
N.Mo >= FromDate
AND N.Mo < ToDate
AND NOT ( -- Only 4 weeks when a month is less than 28 days long
T.Section = 2
AND T.TimeUnit = 5
AND TRUNC(N.Mo + 28, 'MM') <> N.Mo
AND I.CostCenter IS NULL
) AND (
T.Section <> 1
OR IncludeWeekly = 'Y'
)
GROUP BY
N.Mo,
N.CostCenter,
N.Vendor,
N.Vendor_VName,
T.Section,
T.TimeUnit
) X;
COMMIT;
END;
UPDATE
Even after learning all about Oracle execution plans and hints (to translate my SQL Server knowledge), I still could not get the query to run quickly in SSRS until I made it run in two steps, first to put the real table results into a GLOBAL TEMPORARY TABLE and then second to extract the data from that. DYNAMIC_SAMPLING gave me a good execution plan, which I then copied using join and access hints. Here's the final SP (it couldn't be a function because in Oracle you can't do DML in a function when that function is called inside of a SELECT statement):
Sometimes I swear it was ignoring my join hints such as swap_join_inputs and no_swap_join_inputs but from my reading apparently Oracle only ignores hints when they can't actually be used or you're doing something wrong. Fortunately, the tables swapped appropriately (as in the case of USE_NL(CC) it reliably puts the CC table as the swapped, left input, even though it's joined last).
CREATE OR REPLACE
PROCEDURE VendorInvoicesSummary (
FromDate IN date,
ToDate IN date,
CostCenterList IN varchar2,
IncludeWeekly IN varchar2,
ComparisonMonths IN number,
IncludeYTD IN varchar2
)
AS
BEGIN
INSERT INTO InvoiceTemp (Yr, Mo, CostCenter, Vendor, WkNum, Amt) -- A global temporary table
SELECT /*+LEADING(C I D CC) USE_HASH(I D) USE_NL(CC)*/
TRUNC(I.Invoice_Dte, 'YYYY') Yr,
TRUNC(I.Invoice_Dte, 'MM') Mo,
D.Dis_Acct_Unit CostCenter,
I.Vendor,
CASE
WHEN I.Invoice_Dte >= FromDate AND I.Invoice_Dte < ToDate
THEN (TRUNC(I.Invoice_Dte, 'W') - TRUNC(I.Invoice_Dte, 'MM')) / 7 + 1
ELSE 0
END WkNum,
Sum(D.To_Base_Amt) To_Base_Amt
FROM
ICCompany C
INNER JOIN APInvoice I
ON C.Company = I.Company
INNER JOIN APDistrib D
ON C.Company = D.Company
AND I.Invoice = D.Invoice
AND I.Vendor = D.Vendor
AND I.Suffix = D.Suffix
INNER JOIN (
SELECT Substr(REGEXP_SUBSTR(CostCenterList, '[^,]+', 1, LEVEL) || ' ', 1, 15) CostCenter
FROM DUAL
CONNECT BY LEVEL <= Length(CostCenterList) - Length(Replace(CostCenterList, ',', '')) + 1
) CC ON D.Dis_Acct_Unit = CC.CostCenter
WHERE
D.Cancel_Seq = 0
AND I.Cancel_Seq = 0
AND I.Invoice_Dte >= Least(ADD_MONTHS(FromDate, -ComparisonMonths), TRUNC(FromDate, 'YYYY'))
AND I.Invoice_Dte < ToDate
GROUP BY
TRUNC(I.Invoice_Dte, 'YYYY'),
TRUNC(I.Invoice_Dte, 'MM'),
D.Dis_Acct_Unit,
I.Vendor,
CASE
WHEN I.Invoice_Dte >= FromDate AND I.Invoice_Dte < ToDate
THEN (TRUNC(I.Invoice_Dte, 'W') - TRUNC(I.Invoice_Dte, 'MM')) / 7 + 1
ELSE 0
END;
INSERT INTO InvoiceSummary (Mo, CostCenter, Vendor, VendorName, Section, TimeUnit, Amt)
SELECT
Mo,
CostCenter,
Vendor,
VendorName,
Section,
TimeUnit,
Amt
FROM (
WITH Months AS (
SELECT ADD_MONTHS(Least(ADD_MONTHS(FromDate, -ComparisonMonths), TRUNC(FromDate, 'YYYY')), LEVEL - 1) Mo
FROM DUAL
CONNECT BY LEVEL <= MONTHS_BETWEEN(ToDate, Least(ADD_MONTHS(FromDate, -ComparisonMonths), TRUNC(FromDate, 'YYYY')))
), Sections AS (
SELECT 1 Section, 1 StartUnit, 5 EndUnit FROM DUAL
UNION ALL SELECT 2, 0, ComparisonMonths FROM DUAL
UNION ALL SELECT 3, 1, 1 FROM DUAL WHERE IncludeYTD = 'Y'
), Vals AS (
SELECT LEVEL - 1 TimeUnit
FROM DUAL
CONNECT BY LEVEL <= (SELECT Max(EndUnit) FROM Sections) + 1
), TimeUnits AS (
SELECT S.Section, V.TimeUnit
FROM
Sections S
INNER JOIN Vals V
ON V.TimeUnit BETWEEN S.StartUnit AND S.EndUnit
), Names AS (
SELECT DISTINCT
M.Mo,
Coalesce(I.Vendor, '0') Vendor,
Coalesce(I.CostCenter, ' ') CostCenter
FROM
Months M
LEFT JOIN InvoiceTemp I
ON Least(ADD_MONTHS(M.Mo, -ComparisonMonths), TRUNC(M.Mo, 'YYYY')) <= I.Mo
AND I.Mo <= M.Mo
WHERE
M.Mo >= FromDate
AND M.Mo < ToDate
)
SELECT
N.Mo,
N.CostCenter,
N.Vendor,
Coalesce(V.Vendor_VName, 'No Paid Invoices') VendorName,
T.Section,
T.TimeUnit,
Sum(I.Amt) Amt
FROM
Names N
INNER JOIN APVenMast V ON N.Vendor = V.Vendor
CROSS JOIN TimeUnits T
LEFT JOIN InvoiceTemp I
ON N.CostCenter = I.CostCenter
AND N.Vendor = I.Vendor
AND (
(
T.Section = 1 -- Weeks for current month
AND N.Mo = I.Mo
AND T.TimeUnit = I.WkNum
) OR (
T.Section = 2 -- Summary months
AND ADD_MONTHS(N.Mo, -T.TimeUnit) = I.Mo
) OR (
T.Section = 3 -- YTD
AND I.Mo BETWEEN TRUNC(N.Mo, 'YYYY') AND N.Mo
)
)
WHERE
N.Mo >= FromDate
AND N.Mo < ToDate
AND V.Vendor_Group = '1 '
AND NOT ( -- Only 4 weeks when a month is less than 28 days long
T.Section = 2
AND T.TimeUnit = 5
AND TRUNC(N.Mo + 28, 'MM') <> N.Mo
AND I.CostCenter IS NULL
) AND (
T.Section <> 1
OR IncludeWeekly = 'Y'
)
GROUP BY
N.Mo,
N.CostCenter,
N.Vendor,
V.Vendor_VName,
T.Section,
T.TimeUnit
) X;
COMMIT;
END;
This has been a long, painful ride, but if there's one thing I've learned it's that working in a database without properly updated statistics (which I'm going to look into getting our DBA to add even though the vendor doesn't care about them) can be a real disaster for someone who wants to get things done in a reasonable amount of time.

Posting the query may help.
Your DBA should be able to identify the session in a view called v$session, and the columns EVENT and WAIT_CLASS should give an indication of what is happening on the Oracle end.
He would also be able to identify the SQL (SQL_ID from v$session) and use that in a SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(sql_id)) to determine the plan.
If it is a development/test instance, see if he will grant you permissions to do that yourself if he (or she) is busy.

I know this is old but we had a similar problem and had to set the nsl_sort to binary instead of binary_ci. People could try setting the session to binary: alter session set nls_sort=binary

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

SQL Query takes more than 3h - too long - query-optimization

Related

SQL Query to group by time and roll up and concatenate string values

SQL - Finding Gaps in Coverage

SQL query to show good records as well as null records

Exclude certain line depending on a certain condition in t-SQL

Query runs quickly in Oracle SQL Developer, but slowly in SSRS 2008 R2

Categories

Resources