I have a requirement to show 6 months data in a Tableau Dashboard. For that I created a view in SQL server which has a join on 3 tables T1,T2 and T3. Each table is having 20 million records and the number will keep on increasing. The problem is that when I execute the query it takes a long time about 2 hours and nothing is displayed on dashboard. Is there a way to increase the performance of the query.
Follwing is the query for cretaing a view. T1,T2 and T3 are three tables and trackingIdentifier in T1 is the foreign key in T3 as transmissionTID and
trackingIdentifier in T2 is the foreign key in T3 as claimSubmissionTID and
Indexes are built on trackingIdentifier , transmissionTID, trackingIdentifier.
RAM: 32768 MB
Processors: 32
SQL Version : 11.0.3381.0
Create View [dbo].[claimReceivingDashboard] AS
(
Select
trans.trackingIdentifier as trackingIdentifier,
trans.receiptdt as downloadDate,
trans.transactiondate as TransactionDate,
trans.purpose as Purpose ,
trans.recordCount as RecordCount ,
cast(acv.activityNet as decimal(12,4)) as Net,
d.Caption as DispositionID ,
trans.SenderID as SenderId ,
trans.transmissionfilename as filename ,
trans.damaninscomp as ReceiverID,
claim.claimid as claimid ,
claim.claimproviderid as providerid ,
claim.trackingIdentifier as claimTrackingId
from
endisposition d , t1 trans , t2 acv, t3 claim
where
claim.dispositionId = d.EnumId
and trans.trackingIdentifier = acv.transmissionTID
and claim.trackingIdentifier = acv.claimsubmissionTID
and cast(t1.transationdate as date) > cast(Getdate() - 120 as date))
GO
You need to join the right attributes between endisposition d, t1, t2 and t3 below is the sample
endisposition d inner join t1 trans
on d.joiningcolumn = trans.joiningcolumn
inner join t2 acv
on trans.trackingidentifier = acv.transmissionid
inner<<left depends on the requirement>> join t3 claim
on acv.claimsubmissionid = claim.trackingidentifier
where condition
you can avoid condition by calculating this in variable cast(Getdate() - 120 as date))
If these joining conditions are not indexed properly create indexes by looking at execution plan
Related
I wrote the below query to pull the data from different databases. I have created two temp tables to pull the data from two different databases and finally a select statement from the original database to join all the tables. My query is getting executed but not getting any data.(Report is blank). I tried executing the two temp tables separately. it is giving the correct data. But when I execute the whole query, the result is blank. Below is the query. Please help.
"set fmtonly off
use GODSDB
IF object_id('tempdb..#CISIS_Call_Log') IS NOT NULL DROP TABLE #CISIS_Call_Log
select *
into #CISIS_Call_Log
from OPENQUERY (CSISDB,
'select
ccl.ContractOID,
ccl.db_insertdate,
ccl.ContractCallLogStatusIdentifier,
ccl.db_UpdateDate,
ccp.ContractCallLogPurposeOID,
ccp.ContractCallLogPurposeIdentifier,
ccp.Description
from csisdb.dbo.ContractCallLog CCL
inner join csisdb.dbo.ContractCallLogPurpose CCP on ccl.ContractCallLogPurposeIdentifier = ccp.ContractCallLogPurposeIdentifier
where JurisdictionShortIdentifier = ''ON''
AND ContractCallLogStatusIdentifier IN (''DNR'', ''NR'')
')
IF object_id('tempdb..#CMS_Campaign') IS NOT NULL DROP TABLE #CMS_Campaign
select *
into #CMS_Campaign
from OPENQUERY (BA_GBASSTOCMS, '
Select
SystemSourceIdentifier,
ContractOID,
OfferSentDate,
CampaignOfferTypeIdentifier,
CampaignContractStatusIdentifier,
CampaignContractStatusUpdateDate,
DeclineDate,
CampaignOfferOID,
CampaignOID,
CampaignStartDate,
CampaignEndDate,
Jurisdiction,
CampaignDescription
from CMS.dbo.vw_CampaignInfo
where Jurisdiction = ''ON''
and CampaignOfferTypeIdentifier = ''REN''
')
select mp.CommodityTypeIdentifier as Commodity
,c.RtlrContractIdentifier as ContractID
,cs.ContractStatusIdentifier as ContractStatus
,c.SigningDate
,cf.StartDate as FlowStartDate
,cf.EndDate as FlowEndDate
,datediff(day, getdate(), c.RenewalDate) as RemainingDays
,c.RenewalDate
,l.ContractCallLogStatusIdentifier as CallLogType
,Substring (l.Description, 1, 20) as CallPurpose
,l.db_insertDate as CallLogDate
,cms.CampaignOfferOID as OfferID
,cms.CampaignContractStatusIdentifier as OfferStatus
,cms.CampaignContractStatusUpdateDate as StatusChangeDate
,cms.DeclineDate
from Contract c
inner join contractstate cs on cs.contractoid = c.ContractOID
and cs.ContractStatusIdentifier in ('ERA', 'FLW')
and datediff(day, getdate(), c.RenewalDate) > 60
inner join SiteIdentification si on si.SiteOID = c.SiteOID
inner join MarketParticipant mp on mp.MarketParticipantOID = si.MarketParticipantOID
inner join Market m on m.MarketOID = mp.MarketOID
inner join Jurisdiction j on j.JurisdictionOID = m.JurisdictionOID
and j.CountryCode = 'CA'
and j.ProvinceOrStateCode = 'ON'
inner join ContractFlow cf on cf.ContractOID = c.ContractOID
inner join #CISIS_Call_Log l on convert(varchar(15), l.ContractOID) = c.RtlrContractIdentifier
inner join #CMS_Campaign cms on convert(varchar(15), cms.ContractOID) = c.RtlrContractIdentifier
set fmtonly on"
IF the data in each temp table is verified, then:
Try a smaller, less complex, query to test your temp tables with. Also try them using a LEFT join as well e.g.:
select
c.RtlrContractIdentifier as ContractID
, c.SigningDate
, datediff(day, getdate(), c.RenewalDate) as RemainingDays
, c.RenewalDate
, l.ContractCallLogStatusIdentifier as CallLogType
, Substring (l.Description, 1, 20) as CallPurpose
, l.db_insertDate as CallLogDate
, cms.CampaignOfferOID as OfferID
, cms.CampaignContractStatusIdentifier as OfferStatus
, cms.CampaignContractStatusUpdateDate as StatusChangeDate
, cms.DeclineDate
from Contract c
LEFT join #CISIS_Call_Log l on convert(varchar(15), l.ContractOID) = c.RtlrContractIdentifier
LEFT join #CMS_Campaign cms on convert(varchar(15), cms.ContractOID) = c.RtlrContractIdentifier
Does this return data? Does it return data from both joined tables?
If neither temp table is returning data then those join conditions need to be changed.
If both temp tables do return data from that query, then try INNER joins. If that still works, then add back more joins (one at a time) until you identify the join that causes the overall fault.
Without data for every table it just isn't possible for us to pinpoint the exact reason for a NULL result. Only you can, so you need to trouble-shoot the problem one step at a time.
I have a bunch of bank transactions in a table in SQL.
Example: http://sqlfiddle.com/#!6/6b2c8/1/0
I need to identify the transactions that are made between these 2 linked accounts. The Accounts table (not shown) links these 2 accounts to the one source (user).
For example:
I have an everyday account, and a savings account. From time to time, I may transfer money from my everyday account, to my savings account (or vice-versa).
The transaction descriptions are usually similar (Transfer to xxx/transfer from xxx), usually on the same day, and obviously, the same dollar amount.
EDIT: I now have the following query (dumbed down), which works for some scenarios
Basically, I created 2 temp tables with all withdrawals and deposits that met certain criteria. I then join them together, based on a few requirements (same transaction amount, different account # etc). Then using the ROW_NUMBER function, I have ordered which ones are more likely to be inter-account transactions.
I now have an issue where if, for example:
$100 transferred from Account A to Account B
$100 Transferred from Account B to Account C
My query will match the transfer between Account A and C, then there is only one transaction for account B, and it will not be matched. So essentially, instead of receiving 2 rows back (2 deposits, lined up with 2 withdrawals), I only get 1 row (1 deposit, 1 withdrawal), for a transfer from A to B :(
INSERT INTO #Deposits
SELECT t.*
FROM dbo.Customer c
INNER JOIN dbo.Source src ON src.AppID = app.AppID
INNER JOIN dbo.Account acc ON acc.SourceID = src.SourceID
INNER JOIN dbo.Tran t ON t.AccountID = acc.AccountID
WHERE c.CustomerID = 123
AND t.Template = 'DEPOSIT'
INSERT INTO #Withdrawals
SELECT t.*
FROM dbo.Customer c
INNER JOIN dbo.Source src ON src.AppID = app.AppID
INNER JOIN dbo.Account acc ON acc.SourceID = src.SourceID
INNER JOIN dbo.Tran t ON t.AccountID = acc.AccountID
WHERE c.CustomerID = 123
AND t.Template = 'WITHDRAWAL'
;WITH cte
AS ( SELECT [...] ,
ROW_NUMBER() OVER ( PARTITION BY d.TranID ORDER BY SUM( CASE WHEN d.TranDate = d.TranDate THEN 2 ELSE 1 END), w.TranID ) AS DepRN,
ROW_NUMBER() OVER ( PARTITION BY w.TranID ORDER BY SUM( CASE WHEN d.TranDate = d.TranDate THEN 2 ELSE 1 END ), d.TranID ) AS WdlRN
FROM #Withdrawal w
INNER JOIN d ON w.TranAmount = d.TranAmount -- Same transaction amount
AND w.AccountID <> d.AccountID -- Different accounts, same customer
AND w.TranDate BETWEEN d.TranDate AND DATEADD(DAY, 3, d.TranDate) -- Same day, or within 3 days
GROUP BY [...]
)
SELECT *
FROM cte
WHERE cte.DepRN = cte.WdlRN
Maybe this is a start? I don't think we have enough info to say whether this would be reliable or would cause a lot of "false positives".
select t1.TransactionID, t2.TransactionID
from dbo.Transactions as t1 inner join dbo.Transactions as t2
on t2.AccountID = t2.AccountID
and t2.TransactionDate = t1.TransactionDate
and t2.TransactionAmount = t1.TransactionAmount
and t2.TransactionID - t1.TransactionID between 1 and 20 -- maybe??
and t1.TransactionDesc like 'Transfer from%'
and t2.TransactionDesc like 'Transfer to%'
and t2.TransactionID > t1.TransactionID
My first post to stackoverflow (which has helped me hugely over time):
I have a query with a cross join that works fine when I run it with a where clause, but takes forever when I place it in a view and apply the where clause to the view.
I think the problem is that SQL is not applying the where clause to the cross join when the code is encapsulated in a view, and thus ending up with millions of rows (instead of 180 in this case).
The code is below - it is a query which forecasts the future on-hand stock of an item in a warehouse using an average expected monthly usage and a list of incoming orders.
CREATE VIEW [dbo].[ItemWarehouseStockForecastDaily2]
AS
SELECT
fd.AsafterDate
, iw.idItem
, iw.idWarehouse
, iw.OnHandQuantity
+ SUM(ISNULL(iwio.PurchaseOrderInboundQuantity, 0)
- iws.AverageMonthlyDemandQuantity / (365.25/12)
) OVER (ORDER BY fd.AsafterDate) AS OnHandQuantity
FROM
(
( SELECT CalendarDate Asafterdate
FROM Calendar c
WHERE c.CalendarDate > GETDATE()
AND c.CalendarDate < DATEADD(d, 180, GETDATE())
) fd -- This table has 180 rows
-- This table has 10 million rows - one per item per warehouse
CROSS JOIN ItemWarehouse iw
)
LEFT JOIN ItemWarehouseDemandFromStockStatisticsMonthly iws
ON iws.idItem = iw.idItem
AND iws.idWarehouse = iw.idWarehouse
LEFT JOIN ItemWarehouseInboundAndOutboundQuantitiesWithDueDate iwio
ON iwio.idItem = iw.idItem
AND iwio.idWarehouse = iw.idWarehouse
AND iwio.DueDate = fd.Asafterdate
/*
WHERE iw.idItem = 12345
AND iw.idWarehouse = 67
ORDER BY AsafterDate
*/
The commented-out where clause makes the query run fast (sub-second) when not in a view (tables cluster by idwarehouse, iditem)
Any/all help and advice will be greatly appreciated.
I don't have the reputation to comment yet. However, checking on technet it says that a Cross Join should behave the same as a inner join if you add a where statement (at least for SQL 2008 R2). https://technet.microsoft.com/en-us/library/ms190690(v=sql.105).aspx
Did you check wether the same issue occurs if you use an inner join?
As another alternativ, can you add the constraint directly to the where clause as follows?
CREATE VIEW [dbo].[ItemWarehouseStockForecastDaily2]
AS
SELECT
fd.AsafterDate
, iw.idItem
, iw.idWarehouse
, iw.OnHandQuantity
+ SUM(ISNULL(iwio.PurchaseOrderInboundQuantity, 0)
- iws.AverageMonthlyDemandQuantity / (365.25/12)
) OVER (ORDER BY fd.AsafterDate) AS OnHandQuantity
FROM
(
( SELECT CalendarDate Asafterdate
FROM Calendar c
WHERE c.CalendarDate > GETDATE()
AND c.CalendarDate < DATEADD(d, 180, GETDATE())
) fd -- This table has 180 rows
-- Filter right here instead of later:
INNER JOIN ItemWarehouse iw ON iw.idItem = 12345 AND iw.idWarehouse = 67
)
LEFT JOIN ItemWarehouseDemandFromStockStatisticsMonthly iws
ON iws.idItem = iw.idItem
AND iws.idWarehouse = iw.idWarehouse
LEFT JOIN ItemWarehouseInboundAndOutboundQuantitiesWithDueDate iwio
ON iwio.idItem = iw.idItem
AND iwio.idWarehouse = iw.idWarehouse
AND iwio.DueDate = fd.Asafterdate
I am trying to run a SELECT query using LEFT JOIN. I get a COUNT on my second table ( the table on the right side of LEFT JOIN ). This process becomes slightly heavy as the number of records on the second table goes up. My first and second table have a one-to-many relationship. The second table's CampaignId column is a foreign key to the first table's Id. This is a simplified version of my query:
SELECT a.[Id]
,a.CampaignId
,a.[Inserted] AS 'Date'
,COUNT(b.Id) AS 'Received'
FROM [CampaignRun] AS a
LEFT JOIN [CampaignRecipient] AS b
ON a.Id = b.CampaignRunId
GROUP BY
a.[Id], a.CampaignId,a.[Inserted]
HAVING
a.CampaignId = 637
ORDER BY
a.[Inserted] DESC
The number 637 is an example for one the records only.
Is there a way to make this query run faster?
Use a sub-select to calculate Received:
SELECT a.[Id]
,a.CampaignId
,a.[Inserted] AS 'Date'
, (SELECT COUNT(*) FROM [CampaignRecipient] AS b
WHERE a.Id = b.CampaignRunId ) AS 'Received'
FROM [CampaignRun] AS a
WHERE a.CampaignId = 637
ORDER BY a.[Inserted] DESC
You have unneed HAVING clause here, which you can move to WHERE clause
SELECT a.[Id]
,a.CampaignId
,a.[Inserted] AS 'Date'
,COUNT(b.Id) AS 'Received'
FROM [CampaignRun] AS a
LEFT JOIN [CampaignRecipient] AS b
ON a.Id = b.CampaignRunId
WHERE a.CampaignId = 637
GROUP BY a.[Id], a.CampaignId,a.[Inserted]
ORDER BY a.[Inserted] DESC
Also ensure that you have index on foreign key in [CampaignRecipient] table on CampaignRunId column. It's considered a good practice.
I am working on a project where I need to synchronize data from our system to an external system. What I want to achieve, is to periodically send only changed items (rows) from a custom query. This query looks like this (but with many more columns) :
SELECT T1.field1,
T1.field2,
T1.field2,
T1.field3,
CASE WHEN T1.field4 = 'some-value' THEN 1 ELSE 0 END,
T2.field1,
T3.field1,
T4.field1
FROM T1
INNER JOIN T2 ON T2.pk = T2.fk
INNER JOIN T3 ON T3.pk = T2.fk
INNER JOIN T4 ON T4.pk = T2.fk
I want to avoid to have to compare every field one to one between synchronizations. I came with the idea that I could generate a hash for every row from my query, and compare this with the hash from the previous synchronization, which will return only the changed rows. I am aware of the CHECKSUM function, but it is very collision-prone and might miss changes sometimes. However I like the way I could just make a temp table and use CHECKSUM(*), which makes maintenance easier (not having to add fields in the query and in the CHECKSUM) :
SELECT T1.field1,
T1.field2,
T1.field2,
T1.field3,
CASE WHEN T1.field4 = 'some-value' THEN 1 ELSE 0 END,
T2.field1,
T3.field1,
T4.field1
INTO #tmp
FROM T1
INNER JOIN T2 ON T2.pk = T2.fk
INNER JOIN T3 ON T3.pk = T2.fk
INNER JOIN T4 ON T4.pk = T2.fk;
-- get all columns from the query, plus a hash of the row
SELECT *, CHECKSUM(*)
FROM #tmp;
I am aware of HASHBYTES function (which supports sha1, md5, which are less prone to collisions), but it only accept varchar or varbinary, not a list of columns or * the way CHECKSUM does. Having to cast/convert every column from the query is a pain in the ... and opens the door to errors (forget to include a new field for instance)
I also noticed Change Data Capture and Change Tracking features of SQL Server, but they all seems complicated and overkill for what I am doing.
So my question : is there an other method to generate a hash from a query or a temp table that meets my criterias ?
If not, is there an other way to achieve this kind of work (to sync differences from a query)
I found a way to do exactly what I wanted, thanks to the FOR XML clause :
SELECT T1.field1,
T1.field2,
T1.field2,
T1.field3,
CASE WHEN T1.field4 = 'some-value' THEN 1 ELSE 0 END,
T2.field1,
T3.field1,
T4.field1
INTO #tmp
FROM T1
INNER JOIN T2 ON T2.pk = T2.fk
INNER JOIN T3 ON T3.pk = T2.fk
INNER JOIN T4 ON T4.pk = T2.fk;
-- get all columns from the query, plus a hash of the row (converted in an hex string)
SELECT T.*, CONVERT(VARCHAR(100), HASHBYTES('sha1', (SELECT T.* FOR XML RAW)), 2) AS sHash
FROM #tmp AS T;