I have two tables like...
table1 (cid, duedate, currency, value)
main_table1 (cid)
My query is like below, I am find out co-relation between each cid and table1 contains 3 million records(cid and duedate column is compositely unique) and main_table contains 1500 records all unique.
SELECT
b.cid, c.cid,
(COUNT(*) * SUM(b.value * c.value) -
SUM(b.value) * SUM(c.value)) /
(SQRT(COUNT(*) * SUM(b.value * b.value) -
SUM(b.value) * SUM(b.value)) *
SQRT(COUNT(*) * SUM(c.value * c.value) -
SUM(c.value) * SUM(c.value))
) AS correl_ij
FROM
main_table1 a
JOIN
table1 AS b ON a.cid = b.cid
JOIN
table1 AS c ON b.cid < c.cid
AND b.duedate = c.duedate
AND b.currency = c.currency
GROUP BY
b.cid, c.cid
Please suggest how to optimize this query because it is running slow.
CREATE TABLE #table1(
id int identity,
cid int NOT NULL,
duedate date NOT NULL,
currency char(3) NOT NULL,
value float,
PRIMARY KEY(id,currency,cid,duedate)
);
CREATE TABLE #main_table1(
cid int NOT NULL PRIMARY KEY,
currency char(3)
);
--#main table contains 155000 cid records there is no duplicate values
insert into #main_table1
values(19498,'ABC'),(19500,'ABC'),(19534,'ABC')
INSERT INTO #table1(CID,DUEDATE,currency,value)
VALUES(19498,'2016-12-08','USD',-0.0279702098021799) ,
(19498,'2016-12-12','USD',0.0151285161000268),
(19498,'2016-12-15','USD',-0.00965080868337728),
(19498,'2016-12-19','USD',0.00808331709091531)
There are 3 million records in this table for diffrent dates and cid and most of the cid are present in #main_table1.
I am using a.cid < b.cid to remove duplicate relationship between a.cid and b.cid beause i am deriving corelation between each cid.
so 19498 -->>19500 corelation is calculated hence then i do not want 19500--> 19498 because it would be same but duplicate.
That PK is silly. Why would you include Iden in a composite PK let alone in the first position? Drop Iden unless you have to have it for some misguided reason.
PRIMARY KEY(cid, currency, duedate)
Or the natural key if different
If you're commonly joining or sorting on the cid column, you probably want a clustered index on that column or a composite beginning with that column.
If cid, duedate is unique then you can consider removing the id altogether.
If you want to retain id for some reason, make it PRIMARY KEY NONCLUSTERED, and specify a clustered index on cid, duedate.
I'm currently working on a stored procedure on SQL Server 2016. In my Database I have a table structure and need to add another table, which references to the same table as an existing one.
Thus, I have 2 times a 1:1 relation to the same table.
The occuring problem is, I reference the same keys from 2 different origin tables twice in the same target table.
Target table:
FK_Tables | Text
----------------
1 | Table One Text Id: 1
1 | Table Two Text Id: 1 // The error: Same FK_Tables 2 times
Table One:
ID | OtherField
---------
1 | 42
Table Two:
ID | CoolField
---------
1 | 22
Table One and Table Two are currently referencing to the table Reference Table.
Do you know how I can solve this problem, of the same ID twice?
Thanks!!
You need to add a column for each table you're referencing, otherwise you wouldn't know where the ID is coming from if they were all inserted into the same field. Something like this:
/*
CREATE TEST TABLES
*/
DROP TABLE IF EXISTS tbOne;
CREATE TABLE tbOne ( ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY
, TXT VARCHAR(10)
);
DROP TABLE IF EXISTS tbTwo;
CREATE TABLE tbTwo ( ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY
, TXT VARCHAR(10)
);
DROP TABLE IF EXISTS Target;
CREATE TABLE Target ( ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY
, FKTB1 INT
, FKTB2 INT
, TXT VARCHAR(100)
);
-- 1st FK tbOne
ALTER TABLE Target ADD CONSTRAINT FK_One FOREIGN KEY (FKTB1) REFERENCES tbOne (ID);
--2nd FK tbTwo
ALTER TABLE Target ADD CONSTRAINT FK_Two FOREIGN KEY (FKTB2) REFERENCES tbTwo (ID);
-- Populate test tables
INSERT INTO tbOne (TXT)
SELECT TOP 100 LEFT(text, 10)
FROM SYS.messages
INSERT INTO tbTwo (TXT)
SELECT TOP 100 LEFT(text, 10)
FROM SYS.messages
INSERT INTO [Target] (FKTB1, FKTB2, TXT)
SELECT 1, 1, 'Test - constraint'
-- Check result set
SELECT *
FROM tbTwo
SELECT *
FROM tbOne
SELECT *
FROM [Target] T
INNER JOIN tbOne TB1
ON T.FKTB1 = TB1.ID
INNER JOIN tbTwo TB2
ON T.FKTB2 = TB2.ID
I have an old view that takes 4 mins to run, I have been asked to speed it up. The FROM looks like this:
FROM TableA
CROSS JOIN ViewA
INNER JOIN TableB on ViewA.Name = TableB.Name
AND TableA.Code = TableB.Code
AND TableA.Location = TableB.Location
WHERE (DATEDIFF(m, ViewA.SubmitDate, GETDATE()) = 1) -- Only pull last months rows
Table A has around 99k rows, ViewA has around 2000 rows and TableB has around 101K rows. I think the problem is at the INNER JOIN because it I remove it, the query takes 1 second.
My first thought was to see if I could down the number of rows in ViewA by breaking the whole thing into CTEs but this made zero impact. I am thinking I need to index TableB, because it is just a bunch of varchars being used in the joins. I am now changing it to temp tables so I can index it. I can not change the underlying tables and views. Is index temp tables a good way to go, or is there a better solution.
Edit to add info regarding existing indexes. Only thing with an index on it right now is TableA.Id which is the PK and a clustered Index. TableB has an Id field but it is not the PK. ViewA is not indexed.
Edit again to correct some structure. SubmitDate is in the View, not the table.
Here is a very basic structure:
CREATE TABLE TableA
(
Id int NOT NULL PRIMARY KEY,
Section varchar(20) NULL,
Code varchar(20) NULL
)
CREATE TABLE TableB
(
Id int NOT NULL PRIMARY KEY,
Name varchar(20) NULL,
Code varchar(20) NULL,
Section varchar(20) NULL
)
CREATE TABLE TableC
(
Id int NOT NULL PRIMARY KEY,
Name varchar(20) NULL,
SubmitDate DateTime NOT NULL
)
CREATE TABLE TableD
(
Id int NOT NULL PRIMARY KEY,
Section varchar(20) NULL
)
CREATE VIEW ViewA
AS
SELECT c.Section, d.Name, c.SubmitDate
FROM TableC c
JOIN TableD d ON a.Id = b.Id
One improovement is to rewrite where clause into sargable clause. Add index to SubmitDate if there is no index and change query to:
FROM TableA
CROSS JOIN ViewA
INNER JOIN TableB on ViewA.Name = TableB.Name
AND TableA.Code = TableB.Code
AND TableA.Location = TableB.Location
WHERE
TableA.SubmitDate >=DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE())-1,0)
And TableA.SubmitDate < Dateadd(DAY, 1, DATEADD(MONTH,
DATEDIFF(MONTH, -1, GETDATE())-1, -1) )
Also add nonclustered indexes on Name, Code and Location columns.
I have a view for which I want to create an Indexed view. After a lot of energy I was able to put the sql query in place for the view and It looks like this -
ALTER VIEW [dbo].[FriendBalances] WITH SCHEMABINDING as
WITH
trans (Amount,PaidBy,PaidFor, Id) AS
(SELECT Amount,userid AS PaidBy, PaidForUsers_FbUserId AS PaidFor, Id FROM dbo.Transactions
FULL JOIN dbo.TransactionUser ON dbo.Transactions.Id = dbo.TransactionUser.TransactionsPaidFor_Id),
bal (PaidBy,PaidFor,Balance) AS
(SELECT PaidBy,PaidFor, SUM( Amount/ transactionCounts.[_count]) AS Balance FROM trans
JOIN (SELECT Id,COUNT(*)AS _count FROM trans GROUP BY Id) AS transactionCounts ON trans.Id = transactionCounts.Id AND trans.PaidBy <> trans.PaidFor
GROUP BY trans.PaidBy,trans.PaidFor )
SELECT ISNULL(bal.PaidBy,bal2.PaidFor)AS PaidBy,ISNULL(bal.PaidFor,bal2.PaidBy)AS PaidFor,
ISNULL( bal.Balance,0)-ISNULL(bal2.Balance,0) AS Balance
FROM bal
left JOIN bal AS bal2 ON bal.PaidBy = bal2.PaidFor AND bal.PaidFor = bal2.Paidby
WHERE ISNULL( bal.Balance,0)>ISNULL(bal2.Balance,0)
Sample Data for FriendBalances View -
PaidBy PaidFor Balance
------ ------- -------
9990 9991 1000
9990 9992 2000
9990 9993 1000
9991 9993 1000
9991 9994 1000
It is mainly a join of 2 tables.
Transactions -
CREATE TABLE [dbo].[Transactions](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Date] [datetime] NOT NULL,
[Amount] [float] NOT NULL,
[UserId] [bigint] NOT NULL,
[Remarks] [nvarchar](255) NULL,
[GroupFbGroupId] [bigint] NULL,
CONSTRAINT [PK_Transactions] PRIMARY KEY CLUSTERED
Sample data in Transactions Table -
Id Date Amount UserId Remarks GroupFbGroupId
-- ----------------------- ------ ------ -------------- --------------
1 2001-01-01 00:00:00.000 3000 9990 this is a test NULL
2 2001-01-01 00:00:00.000 3000 9990 this is a test NULL
3 2001-01-01 00:00:00.000 3000 9991 this is a test NULL
TransactionUsers -
CREATE TABLE [dbo].[TransactionUser](
[TransactionsPaidFor_Id] [bigint] NOT NULL,
[PaidForUsers_FbUserId] [bigint] NOT NULL
) ON [PRIMARY]
Sample Data in TransactionUser Table -
TransactionsPaidFor_Id PaidForUsers_FbUserId
---------------------- ---------------------
1 9991
1 9992
1 9993
2 9990
2 9991
2 9992
3 9990
3 9993
3 9994
Now I am not able to create a view because my query contains cte(s). What are the options that I have now?
If cte can be removed, what should be the other option which would help in creating indexed views.
Here is the error message -
Msg 10137, Level 16, State 1, Line 1 Cannot create index on view "ShareBill.Test.Database.dbo.FriendBalances" because it references common table expression "trans". Views referencing common table expressions cannot be indexed. Consider not indexing the view, or removing the common table expression from the view definition.
The concept:
Transaction mainly consists of:
an Amount that was paid
UserId of the User who paid that amount
and some more information which is not important for now.
TransactionUser table is a mapping between a Transaction and a User Table. Essentially a transaction can be shared between multiple persons. So we store that in this table.
So we have transactions where 1 person is paying for it and other are sharing the amount. So if A pays 100$ for B then B would owe 100$ to A. Similarly if B pays 90$ for A then B would owe only $10 to A. Now if A pays 300$ for A,b,c that means B would owe 110$ and C would owe 10$ to A.
So in this particular view we are aggregating the effective amount that has been paid (if any) between 2 users and thus know how much a person owes another person.
Okay, this gives you an indexed view (that needs an additional view on top of to sort out the who-owes-who detail), but it may not satisfy your requirements still.
/* Transactions table, as before, but with handy unique constraint for FK Target */
CREATE TABLE [dbo].[Transactions](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Date] [datetime] NOT NULL,
[Amount] [float] NOT NULL,
[UserId] [bigint] NOT NULL,
[Remarks] [nvarchar](255) NULL,
[GroupFbGroupId] [bigint] NULL,
CONSTRAINT [PK_Transactions] PRIMARY KEY CLUSTERED (Id),
constraint UQ_Transactions_XRef UNIQUE (Id,Amount,UserId)
)
Nothing surprising so far, I hope
/* Much expanded TransactionUser table, we'll hide it away and most of the maintenance is automatic */
CREATE TABLE [dbo]._TransactionUser(
[TransactionsPaidFor_Id] int NOT NULL,
[PaidForUsers_FbUserId] [bigint] NOT NULL,
Amount float not null,
PaidByUserId bigint not null,
UserCount int not null,
LowUserID as CASE WHEN [PaidForUsers_FbUserId] < PaidByUserId THEN [PaidForUsers_FbUserId] ELSE PaidByUserId END,
HighUserID as CASE WHEN [PaidForUsers_FbUserId] < PaidByUserId THEN PaidByUserId ELSE [PaidForUsers_FbUserId] END,
PerUserDelta as (Amount/UserCount) * CASE WHEN [PaidForUsers_FbUserId] < PaidByUserId THEN -1 ELSE 1 END,
constraint PK__TransactionUser PRIMARY KEY ([TransactionsPaidFor_Id],[PaidForUsers_FbUserId]),
constraint FK__TransactionUser_Transactions FOREIGN KEY ([TransactionsPaidFor_Id]) references dbo.Transactions,
constraint FK__TransactionUser_Transaction_XRef FOREIGN KEY ([TransactionsPaidFor_Id],Amount,PaidByUserID)
references dbo.Transactions (Id,Amount,UserId) ON UPDATE CASCADE
)
This table now maintains enough information to allow the view to be constructed. The rest of the work we do is to construct/maintain the data in the table. Note that, with the foreign key constraint, we've already ensured that if, say, an amount is changed in the transactions table, everything gets recalculated.
/* View that mimics the original TransactionUser table -
in fact it has the same name so existing code doesn't need to change */
CREATE VIEW dbo.TransactionUser
with schemabinding
as
select
[TransactionsPaidFor_Id],
[PaidForUsers_FbUserId]
from
dbo._TransactionUser
GO
/* Effectively the PK on the original table */
CREATE UNIQUE CLUSTERED INDEX PK_TransactionUser on dbo.TransactionUser ([TransactionsPaidFor_Id],[PaidForUsers_FbUserId])
Anything that's already written to work against TransactionUser will now work against this view, and be none the wiser. Except, they can't insert/update/delete the rows without some help:
/* Now we write the trigger that maintains the underlying table */
CREATE TRIGGER dbo.T_TransactionUser_IUD
ON dbo.TransactionUser
INSTEAD OF INSERT, UPDATE, DELETE
AS
SET NOCOUNT ON;
/* Every delete affects *every* row for the same transaction
We need to drop the counts on every remaining row, as well as removing the actual rows we're interested in */
WITH DropCounts as (
select TransactionsPaidFor_Id,COUNT(*) as Cnt from deleted group by TransactionsPaidFor_Id
), KeptRows as (
select tu.TransactionsPaidFor_Id,tu.PaidForUsers_FbUserId,UserCount - dc.Cnt as NewCount
from dbo._TransactionUser tu left join deleted d
on tu.TransactionsPaidFor_Id = d.TransactionsPaidFor_Id and
tu.PaidForUsers_FbUserId = d.PaidForUsers_FbUserId
inner join DropCounts dc
on
tu.TransactionsPaidFor_Id = dc.TransactionsPaidFor_Id
where
d.PaidForUsers_FbUserId is null
), ChangeSet as (
select TransactionsPaidFor_Id,PaidForUsers_FbUserId,NewCount,1 as Keep
from KeptRows
union all
select TransactionsPaidFor_Id,PaidForUsers_FbUserId,null,0
from deleted
)
merge into dbo._TransactionUser tu
using ChangeSet cs on tu.TransactionsPaidFor_Id = cs.TransactionsPaidFor_Id and tu.PaidForUsers_FbUserId = cs.PaidForUsers_FbUserId
when matched and cs.Keep = 1 then update set UserCount = cs.NewCount
when matched then delete;
/* Every insert affects *every* row for the same transaction
This is why the indexed view couldn't be generated */
WITH TU as (
select TransactionsPaidFor_Id,PaidForUsers_FbUserId,Amount,PaidByUserId from dbo._TransactionUser
where TransactionsPaidFor_Id in (select TransactionsPaidFor_Id from inserted)
union all
select TransactionsPaidFor_Id,PaidForUsers_FbUserId,Amount,UserId
from inserted i inner join dbo.Transactions t on i.TransactionsPaidFor_Id = t.Id
), CountedTU as (
select TransactionsPaidFor_Id,PaidForUsers_FbUserId,Amount,PaidByUserId,
COUNT(*) OVER (PARTITION BY TransactionsPaidFor_Id) as Cnt
from TU
)
merge into dbo._TransactionUser tu
using CountedTU new on tu.TransactionsPaidFor_Id = new.TransactionsPaidFor_Id and tu.PaidForUsers_FbUserId = new.PaidForUsers_FbUserId
when matched then update set Amount = new.Amount,PaidByUserId = new.PaidByUserId,UserCount = new.Cnt
when not matched then insert
([TransactionsPaidFor_Id],[PaidForUsers_FbUserId],Amount,PaidByUserId,UserCount)
values (new.TransactionsPaidFor_Id,new.PaidForUsers_FbUserId,new.Amount,new.PaidByUserId,new.Cnt);
Now that the underlying table is being maintained, we can finally write the indexed view you wanted in the first place... almost. The issue is that the totals we create may be positive or negative, because we've normalized the transactions so that we can easily sum them:
CREATE VIEW [dbo]._FriendBalances
WITH SCHEMABINDING
as
SELECT
LowUserID,
HighUserID,
SUM(PerUserDelta) as Balance,
COUNT_BIG(*) as Cnt
FROM dbo._TransactionUser
WHERE LowUserID != HighUserID
GROUP BY
LowUserID,
HighUserID
GO
create unique clustered index IX__FriendBalances on dbo._FriendBalances (LowUserID, HighUserID)
So we finally create a view, built on the indexed view above, that if the balance is negative, we flip the person owed, and the person owing around. But it will use the index on the above view, which is most of the work we were seeking to save by having the indexed view:
create view dbo.FriendBalances
as
select
CASE WHEN Balance >= 0 THEN LowUserID ELSE HighUserID END as PaidBy,
CASE WHEN Balance >= 0 THEN HighUserID ELSE LowUserID END as PaidFor,
ABS(Balance) as Balance
from
dbo._FriendBalances WITH (NOEXPAND)
Now, finally, we insert your sample data:
set identity_insert dbo.Transactions on --Ensure we get IDs we know
GO
insert into dbo.Transactions (Id,[Date] , Amount , UserId , Remarks ,GroupFbGroupId)
select 1 ,'2001-01-01T00:00:00.000', 3000, 9990 ,'this is a test', NULL union all
select 2 ,'2001-01-01T00:00:00.000', 3000, 9990 ,'this is a test', NULL union all
select 3 ,'2001-01-01T00:00:00.000', 3000, 9991 ,'this is a test', NULL
GO
set identity_insert dbo.Transactions off
GO
insert into dbo.TransactionUser (TransactionsPaidFor_Id, PaidForUsers_FbUserId)
select 1, 9991 union all
select 1, 9992 union all
select 1, 9993 union all
select 2, 9990 union all
select 2, 9991 union all
select 2, 9992 union all
select 3, 9990 union all
select 3, 9993 union all
select 3, 9994
And query the final view:
select * from dbo.FriendBalances
PaidBy PaidFor Balance
9990 9991 1000
9990 9992 2000
9990 9993 1000
9991 9993 1000
9991 9994 1000
Now, there is additional work we could do, if we were concerned that someone may find a way to dodge the triggers and perform direct changes to the base tables. The first would be yet another indexed view, that will ensure that every row for the same transaction has the same UserCount value. Finally, with a few additional columns, check constraints, FK constraints and more work in the triggers, I think we can ensure that the UserCount is correct - but it may add more overhead than you want.
I can add scripts for these aspects if you want me to - it depends on how restrictive you want/need the database to be.
I have 2 tables:
Order (with a identity order id field)
OrderItems (with a foreign key to order id)
In a stored proc, I have a list of orders that I need to duplicate. Is there a good way to do this in a stored proc without a cursor?
Edit:
This is on SQL Server 2008.
A sample spec for the table might be:
CREATE TABLE Order (
OrderID INT IDENTITY(1,1),
CustomerName VARCHAR(100),
CONSTRAINT PK_Order PRIMARY KEY (OrderID)
)
CREATE TABLE OrderItem (
OrderID INT,
LineNumber INT,
Price money,
Notes VARCHAR(100),
CONSTRAINT PK_OrderItem PRIMARY KEY (OrderID, LineNumber),
CONSTRAINT FK_OrderItem_Order FOREIGN KEY (OrderID) REFERENCES Order(OrderID)
)
The stored proc is passed a customerName of 'fred', so its trying to clone all orders where CustomerName = 'fred'.
To give a more concrete example:
Fred happens to have 2 orders:
Order 1 has line numbers 1,2,3
Order 2 has line numbers 1,2,4,6.
If the next identity in the table was 123, then I would want to create:
Order 123 with lines 1,2,3
Order 124 with lines 1,2,4,6
On SQL Server 2008 you can use MERGE and the OUTPUT clause to get the mappings between the original and cloned id values from the insert into Orders then join onto that to clone the OrderItems.
DECLARE #IdMappings TABLE(
New_OrderId INT,
Old_OrderId INT)
;WITH SourceOrders AS
(
SELECT *
FROM Orders
WHERE CustomerName = 'fred'
)
MERGE Orders AS T
USING SourceOrders AS S
ON 0 = 1
WHEN NOT MATCHED THEN
INSERT (CustomerName )
VALUES (CustomerName )
OUTPUT inserted.OrderId,
S.OrderId INTO #IdMappings;
INSERT INTO OrderItems
SELECT New_OrderId,
LineNumber,
Price,
Notes
FROM OrderItems OI
JOIN #IdMappings IDM
ON IDM.Old_OrderId = OI.OrderID