SQL Server indexed calculated column that sums another table - sql-server

I'd like to effectively add a calculated column, which sums a column from selected rows in another table. I need to to quickly retrieve and search for values in the calculated column without re-computing the sum.
The calculated column I'd like to add would look like this in Dream-SQL:
ALTER TABLE Invoices ADD Balance
AS SUM(Transactions.Amount) WHERE Transactions.InvoiceId = Invoices.Id
Of course, this doesn't work. My understanding is that you can't add a calculated column that references another table. However, it appears that an indexed view can contain such a column.
The project is based on Entity Framework Code First. The application needs to quickly find non-zero balances.
Assuming an indexed view is the way to go, what is the best approach to integrating it with the Invoices and Transactions tables to make it easy use with LINQ to Entities? Should the indexed view contain all the columns in the Invoices table or just the Balance (what gets persisted)? A code snippet of the SQL to create the recommended view and index would be helpful.

An indexed view won't work because it would only index expressions in the GROUP BY clause, which means it can't index the sum. A computed column won't work because the sum can't be persisted or indexed.
A trigger works, however:
CREATE TRIGGER UpdateInvoiceBalance ON Transactions AFTER INSERT, UPDATE AS
IF UPDATE(Amount) BEGIN
SET NOCOUNT ON;
WITH InvoiceBalances AS (
SELECT Transactions.InvoiceId, SUM(Transactions.Amount) AS Balance
FROM Transactions
JOIN inserted ON Transactions.InvoiceId = inserted.InvoiceId
GROUP BY Transactions.InvoiceId)
UPDATE Invoices
SET Balance = InvoiceBalances.Balance
FROM InvoiceBalances
WHERE Invoices.Id = InvoiceBalances.InvoiceId
END
It also helps to provide a default value of 0 for the Balance column since when you mark it as DatabaseGeneratedOption.Computed, EF won't provide any value for it when adding an Invoice row.

Related

SQL Server trigger to track annual revenue

I have a simple database with 4 tables:
Customer (cusId)
Newspaper (papId)
SubCost (subId)
Subscription (cusId, papId, subId)
Newspaper has a column to track number of subscribers which is updated via a trigger on the Subscription table. It also has a column to track annual revenue which should be based on the number of subscribers and the cost associated with the subscription (subId).
I am looking for a trigger to track annual revenue. There are 3 subscription types (subId) with differing weekly costs and a paper can have more than one type of subscription so it can't just be (cost * 52 * numSubs).
Can you help me with this logic?
Your best bet is not using such a column at all. Instead use a view which computes the result, and index it if necessary
CREATE OR ALTER VIEW vTotalSubs
WITH SCHEMABINDING AS
SELECT
n.papid,
TotalRevenue = SUM(sc.Cost * 52),
TotalSubscriptions = COUNT_BIG(*) -- you MUST have this column here if aggregating with an index
FROM dbo.Newspaper n
JOIN dbo.Subscription s ON s.papid = n.papid
JOIN dbo.SubCost sc ON sc.subid = s.subid
GROUP BY
n.papid;
GO
CREATE UNIQUE CLUSTERED INDEX CX_vTotalSubs ON vTotalSubs (papid);
If you decide to index the view, be aware there are many restrictions to indexed views, in particular:
Only INNER JOIN is allowed, no other join types, no subqueries
Must schema-bind, and specify schema on all tables.
If aggregating, you must have COUNT_BIG(*), and the only other aggregation allowed is SUM
Make sure to add the WITH (NOEXPAND) hint when querying, otherwise there may be performance impacts
The server will automatically maintain the index, you do not need to update it.

SQL query runs into a timeout on a sparse dataset

For sync purposes, I am trying to get a subset of the existing objects in a table.
The table has two fields, [Group] and Member, which are both stringified Guids.
All rows together may be to large to fit into a datatable; I already encountered an OutOfMemory exception. But I have to check that everything I need right now is in the datatable. So I take the Guids I want to check (they come in chunks of 1000), and query only for the related objects.
So, instead of filling my datatable once with all
SELECT * FROM Group_Membership
I am running the following SQL query against my SQL database to get related objects for one thousand Guids at a time:
SELECT *
FROM Group_Membership
WHERE
[Group] IN (#Guid0, #Guid1, #Guid2, #Guid3, #Guid4, #Guid5, ..., #Guid999)
The table in question now contains a total of 142 entries, and the query already times out (CommandTimeout = 30 seconds). On other tables, which are not as sparsely populated, similar queries don't time out.
Could someone shed some light on the logic of SQL Server and whether/how I could hint it into the right direction?
I already tried to add a nonclustered index on the column Group, but it didn't help.
I'm not sure that WHERE IN will be able to maximally use an index on [Group], or if at all. However, if you had a second table containing the GUID values, and furthermore if that column had an index, then a join might perform very fast.
Create a temporary table for the GUIDs and populate it:
CREATE TABLE #Guids (
Guid varchar(255)
)
INSERT INTO #Guids (Guid)
VALUES
(#Guid0, #Guid1, #Guid2, #Guid3, #Guid4, ...)
CREATE INDEX Idx_Guid ON #Guids (Guid);
Now try rephrasing your current query using a join instead of a WHERE IN (...):
SELECT *
FROM Group_Membership t1
INNER JOIN #Guids t2
ON t1.[Group] = t2.Guid;
As a disclaimer, if this doesn't improve the performance, it could be because your table has low cardinality. In such a case, an index might not be very effective.

Update row_number value in view joined to different table

I created a view in SQL Server that includes a row_number() function. The table referenced in the view contains every record in my database and enumerates records based on duplicate instances of a composite ID. For example:
row_number() over (
partition by
composite_id
order by
sample_value) as rownum
The issue is that whenever I join this view against another table (or filter rows based on a WHERE clause), the row number nevertheless always returns the value that would be returned for the full table referenced in the view. Instead, I'd like the row number to update depending on the records that are ultimately returned in the eventual result set.
For example:
select *
from my_created_view a
where a.sample_value in ('a','b','c')
or
select *
from my_created_view a
inner join subset_of_data b on a.sample_value = b.sample_value
...where either query above would result in a smaller number of records than are contained in the full original table and the resulting set of composite_id would sometimes contain only one instance. In cases where the result set contains only one instance of composite_id, I'd like that row to receive a value of 1.
Is this possible? Or does row numbering within a view create a row number that's tied only to the query within the created view?
Thanks in advance for any light you can shed here!

Using count function or create specific column for counting in sql

I working on groups project. I have those tables :
I can get the number of members for each group by using count function :
SELECT COUNT(1) AS Counts FROM [Groups].[GroupMembers]
WHERE GroupId=Id;
Or I can add another column to Groups table for counting and every time new member join to the group, this field will increase by one. Does it better to use count function or add another column for counting ? in other words, what are the advantages and disadvantages of each method ?
Creating a column to store the count's is not recommend at all.
When you want the count of each group you can use a simple Select query to show the count of each group.
SELECT G.groupid,
Count(userid)
FROM groups G
LEFT OUTER JOIN groupmembers GM
ON G.groupid = GM.groupid
GROUP BY G.groupid
In case you want to add a new column then you will require a Trigger on GroupMembers table to update the count column in Groups table when a new user is added to any group in GroupMembers table
It depends on your table engine. If your table engine is MyISAM it would be much faster because it would simply read number of rows in the table from stored value, however Innodb engines will need to do a full table scan.
It is not recommended to store a count inside of the table itself, so if this is something you're worried about, use the MyISAM engine if possible.
Storing a value in the table would needlessly require an extra UPDATE query on each new/lost membership.

How do I Iterate through a small set of records and retrieve records that match criteria set in each record

I have a table tblCriteria that contains a small (<20) set of records. Each record has a field of criteria.
I want SQL to move through these records when requested tblFilterRun, filter the main table tblRecords (~5000 records) and then insert some key fields from the matching records into another table tblFilterResults.
tblCriteria (CriteriaID, CriteriaText)
tblFilterRun (FilterRunID, FilterRunDate)
tblFilterResults (FilterResultsID, FilterRunID, RecordID, Ref, CustomerID, SupplierID
tblRecords (RecordID, CustomerID, SupplierID...)
Previously I would have created something in Access to iterate through each tblCriteria record, but I would like a purely server solution. I've heard cursors mentioned (usually at the same time as a profanity), what are my options?
It's not really clear what you need to do with the records in tblCriteria, but can you created a UDF that would do the work of processing one record? Then you can call it on every record using one query like
SELECT *
FROM tblCriteria
CROSS APPLY dbo.udf_yourFunction(parameter1, parameter2, etc)

Resources