SQL Persisted Computed Column with Subquery - sql-server

I have three tables
Table 1: Items
ItemID | DaysLastSold
Table2: Listings
ItemID | ListingID
Table3: Sales
ListingID | DateItemClosed
I got this query to work:
SELECT min(DATEDIFF(day, DateItemClosed, getdate())) as DaysLastSold
from Sales
where QtySold > 0
and ListingID in (SELECT ListingID from Listings where ItemID = 8101 )
What I'm trying to do is basically place this query into the DaysLastSold Column in the Items table. So when ever the column is selected it recalculates DaysLastSold using the ItemID in the neighboring column.

If you want to persist that information you could create an indexed view that is made up of your calculated value and an ItemID. Obviously this would not be a column in your original table though. You could then join in on this view when you need the information.
Personally I would probably just do it inline when you need it. If you are concerned about performance, post the execution plan here and we may be able to make some suggestions.

Related

Replacing data in one table with data in another table using a unique ID

I'm using Access 2016 to view data from a table on our SQL server. I have a massive audit log where the record being viewed is represented by a "FolderID" field. I have another table that has values for the FolderID (represented as "fid") along with columns identifying the record's name and other ID numbers.
I want to be able to replace the FolderID value in the first table with CUSTOMER_NAME value from the second table so I know what's being viewed at a glance.
I've tried googling different join techniques to build a query that will accomplish this, but my google-fu is weak or I'm just not caffeinated enough today.
Table 1.
EventTime EventType FolderID
4/4/2019 1:23:39 PM A 12345
Table 2
fid acc Other_ID Third_ID CUSTOMER_NAME
12345 0 9875 12345678 Doe, John
Basically I want to query Table 2 to search for fid using the value in Table 1 for FolderID, and I want it to respond with the CUSTOMER_NAME associated with the FolderID/fid. The result would look like:
EventTime EventType FolderID
4/4/2019 1:23:39 PM A Doe, John
I'm stupid because I thought I was too smart to use the freaking Query Wizard. When I did, and it prompted me to create relationships and actually think about what I was doing, it came up with this.
SELECT [table1].EventTime, [table1].EventType, [table1].FolderID, [table1].ObjRef, [table1].AreaID, [table1].FileID, [table2].CUSTOMER_NAME, [table2].fid FROM [table2]
LEFT JOIN [table1] ON [table2].[fid] = [table1].[FolderID];
You can run this query and check if it helps!.
Select EventTime, EventType , CUSTOMER_NAME AS FolderID FROM Table1, Table2 Where Table1.FolderID = Table2.fid;
Basically, 'AS' is doing what you want here as you can rename your column to whatever you want.

spx for moving values to new table

I am trying to create one spx which based upon my ID which is 1009 will move 9 columns data to new table:
The old table has 9 columns:
CD_Train
CD_Date
CD_Score
Notes_Train
Notes_Date
Notes_Score
Ann_Train
Ann_Date
Ann_Score
userid - common in both tables
ID - 1009 - only exists in this table
and my new table has:
TrainingID,
TrainingType,
Score,
Date,
Status,
userid
TrainingType will have 3 values: Notes, CD, Ann
and other fields like score will get data from notes_score and so on
and date will get data from notes_date,cd_date depending upon in which column cd training went
status will get value from Notes_Train, cd_train and so on
based upon this, I am lost how should I do it
I tried querying one sql of users table and tried to do the join but I am losing the ground how to fix it
No idea yet, how to fill your column trainingId but the rest can be done by applying some UNION ALL clauses:
INSERT INTO tbl2 (trainingType,Date,Score,Status,userid)
Select 'CD' , CD_date, CD_score, CD_Train, userid FROM tbl1 where CD_date>0
UNION ALL
SELECT 'Notes', Notes_Date, Notes_Score, Notes_Train, userid FROM tbl1 where Notes_date>0
UNION ALL
SELECT 'Ann', Ann_Date, Ann_Score, ANN_Train, userid
FROM tbl1 where Ann_date>0
I don't know as yet whether all columns are filled in each row. That is the reason for the where clauses which should filter out only those rows with relevant data in the three selected columns.

SQL Server FullText Search with Weighted Columns from Previous One Column

In the database on which I am attempting to create a FullText Search I need to construct a table with its column names coming from one column in a previous table. In my current implementation attempt the FullText indexing is completed on the first table Data and the search for the phrase is done there, then the second table with the search results is made.
The schema for the database is
**Players**
Id
PlayerName
Blacklisted
...
**Details**
Id
Name -> FirstName, LastName, Team, Substitute, ...
...
**Data**
Id
DetailId
PlayerId
Content
DetailId in the table Data relates to Id in Details, and PlayerId relates to Id in Players. If there are 1k rows in Players and 20 rows in Details, then there are 20k rows in Data.
WITH RankedPlayers AS
(
SELECT PlayerID, SUM(KT.[RANK]) AS Rnk
FROM Data c
INNER JOIN FREETEXTTABLE(dbo.Data, Content, '"Some phrase like team name and player name"')
AS KT ON c. DataID = KT.[KEY]
GROUP BY c.PlayerID
)
…
Then a table is made by selecting the rows in one column. Similar to a pivot.
…
SELECT rc.Rnk,
c.PlayerID,
PlayerName,
TeamID,
…
(SELECT Content FROM dbo.Data data WHERE DetailID = 1 AND data.PlayerID = c.PlayerID) AS [TeamName],
…
FROM dbo.Players c
JOIN RankedPlayers rc ON c. PlayerID = rc. PlayerID
ORDER BY rc.Rnk DESC
I can return a ranked table with this implementation, the aim however is to be able to produce results from weighted columns, so say the column Playername contributes to the rank more than say TeamName.
I have tried making a schema bound view with a pivot, but then I cannot index it because of the pivot. I have tried making a view of that view, but it seems the metadata is inherited, plus that feels like a clunky method.
I then tried to do it as a straight query using sub queries in the select statement, but cannot due to indexing not liking sub queries.
I then tried to join multiple times, again the index on the view doesn't like self-referencing joins.
How to do this?
I have come across this article http://developmentnow.com/2006/08/07/weighted-columns-in-sql-server-2005-full-text-search/ , and other articles here on weighted columns, however nothing as far as I can find addresses weighting columns when the columns were initially row data.
A simple solution that works really well. Put weight on the rows containing the required IDs in another table, left join that table to the table to which the full text search had been applied, and multiply the rank by the weight. Continue as previously implemented.
In code that comes out as
DECLARE #Weight TABLE
(
DetailID INT,
[Weight] FLOAT
);
INSERT INTO #Weight VALUES
(1, 0.80),
(2, 0.80),
(3, 0.50);
WITH RankedPlayers AS
(
SELECT PlayerID, SUM(KT.[RANK] * ISNULL(cw.[Weight], 0.10)) AS Rnk
FROM Data c
INNER JOIN FREETEXTTABLE(dbo.Data, Content, 'Karl Kognition C404') AS KT ON c.DataID = KT.[KEY]
LEFT JOIN #Weight cw ON c.DetailID = cw.DetailID
GROUP BY c.PlayerID
)
SELECT rc.Rnk,
...
I'm using a temporary table here for evidence of concept. I am considering adding a column Weights to the table Details to avoid an unnecessary table and left join.

Metadata database design

I am trying to store meta data about a document into a SQL Server. The document are stored into a document archive, and returns back an identifier so I can get back that document by asking the archive to get the document by identifier.
Our user would like to be able to search for this document based on different meta data. The meta data could be 1 attribute or 5 depending on the document type, and the users should be able to create new document types from a admin site.
I can see two solution here. One is that each documenttype gets it's own metadata table, where all metadata attributes are predefined, and if one should be added a new column needs to be created. And if a new documenttype is created a new metadata table needs to be created. Our DBA will freak out with a solution like this, and I also see a problem with indexes. Because if the documenttype has 5 different meta data attributes it needs to be searchable with 1 or 4 of them specified in the search. Then I would need to write index for all the different combinations of possible searchs.
here is an example (fictiv)
|documentId | Name | InsertDate | CustomerId | City
| 1 | John | 2014-01-01 | 2 | London
| 2 | John | 2014-01-20 | 5 | New York
| 3 | Able | 2014-01-01 | 10 | Paris
I could here say:
Give me all documents where Name = 'John'
Give me all documets where Name = 'John' And CustomerId = 5
Give me all document where InserDate = '2014-01-01' and City = 'London'
This will be 3 differnet indexes and then I haven't coverd all possible combinations. This isn't practical.
So I am look in to the evil 'EAV' (anti)pattern.
So instead of having the metadata as columns I can have the as rows.
|documentId | MetaAttribute | MetaValue
| 1 | Name | John
| 1 | InsertDate | 2014-01-01
| 1 | CustomerId | 2
| 1 | City | London
| 2 | Name | John
| 2 | InsertDate | 2014-01-20
| 2 | CustomerId | 5
| 2 | City | New York
| 3 | Name | Able
| 3 | InserDate | 2014-01-01
| 3 | CustomerId | 10
| 3 | City | Paris
Here it's simple to create one index om MetaAttribute och metaValue, and it's covered. If a new documenttype is created, new metadata can be created with that documenttype into a MetaAttributeTable (that contains all MetaAttribute for the different documenttype). So no need to create new tables or coulms if a new documenttype is added or if a new attribute is added to a documenttype. Instead all MetaValues most be strings :( and the SQL Query to find the document id is a bit more complicated.
This is what I figured out. (In this example the MetaAttribute is a string, but would be an ID to the MetaAttribute Table)
SELECT * FROM [Document]
WHERE ID IN (SELECT documentId FROM [MetaData]
WHERE ((MetaAttribute = 'Name' AND MetaValue = 'John')
OR (MetaAttribute = 'CustomerId' and MetaValue = '5'))
GROUP BY [documentId]
HAVING Count(1) = 2)
Here I need to ask if the Name = 'John' and CustomerId = 5. I do that by finding all records that have Name = 'John' and CustomerId = '5' and the Group it on the documentId and count number of items in the group. If I got 2 then both Name = 'John' and CustomerId = '5' is true for this search. Return the documentId and use that to retrive information about the document, like the document archive storage id.
There should be a better SQL statement for this isn't there?
So my question is. Is there a better approche than these 2. Is the EAV-pattern so bad that I should stick with the first approche and have a Freaked out DBA and "ten millions of indexes"
We are talking about a system that will have around 10-20 millions of new records each month, and contain data for at least 3 years.... So the tables will be preatty big and good indexes are neccasary for performance.
Best Regards
Magnus
The EAV model is appealing if you have unbounded attributes--that is, anyone can set up anything as an attribute. However, it sounds from your description that this is not the case--the possible document attributes come from a known and fairly limited set. If this is the case, routine normalization suggests the following:
-- One per document
CREATE TABLE Document
(
DocumentId -- primary key
,DocumentType
,<etc>
)
-- One per "type" of document
CREATE TABLE DocumentType
(
DocumentTypeId -- pirmary key
,Name
)
-- One per possible document attribute.
-- Note that multiple document types can reference the same attribute
CREATE TABLE DocumentAttributes
(
AttributeId -- primary key
,Name
)
-- This lists which attributes are used by a given type
CREATE TABLE DocumentTypeAttributes
(
DocumentTypeId
,AttributeId
-- compound primary key on both columns
-- foeign keys on both columns
)
-- This contains the final association of document and attributes
CREATE TABLE DocumentAttributeValues
(
DocumentId
,AttributeId
,Value
-- compound primary key on DocumentId, AttributeId
-- foeign keys on both columns ot their respective parent tables
)
A tighter model with more robust keys could be implemented to ensure at the database level that an attribute cannot be assigned to a document with an “inappropriate” type.
Queries have to use joins, but (presumably) only the Documents and DocumentAttributes tables will ever be large. An index on on (AttributeId + Value) facilitiate lookups by attribute type, and depending on cardinality an index on (Value + AttributeId) could make searches for specific attributes quite efficient.
(Edit)
Ooh, clever, I created two tables with the same name. I've renamed the last one to DocumentAttributeValues. (Free advice is clearly worth what you paid for it!)
This shows how ugly these systems can get in SQL, as you have to “look up” both attributes separately. On the plus side you don’t have to worry about “does this type go with this document”, as those rules have (better had) been applied when the data was loaded. Two examples:
This one spells everything out in joins, and as such I think it might perform worse than the next:
-- Top-down
SELECT do.DocumentId
from Documents do
inner join DocumentAttributes da1
on da.Name = 'Name'
inner join DocumentAttributeValues dav1
on dav1.AttributeId = da1.AttributeId
and dav1.Value = 'John'
inner join DocumentAttributes da2
on da2.Name = 'CustomerId'
inner join DocumentAttributeValues dav2
on dav2.AttributeId = da2.AttributeId
and dav2.Value = '5'
This one picks out the attributes, then finds which documents have all of them. It might perform better, as there’s one less table to process:
-- Bottom-up
SELECT xx.DocumentId
from (-- All documents with name "John"
select dav.DocumentId
from DocumentAttributes da
inner join DocumentAttributeValues dav
on dav.AttributeId = da.AttributeId
where da.Name = 'Name'
and dav.Value = 'John'
-- This combines the two sets, with "all" keeping any duplicate entries
union all
-- All documents with CustomerId = "5"
select dav.DocumentId
from DocumentAttributes da
inner join DocumentAttributeValues dav
on dav.AttributeId = da.AttributeId
where da.Name = 'CustomerId'
and dav.Value = '5') xx -- Have to give the subquery an alias
group by xx.DocumentId
having count(*) = 2
While further refinements might be possible, the more more attributes you’re filtering on, the uglier the queries will be. Five attributes max might work ok in SQL, but if you’ve got tons of attributes, a NoSQL solution might be what you’re looking for.
(Please note that, as with my original post, I have not tested this code, so there may be typos or subtle--or not so subtle--errors in here.)
SQL Server 2008+ offers three related features for dealing with such cases:
Sparse Columns which allow you to define hundreds of columns even if only a subset are used at a time
Column Sets allow you to group these columns and treat them as a group
Filtered indexes can index only the rows that actually have values in them.
These features allow you to work with more-or-less normal SQL statements to handle all metadata columns.
These features were specifically added to address the EAV/metadata scenario.
EDIT
If you have a limited set of attributes that are always filled, there is no need for Sparse Columns or the EAV anti-pattern either.
You can create your tables as you normally would and add indexes to optimize the real workload you encounter. Certain types of queries will occur far more often than others and SQL Server's Index tuning advisor can propose the indexes and statistics to use based on a trace captured using SQL Server's Profiler.
It's quite possible that only a subset of the columns will accelerate searches and the rest can be added as include columns in the index.
Full Text Search
A more powerful option is to use SQL Server's Full Text Search. This will allow you to execute queries using arbitrary attributes. This is another technique using by document/content management systems, ERPs and CRMs to handle arbitrary attributes.
With FTS you simply specify the columns to include in one FTS index and don't have to create separate indexes for each attribute.
You can use FTS predicates in SELECT queries like this:
SELECT Name, ListPrice
FROM Production.Product
WHERE ListPrice = 80.99
AND CONTAINS(Name, 'Mountain')
This can result in much simpler queries (you just write a modified select) and administration (no worries about column order in indexes, only one FTS index to manage)

Computed column expression

I have a specific need for a computed column called ProductCode
ProductId | SellerId | ProductCode
1 1 000001
2 1 000002
3 2 000001
4 1 000003
ProductId is identity, increments by 1.
SellerId is a foreign key.
So my computed column ProductCode must look how many products does Seller have and be in format 000000. The problem here is how to know which Sellers products to look for?
I've written have a TSQL which doesn't look how many products does a seller have
ALTER TABLE dbo.Product
ADD ProductCode AS RIGHT('000000' + CAST(ProductId AS VARCHAR(6)) , 6) PERSISTED
You cannot have a computed column based on data outside of the current row that is being updated. The best you can do to make this automatic is to create an after-trigger that queries the entire table to find the next value for the product code. But in order to make this work you'd have to use an exclusive table lock, which will utterly destroy concurrency, so it's not a good idea.
I also don't recommend using a view because it would have to calculate the ProductCode every time you read the table. This would be a huge performance-killer as well. By not saving the value in the database never to be touched again, your product codes would be subject to spurious changes (as in the case of perhaps deleting an erroneously-entered and never-used product).
Here's what I recommend instead. Create a new table:
dbo.SellerProductCode
SellerID LastProductCode
-------- ---------------
1 3
2 1
This table reliably records the last-used product code for each seller. On INSERT to your Product table, a trigger will update the LastProductCode in this table appropriately for all affected SellerIDs, and then update all the newly-inserted rows in the Product table with appropriate values. It might look something like the below.
See this trigger working in a Sql Fiddle
CREATE TRIGGER TR_Product_I ON dbo.Product FOR INSERT
AS
SET NOCOUNT ON;
SET XACT_ABORT ON;
DECLARE #LastProductCode TABLE (
SellerID int NOT NULL PRIMARY KEY CLUSTERED,
LastProductCode int NOT NULL
);
WITH ItemCounts AS (
SELECT
I.SellerID,
ItemCount = Count(*)
FROM
Inserted I
GROUP BY
I.SellerID
)
MERGE dbo.SellerProductCode C
USING ItemCounts I
ON C.SellerID = I.SellerID
WHEN NOT MATCHED BY TARGET THEN
INSERT (SellerID, LastProductCode)
VALUES (I.SellerID, I.ItemCount)
WHEN MATCHED THEN
UPDATE SET C.LastProductCode = C.LastProductCode + I.ItemCount
OUTPUT
Inserted.SellerID,
Inserted.LastProductCode
INTO #LastProductCode;
WITH P AS (
SELECT
NewProductCode =
L.LastProductCode + 1
- Row_Number() OVER (PARTITION BY I.SellerID ORDER BY P.ProductID DESC),
P.*
FROM
Inserted I
INNER JOIN dbo.Product P
ON I.ProductID = P.ProductID
INNER JOIN #LastProductCode L
ON P.SellerID = L.SellerID
)
UPDATE P
SET P.ProductCode = Right('00000' + Convert(varchar(6), P.NewProductCode), 6);
Note that this trigger works even if multiple rows are inserted. There is no need to preload the SellerProductCode table, either--new sellers will automatically be added. This will handle concurrency with few problems. If concurrency problems are encountered, proper locking hints can be added without deleterious effect as the table will remain very small and ROWLOCK can be used (except for the INSERT which will require a range lock).
Please do see the Sql Fiddle for working, tested code demonstrating the technique. Now you have real product codes that have no reason to ever change and will be reliable.
I would normally recommend using a view to do this type of calculation. The view could even be indexed if select performance is the most important factor (I see you're using persisted).
You cannot have a subquery in a computed column, which essentially means that you can only access the data in the current row. The only ways to get this count would be to use a user-defined function in your computed column, or triggers to update a non-computed column.
A view might look like the following:
create view ProductCodes as
select p.ProductId, p.SellerId,
(
select right('000000' + cast(count(*) as varchar(6)), 6)
from Product
where SellerID = p.SellerID
and ProductID <= p.ProductID
) as ProductCode
from Product p
One big caveat to your product numbering scheme, and a downfall for both the view and UDF options, is that we're relying upon a count of rows with a lower ProductId. This means that if a Product is inserted in the middle of the sequence, it would actually change the ProductCodes of existing Products with a higher ProductId. At that point, you must either:
Guarantee the sequencing of ProductId (identity alone does not do this)
Rely upon a different column that has a guaranteed sequence (still dubious, but maybe CreateDate?)
Use a trigger to get a count at insert which is then never changed.

Resources