Sql Server - How compare hash of two rows in merge

Sql Server - How compare hash of two rows in merge - sql-server

I have several working tables that I am merging together into one final table that will be used for display. If the display table does not contain the primary key compiled from the working tables (hereafter called src)then I insert the row into display. This works fine, the next part is confusing to me.
If the primary key is already in display I only want to update the display row if the src row has the same primary key but at least one column is different from the display row. I'd like to implement this using the HASHBYTES() method using the MD5 algorithm.
From msdn, the syntax should look like this: HASHBYTES('MD5', {#variable | 'string'})
I want to be able to do something like this in my merge statement:
WHEN MATCHED AND HASHBYTES('MD5', display) != HASHBYTES('MD5', src) THEN ...(stuff)
How do I complete the HASHBYTES function?
Here is my current merge statement
MERGE dbo.DisplayCases AS display
USING (SELECT CaseId, Title, projects.ProjectName, categories.CategoryTitle, Root, milestones.MilestoneName,
milestones.MilestoneDate, Priority, statuses.StatusTitle, EstimatedHours, ElapsedHours, personAssigned.Name as AssignedTo,
personResolved.Name as ResolvedBy, cases.IsResolved, IsOpen, Opened, Resolved, Uri, ResolveUri,
OutlineUri, SpecUri, ParentId, Backlog
FROM fogbugz.Cases cases
JOIN fogbugz.Projects projects ON cases.ProjectId = projects.ProjectId
JOIN fogbugz.Categories categories ON cases.CategoryId = categories.CategoryId
JOIN fogbugz.Milestones milestones ON cases.MilestoneId = milestones.MilestoneId
JOIN fogbugz.Statuses statuses ON cases.Status = statuses.StatusId
JOIN fogbugz.People personAssigned ON cases.AssignedTo = personAssigned.Id
LEFT JOIN fogbugz.People personResolved ON cases.ResolvedBy = personResolved.Id
) as src
ON display.CaseId = src.CaseId
WHEN NOT MATCHED THEN
INSERT(CaseId, CaseTitle, ProjectName, CategoryTitle, RootId, MilestoneName, MilestoneDate, Priority,
StatusTitle, EstHrs, ElapsedHrs, AssignedTo, ResolvedBy, IsOpen, IsResolved, Opened, Resolved, Uri,
ResolveUri, OutlineUri, Spec, ParentId, Backlog)
VALUES(src.CaseId, src.Title, src.ProjectName, src.CategoryTitle, src.Root, src.MilestoneName,
src.MilestoneDate, src.Priority, src.StatusTitle, src.EstimatedHours, src.ElapsedHours,
src.AssignedTo, src.ResolvedBy, src.IsResolved, src.IsOpen, src.Opened, src.Resolved,
src.Uri, src.ResolveUri, src.OutlineUri, src.SpecUri, src.ParentId, src.Backlog);

From Martin Smith's comment...
You could do WHEN MATCHED AND EXISTS(SELECT Source.* EXCEPT SELECT Target.*) THEN UPDATE ...

Related

Apply OPENJSON to a single column

I have a products table with two attribute column, and a json column. I'd like to be able to delimit the json column and insert extra rows retaining the attributes. Sample data looks like:
ID Name Attributes
1 Nikon {"4e7a":["jpg","bmp","nef"],"604e":["en"]}
2 Canon {"4e7a":["jpg","bmp"],"604e":["en","jp","de"]}
3 Olympus {"902c":["yes"], "4e7a":["jpg","bmp"]}
I understand OPENJSON can convert JSON objects into rows, and key values into cells but how do I apply it on a single column that contains JSON data?
My goal is to have an output like:
ID Name key value
1 Nikon 902c NULL
1 Nikon 4e7a ["jpg","bmp","nef"]
1 Nikon 604e ["en"]
2 Canon 902c NULL
2 Canon 4e7a ["jpg","bmp"]
2 Canon 604e ["en","jp","de"]
3 Olympus 902c ["yes"]
3 Olympus 4e7a ["jpg","bmp"]
3 Olympus 604e NULL
Is there a way I can query this products table like? Or is there a way to reproduce my goal data set?
SELECT
ID,
Name,
OPENJSON(Attributes)
FROM products
Thanks!

Here is something that will at least start you in the right direction.
SELECT P.ID, P.[Name], AttsData.[key], AttsData.[Value]
FROM products P CROSS APPLY OPENJSON (P.Attributes) AS AttsData
The one thing that has me stuck a bit right now is the missing values (value is null in result)...
I was thinking of maybe doing some sort of outer/full join back to this, but even that is giving me headaches. Are you certain you need that? Or, could you do an existence check with the output from the SQL above?
I am going to keep at this. If I find a solution that matches your output exactly, I will add to this answer.
Until then... good luck!

You can get the rows with NULL value fields by creating a list of possible keys and using CROSS APPLY to associate each key to each row from the original dataset, and then left-joining in the parsed JSON.
Here's a working example you should be able to execute as-is:
-- Throw together a quick and dirty CTE containing your example data
WITH OriginalValues AS (
SELECT *
FROM (
VALUES ( 1, 'Nikon', '{"4e7a":["jpg","bmp","nef"],"604e":["en"]}' ),
( 2, 'Canon', '{"4e7a":["jpg","bmp"],"604e":["en","jp","de"]}' ),
( 3, 'Olympus', '{"902c":["yes"], "4e7a":["jpg","bmp"]}' )
) AS T ( ID, Name, Attributes )
),
-- Build a separate dataset that includes all possible 'key' values from the JSON.
PossibleKeys AS (
SELECT DISTINCT A.[key]
FROM OriginalValues CROSS APPLY OPENJSON( OriginalValues.Attributes ) AS A
),
-- Get the existing keys and values from the JSON, associated with the record ID
ValuesWithKeys AS (
SELECT OriginalValues.ID, Atts.[key], Atts.Value
FROM OriginalValues CROSS APPLY OPENJSON( OriginalValues.Attributes ) AS Atts
)
-- Join each possible 'key' value with every record in the original dataset, and
-- then left join the parsed JSON values for each ID and key
SELECT OriginalValues.ID, OriginalValues.Name, KeyList.[key], ValuesWithKeys.Value
FROM OriginalValues
CROSS APPLY PossibleKeys AS KeyList
LEFT JOIN ValuesWithKeys
ON OriginalValues.ID = ValuesWithKeys.ID
AND KeyList.[key] = ValuesWithKeys.[key]
ORDER BY ID, [key];
If you need to include some pre-determined key values where some of them might not exist in ANY of the JSON values stored in Attributes, you could construct a CTE (like I did to emulate your original dataset) or a temp table to provide those values instead of doing the DISTINCT selection in the PossibleKeys CTE above. If you already know what your possible key values are without having to query them out of the JSON, that would most likely be a less costly approach.

SqlServer Many to Many AND

I have 3 (hypothetical) tables.
Photos (a list of photos)
Attributes (things describing the photos)
PhotosToAttributes (a table to link the first 2)
I want to retrieve the Names of all the Photos that have a list of attributes.
For example, all photos that have both dark lighting and are portraits (AttributeID 1 and 2). Or, for example, all photos that have dark lighting, are portraits and were taken at a wedding (AttributeID 1 and 2 and 5). Or any arbitrary number of attributes.
The scale of the database will be maybe 10,000 rows in Photos, 100 Rows in Attributes and 100,000 rows in PhotosToAttributes.
This question: SQL: Many-To-Many table AND query is very close. (I think.) I also read the linked answers about performance. That leads to something like the following. But, how do I get Name instead of PhotoID? And presumably my code (C#) will build this query and adjust the attribute list and count as necessary?
SELECT PhotoID
FROM PhotosToAttributes
WHERE AttributeID IN (1, 2, 5)
GROUP by PhotoID
HAVING COUNT(1) = 3
I'm a bit database illiterate (it's been 20 years since I took a database class); I'm not even sure this is a good way to structure the tables. I wanted to be able to add new attributes and photos at will without changing the data access code.

It is probably a reasonable way to structure the database. An alternate would be to keep all the attributes as a delimited list in a varchar field, but that would lead to performance issues as you search the field.
Your code is close, to take it to the final step you should just join the other two tables like this:
Select p.Name, p.PhotoID
From Photos As p
Join PhotosToAttributes As pta On p.PhotoID = pta.PhotoID
Join Attributes As a On pta.AttributeID = a.AttributeID
Where a.Name In ('Dark Light', 'Portrait', 'Wedding')
Group By p.Name, p.PhotoID
Having Count(*) = 3;
By joining the Attributes table like that it means you can search for attributes by their name, instead of their ID.

For first create view from your joins:
create view vw_PhotosWithAttributes
as
select
p.PhotoId,
a.AttributeID,
p.Name PhotoName,
a.Name AttributeName
from Photos p
inner join PhotosToAttributes pa on p.PhotoId = pa.PhotoId
inner join Attributes a on a.AttributeID = pa.AttributeID
You can easy ask for attribute, name, id but don't forget to properly index field.

SQL Server FullText Search with Weighted Columns from Previous One Column

In the database on which I am attempting to create a FullText Search I need to construct a table with its column names coming from one column in a previous table. In my current implementation attempt the FullText indexing is completed on the first table Data and the search for the phrase is done there, then the second table with the search results is made.
The schema for the database is
**Players**
Id
PlayerName
Blacklisted
...
**Details**
Id
Name -> FirstName, LastName, Team, Substitute, ...
...
**Data**
Id
DetailId
PlayerId
Content
DetailId in the table Data relates to Id in Details, and PlayerId relates to Id in Players. If there are 1k rows in Players and 20 rows in Details, then there are 20k rows in Data.
WITH RankedPlayers AS
(
SELECT PlayerID, SUM(KT.[RANK]) AS Rnk
FROM Data c
INNER JOIN FREETEXTTABLE(dbo.Data, Content, '"Some phrase like team name and player name"')
AS KT ON c. DataID = KT.[KEY]
GROUP BY c.PlayerID
)
…
Then a table is made by selecting the rows in one column. Similar to a pivot.
…
SELECT rc.Rnk,
c.PlayerID,
PlayerName,
TeamID,
…
(SELECT Content FROM dbo.Data data WHERE DetailID = 1 AND data.PlayerID = c.PlayerID) AS [TeamName],
…
FROM dbo.Players c
JOIN RankedPlayers rc ON c. PlayerID = rc. PlayerID
ORDER BY rc.Rnk DESC
I can return a ranked table with this implementation, the aim however is to be able to produce results from weighted columns, so say the column Playername contributes to the rank more than say TeamName.
I have tried making a schema bound view with a pivot, but then I cannot index it because of the pivot. I have tried making a view of that view, but it seems the metadata is inherited, plus that feels like a clunky method.
I then tried to do it as a straight query using sub queries in the select statement, but cannot due to indexing not liking sub queries.
I then tried to join multiple times, again the index on the view doesn't like self-referencing joins.
How to do this?
I have come across this article http://developmentnow.com/2006/08/07/weighted-columns-in-sql-server-2005-full-text-search/ , and other articles here on weighted columns, however nothing as far as I can find addresses weighting columns when the columns were initially row data.

A simple solution that works really well. Put weight on the rows containing the required IDs in another table, left join that table to the table to which the full text search had been applied, and multiply the rank by the weight. Continue as previously implemented.
In code that comes out as
DECLARE #Weight TABLE
(
DetailID INT,
[Weight] FLOAT
);
INSERT INTO #Weight VALUES
(1, 0.80),
(2, 0.80),
(3, 0.50);
WITH RankedPlayers AS
(
SELECT PlayerID, SUM(KT.[RANK] * ISNULL(cw.[Weight], 0.10)) AS Rnk
FROM Data c
INNER JOIN FREETEXTTABLE(dbo.Data, Content, 'Karl Kognition C404') AS KT ON c.DataID = KT.[KEY]
LEFT JOIN #Weight cw ON c.DetailID = cw.DetailID
GROUP BY c.PlayerID
)
SELECT rc.Rnk,
...
I'm using a temporary table here for evidence of concept. I am considering adding a column Weights to the table Details to avoid an unnecessary table and left join.

Updating column based on three tables

I know it's very unprofessional, but it's our business system so I can't change it.
I have three tables: t_posList, t_url, t_type. The table t_posList has a column named URL which is also stored in the table t_url (the ID of the table t_url is not saved in t_posList so I have to find it like posList.Url = t_url.Url).
The column t_posList.status of every data row should be updated to 'non-customer' (it will be a status id but lets keep it simple) if: the ID of t_url can NOT be found in t_type.url_id.
So the query has like two steps: first I have to get all of the data rows where t_posList.Url = t_url.Url. After this I have to check which ID's of the found t_url rows can NOT be found in t_type.url_id.
I really hope you know what I mean. Because our system is very unprofessional and my SQL knowledge is not that good I'm not able to make this query.
EDIT: I tried this:
UPDATE t_poslist SET status = (
SELECT 'non-customer'
FROM t_url, t_type
WHERE url in
(select url from t_url
LEFT JOIN t_type ON t_url.ID = t_type.url_id
WHERE t_type.url_id is null)
)

What about this?
UPDATE p
SET status = 'non-customer'
FROM t_poslist p
INNER JOIN t_url u ON u.url = p.url
WHERE NOT EXISTS
(
SELECT * FROM t_type t WHERE t.url_id = u.ID
)

Merge statement for stocks using three tables

Here is the SQL Query:
MERGE tblProductsSold
USING tblOrders on tblOrders.OrderID = tblProductsSold.txtOrderID
WHEN NOT MATCHED THEN
Insert ( txtOrderID, txtOrderdate, txtPartno, txtQty)
values
(SELECT tblItemsOnOrder.txtOrderID,
tblOrders.txtDateTime,
tblItemsOnOrder.txtPartNO,
tblItemsOnOrder.txtQTY
FROM tblOrders INNER JOIN tblItemsOnOrder
ON tblOrders.OrderID = tblItemsOnOrder.txtOrderID
WHERE tblOrders.txtIsConfirmed = '1'
)
OUTPUT $action ;
Desired Result: need to import orders with Products that are not already in the tblProductsSold table

You cannot approach it like you are doing it right now.
The MERGE statement merges two tables - the two tables you define in the header - the source table and the target table.
Right now, you're using tblOrders as your source, and tblProducts as your target. That alone seems odd - you're trying to merge orders into products? Doesn't seem very fitting...
Once you've defined your source and target table - you stat comparing which rows from the source are present in the target (or not). If a given row from your source is not present in the target - then you can insert its values into the target table.
But that only works for direct column values from the source table! You cannot go out and do subqueries into other tables as you're trying to do!
So I believe what you really should do is this:
as your source - have a view that lists the products found in your orders - the products (not the orders per se)
then compare your Products table to this view - if your orders happen to have any products that aren't present in the base Products table - insert them.
So you'd need something like:
MERGE tblProductsSold AS Target
USING (SELECT tblItemsOnOrder.txtOrderID, tblOrders.txtDateTime,
tblItemsOnOrder.txtPartNO, tblItemsOnOrder.txtQty
FROM tblOrders
INNER JOIN tblItemsOnOrder ON tblOrders.OrderID = tblItemsOnOrder.txtOrderID
WHERE tblOrders.txtIsConfirmed = '1') AS Source
ON Source.OrderID = Target.txtOrderID
WHEN NOT MATCHED THEN
INSERT (txtOrderID, txtOrderdate, txtPartno, txtQty)
VALUES (Source.OrderID, Source.txtDateTime, Source.txtPartNo, Source.txtQty)
OUTPUT $action ;

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Sql Server - How compare hash of two rows in merge - sql-server

From Martin Smith's comment... You could do WHEN MATCHED AND EXISTS(SELECT Source.* EXCEPT SELECT Target.*) THEN UPDATE ...

Related

Apply OPENJSON to a single column

SqlServer Many to Many AND

SQL Server FullText Search with Weighted Columns from Previous One Column

Updating column based on three tables

Merge statement for stocks using three tables

Categories

Resources