sql server - insert if not exist else update

sql server - insert if not exist else update - sql-server

I have here a list of 100 types of item flavor. Then I have a table where I need a record for every item in every flavor. So if I have 50 items, I need 100 records for each of the 50 items in this table_A. so there will be a total of 100x50 records in this table at the end.
What I have now is a random mix of data and I know I don't have a record for each type of flavor for every item.
What I need help with is, an idea/algorithm so solve this problem. pseudo code would do. I have a table with all possible flavors (tbl_flavor) and a table with all 50 items (tbl_items). These two will dictate what needs to be put in table_A which is basically an inventory.
Please advise.

If I'm understanding your question correctly, a SQL Server EXCEPT query will help.
As already pointed out in the comments, here's how to get the matrix of items and flavors:
SELECT Items.Item, Flavors.Flavor
FROM Items
CROSS JOIN Flavors
Here's how to get the matrix of items and flavors, omitting the combinations that are already in your other table.
SELECT Items.Item, Flavors.Flavor
FROM Items
CROSS JOIN Flavors
EXCEPT SELECT Item, Flavor
FROM Table_A
So the INSERT becomes:
INSERT INTO Table_A (Item, Flavor)
SELECT Items.Item, Flavors.Flavor
FROM Items
CROSS JOIN Flavors
EXCEPT SELECT Item, Flavor
FROM Table_A
This query is untested because I'm not 100% sure about the question. If you post more detail I'll test it.

There are a few ways you can tackle this sort of problem. Here is psuedocode for one of those ways.
Update table
set Col1 = SomeValue
where MyKeys = Mykeys
if (##ROWCOUNT = 0)
begin
Insert table
(Cols)
Values
(Vals)
end
Or you can use MERGE. https://msdn.microsoft.com/en-us/library/bb510625.aspx

Try This
UPDATE MyTable
SET
ColumnToUpdate = NewValue
WHERE EXISTS
(
SELECT
1
FROM TableWithNewValue
WHERE ColumnFromTable1 = MyTable.ColumnName
)
INSERT INTO MyTable
(
Column1,
Column2,
...
ColumnN
)
SELECT
Value1,
Value2,
...
ValueN
FROM TableWithNewValue
WHERE NOT EXISTS
(
SELECT
1
FROM MyTable
WHERE ColumnName = TableWithNewValue.ColumnFromTable1
)

Related

How to insert multiple rows in a merge?

How to insert multiple rows in a merge in SQL?
I'm using a MERGE INSERT and I'm wondering is it possible to add two rows at the same time? Under is the query I have written, but as you can see, I want to insert both boolean for IsNew, also when it is not matched, I want to add a row for IsNew = 1 and one IsNew = 0.
How can I achieve this?
MERGE ITEMS AS TARGET
USING #table AS SOURCE
ON T.[ID]=S.ID
WHEN MATCHED THEN
UPDATE SET
T.[Content] = S.[Content],
WHEN NOT MATCHED THEN
INSERT (ID, Content, TIME, IsNew)
VALUES (ID, TEXT, GETDATE(), 1),

You can't do this directly with a merge statement, but there is a simple solution.
The merge statement <merge_not_matched> clause (which is the insert...values|default values) clause can only insert one row on the target table for each row in the source table.
This means that for you to enter two rows for each match, you simply need to change the source table - in this case, it's as simple as a cross join query.
However the <merge_matched> clause requires that only a single row from the source can match any single row from the target - or you will get the following error:
The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.
To solve this problem you will have to add a condition to the when match to make sure only one row from the source table updates the target table:
MERGE Items AS T
USING (
SELECT Id, Text, GetDate() As Date, IsNew
FROM #table
-- adding one row for each row in source
CROSS JOIN (SELECT 0 As IsNew UNION SELECT 1) AS isNewMultiplier
) AS S
ON T.[ID]=S.ID
WHEN MATCHED AND S.IsNew = 1 THEN -- Note the added condition here
UPDATE SET
T.[Content] = S.[Text]
WHEN NOT MATCHED THEN
INSERT (Id, Content, Time, IsNew) VALUES
(Id, Text, Date, IsNew);
You can see a live demo on rextester.
With all that being said, I would like to refer you to another stackoverflow post that offers a better alternative then using the merge statement.
The author of the answer is a Microsoft MVP and an SQL Server expert DBA, you should at least read what he has to say.

It seems you can't achieve this using a merge statement. It may be better for you to split the two into separate queries for update and insert.
For example:
UPDATE ITEMS SET ITEMS.ID = #table.ID FROM ITEMS INNER JOIN #table ON ITEMS.ID = #table.ID
INSERT INTO ITEMS (ID, Content, TIME, IsNew) SELECT (ID, TEXT, GETDATE(), 1) FROM #table
INSERT INTO ITEMS (ID, Content, TIME, IsNew) SELECT (ID, TEXT, GETDATE(), 0) FROM #table
This will insert both rows as desired, mimicking your merge statement. However, your update statement won't do much - if you're matching based on ID, then it's impossible for you to have any IDs to update. If you wanted to update other fields, then you could change it like this:
UPDATE ITEMS SET ITEMS.Content = #table.TEXT FROM ITEMS INNER JOIN #table ON ITEMS.ID = #table.ID

How can I SELECT data that is exclusive?

I'm trying to write a SELECT statement that will pull data that only exists once. I have two columns, ItemID and OfficeID, and I need to find items from the ItemID column that are only registered to one office. Items can have multiple rows, one for each office they are assigned to. So a single ItemID can have multiple rows if it is used in multiple offices. Can I use a select statement with COUNT, or is there a better way?
Can't think of a place to start, but I've used COUNT in varying ways.

Using HAVING and EXISTS you can use the below query since Items can have multiple rows, one for each office they are assigned to which I read as an ItemID will only have multiple rows if it has multiple OfficeID. If there can be multiple rows for the same OfficeID just let us know.
select *
from table
where exists(select ItemID from table group by ItemID having count(*) = 1)

You must group by ItemID and in the having clause apply the condition count(*) = 1:
select ItemID
from tablename
group by ItemID
having count(*) = 1
or with NOT EXISTS:
select t.ItemID
from tablename t
where not exists (
select 1 from tablename
where ItemID = t.ItemID and OfficeID <> t.OfficeID
)
this will return all items for which there is not another row with the same ItemID but different OfficeID.

SQL: Efficient and fast way to select million of records except given records

I am using SQL Server. Facing problem with handling large amount of data in SQL Query. I want to select those records from ITEM table which are not in my given list.
Let me elaborate.
I have ITEM table having ITEM_CODE as column.
It contains several million of records.
and populating some item codes from other source, for instance a file.
So i want to select those records from ITEM table which are not in that populated list.
Like,
SELECT ITEM_CODE FROM ITEM WHERE ITEM_CODE NOT IN ('I1', 'I2', 'I3',.......);
Using IN is cumbersome task, and take lot of time. then i used other way, like this,
SELECT ITEM_CODE FROM ITEM WHERE NOT (ITEM_CODE = 'I1' OR ITEM_CODE = 'I2' AND .....)
Note: .... means million of parameters.
This way also takes lot of time. another way i used,
SELECT T.ITEM_CODE FROM ITEM T LEFT JOIN
(SELECT ITEM_CODE FROM ITEM T1
WHERE T1.ITEM_CODE ='I1' OR T1.ITEM_CODE ='I2') AS T2
on T.ITEM_CODE = T2.ITEM_CODE WHERE T2.ITEM_CODE IS NULL
This way improve little performance, but still not satisfactory.
Is there any way to do it fast ?
Please suggest me some solution to this.
Any answer will be appreciable.
Thank you.

How about something like this...
CREATE TABLE #TMP(ITEM_CODE VARCHAR(10))
INSERT INTO #TMP
VALUES('I1'), ('I2'), etc ....
SELECT T.ITEM_CODE
FROM ITEM T
LEFT JOIN #TMP T2 ON T.ITEM_CODE = T2.ITEM_CODE
WHERE T2.ITEM_CODE IS NULL
OR
SELECT T.ITEM_CODE
FROM ITEM T
WHERE NOT EXISTS(
SELECT NULL
FROM #TMP T2
WHENRE T2.ITEM_CODE = T.ITEM_CODE)
You can even create an index on the temp table
CREATE INDEX _temp ON #TMP (ITEM_CODE)

SQL WHERE NOT EXISTS (skip duplicates)

Hello I'm struggling to get the query below right. What I want is to return rows with unique names and surnames. What I get is all rows with duplicates
This is my sql
DECLARE #tmp AS TABLE (Name VARCHAR(100), Surname VARCHAR(100))
INSERT INTO #tmp
SELECT CustomerName,CustomerSurname FROM Customers
WHERE
NOT EXISTS
(SELECT Name,Surname
FROM #tmp
WHERE Name=CustomerName
AND ID Surname=CustomerSurname
GROUP BY Name,Surname )
Please can someone point me in the right direction here.
//Desperate (I tried without GROUP BY as well but get same result)

DISTINCT would do the trick.
SELECT DISTINCT CustomerName, CustomerSurname
FROM Customers
Demo
If you only want the records that really don't have duplicates (as opposed to getting duplicates represented as a single record) you could use GROUP BY and HAVING:
SELECT CustomerName, CustomerSurname
FROM Customers
GROUP BY CustomerName, CustomerSurname
HAVING COUNT(*) = 1
Demo

First, I thought that #David answer is what you want. But rereading your comments, perhaps you want all combinations of Names and Surnames:
SELECT n.CustomerName, s.CustomerSurname
FROM
( SELECT DISTINCT CustomerName
FROM Customers
) AS n
CROSS JOIN
( SELECT DISTINCT CustomerSurname
FROM Customers
) AS s ;

Are you doing that while your #Tmp table is still empty?
If so: your entire "select" is fully evaluated before the "insert" statement, it doesn't do "run the query and add one row, insert the row, run the query and get another row, insert the row, etc."
If you want to insert unique Customers only, use that same "Customer" table in your not exists clause
SELECT c.CustomerName,c.CustomerSurname FROM Customers c
WHERE
NOT EXISTS
(SELECT 1
FROM Customers c1
WHERE c.CustomerName = c1.CustomerName
AND c.CustomerSurname = c1.CustomerSurname
AND c.Id <> c1.Id)
If you want to insert a unique set of customers, use "distinct"

Typically, if you're doing a WHERE NOT EXISTS or WHERE EXISTS, or WHERE NOT IN subquery,
you should use what is called a "correlated subquery", as in ypercube's answer above, where table aliases are used for both inside and outside tables (where inside table is joined to outside table). ypercube gave a good example.
And often, NOT EXISTS is preferred over NOT IN (unless the WHERE NOT IN is selecting from a totally unrelated table that you can't join on.)
Sometimes if you're tempted to do a WHERE EXISTS (SELECT from a small table with no duplicate values in column), you could also do the same thing by joining the main query with that table on the column you want in the EXISTS. Not always the best or safest solution, might make query slower if there are many rows in that table and could cause many duplicate rows if there are dup values for that column in the joined table -- in which case you'd have to add DISTINCT to the main query, which causes it to SORT the data on all columns.
-- Not efficient at all.
And, similarly, the WHERE NOT IN or NOT EXISTS correlated subqueries can be accomplished (and give the exact same execution plan) if you LEFT OUTER JOIN the table you were going to subquery -- and add a WHERE . IS NULL.
You have to be careful using that, but you don't need a DISTINCT. Frankly, I prefer to use the WHERE NOT IN subqueries or NOT EXISTS correlated subqueries, because the syntax makes the intention clear and it's hard to go wrong.
And you do not need a DISTINCT in the SELECT inside such subqueries (correlated or not). It would be a waste of processing (and for WHERE EXISTS or WHERE IN subqueries, the SQL optimizer would ignore it anyway and just use the first value that matched for each row in the outer query). (Hope that makes sense.)

T-SQL filtering on dynamic name-value pairs

I'll describe what I am trying to achieve:
I am passing down to a SP an xml with name value pairs that I put into a table variable, let's say #nameValuePairs.
I need to retrieve a list of IDs for expressions (a table) with those exact match of name-value pairs (attributes, another table) associated.
This is my schema:
Expressions table --> (expressionId, attributeId)
Attributes table --> (attributeId, attributeName, attributeValue)
After trying complicated stuff with dynamic SQL and evil cursors (which works but it's painfully slow) this is what I've got now:
--do the magic plz!
-- retrieve number of name-value pairs
SET #noOfAttributes = select count(*) from #nameValuePairs
select distinct
e.expressionId, a.attributeName, a.attributeValue
into
#temp
from
expressions e
join
attributes a
on
e.attributeId = a.attributeId
join --> this join does the filtering
#nameValuePairs nvp
on
a.attributeName = nvp.name and a.attributeValue = nvp.value
group by
e.expressionId, a.attributeName, a.attributeValue
-- now select the IDs I need
-- since I did a select distinct above if the number of matches
-- for a given ID is the same as noOfAttributes then BINGO!
select distinct
expressionId
from
#temp
group by expressionId
having count(*) = #noOfAttributes
Can people please review and see if they can spot any problems? Is there a better way of doing this?
Any help appreciated!

I belive that this would satisfy the requirement you're trying to meet. I'm not sure how much prettier it is, but it should work and wouldn't require a temp table:
SET #noOfAttributes = select count(*) from #nameValuePairs
SELECT e.expressionid
FROM expression e
LEFT JOIN (
SELECT attributeid
FROM attributes a
JOIN #nameValuePairs nvp ON nvp.name = a.Name AND nvp.Value = a.value
) t ON t.attributeid = e.attributeid
GROUP BY e.expressionid
HAVING SUM(CASE WHEN t.attributeid IS NULL THEN (#noOfAttributes + 1) ELSE 1 END) = #noOfAttributes
EDIT: After doing some more evaluation, I found an issue where certain expressions would be included that shouldn't have been. I've modified my query to take that in to account.

One error I see is that you have no table with an alias of b, yet you are using: a.attributeId = b.attributeId.
Try fixing that and see if it works, unless I am missing something.
EDIT: I think you just fixed this in your edit, but is it supposed to be a.attributeId = e.attributeId?

This is not a bad approach, depending on the sizes and indexes of the tables, including #nameValuePairs. If it these row counts are high or it otherwise becomes slow, you may do better to put #namValuePairs into a temp table instead, add appropriate indexes, and use a single query instead of two separate ones.
I do notice that you are putting columns into #temp that you are not using, would be faster to exclude them (though it would mean duplicate rows in #temp). Also, you second query has both a "distinct" and a "group by" on the same columns. You don't need both so I would drop the "distinct" (probably won't affect performance, because the optimizer already figured this out).
Finally, #temp would probably be faster with a clustered non-unique index on expressionid (I am assuming that this is SQL 2005). You could add it after the SELECT..INTO, but it is usually as fast or faster to add it before you load. This would require you to CREATE #temp first, add the clustered and then use INSERT..SELECT to load it instead.
I'll add an example of merging the queries in a mintue... Ok, here's one way to merge them into a single query (this should be 2000-compatible also):
-- retrieve number of name-value pairs
SET #noOfAttributes = select count(*) from #nameValuePairs
-- now select the IDs I need
-- since I did a select distinct above if the number of matches
-- for a given ID is the same as noOfAttributes then BINGO!
select
expressionId
from
(
select distinct
e.expressionId, a.attributeName, a.attributeValue
from
expressions e
join
attributes a
on
e.attributeId = a.attributeId
join --> this join does the filtering
#nameValuePairs nvp
on
a.attributeName = nvp.name and a.attributeValue = nvp.value
) as Temp
group by expressionId
having count(*) = #noOfAttributes