SQL Server: complex operation unique code for groups of rows - sql-server

I have a table where most of the columns are filled with data. In that table I have rows where for 4 columns I have duplicate data. I wanna do something like this:
Check all rows, group all data by these four columns (one group = one unique data for these 4 columns) and sign to them UniqueCode in last column. UniqueCode are for group.
So, when I have something like this:
Name Street HouseNumber PostCode UniqueCode
Cos Cos Cos Cos
Cos Cos Cos Cos
I want to fill UniqueCode with this same code. I want to write a query which clears all current uniqueCode and fills with new and write trigger (I call it right?) which do the same for newly added rows.
It is possible write that behaviour in sql?
Or I need to do it in code in my program?
Can you help me?
Sorry for my bad English.

Assuming you have a table similar to this one:
create table Persons
(
ID int identity primary key,
Name varchar(100),
Street varchar(100),
HouseNumber int,
PostCode varchar(100),
UniqueCode uniqueidentifier
)
You would create a trigger that finds already entered Persons having the same composite key:
create trigger AssignPersonGroup on Persons
after insert, update
as
set nocount on
update Persons
set UniqueCode =
isnull(
(select top 1 UniqueCode from Persons
where UniqueCode is not null
and Inserted.Name = Persons.Name
and Inserted.Street = Persons.Street
and Inserted.HouseNumber = Persons.HouseNumber
and Inserted.PostCode = Persons.PostCode)
, newid())
from Inserted inner join Persons
on Inserted.ID = Persons.ID
And assuming this would be the data:
insert into persons (Name, Street, HouseNumber, PostCode)
values ('Zagor', 'Darkwood', 23, '01010')
insert into persons (Name, Street, HouseNumber, PostCode)
values ('Zagor', 'Darkwood', 23, '01010')
insert into persons (Name, Street, HouseNumber, PostCode)
values ('Chico', 'Darkwood', 23, '01010')
Then
select *
from Persons
Would deliver:
ID Name Street HouseNumber PostCode UniqueCode
1 Zagor Darkwood 23 01010 A113D12D-F730-42DD-B3EE-AC33E34C0679
2 Zagor Darkwood 23 01010 A113D12D-F730-42DD-B3EE-AC33E34C0679
3 Chico Darkwood 23 01010 FD0739AF-525C-42C2-B929-0AB8EEAC3A73
Your existing data would be updated if you execute following two updates:
update persons
set uniquecode = newid()
update Persons
set uniqueCode=
(
select top 1 uniqueCode
from Persons groups
where UniqueCode is not null
and groups.Name = Persons.Name
and groups.Street = Persons.Street
and groups.HouseNumber = Persons.HouseNumber
and groups.PostCode = Persons.PostCode
order by ID
)
They are separated because Sql Server executes joins before scalar expressions. My first take on this was update query with derived table grouping on all columns and adding newid() column, but as inner join to persons table was executed first ids were unique per each row again.
Disable trigger prior to doing this and reenable it later. uniqueCode should not be set in program, only in trigger. You would need composite index on (Name, Street, HouseNumber, PostCode) and another one on UniqueCode.
That aside, I have seen your other question concerning this project. What are you trying to do? I'm unsure about nature of relationship between primary and high school tables. If you want to connect them by certain person, then you should have a person table with identity primary key. Both tables would reference this one and you would have no problems identifing someone's academic path :-)

Related

Snowflake - how to do multiple DML operations on same primary key in a specific order?

I am trying to set up continuous data replication in Snowflake. I get the transactions happened in source system and I need to perform them in Snowflake in the same order as source system. I am trying to use MERGE for this, but when there are multiple operations on same key in source system, MERGE is not working correctly. It either misses an operation or returns duplicate row detected during DML operation error.
Please note that the transactions need to happen in exact order and it is not possible to take the latest transaction for a key and just do it (like if a record has been INSERTED and UPDATED, in Snowflake too it needs to be inserted first and then updated even though insert is only transient state) .
Here is the example:
create or replace table employee_source (
id int,
first_name varchar(255),
last_name varchar(255),
operation_name varchar(255),
binlogkey integer
)
create or replace table employee_destination ( id int, first_name varchar(255), last_name varchar(255) );
insert into employee_source values (1,'Wayne','Bells','INSERT',11);
insert into employee_source values (1,'Wayne','BellsT','UPDATE',12);
insert into employee_source values (2,'Anthony','Allen','INSERT',13);
insert into employee_source values (3,'Eric','Henderson','INSERT',14);
insert into employee_source values (4,'Jimmy','Smith','INSERT',15);
insert into employee_source values (1,'Wayne','Bellsa','UPDATE',16);
insert into employee_source values (1,'Wayner','Bellsat','UPDATE',17);
insert into employee_source values (2,'Anthony','Allen','DELETE',18);
MERGE into employee_destination as T using (select * from employee_source order by binlogkey)
AS S
ON T.id = s.id
when not matched
And S.operation_name = 'INSERT' THEN
INSERT (id,
first_name,
last_name)
VALUES (
S.id,
S.first_name,
S.last_name)
when matched AND S.operation_name = 'UPDATE'
THEN
update set T.first_name = S.first_name, T.last_name = S.last_name
When matched
And S.operation_name = 'DELETE' THEN DELETE;
I am expecting to see - Bellsat - as last name for employee id 1 in the employee_destination table after all rows get processed. Same way, I should not see emp id 2 in the employee_destination table.
Is there any other alternative to MERGE to achieve this? Basically to go over every single DML in the same order (using binlogkey column for ordering) .
thanks.
You need to manipulate your source data to ensure that you only have one record per key/operation otherwise the join will be non-deterministic and will (dpending on your settings) either error or will update using a random one of the applicable source records. This is covered in the documentation here https://docs.snowflake.com/en/sql-reference/sql/merge.html#duplicate-join-behavior.
In any case, why would you want to update a record only for it to be overwritten by another update - this would be incredibly inefficient?
Since your updates appear to include the new values for all rows, you can use a window function to get to just the latest incoming change, and then merge those results into the target table. For example, the select for that merge (with the window function to get only the latest change) would look like this:
with SOURCE_DATA as
(
select COLUMN1::int ID
,COLUMN2::string FIRST_NAME
,COLUMN3::string LAST_NAME
,COLUMN4::string OPERATION_NAME
,COLUMN5::int PROCESSING_ORDER
from values
(1,'Wayne','Bells','INSERT',11),
(1,'Wayne','BellsT','UPDATE',12),
(2,'Anthony','Allen','INSERT',13),
(3,'Eric','Henderson','INSERT',14),
(4,'Jimmy','Smith','INSERT',15),
(1,'Wayne','Bellsa','UPDATE',16),
(1,'Wayne','Bellsat','UPDATE',17),
(2,'Anthony','Allen','DELETE',18)
)
select * from SOURCE_DATA
qualify row_number() over (partition by ID order by PROCESSING_ORDER desc) = 1
That will produce a result set that has only the changes required to merge into the target table:
ID
FIRST_NAME
LAST_NAME
OPERATION_NAME
PROCESSING_ORDER
1
Wayne
Bellsat
UPDATE
17
2
Anthony
Allen
DELETE
18
3
Eric
Henderson
INSERT
14
4
Jimmy
Smith
INSERT
15
You can then change the when not matched to remove the operation_name. If it's listed as an update and it's not in the target table, it's because it was inserted in a previous operation in the new changes.
For the when matched clause, you can use the operation_name to determine if the row should be updated or deleted.

What's the cleanest way in SQL to break a table into two tables?

I have a table that I want to break into 2 tables. I want to pull some data out of table A, put it into a new table B, and then point each record in A to the corresponding record in the new table.
It's easy enough to populate the new table with an INSERT INTO B blah blah SELECT blah blah FROM A. But the catch is, when I create the new records in B, I want to write the ID of the B record back into A.
I've thought of two ways to do this:
Create a cursor, loop through A a record at a time, create the record in B and post the new ID back to A.
Create a temporary table with the extracted data, an ID for the new record, and the ID of A. Then use this temporary table to populate B and also to post the ID back to A.
Both methods seem cumbersome with a lot of copying all the data back and forth. Is there a clean, simple way to do this or should I just knuckle down and do it the hard way?
Oh, I'm using Microsoft SQL Server, if your answer depends on non-standard features of SQL.
Someone asks for an example. Yes, I should have included something concrete to make it clear. The real example is a bunch of data, but let me give a simplified example of what I mean.
Let's say I have a Customer table with customer_id, name, and city. I want to break city out into a separate table.
So for example:
Customer
ID Name City
17 Al Detroit
22 Betty Baltimore
39 Charles Cleveland
I want to convert this to:
Customer
ID Name City_ID
17 Al 1
22 Betty 2
39 Charles 3
City
ID Name
1 Detroit
2 Baltimore
3 Cleveland
The exact ID values don't matter.
So easy enough to create the City table and the reference ...
create table city (id int identity primary key, name varchar(50))
alter table customer add city_id int references city
And then populate the city table ...
insert into city (name)
select city from customer
The trick is how to get those city IDs back into the Customer table.
(And yes, in this simplified example, the effort may appear pointless. In real life we have many tables with addresses and I want to pull all those fields out of all the other tables and put them into a single address table, so we can standardize the declarations and processing of addresses.)
(Note: I haven't tested the sample code above. Excuse me if there's a typo or something in there.)
You can use the output clause to capture your new ID values.
without any sample data or examples of what you are doing the following is just a guide.
Create a #table to hold the new ID values, then insert the newly inserted Id identity values along with a correlating value from the inserted virtual table. You can then update the original table with the new IDs by joining on this correlating value.
create table #NewIds (TableBId int, TableAId int)
insert into TableB (column list)
output inserted.Id, inserted.TableAId into #NewIds
select column list
from TableA
update a
set a.TableBId=Id
from #NewIds n join TableA a on a.Id=n.TableAId

spx for moving values to new table

I am trying to create one spx which based upon my ID which is 1009 will move 9 columns data to new table:
The old table has 9 columns:
CD_Train
CD_Date
CD_Score
Notes_Train
Notes_Date
Notes_Score
Ann_Train
Ann_Date
Ann_Score
userid - common in both tables
ID - 1009 - only exists in this table
and my new table has:
TrainingID,
TrainingType,
Score,
Date,
Status,
userid
TrainingType will have 3 values: Notes, CD, Ann
and other fields like score will get data from notes_score and so on
and date will get data from notes_date,cd_date depending upon in which column cd training went
status will get value from Notes_Train, cd_train and so on
based upon this, I am lost how should I do it
I tried querying one sql of users table and tried to do the join but I am losing the ground how to fix it
No idea yet, how to fill your column trainingId but the rest can be done by applying some UNION ALL clauses:
INSERT INTO tbl2 (trainingType,Date,Score,Status,userid)
Select 'CD' , CD_date, CD_score, CD_Train, userid FROM tbl1 where CD_date>0
UNION ALL
SELECT 'Notes', Notes_Date, Notes_Score, Notes_Train, userid FROM tbl1 where Notes_date>0
UNION ALL
SELECT 'Ann', Ann_Date, Ann_Score, ANN_Train, userid
FROM tbl1 where Ann_date>0
I don't know as yet whether all columns are filled in each row. That is the reason for the where clauses which should filter out only those rows with relevant data in the three selected columns.

T-SQL for Updating Rows with same value in a column

I have a table lets say called FavoriteFruits that has NAME, FRUIT, and GUID for columns. The table is already populated with names and fruits. So lets say:
NAME FRUIT GUID
John Apple NULL
John Orange NULL
John Grapes NULL
Peter Canteloupe NULL
Peter Grapefruit NULL
Ok, now I want to update the GUID column with a new GUID (using NEWID()), but I want to have the same GUID per distinct name. So I want all the John Smiths to have the same GUID, and I want both the Peters to have the same GUID, but that GUID different than the one used for the Johns. So now it would look something like this:
NAME FRUIT GUID
John Apple f6172268-78b7-4c2b-8cd7-7a5ca20f6a01
John Orange f6172268-78b7-4c2b-8cd7-7a5ca20f6a01
John Grapes f6172268-78b7-4c2b-8cd7-7a5ca20f6a01
Peter Canteloupe e3b1851c-1927-491a-803e-6b3bce9bf223
Peter Grapefruit e3b1851c-1927-491a-803e-6b3bce9bf223
Can I do that in an update statement without having to use a cursor? If so can you please give an example?
Thanks guys...
Update a CTE won't work because it'll evaluate per row. A table variable would work:
You should be able to use a table variable as a source from which to update the data. This is untested, but it'll look something like:
DECLARE #n TABLE (Name varchar(10), Guid uniqueidentifier);
INSERT #n
SELECT Name, newid() AS Guid
FROM FavoriteFruits
GROUP BY Name;
UPDATE f
SET f.Guid = n.Guid
FROM #n n
JOIN FavoriteFruits f ON f.Name = n.Name
So that populates a variable with a GUID per name, then joins it back to the original table and updates accordingly.
To clarify comments re a table expression in the USING clause of a MERGE statement.
The following won't work because it'll evaluate per row:
MERGE INTO FavoriteFruits
USING (
SELECT NAME, NEWID() AS GUID
FROM FavoriteFruits
GROUP
BY NAME
) AS source
ON source.NAME = FavoriteFruits.NAME
WHEN MATCHED THEN
UPDATE
SET GUID = source.GUID;
But the following, using a table variable, will work:
DECLARE #n TABLE
(
NAME VARCHAR(10) NOT NULL UNIQUE,
GUID UNIQUEIDENTIFIER NOT NULL UNIQUE
);
INSERT INTO #n (NAME, GUID)
SELECT NAME, NEWID()
FROM FavoriteFruits
GROUP
BY NAME;
MERGE INTO FavoriteFruits
USING #n AS source
ON source.NAME = FavoriteFruits.NAME
WHEN MATCHED THEN
UPDATE
SET GUID = source.GUID;
There's a single-statement solution too, which, however, has some limitations. The idea is to use OPENQUERY(), like this:
UPDATE FavoriteFruits
SET GUID = n.GUID
FROM (
SELECT NAME, GUID
FROM OPENQUERY(
linkedserver,
'SELECT NAME, NEWID() AS GUID FROM database.schema.FavoriteFruits GROUP BY NAME'
)
) n
WHERE FavoriteFruits.NAME = n.NAME
This solution implies that you need to create a self-pointing linked server. Another specificity is that you can't use this method on table variables nor local temporary tables (global ones would do as well as 'normal' tables).

How to automatically set the SNo of the row in MS SQL Server 2005?

I have a couple of rows in a database table (lets call it Customer). Each row is numbered by SNo, which gets automatically incremented by the identity property inherent in MS SQLServer. But when I delete a particular row that particular row number is left blank, but I want the table to auto correct itself.
To give you a example:
I have a sample Customer Table with following rows:
SNo CustomerName Age
1 Dani 28
2 Alex 29
3 Duran 21
4 Mark 24
And suppose I delete 3rd row the table looks like this:
SNo CustomerName Age
1 Dani 28
2 Alex 29
4 Mark 24
But I want the table to look like this:
SNo CustomerName Age
1 Dani 28
2 Alex 29
3 Mark 24
How can I achieve that?
Please help me out
Thanks in anticipation
As has been pointed out doing that would break anything in a relationship with SNo, however if your doing this because you need ordinal numbers in you presentation layer for example, you can pull off a [1..n] row number with;
SELECT ROW_NUMBER() OVER(ORDER BY SNo ASC), SNo, CustomerName, Age FROM Customer
Obviously in this case the row number is just an incrementing number, its meaningless in relation to anything else.
I don't think you want to do that. Imagine the scenario where you have another table CustomerOrder that stores all customer orders. The structure for that table might look something like this:
CustomerOrder
-------------
OrderID INT
SNo INT
OrderDate DATETIME
...
In this case, the SNo field is a foreign key into the CustomerOrder table, and we use it to relate orders to a customer. If you delete a record from your Customer table (say with SNo = 1), are you going to go back and update the SNo values in the entire CustomerOrder table? It's best to just let the ID's autoincrement and not worry about spaces in the IDs due to deletions.
Why not create a view?
CREATE VIEW <ViewName>
AS
SELECT
ROW_NUMBER() OVER(ORDER BY SNo ASC) AS SNo
,CustomerName
,Age
FROM Customers
GO
Then access the data in customers table by selecting from the view.
Of course the SNo shown by the view has no meaning in the context of relationships, but the data returned will look exactly like you want it to look.
Using transactions when inserting records in the Database with C#
You have to use DBCC CHECKIDENT(table_name, RESEED, next_val_less_1);
As have been pointed out in other answers, this is a bad idea, and if the reason is for a presentation there are other solutions.
-- Add data to temp table
select SNo, CustomerName, Age
into #Customer
from Customer
-- Truncate Customer
-- Resets identity to seed value for column
truncate table Customer
-- Add rows back to Customer
insert into Customer(CustomerName, Age)
select CustomerName, Age
from #Customer
order by SNo
drop table #Customer

Resources