Duplicate Data over One-to-Many self relation (Tsql)

Duplicate Data over One-to-Many self relation (Tsql) - sql-server

Sorry if the title is poorly descriptive, but I can't do better right now =(
So, I have this master-detail scheme, with the detail being a tree structure (one to many self relation) with n levels (on SQLServer 2005)
I need to copy a detail structure from one master to the another using a stored procedure, by passing the source master id and the target master id as parameters (the target is new, so it doesn't has details).
I'm having troubles, and asking for your kind help in finding a way to keep track of parent id's and inserting the children without using cursors or nasty things like that...
This is a sample model, of course, and what I'm trying to do is to copy the detail structure from one master to other. In fact, I'm creating a new master using an existing one as template.

If I understand the problem, this might be what you want:
INSERT dbo.Master VALUES (#NewMaster_ID, #NewDescription)
INSERT dbo.Detail (parent_id, master_id, [name])
SELECT detail_ID, #NewMaster_ID, [name]
FROM dbo.Detail
WHERE master_id = #OldMaster_ID
UPDATE NewChild
SET parent_id = NewParent.detail_id
FROM dbo.Detail NewChild
JOIN dbo.Detail OldChild
ON NewChild.parent_id = OldChild.detail_id
JOIN dbo.Detail NewParent
ON NewParent.parent_id = OldChild.parent_ID
WHERE NewChild.master_id = #NewMaster_ID
AND NewParent.master_id = #NewMaster_ID
AND OldChild.master_id = #OldMaster_ID
The trick is to use the old detail_id as the new parent_id in the initial insert. Then join back to the old set of rows using this relationship, and update the new parent_id values.
I assumed that detail_id is an IDENTITY value. If you assign them yourself, you'll need to provide details, but there's a similar solution.

you'll have to provide create table and insert into statements for little sample data.
and expected results based on this sample data.

Related

Distinct with long columns

I have here some database schema with tables having long fields (in MS-SQL-Server of type "text", in Sybase of type "text" too) and I need to retrieve distinct rows.
The tables looks like
create table node (id int primary key, … a few more fields … data text);
create table ref (id int primary key, node_id int, … a few more fields);
For one row in "node", there may be zero or more rows in "ref".
Now I have a query like
SELECT node.* FROM node, ref WHERE node.id = ref.node_id AND ... some more restrictions.
This query returns duples and triples when there is more than a single row in "ref" for some "node_id".
But I need unique rows!
Using SELECT DISTINCT node.* does not work because of the columns of type "text" :-(
In Sybase there is trick, just add "GROUP BY node.id" to the query, voila! You get unique rows returned.
Is there some similar simple Trick for MS-SQL-Server?
I have already a solution with temporary tables, but this seems to be a lot slower maybe the reason is just because of the larger number of statements transferred to the database?

It looks like you are approaching this problem from the wrong direction. Joins are typically used to expand on keys where relevant data is stored in different tables. So it's no surprise you are getting more than one row per node_id.
In your query, you join the two tables together, but then you ignore everything from ref. It looks like you're just trying to filter out ids from node that are not referenced in ref. If that is the case, then you don't want to use a join. The following will work much better
select *
from node
where id in (
select node_id
from ref
where [any restrictions placed on the ref table go here]
)
and [any restrictions placed on the node table go here]
Furthermore, at the risk of teaching you bad join practices, the same thing can be accomplished they way you were trying to do it originally, but it's more painful to write and it's not good practice
select node.col1, node.col2, ... , node.last_col
FROM node
inner join ref on node.id = ref.node_id
where [some restrictions.]
group by node.col1, node.col2, ... , node.last_col

Merge query using two tables in SQL server 2012

I am very new to SQL and SQL server, would appreciate any help with the following problem.
I am trying to update a share price table with new prices.
The table has three columns: share code, date, price.
The share code + date = PK
As you can imagine, if you have thousands of share codes and 10 years' data for each, the table can get very big. So I have created a separate table called a share ID table, and use a share ID instead in the first table (I was reliably informed this would speed up the query, as searching by integer is faster than string).
So, to summarise, I have two tables as follows:
Table 1 = Share_code_ID (int), Date, Price
Table 2 = Share_code_ID (int), Share_name (string)
So let's say I want to update the table/s with today's price for share ZZZ. I need to:
Look for the Share_code_ID corresponding to 'ZZZ' in table 2
If it is found, update table 1 with the new price for that date, using the Share_code_ID I just found
If the Share_code_ID is not found, update both tables
Let's ignore for now how the Share_code_ID is generated for a new code, I'll worry about that later.
I'm trying to use a merge query loosely based on the following structure, but have no idea what I am doing:
MERGE INTO [Table 1]
USING (VALUES (1,23-May-2013,1000)) AS SOURCE (Share_code_ID,Date,Price)
{ SEEMS LIKE THERE SHOULD BE AN INNER JOIN HERE OR SOMETHING }
ON Table 2 = 'ZZZ'
WHEN MATCHED THEN UPDATE SET Table 1.Price = 1000
WHEN NOT MATCHED THEN INSERT { TO BOTH TABLES }
Any help would be appreciated.

http://msdn.microsoft.com/library/bb510625(v=sql.100).aspx
You use Table1 for target table and Table2 for source table
You want to do action, when given ID is not found in Table2 - in the source table
In the documentation, that you had read already, that corresponds to the clause
WHEN NOT MATCHED BY SOURCE ... THEN <merge_matched>
and the latter corresponds to
<merge_matched>::=
{ UPDATE SET <set_clause> | DELETE }
Ergo, you cannot insert into source-table there.
You could use triggers for auto-insertion, when you insert something in Table1, but that will not be able to insert proper Shared_Name - trigger just won't know it.
So you have two options i guess.
1) make T-SQL code block - look for Stored Procedures. I think there also is a construct to execute anonymous code block in MS SQ, like EXECUTE BLOCK command in Firebird SQL Server, but i don't know it for sure.
2) create updatable SQL VIEW, joining Table1 and Table2 to show last most current date, so that when you insert a row in this view the view's on-insert trigger would actually insert rows to both tables. And when you would update the data in the view, the on-update trigger would modify the data.

Stored procedure to update a table based on data from table parameter

I want to begin by stating I'm an SQL noob, so I'd appreciate any suggestions or comments on my workflow and/or mindset when trying to solve this issue.
What I'm doing is gathering usage statistics about several applications, in several categories (not all categories necessarily apply to all applications), storing them in a database.
I've set up a few tables to do that, and then one table to link everything together that's structured like so (from now on: Dtable):
(column name - details)
UserID - foreign key to another table which stores users data
ApplicationID - foreign key to another table which stores applications data
CategoryID - foreign key to another table which holds a list of different categories
Value - the actual data
Each application gathers the data, then submits it to the database using a stored procedure. As the amount of data can be different based on actual usage (not always sending every category) and for each application, I was thinking of sending the data as a DataTable with a list of CategoryID and Value so I won't have to call a procedure for every individual category (Ptable).
I need to update each record in Dtable to the correct value in Ptable according to CategoryID, but also filtered by UserID and ApplicationID. UserID and ApplicationID will be given as two other parameters to the Stored Procedure. Ptable only contains a list of CategoryID / Value records.
Now, I read about Cursors (for each record in the table parameter set the relevant data in the database table), but the consensus seems to be "Avoid at all costs".
How would I go about updating the table, then, based on the varying records in Ptable?
P.S.
The tables are structured like so to keep agility and scalability in adding more categories/applications in the future. If there's a better way to do it I'll be happy to know.

I believe the update statement would look something like this, where #ApplicationID and #UserID are the stored proc's other parameters:
update Dtable
set Dtable.Value = p.Value
from Ptable p
where Dtable.UserID = #UserID
and Dtable.ApplicationID = #ApplicationID
and Dtable.CategoryID = p.CategoryID;

Updating redundant/denormalized data automatically in SQL Server

Use a high level of redundant, denormalized data in my DB designs to improve performance. I'll often store data that would normally need to be joined or calculated. For example, if I have a User table and a Task table, I would store the Username and UserDisplayName redundantly in every Task record. Another example of this is storing aggregates, such as storing the TaskCount in the User table.
User
UserID
Username
UserDisplayName
TaskCount
Task
TaskID
TaskName
UserID
UserName
UserDisplayName
This is great for performance since the app has many more reads than insert, update or delete operations, and since some values like Username change rarely. However, the big draw back is that the integrity has to be enforced via application code or triggers. This can be very cumbersome with updates.
My question is can this be done automatically in SQL Server 2005/2010... maybe via a persisted/permanent View. Would anyone recommend another possibly solution or technology. I've heard document-based DBs such as CouchDB and MongoDB can handle denormalized data more effectively.

You might want to first try an Indexed View before moving to a NoSQL solution:
http://msdn.microsoft.com/en-us/library/ms187864.aspx
and:
http://msdn.microsoft.com/en-us/library/ms191432.aspx
Using an Indexed View would allow you to keep your base data in properly normalized tables and maintain data-integrity while giving you the denormalized "view" of that data. I would not recommend this for highly transactional tables, but you said it was heavier on reads than writes so you might want to see if this works for you.
Based on your two example tables, one option is:
1) Add a column to the User table defined as:
TaskCount INT NOT NULL DEFAULT (0)
2) Add a Trigger on the Task table defined as:
CREATE TRIGGER UpdateUserTaskCount
ON dbo.Task
AFTER INSERT, DELETE
AS
;WITH added AS
(
SELECT ins.UserID, COUNT(*) AS [NumTasks]
FROM INSERTED ins
GROUP BY ins.UserID
)
UPDATE usr
SET usr.TaskCount = (usr.TaskCount + added.NumTasks)
FROM dbo.[User] usr
INNER JOIN added
ON added.UserID = usr.UserID
;WITH removed AS
(
SELECT del.UserID, COUNT(*) AS [NumTasks]
FROM DELETED del
GROUP BY del.UserID
)
UPDATE usr
SET usr.TaskCount = (usr.TaskCount - removed.NumTasks)
FROM dbo.[User] usr
INNER JOIN removed
ON removed.UserID = usr.UserID
GO
3) Then do a View that has:
SELECT u.UserID,
u.Username,
u.UserDisplayName,
u.TaskCount,
t.TaskID,
t.TaskName
FROM User u
INNER JOIN Task t
ON t.UserID = u.UserID
And then follow the recommendations from the links above (WITH SCHEMABINDING, Unique Clustered Index, etc.) to make it "persisted". While it is inefficient to do an aggregation in a subquery in the SELECT as shown above, this specific case is intended to be denormalized in a situation that has higher reads than writes. So doing the Indexed View will keep the entire structure, including the aggregation, physically stored so each read will not recalculate it.
Now, if a LEFT JOIN is needed if some Users do not have any Tasks, then the Indexed View will not work due to the 5000 restrictions on creating them. In that case, you can create a real table (UserTask) that is your denormalized structure and have it populated via either a Trigger on just the User Table (assuming you do the Trigger I show above which updates the User Table based on changes in the Task table) or you can skip the TaskCount field in the User Table and just have Triggers on both tables to populate the UserTask table. In the end, this is basically what an Indexed View does just without you having to write the synchronization Trigger(s).

Copying select fields in one db table to another db table

I have two fields (Utter and Misery) in Table Massconfusion in database A that I need to move to two fields (also named Utter and Misery) in a table also called Massconfusion database B. There are two keys (Primary and subkey) that keep this data sorted correctly with the rest of the information in Database A and B.
(basically we somehow lost most of the information in the two fields and are trying to get it from an old copy of our db and all of the easy methods of restoration have not worked.)
I am a total newbie at scripting in sql. So I am pleading, HELP! Thanks in advance.

UPDATE B.dbo.MassConfusion
SET Utter = (SELECT Utter FROM A.dbo.MassConfusion WHERE A.PrimeKey = B.PrimeKey)
UPDATE B.dbo.MassConfusion
SET Misery= (SELECT Misery FROM A.dbo.MassConfusion WHERE A.PrimeKey = B.PrimeKey)
You may be better off inserting into a new table depending on the number of records and how messed up they are, though....UPDATE can be slow and expensive depending on how many indexes you have, etc.

I wasn't clear on exactly what the primary key for your MassConfusion table was. The first version assumes that the primary key for MassConfusion is just Primary. If the primary key is actually a composite of Primary and SubKey, then use the second version.
Version 1: Primary key consists of one column
/* Just to make it clear that this is run from Database B */
Use B
go
update MCB
set Utter = MCA.Utter,
Misery = MCA.Misery
from MassConfusion MCB
inner join A.dbo.MassConfusion MCA
on MCB.Primary = MCA.Primary
Version 2: Primary key is a composite of two columns
/* Just to make it clear that this is run from Database B */
Use B
go
update MCB
set Utter = MCA.Utter,
Misery = MCA.Misery
from MassConfusion MCB
inner join A.dbo.MassConfusion MCA
on MCB.Primary = MCA.Primary
and MCB.SubKey = MCA.SubKey

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight