SQL Delete vs Update - sql-server

I have seen something like this asked a number of times but not quite in this configuration. I have a table that has a one to many relation.
Let’s say I have a computer table and a parts table. The user enters a generic info in the computer table then selects parts that are stored in the parts table with a relationship to the computer table of computerId. So the original write is a simple insert. Now let’s say the user select the computer again and changes the part on the pc, adds some new, removes some, and updates a few. Then the user hits save to save the changes. I run a simple update on the computer table but now the issue with the parts table.
Would it be better to delete all the records from the parts table for the computer Id and then do a clean insert of all the parts selected.
Or Run some method that would look at the existing parts in the table and where the part has been updated update the record, where the part no longer exists do a delete, and then insert the remaining parts?

Clearly the simple solution is to delete all and then insert all.
The down side of this SQL traffic, locks, and table fragmentation.
If it is small table and only few concurrent users then fine.
In a high volume environment I do the following
There is no update - that is just an ignore
- delete items gone
- ignore any items not changed
- insert new items
And you can do that in one pass two/three statements.
Or you could define a stored procedure.
Do the delete before the insert to clear space first.
You can get real fancy and use an update for delete / insert but that just gets more complex than it is worth in my mind. You would still have an insert or a delete if the item count is not the same.
delete comp_part
where compID = #compID and partID not in (....);
Insert is a little more tricky:
You can to it with a series of inserts and if you have a PK just let the insert fail
The other way is to create a #table and use it for both the delete and insert
This is only worth the hassle if you have a REALLY busy table.

It all depends upon the business model, if you would want to track the transaction than its not a good option to delete it. If you have all your old transactions with your customers than it would be beneficial for tracking purposes., Your CustomerID would be Primarykey and you can have another Unique key as PartOrderID which will be a unique value for each insert.
Hope this helps

Really you should have three tables. Product, Part, and ProductPart; the ProductPart table would store the association of "this product has these parts". As far as updating, the simplest thing would be to delete all ProductParts for a given Product and re-insert the records you want.

Related

SQL Server: Best way to automatically stream updates to a summary table based on changes from another table

I am currently working in a SQL server database where I have a table User that has a schema like so:
username
category
user1
gaming
user2
gaming
user3
sports
My summary table UserCategoryCount is a simple groupby statement for how many users belong to each category and looks like this:
category
numUsers
gaming
2
sports
1
New entries are constantly being uploaded to the User, and I want to be able to stream updates in the User table to the UserCategoryCount summary table. I am aware that I can create a simple VIEW statement that performs a groupby on the User table, but I would like UserCategoryCount to be its own table that automatically changes based on new users being uploaded to the User table.
My first thought was to create a trigger that will detect when the User table has been updated. So far, the most simple but cheesy solution I can think of is creating a trigger that simply deletes and refreshes UserCategoryCount:
CREATE TRIGGER TRG_Add_User
ON User
AS
BEGIN
DELETE FROM UserCategoryCount
INSERT INTO UserCategoryCount (category, numUsers)
SELECT Category, Count(Category) as numUsers
FROM User GROUP BY Category
END
GO
But this seems like a really hacky way of updating the UserCategoryCount table. Any help on how to improve this update statement so that I don't have to completely overwrite the table every time a new user or batch of users has been inserted would be greatly appreciated.
For a start, your trigger is seriously flawed: it does not use the inserted or deleted tables and instead recalculates the whole thing every time, this is going to be very bad for performance. It also does not specify whether it is for inserts, updates or deletes.
A much better solution is to use an indexed view. This is like a regular view, except that the server maintains the actual data on disk, and updates it in real-time whenever there are changes to the underlying tables.
CREATE OR ALTER VIEW dbo.UserCategoryCount
WITH SCHEMABINDING
AS
SELECT
u.Category,
COUNT_BIG(*) AS numUsers
FROM dbo.User u
GROUP BY u.Category;
GO
CREATE UNIQUE CLUSTERED INDEX CX_UserCategoryCount ON dbo.UserCategoryCount (Category);
There are some restrictions on indexed views, among them:
They must be schema-bound, and therefore underlying columns cannot be changed
All tables must be two-part, schema and table
Only joins allowed are INNER or CROSS, no LEFT/RIGHT/FULL/APPLY or derived tables, CTEs or subqueries.
If there is a GROUP BY, you must add COUNT_BIG, and the only other aggregate allowed is SUM

Best practices to get a new row id from database

hi i am creating a project where i am actually having 3 related tables which are connected to one table like below
table 1
id
name
table 2
id
tb1_id
random_thing
table 3
id
tb1_id
random_thing
i can not basically go with an option where i can create a row in table1 first and then tb2,tb3 . client wants everything to be done on single button . so i am creating a new blank row whenever the page is called and getting the new tb1_id and then linking everything and go with single button but the problem is i can delete unused rows like 2-3 days later but thats ridiculous so is there any other best practices to get over situations such as this?
Edit
Explanation with an example will be really helpful , i am good to go with any database or any language just example has to be good so i can understand how its done. sorry but i am one of those guys who hates theory and love practicals :d
The best practice is to use explicit foreign key relationships and transactions.
So, the basic idea is:
Begin a transaction.
Insert a row in table 1 with the name.
Get the id of the newly created row, ideally using a returning or output clause (depending on the database).
Insert a row in table 2.
Insert a row in table 3.
Commit the transaction.
When using transactions, just be careful to rollback the transaction if it does not complete for any reason.
As for deleting rows, you can have "cascading delete" options on the foreign key definitions, so if the parent row is deleted then the related rows are also deleted.
Some databases (notably Postgres) offer some functionality where you can put all this into a single statement using CTEs that modify the data. The idea is still the same, just easier to code.
I should note that there are perfectly reasonable alternatives. For instance, you could create a view on the "data" columns of the three tables and create insert/update/delete triggers on the view. Personally, I find that hiding this functionality in triggers makes it more difficult to understand and maintain. I think that is a personal opinion and this is also a reasonable approach.

How to improve performance when deleting entities from database?

I started an ASP.NET project with Entity Framework 4 for my DAL, using SQL Server 2008. In my database, I have a table Users that should have many rows (5.000.000 for example).
Initially I had my Users table designed like this:
Id uniqueidentifier
Name nvarchar(128)
Password nvarchar(128)
Email nvarchar(128)
Role_Id int
Status_Id int
I've modified my table, and added a MarkedForDeletion column:
Id uniqueidentifier
Name nvarchar(128)
Password nvarchar(128)
Email nvarchar(128)
Role_Id int
Status_Id int
MarkedForDeletion bit
Should I delete every entity each time, or use the MarkedForDeletion attribute. This means that I need to update the value and at some moment in time to delete all users with the value set to true with a stored procedure or something similar.
Wouldn't the update of the MarkedForDeletion attribute cost the same as a delete operation?
Depending on the requirements/needs/future needs of your system, consider moving your 'deleted' entities over to a new table. Setup an 'audit' table to hold those that are deleted. Consider the case where someone wants something 'restored'.
To your question on performance: would the update be the same cost as a delete? No. The update would be a much lighter operation, especially if you had an index on the PK (errrr, that's a guid, not an int). The point being that an update to a bit field is much less expensive. A (mass) delete would force a reshuffle of the data. Perhaps that job belongs during a downtime or a low-volume period.
Regarding performance: benchmark it to see what happens! Given your table with 5 million rows, it'd be nice to see how your SQL Server performs, in its current state of indexes, paging, etc, with both scenarios. Make a backup of your database, and restore into a new database. Here you can sandbox as you like. Run & time the scenarios:
mass delete vs.
update a bit or smalldatetime field vs.
move to an audit table
In terms of books, try:
this answer re: books
a recommendation for Adam Mechanic's book
another question on database books.
This may depend on what you want to do with the information. For instance, you may want to mark a user for deletion but not delte all his child records (say something like forum posts), inthsi case you should markfor deletion or use a delted date field. If you do this, create a view to use for all active users (called ActiveUsers) , then insist that the view beused in any query for login or where you only want to see the active users. That will help prevent query errors from when you forget to exlude the inactive ones. If your system is active, do not make this change without going through and adjusting all queries that need to use the new view.
Another reason to use the second version is to prevent slowdowns when delting large numbers of child records. They no longer need to be deleted if you use a deleted flag. This can help performance becasue less resources are needed. Additionally you can flag records for deltion and then delte them inthe inthe middle of the night (or move to a history table) to keep the main tables smaller but still not affect performance during peak hours.

trigger insertions into same table

I have many tables in my database which are interrelated. I have a table (table one) which has had data inserted and the id auto increments. Once that row has an ID i want to insert this into a table (table three) with another set of ID's which comes from a form(this data will also be going into a table, so it could from from that table), the same form as the data which went into the first table came from.
The two ID's together make the primary key of the third table.
How can I do this, its to show that more than one ID is joined to a single ID for something else.
Thanks.
You can't do that through a trigger as the trigger only has available to it the data that you already inserted not data that is currenlty only residing in your user interface.
Normally how you handle this situation is that you write a stored proc that inserts the meeting, returns the id value (using scope_identity() in SQL Server, but I'm sure other databases would have method to return the auto-generated id as well). Then you would use that value to insert to the other table with the other values you need for that table. You would of course want to wrap the whole thing in a transaction.
I think you can probably do what you're describing (just write the INSERTs to table 3) in the table 1 trigger) but you'll have to put the additional info for the table 3 rows into your table 1 row, which isn't very smart.
I can't see why you would do that instead of writing the INSERTs in your code, where someone reading it can see what's happening.
The trouble with triggers is that they make it easy to hide business logic in the database. I think (and I believe I'm in the majority here) that it's easier to understand, manage, maintain and generally all-round deal with an application where all the business rules exist in the same general area.
There are reasons to use triggers (for propagating denormalised values, for example) just as there are reasons for useing stored procedures. I'm going to assert that they are largely related to performance-critical areas. Or should be.

Can I logically reorder columns in a table?

If I'm adding a column to a table in Microsoft SQL Server, can I control where the column is displayed logically in queries?
I don't want to mess with the physical layout of columns on disk, but I would like to logically group columns together when possible so that tools like SQL Server Management Studio list the contents of the table in a convenient way.
I know that I can do this through SQL Management Studio by going into their "design" mode for tables and dragging the order of columns around, but I'd like to be able to do it in raw SQL so that I can perform the ordering scripted from the command line.
You can not do this programatically (in a safe way that is) without creating a new table.
What Enterprise Manager does when you commit a reordering is to create a new table, move the data and then delete the old table and rename the new table to the existing name.
If you want your columns in a particular order/grouping without altering their physical order, you can create a view which can be whatever you desire.
I think what everyone here is missing is that although not everyone has to deal with 10's, 20's, or 1000's instances of the same software system installed throughout the country and world, those of us that design commercially sold software do so. As a result, we expand systems over time, expand tables by adding fields as new capability is needed, and as those fields are identified do belong in an existing table, and as such, over a decade of expanding, growing, adding fields, etc to tables, and then having to work with those tables from design, to support, to sometimes digging into raw data/troubleshooting to debug new functionality bugs, it is incredibly aggravating to not have the primary information you want to see within the first handful of fields, when you may have tables with 30, 40, 50, or even 90 fields, and yes, in a strictly normalized database.
I've often wished I could do this, for this exact reason. But short of doing exactly what SQL does, building a Create Script for a new Table the way I want it, writing the Insert to it, then dropping all existing constraints, relationships, keys, index, etc etc from the existing table and renaming the "new" table back to the old name, and then reading all those keys, relationships, index, etc etc ....
It's not only tedious, time-consuming, but ... in five more years, it will need to happen again.
It's so close to worth that massive amount of work, however the point is, it won't be the last time we need this ability, since our systems will continue to grow, expand, and get fields in a wacked ordered driven by need/design additions.
A majority of developers think from a single system standpoint that serves a single company or very specific hard box market.
The "off-the-shelf" but significantly progressive designers and leaders of development in their market space will always have to deal with this problem, over and over, and would love a creative solution if anyone has one. This could easily save my company a dozen hours a week, just not having to scroll over, or remember where "that" field is in the source data table.
When Management Studio does it, it's creating a temporary table, copying everything across, dropping your original table and renaming the temporary table. There's no simple equivalent T-SQL statement.
If you don't fancy doing that, you could always create a view of the table with the columns in the order you'd like and use that?
Edit: beaten!
If I understand your question, you want to affect what columns are returned first, second, third, etc in existing queries, right?
If all of your queries are written with SELECT * FROM TABLE - then they will show up in the output as they are laid out in SQL.
If your queries are written with SELECT Field1, Field2 FROM TABLE - then the order they are laid out in SQL does not matter.
There is one way, but its only temporarily for the query itself. For example,
Lets say you have 5 tables.
Table is called T_Testing
FirstName, LastName, PhoneNumber, Email, and Member_ID
you want it to list their ID, then Last Name, then FirstName, then Phone then Email.
You can do it as per the Select.
Select Member_ID, LastName, FirstName, PhoneNumber, Email
From T_Testing
Other than that, if you just want the LastName to Show before first name for some reason, you can do it also as follows:
Select LastName, *
From T_Testing
The only thing you wanna be sure that you do is that the OrderBy or Where Function needs to be denoted as Table.Column if you are going to be using a Where or OrderBy
Example:
Select LastName, *
From T_Testing
Order By T_Testing.LastName Desc
I hope this helps, I figured it out because I needed to do this myself.
Script your existing table to a query window.
Run this script against a Test database (remove the Use statement)
Use SSMS to make the column changes you need
Click Generate Change Script (left most and bottommost icon on the
buttonbar, by default)
Use this script against your real table
All the script really does is create a second table table with the desired column orders, copies all your data into it, drops the original table and then renames the secondary table to take its place. This does save you writing it yourself though should you want a deploy script.
It is not possible to change the order of the columns without recreating the whole table. If you have a few instances of the database only, you can use SSMS for this (Select the table and click "design").
In case you have too many instances for a manual process, you should try this script:
https://github.com/Epaminaidos/reorder-columns
It can be done using SQL, by modifying the system tables directly. For example, look here:
Alter table - Add new column in between
However, I would not recommend playing with system tables, unless it's absolutely necessary.
Open your table in SSMS in design mode:
Reorder your columns:
It is important to not save your change.
Click the "Generate Change Script" button:
Now a window will open that contains the script to apply this change:
Copy the text from the window.
In this instance, it generated the following code:
/* To prevent any potential data loss issues, you should review this script in detail before running it outside the context of the database designer.*/
BEGIN TRANSACTION
SET QUOTED_IDENTIFIER ON
SET ARITHABORT ON
SET NUMERIC_ROUNDABORT OFF
SET CONCAT_NULL_YIELDS_NULL ON
SET ANSI_NULLS ON
SET ANSI_PADDING ON
SET ANSI_WARNINGS ON
COMMIT
BEGIN TRANSACTION
GO
CREATE TABLE dbo.Tmp_MyTable
(
Id int NOT NULL,
Name nvarchar(30) NULL,
Country nvarchar(50) NOT NULL
) ON [PRIMARY]
GO
ALTER TABLE dbo.Tmp_MyTable SET (LOCK_ESCALATION = TABLE)
GO
IF EXISTS(SELECT * FROM dbo.MyTable)
EXEC('INSERT INTO dbo.Tmp_MyTable (Id, Name, Country)
SELECT Id, Name, Country FROM dbo.MyTable WITH (HOLDLOCK TABLOCKX)')
GO
DROP TABLE dbo.MyTable
GO
EXECUTE sp_rename N'dbo.Tmp_MyTable', N'MyTable', 'OBJECT'
GO
COMMIT
As you can see, what it does is 1) create a new temporary table, 2) copy the data over to the temporary table, 3) delete the original table and 4) rename the temporary table to the original table's name.

Resources