Simple SQL Server Update query taking over 15 mins - sql-server

I have a SQL Server database table which contains 5 columns and 3 rows. Schema description:
cId (int) is the id of the table it is not autonumeric but its values are 1-2-3 for now.
cIdFoot (int) is the foreign key from another static data table that has 4 rows.
cwId (int) It is another foreign key from a table with lots of columns but just 1 row.
2 description columns which are of type varchar(MAX)
Although cIdFoot and cwId are both foreign keys since they are nearly empty tables in this table those are currently not created as foreign keys.
My update is this:
UPDATE myTable
SET desc1 = '2018', desc2 = '2018'
WHERE (cId = 1)
This simple SQL script takes over 6 mins and the table just has 3 rows.
Doing selects seems ok but when I try to update it takes forever and it usually times out. I have seen that in the log -> .ldf file does not seem to get any bigger, could it be that it has been corrupted or that it was not created well. Strangely bigger tables with over 10000 rows work correctly when updating or inserting.
Any help would be grateful.
The execution plan: The Execution plan

Based on your comments, SPID 85 is blocking your update query. Your update will not finish until what ever this process is doing completes or releases the lock it has, which your update needs.
If you kill SPID 85, the update will run, but you need to figure out what this process is doing and why it's holding a lock on that table. It's not likely going to stop obtaining a lock on this table (and thus blocking your update) until you fix it.

Related

Using a SQL Server trigger to find what procedure makes an update

I'm working with a decently large software platform that utilizes Microsoft SQL Server 2008 R2. I'm investigating a very rare bug where something in the database is updating ContactInfo's primary key (CID) to 0. The table has two primary keys. CID is related to a primary key in another table to store contact information.
Is there a way to make an existing update trigger capture what stored procedure is making an update to the table? Or even better, is there a way to capture Profiler data in an audit table, such as stored procedure execution statement with input parameters? We could continuously run a Profiler trace to try to catch the update in real time but the the infrequency of the bug would result in at least a few hundred gigs of trace data to be stored, which is not an option.
Below is my code for the existing audit table. There are 28 columns, so I just replaced them with [columns] for simplicity and space.
ALTER TRIGGER [dbo].[wt_ContactInfo_U]
ON [dbo].[ContactInfo]
FOR UPDATE AS
INSERT INTO [dbo].[ContactInfo_AUDIT] ( // columns )
SELECT // columns
FROM Deleted D
WHERE
CHECKSUM( // D.[columns]) NOT IN (SELECT CHECKSUM( // I.[columns])
FROM INSERTED I
WHERE I.[CID] = D.[CID])

Can I write a SQL script to update all the rows for a single table in SQL Server, but in batches?

I have a large amount of rows in a single table, in our Microsoft SQL Server.
I need to add a new column (DateTime) to the table. Fine. The column needs to be NOT NULL. Ok, fine. Now, this is the problem:
It takes too long to set all column values for all the rows
I tried to create the NULLABLE column. That took a nanosecond. Then UPDATE all the rows by setting the value to GETUTCDATE().
eg.
DECLARE #foo = GETUTCDATE();
UPDATE PewPew
SET NewField = #Foo
After 2 hours, I had to cancel the query. Yeah, 2 hours. I also did NOT do this in a TRANSACTION. I then dropped the new field and we were back to where we started.
I was thinking could we
Create column NULLABLE
In batches of 1000 or something, UPDATE the top 1000 rows where the NewField is NULL.
Once all is done, then alter the column to make it NOT NULLABLE
The SQL Server is on Azure - it's a Standard D13 (8 Cores, 56 GB memory) VM and I think we put it on SSD's, so I'd like to think the hardware isn't too bad.
Footnote: large amount of rows = I think it's about ~20 million. That's sorta large to us, but not large to some. We get that.
select 1;
while(##RowCount<>0)
begin
update top(1000) PewPew set NewField = getutcdate() WHERE NewField IS NULL;
end
This will update 1000 rows at a time.
What if you add the new column as NOT NULL and with a default constraint?
ALTER TABLE dbo.YourTable
ADD NewColumn DATETIME2(3) NOT NULL
CONSTRAINT DF_YourTable_NewColumn DEFAULT (SYSUTCDATETIME())
That should add the column, as NOT NULL, and immediately fill in the default value. Not sure if that'll run any faster than your UPDATE, though - worth a shot.

ORACLE table performance basics

Complete newbie to Oracle DBA-ing, and yet trying to migrate a SQL Server DB (2008R2) to Oracle (11g - total DB size only ~20Gb)...
I'm having a major problem with my largest single table (~30 million rows). Rough structure of the table is:
CREATE TABLE TableW (
WID NUMBER(10,0) NOT NULL,
PID NUMBER(10,0) NOT NULL,
CID NUMBER(10,0) NOT NULL
ColUnInteresting1 NUMBER(3,0) NOT NULL,
ColUnInteresting2 NUMBER(3,0) NOT NULL,
ColUnInteresting3 FLOAT NOT NULL,
ColUnInteresting4 FLOAT NOT NULL,
ColUnInteresting5 VARCHAR2(1024 CHAR),
ColUnInteresting6 NUMBER(3,0) NOT NULL,
ColUnInteresting7 NUMBER(5,0) NOT NULL,
CreatedDate DATE NOT NULL,
ModifiedDate DATE NOT NULL,
CreatedByUser VARCHAR2(20 CHAR),
ModifiedByUser VARCHAR2(20 CHAR)
);
ALTER TABLE TableW ADD CONSTRAINT WPrimaryKey PRIMARY KEY (WID)
ENABLE;
CREATE INDEX WClusterIndex ON TableW (PID);
CREATE INDEX WCIDIndex ON TableW (CID);
ALTER TABLE TableW ADD CONSTRAINT FKTableC FOREIGN KEY (CID)
REFERENCES TableC (CID) ON DELETE CASCADE
ENABLE;
ALTER TABLE TableW ADD CONSTRAINT FKTableP FOREIGN KEY (PID)
REFERENCES TableP (PID) ON DELETE CASCADE
ENABLE;
Running through some basics test, it seems a simple 'DELETE FROM TableW WHERE PID=13455' is taking a huge amount of time (~880s) to execute what should be a quick delete (~350 rows). [query run via SQL Developer].
Generally, the performance of this table is noticeably worse than its SQL equivalent. There are no issues under SQL Server, and the structure of this table and the surrounding ones look sensible for Oracle by comparison to SQL.
My problem is that I cannot find a useful set of diagnostics to start looking for where the problem lies. Any queries / links greatly appreciated.
[The above is a request for help based on the assumption it should not take anything like 10 minutes to delete 350 rows from a table with 30 million records, when it takes SQL Server <1s to do the same for an equivalent DB structure]
EDIT:
The migration is being performed thus:
1 In SQL developer:
- Create Oracle User, tablespace, grants etc AS Sys
- Create the tables, sequences, triggers etc AS New User
2 Via some Java:
- Check SQL-Oracle structure consistency
- Disable all foreign keys
- Move data (Truncate destination table, Select From Old, Insert Into New)
- Adjust sequences to correct starting value
- Enable foreign keys
If you ask us how to improve the performance, then there are several ways to improve it:
Parallel DML
Partitioning.
Parallel DML consumes all the resource you have to perform the operation. Oracle runs several threads to complete the operation. Other sessions has to wait for the end of the operation, because system resources are busy.
Partitioning let you exclude old sections right away. For example, your table stores the data from 2000 to 2014. Most likely you don't need old records, so you can split your table for several partitions and exclude the oldest one.
Check the wait events for your session that's doing the DELETE. That will tell you what your main bottleneck is.
And echoing Marco's comment above - Make sure your table stats are up to date - that will help the optimizer build a good plan to run those queries for you.
To update all (and in case any else finds this):
The correct question to find a solution was: what tables do you have referencing this one?
The problem was another table (let's call it TableV) using WID as a foreign key, but the WID column in TableV was not indexed. This means for every record delete in TableW, the whole of TableV had to be searched for associated records to be deleted. As TableV is >3 million rows, deleting the small set of 350 rows in TableV meant the Oracle server trying to read a total of >1 billion rows.
A single index added to WID in TableV, and the delete statement now takes <1s.
Thanks to all for the comments - a lot of Oracle inner working learnt!

Does ALTER TABLE ALTER COLUMN interrupt ongoing db access?

I have a column in a table so that it is no longer NVARCHAR(256) but is NVARCHAR(MAX). I know the command to do this (ALTER TABLE ALTER COLUMN NVARCHAR(MAX)). My quesiton is really about disruption. I have to do this on a production environment and I was wondering if while I carry this out on the live environment there is a chance that there may be some disruption to usage to users. Will users who are using the database at the time be booted off? Will this operation likely take too long?
Thanks,
Sachin
I've deleted my previous answer which claimed that this would be a metadata only change and am submitting a new one with an entirely different conclusion!
Whilst this is true for changing to up to nvarchar(4000) for the case of changing to nvarchar(max) the operation does seem extremely expensive. SQL Server will add a new variable length column and copy the previously existing data which will likely mean a time consuming blocking operation resulting in many page splits and both internal and logical fragmentation.
This can be seen from the below
CREATE TABLE T
(
Foo int IDENTITY(1,1) primary key,
Bar NVARCHAR(256) NULL
)
INSERT INTO T (Bar)
SELECT TOP 4 REPLICATE(CHAR(64 + ROW_NUMBER() OVER (ORDER BY (SELECT 0))),50)
FROM sys.objects
ALTER TABLE T ALTER COLUMN Bar NVARCHAR(MAX) NULL
Then looking at the page in SQL Server Internals Viewer shows
The white 41 00 ... is wasted space from the previous version of the column.
Any ongoing queries will not be affected. The database has to wait until it can make an exclusive table lock before it can be altered.
While the update is done, no queries can use the table, so if there is a lot of records in the table, the database will seem unresponsive to any queries that would need to use the table.
The advice has to be - make a backup and do it out of hours if you can.
That having been said, I would not expect your database to be disrupted by the change and it will not take very long to do it.
What about your client software ? How will that be affected ?
It should be fine, unless you have a massive amount of rows (millions).. Yes, it will lock the table while it's updating but pending requests will just wait on it.

How do you add a NOT NULL Column to a large table in SQL Server?

To add a NOT NULL Column to a table with many records, a DEFAULT constraint needs to be applied. This constraint causes the entire ALTER TABLE command to take a long time to run if the table is very large. This is because:
Assumptions:
The DEFAULT constraint modifies existing records. This means that the db needs to increase the size of each record, which causes it to shift records on full data-pages to other data-pages and that takes time.
The DEFAULT update executes as an atomic transaction. This means that the transaction log will need to be grown so that a roll-back can be executed if necessary.
The transaction log keeps track of the entire record. Therefore, even though only a single field is modified, the space needed by the log will be based on the size of the entire record multiplied by the # of existing records. This means that adding a column to a table with small records will be faster than adding a column to a table with large records even if the total # of records are the same for both tables.
Possible solutions:
Suck it up and wait for the process to complete. Just make sure to set the timeout period to be very long. The problem with this is that it may take hours or days to do depending on the # of records.
Add the column but allow NULL. Afterward, run an UPDATE query to set the DEFAULT value for existing rows. Do not do UPDATE *. Update batches of records at a time or you'll end up with the same problem as solution #1. The problem with this approach is that you end up with a column that allows NULL when you know that this is an unnecessary option. I believe that there are some best practice documents out there that says that you should not have columns that allow NULL unless it's necessary.
Create a new table with the same schema. Add the column to that schema. Transfer the data over from the original table. Drop the original table and rename the new table. I'm not certain how this is any better than #1.
Questions:
Are my assumptions correct?
Are these my only solutions? If so, which one is the best? I f not, what else could I do?
I ran into this problem for my work also. And my solution is along #2.
Here are my steps (I am using SQL Server 2005):
1) Add the column to the table with a default value:
ALTER TABLE MyTable ADD MyColumn varchar(40) DEFAULT('')
2) Add a NOT NULL constraint with the NOCHECK option. The NOCHECK does not enforce on existing values:
ALTER TABLE MyTable WITH NOCHECK
ADD CONSTRAINT MyColumn_NOTNULL CHECK (MyColumn IS NOT NULL)
3) Update the values incrementally in table:
GO
UPDATE TOP(3000) MyTable SET MyColumn = '' WHERE MyColumn IS NULL
GO 1000
The update statement will only update maximum 3000 records. This allow to save a chunk of data at the time. I have to use "MyColumn IS NULL" because my table does not have a sequence primary key.
GO 1000 will execute the previous statement 1000 times. This will update 3 million records, if you need more just increase this number. It will continue to execute until SQL Server returns 0 records for the UPDATE statement.
Here's what I would try:
Do a full backup of the database.
Add the new column, allowing nulls - don't set a default.
Set SIMPLE recovery, which truncates the tran log as soon as each batch is committed.
The SQL is: ALTER DATABASE XXX SET RECOVERY SIMPLE
Run the update in batches as you discussed above, committing after each one.
Reset the new column to no longer allow nulls.
Go back to the normal FULL recovery.
The SQL is: ALTER DATABASE XXX SET RECOVERY FULL
Backup the database again.
The use of the SIMPLE recovery model doesn't stop logging, but it significantly reduces its impact. This is because the server discards the recovery information after every commit.
You could:
Start a transaction.
Grab a write lock on your original table so no one writes to it.
Create a shadow table with the new schema.
Transfer all the data from the original table.
execute sp_rename to rename the old table out.
execute sp_rename to rename the new table in.
Finally, you commit the transaction.
The advantage of this approach is that your readers will be able to access the table during the long process and that you can perform any kind of schema change in the background.
Just to update this with the latest information.
In SQL Server 2012 this can now be carried out as an online operation in the following circumstances
Enterprise Edition only
The default must be a runtime constant
For the second requirement examples might be a literal constant or a function such as GETDATE() that evaluates to the same value for all rows. A default of NEWID() would not qualify and would still end up updating all rows there and then.
For defaults that qualify SQL Server evaluates them and stores the result as the default value in the column metadata so this is independent of the default constraint which is created (which can even be dropped if no longer required). This is viewable in sys.system_internals_partition_columns. The value doesn't get written out to the rows until next time they happen to get updated.
More details about this here: online non-null with values column add in sql server 2012
Admitted that this is an old question. My colleague recently told me that he was able to do it in one single alter table statement on a table with 13.6M rows. It finished within a second in SQL Server 2012. I was able to confirm the same on a table with 8M rows. Something changed in later version of SQL Server?
Alter table mytable add mycolumn char(1) not null default('N');
I think this depends on the SQL flavor you are using, but what if you took option 2, but at the very end alter table table to not null with the default value?
Would it be fast, since it sees all the values are not null?
If you want the column in the same table, you'll just have to do it. Now, option 3 is potentially the best for this because you can still have the database "live" while this operation is going on. If you use option 1, the table is locked while the operation happens and then you're really stuck.
If you don't really care if the column is in the table, then I suppose a segmented approach is the next best. Though, I really try to avoid that (to the point that I don't do it) because then like Charles Bretana says, you'll have to make sure and find all the places that update/insert that table and modify those. Ugh!
I had a similar problem, and went for your option #2.
It takes 20 minutes this way, as opposed to 32 hours the other way!!! Huge difference, thanks for the tip.
I wrote a full blog entry about it, but here's the important sql:
Alter table MyTable
Add MyNewColumn char(10) null default '?';
go
update MyTable set MyNewColumn='?' where MyPrimaryKey between 0 and 1000000
go
update MyTable set MyNewColumn='?' where MyPrimaryKey between 1000000 and 2000000
go
update MyTable set MyNewColumn='?' where MyPrimaryKey between 2000000 and 3000000
go
..etc..
Alter table MyTable
Alter column MyNewColumn char(10) not null;
And the blog entry if you're interested:
http://splinter.com.au/adding-a-column-to-a-massive-sql-server-table
I had a similar problem and I went with modified #3 approach. In my case the database was in SIMPLE recovery mode and the table to which column was supposed to be added was not referenced by any FK constraints.
Instead of creating a new table with the same schema and copying contents of original table, I used SELECT…INTO syntax.
According to Microsoft (http://technet.microsoft.com/en-us/library/ms188029(v=sql.105).aspx)
The amount of logging for SELECT...INTO depends on the recovery model
in effect for the database. Under the simple recovery model or
bulk-logged recovery model, bulk operations are minimally logged. With
minimal logging, using the SELECT… INTO statement can be more
efficient than creating a table and then populating the table with an
INSERT statement. For more information, see Operations That Can Be
Minimally Logged.
The sequence of steps :
1.Move data from old table to new while adding new column with default
SELECT table.*, cast (‘default’ as nvarchar(256)) new_column
INTO table_copy
FROM table
2.Drop old table
DROP TABLE table
3.Rename newly created table
EXEC sp_rename 'table_copy', ‘table’
4.Create necessary constraints and indexes on the new table
In my case the table had more than 100 million rows and this approach completed faster than approach #2 and log space growth was minimal.
1) Add the column to the table with a default value:
ALTER TABLE MyTable ADD MyColumn int default 0
2) Update the values incrementally in the table (same effect as accepted answer). Adjust the number of records being updated to your environment, to avoid blocking other users/processes.
declare #rowcount int = 1
while (#rowcount > 0)
begin
UPDATE TOP(10000) MyTable SET MyColumn = 0 WHERE MyColumn IS NULL
set #rowcount = ##ROWCOUNT
end
3) Alter the column definition to require not null. Run the following at a moment when the table is not in use (or schedule a few minutes of downtime). I have successfully used this for tables with millions of records.
ALTER TABLE MyTable ALTER COLUMN MyColumn int NOT NULL
I would use CURSOR instead of UPDATE. Cursor will update all matching records in batch, record by record -- it takes time but not locks table.
If you want to avoid locks use WAIT.
Also I am not sure, that DEFAULT constrain changes existing rows.
Probably NOT NULL constrain use together with DEFAULT causes case described by author.
If it changes add it in the end
So pseudocode will look like:
-- without NOT NULL constrain -- we will add it in the end
ALTER TABLE table ADD new_column INT DEFAULT 0
DECLARE fillNullColumn CURSOR LOCAL FAST_FORWARD
SELECT
key
FROM
table WITH (NOLOCK)
WHERE
new_column IS NULL
OPEN fillNullColumn
DECLARE
#key INT
FETCH NEXT FROM fillNullColumn INTO #key
WHILE ##FETCH_STATUS = 0 BEGIN
UPDATE
table WITH (ROWLOCK)
SET
new_column = 0 -- default value
WHERE
key = #key
WAIT 00:00:05 --wait 5 seconds, keep in mind it causes updating only 12 rows per minute
FETCH NEXT FROM fillNullColumn INTO #key
END
CLOSE fillNullColumn
DEALLOCATE fillNullColumn
ALTER TABLE table ALTER COLUMN new_column ADD CONSTRAIN xxx
I am sure that there are some syntax errors, but I hope that this
help to solve your problem.
Good luck!
Vertically segment the table. This means you will have two tables, with the same primary key, and exactly the same number of records... One will be the one you already have, the other will have just the key, and the new Non-Null column (with default value) .
Modify all Insert, Update, and delete code so they keep the two tables in synch... If you want you can create a view that "joins" the two tables together to create a single logical combination of the two that appears like a single table for client Select statements...

Resources