Oracle: do Truncates maintain Atomicity within a transaction? - database

Oracle 10g -- due to a compatibility issue with a 9i database, I'm pulling data through a 10g database (to be used by an 11g database) using INSERT INTO...SELECT statements via a scheduled job that runs every 15 minutes. I notice that TRUNCATE statements are much faster than DELETE statements and have read that a 'downside' to DELETE statements is that they never decrease the table high-water mark. My use for this data is purely read-only -- UPDATEs and INSERTs are never issued against the tables in question.
Given the above, I want to avoid the possible situation where my 'working' database (Oracle 11g) attempts to read from a table on my staging database (10g) that is empty for a period of time because the TRUNCATE happened straight away and the INSERT INTO...SELECT from the 9i database is taking a couple of minutes to complete.
So, I'm wondering if that is how Oracle handles TRUNCATEs within a transaction, or if the whole operation is performed and COMMITted, despite the fact that TRUNCATEs can't be rolled back? Or, put another way, from an external SELECT point of view, if I wrap a TRUNCANTE and INSERT INTO...SELECT on a table in a transaction, will the table ever appear empty to an external SELECT reading from the table?

Once a table has been truncated in a transaction, you cannot do anything else with that table in the same transaction; you have to commit (or rollback) the transaction before you can use that table again. Or, it may be that truncating a table effectively terminates the current transaction. Either way, if you use TRUNCATE, you have a window when the table is truncated (empty) but the INSERT operation has not completed. This is not what you wanted, but it is what Oracle provides.

You can do partition exchange. Have 2 partitions in staging table; p_OLD and p_NEW.
Before insert do partition exchange "new"->"old" and truncate "new" partition. (At this point if you select from table you see old data)
Insert data into "new" partition, truncate "old" partition. (At this point you see new data).
With this approach your table is never empty to the onlooker.
Why do you need 3 Oracle environments?

Related

How to rollback a table in SQL Server

One table has been truncated by mistake and inserted few records in the table. So could you please suggest how to get the previous table back
RESTORE YOUR BACKUP
One of the advantages of a truncate is it almost doesn't let a footprint in the log. A delete affects row-by-row while the truncate is a lot faster because it completely deallocates the data pages, at once.
After the commit, there's nothing left to be rolled back.
EDIT
If there any identity in that table it will also be reset. So any inserts will "duplicate" truncated ids.

We only have SQL Server Standard edition so no access to snapshot functionality. Options?

We only have SQL Servre Standard edition so I can't use the Snapshot functionality. Before spending the time just want to know if the following is possible (or if there is a better way) please:
At the end of every month I need to take a snapshot of the month and store it in table b. The following month take another snapshot and append that snapshots data to table b. And so on....
Is it possible to create a stored procedure to run at the end of every month that stores the snapshot data into a temp table A. Then using another stored procedure, take data from temp table A and append to table B? The second procedure can have a drop table A.
Cheers.
Yes, it is possible.
If I understand you, more or less, this is what you want:
Lock the table
Select everything into a staging table
Move everything from that staging table into your destination
You can lock the entire table (this will prevent changes, but can lead to deadlocks).
INSERT INTO stagingTable (
... -- field list
)
SELECT
... -- field list
FROM
myTable WITH (TABLOCK)
;
TABLOCK will place a shared lock on the table which will be released when the statement is executed (READ COMMITTED isolation level) or after the transaction is committed/rolled back (SERIALIZABLE).
If you want to keep the lock during the whole transaction, you can add the HOLDLOCK hint too, which switches the isolation level to serializable for the object, thus the lock will be released after COMMIT. Don't forget to start a transaction and commit/roll it back.
You can also use TABLOCKX, which is an exclusive lock preventing all processes to acquire a lock on the table or on anything on lower levels (pages, rows, etc) in the table. This will prevent concurrent reads too!
You can let the SQL Server to decide which lock it wants to use (a.k.a. omit the hint), in this case SQL Server may choose to use more granular locks (such as page or row locks) instead of locking the whole table.

Is Log Sequence Number (LSN) unique for database or table in SQL Server?

I am using SQL CDC to track changes for multiple tables in SQL Server. I would want to report out these changes in right sequence for each I have a program which collects the data from each CDC table. But I want to make sure that all the changes that are happening to these tables are reported in correct sequence. Can I rely on LSN for the right sequence?
The LSN number is unique for a given transaction but is not globally unique. If you have multiple records within the same transaction they will all share the same __$start_lsn value in cdc. If you want the correct order of operations you need to sort by __$start_lsn, __$seqval, then __$operation. The __$seqval represents the id of the individual operation within the wrapping transaction.
For example, I have a table in the dbo schema named foo. It has one column y. If I run this statement:
INSERT INTO dbo.foo VALUES (1);
INSERT INTO dbo.foo VALUES (2);
Then I will see two separate LSN values in cdc because these are in two separate transactions. If I run this:
BEGIN TRAN
INSERT INTO dbo.foo VALUES (1);
INSERT INTO dbo.foo VALUES (2);
COMMIT TRAN
Then I will see one LSN value for both records, but they will have different __$seqval values, and the seqval for my first record will be less than the seqval for my second record.
LSN is unique, ever increasing within the database, across all tables in that database.
In most cases LSN value is unique across all tables, however I found instances where one single LSN value belongs to the changes in 40 tables. I don't know the SQL script that associated with those changes, but I know that all operations were 'INSERT'.
Not sure if it is a bug. CDC documentations is poor, covers just basics. Not many users know that CDC capture process has many bugs confirmed by MS for both SQL 2014 & 2016 (we have the open case).
So I would not rely on the documentation. It may be wrong in some scenarios. It's better to implement more checks and test it with large volume of different combinations of changes.
I also encountered that scenario. In my experience and what I understood is in your first example, there are 2 transactions happened so you will really get 2 different LSN. While in your second example, you only have 1 transaction with 2 queries inside. The CDC will count it as only 1 transaction since it is inside BEGIN and END TRAN. I can't provide links to you since this is my personal experience.

What is the fastest way to insert data to MS SQL database without locking it?

I've a running system where data is inserted periodically into MS SQL DB and web application is used to display this data to users.
During data insert users should be able to continue to use DB, unfortunatelly I can't redesign the whole system right now. Every 2 hours 40k-80k records are inserted.
Right now the process looks like this:
Temp table is created
Data is inserted into it using plain INSERT statements (parameterized queries or stored proceuders should improve the speed).
Data is pumped from temp table to destination table using INSERT INTO MyTable(...) SELECT ... FROM #TempTable
I think that such approach is very inefficient. I see, that insert phase can be improved (bulk insert?), but what about transfering data from temp table to destination?
This is waht we did a few times. Rename your table as TableName_A. Create a view that calls that table. Create a second table exactly like the first one (Tablename_B). Populate it with the data from the first one. Now set up your import process to populate the table that is not being called by the view. Then change the view to call that table instead. Total downtime to users, a few seconds. Then repopulate the first table. It is actually easier if you can truncate and populate the table becasue then you don't need that last step, but that may not be possible if your input data is not a complete refresh.
You cannot avoid locking when inserting into the table. Even with BULK INSERT this is not possible.
But clients that want to access this table during the concurrent INSERT operations can do so when changing the transaction isolation level to READ UNCOMMITTED or by executing the SELECT command with the WITH NOLOCK option.
The INSERT command will still lock the table/rows but the SELECT command will then ignore these locks and also read uncommitted entries.

Is deleting all records in a table a bad practice in SQL Server?

I am moving a system from a VB/Access app to SQL server. One common thing in the access database is the use of tables to hold data that is being calculated and then using that data for a report.
eg.
delete from treporttable
insert into treporttable (.... this thing and that thing)
Update treportable set x = x * price where (...etc)
and then report runs from treporttable
I have heard that SQL server does not like it when all records from a table are deleted as it creates huge logs etc. I tried temp sql tables but they don't persists long enough for the report which is in a different process to run and report off of.
There are a number of places where this is done to different report tables in the application. The reports can be run many times a day and have a large number of records created in the report tables.
Can anyone tell me if there is a best practise for this or if my information about the logs is incorrect and this code will be fine in SQL server.
If you do not need to log the deletion activity you can use the truncate table command.
From books online:
TRUNCATE TABLE is functionally
identical to DELETE statement with no
WHERE clause: both remove all rows in
the table. But TRUNCATE TABLE is
faster and uses fewer system and
transaction log resources than DELETE.
http://msdn.microsoft.com/en-us/library/aa260621(SQL.80).aspx
delete from sometable
Is going to allow you to rollback the change. So if your table is very large, then this can cause a lot of memory useage and time.
However, if you have no fear of failure then:
truncate sometable
Will perform nearly instantly, and with minimal memory requirements. There is no rollback though.
To Nathan Feger:
You can rollback from TRUNCATE. See for yourself:
CREATE TABLE dbo.Test(i INT);
GO
INSERT dbo.Test(i) SELECT 1;
GO
BEGIN TRAN
TRUNCATE TABLE dbo.Test;
SELECT i FROM dbo.Test;
ROLLBACK
GO
SELECT i FROM dbo.Test;
GO
i
(0 row(s) affected)
i
1
(1 row(s) affected)
You could also DROP the table, and recreate it...if there are no relationships.
The [DROP table] statement is transactionally safe whereas [TRUNCATE] is not.
So it depends on your schema which direction you want to go!!
Also, use SQL Profiler to analyze your execution times. Test it out and see which is best!!
The answer depends on the recovery model of your database. If you are in full recovery mode, then you have transaction logs that could become very large when you delete a lot of data. However, if you're backing up transaction logs on a regular basis to free the space, this might not be a concern for you.
Generally speaking, if the transaction logging doesn't matter to you at all, you should TRUNCATE the table instead. Be mindful, though, of any key seeds, because TRUNCATE will reseed the table.
EDIT: Note that even if the recovery model is set to Simple, your transaction logs will grow during a mass delete. The transaction logs will just be cleared afterward (without releasing the space). The idea is that DELETE will create a transaction even temporarily.
Consider using temporary tables. Their names start with # and they are deleted when nobody refers to them. Example:
create table #myreport (
id identity,
col1,
...
)
Temporary tables are made to be thrown away, and that happens very efficiently.
Another option is using TRUNCATE TABLE instead of DELETE. The truncate will not grow the log file.
I think your example has a possible concurrency issue. What if multiple processes are using the table at the same time? If you add a JOB_ID column or something like that will allow you to clear the relevant entries in this table without clobbering the data being used by another process.
Actually tables such as treporttable do not need to be recovered to a point of time. As such, they can live in a separate database with simple recovery mode. That eases the burden of logging.
There are a number of ways to handle this. First you can move the creation of the data to running of the report itself. This I feel is the best way to handle, then you can use temp tables to temporarily stage your data and no one will have concurency issues if multiple people try to run the report at the same time. Depending on how many reports we are talking about, it could take some time to do this, so you may need another short term solutio n as well.
Second you could move all your reporting tables to a difffernt db that is set to simple mode and truncate them before running your queries to populate. This is closest to your current process, but if multiple users are trying to run the same report could be an issue.
Third you could set up a job to populate the tables (still in separate db set to simple recovery) once a day (truncating at that time). Then anyone running a report that day will see the same data and there will be no concurrency issues. However the data will not be up-to-the minute. You also could set up a reporting data awarehouse, but that is probably overkill in your case.

Resources