Recommended order of updating a database - database

Granted that while updating a database, there are DELETED records, UPDATED records and INSERTED (new) records. What would be the recommended order for performing the updates?
I currently do DELETE, UPDATE, INSERT. My reason is as follows:
DELETED records are not in the DB anymore (at least from the user's point of view) so they should be removed first.
UPDATED records should go next because they are modifying existing data.
INSERTED records go last to fill the DB with the newest records.
Is this sequence satisfactory or a different order would be better?

This sequence is satisfactory you don't need to change.

Related

Find out the recently selected rows from a Oracle table and can I update a LAST_ACCESSED column whenever the table is accessed

I have a database table which have more than 1 million records uniquely identified by a GUID column. I want to find out which of these record or rows was selected or retrieved in the last 5 years. The select query can happen from multiple places. Sometimes the row will be returned as a single row. Sometimes it will be part of a set of rows. there is select query that does the fetching from a jdbc connection from a java code. Also a SQL procedure also fetches data from the table.
My intention is to clean up a database table.I want to delete all rows which was never used( retrieved via select query) in last 5 years.
Does oracle DB have any inbuild meta data which can give me this information.
My alternative solution was to add a column LAST_ACCESSED and update this column whenever I select a row from this table. But this operation is a costly operation for me based on time taken for the whole process. Atleast 1000 - 10000 records will be selected from the table for a single operation. Is there any efficient way to do this rather than updating table after reading it. Mine is a multi threaded application. so update such large data set may result in deadlocks or large waiting period for the next read query.
Any elegant solution to this problem?
Oracle Database 12c introduced a new feature called Automatic Data Optimization that brings you Heat Maps to track table access (modifications as well as read operations). Careful, the feature is currently to be licensed under the Advanced Compression Option or In-Memory Option.
Heat Maps track whenever a database block has been modified or whenever a segment, i.e. a table or table partition, has been accessed. It does not track select operations per individual row, neither per individual block level because the overhead would be too heavy (data is generally often and concurrently read, having to keep a counter for each row would quickly become a very costly operation). However, if you have you data partitioned by date, e.g. create a new partition for every day, you can over time easily determine which days are still read and which ones can be archived or purged. Also Partitioning is an option that needs to be licensed.
Once you have reached that conclusion you can then either use In-Database Archiving to mark rows as archived or just go ahead and purge the rows. If you happen to have the data partitioned you can do easy DROP PARTITION operations to purge one or many partitions rather than having to do conventional DELETE statements.
I couldn't use any inbuild solutions. i tried below solutions
1)DB audit feature for select statements.
2)adding a trigger to update a date column whenever a select query is executed on the table.
Both were discarded. Audit uses up a lot of space and have performance hit. Similary trigger also had performance hit.
Finally i resolved the issue by maintaining a separate table were entries older than 5 years that are still used or selected in a query are inserted. While deleting I cross check this table and avoid deleting entries present in this table.

How to find out the rows affected in SQL Profiler or trace?

I'm using tracing to log all delete or update queries run through the system. The problem is, if I run a query like DELETE FROM [dbo].[Artist] WHERE ArtistId>280, I know how many rows were deleted but I'm unable to find out which rows were deleted (the data they had).
I'm thinking of doing this as a logging system so it would be useful to see which rows were affected and what data they had if at all possible. I don't really want to use triggers for this job but I will if I have to (and if it's feasible).
If you need the original data and are planning on storing all the deleted data in a separate table why not just logically delete the original data rather than physically delete it? i.e.
UPDATE dbo.Artist SET Artist_deleted = 1 WHERE ArtistId>280
Then you only need add one column to your current table rather than creating new tables and scripts to support these. You could then partition the current table based on the deleted flag if you are worried about disk space/performance etc.

SQL Server Trigger Isolation / Scope Documentation

I have been looking for definitive documentation regarding the isolation level ( or concurrency or scope ... I'm not sure EXACTLY what to call it) of triggers in SQL Server.
I have found the following sources which indicate that what I believe is true (which is to say that two users, executing updates to the same table --even the same rows-- will then have independent and isolated triggers executed):
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/601977fb-306c-4888-a72b-3fbab6af0cdc/effects-of-concurrent-trigger-firing-on-inserted-and-deleted-tables?forum=transactsql
https://social.msdn.microsoft.com/forums/sqlserver/en-US/b78c3e7b-6b98-48e1-ad43-3c773c79a6ff/trigger-and-inserted-table
The first question is essentially the same question I am trying to find the answer to, but the answer given doesn't provide any sources. The second question also hits near the mark, and the answer is the same, but again, no sources are provided.
Can someone point me to where the available documentation makes the same assertions?
Thanks!
Well, Isolation Level and Scope are two very different things.
Isolation Level
Triggers operate within a transaction. By default, that transaction should be using the default isolation level of READ COMMITTED. However, if the calling process has specified a different isolation level, then that would override the default. As per usual: if desired, you should be able to override that within the trigger itself.
According to the MSDN page for DML Triggers:
The trigger and the statement that fires it are treated as a single transaction, which can be rolled back from within the trigger. If a severe error is detected (for example, insufficient disk space), the entire transaction automatically rolls back.
Scope
The context provided is:
{from you}
two users, executing updates to the same table --even the same rows
{from the first linked MSDN article in the Question that is "essentially the same question I am trying to find the answer to"}
Are the inserted and deleted tables scoped to the current session? In other words will they only contain the inserted and deleted records for the current scope, or will they contain the records for all current update operations against the same table? Can there even be truely concurrent operations or will locks prevent this?
Before getting into the inserted and deleted tables it should be made very clear that there will only ever be a single DML operation happening on a particular row at any given moment. Two or more requests might come in at the exact same nanosecond, but all requests will take their turn, one at a time (and yes, due to locking).
Now, regarding what is in the inserted and deleted tables: Yes, only the rows for that particular event will be (and even can be) in those two pseudo-tables. If you execute an UPDATE that will modify 5 rows, only those 5 rows will be in the inserted and deleted tables. And since you are looking for documentation, the MSDN page for Use the inserted and deleted Tables states:
The deleted table stores copies of the affected rows during DELETE and UPDATE statements. During the execution of a DELETE or UPDATE statement, rows are deleted from the trigger table and transferred to the deleted table. The deleted table and the trigger table ordinarily have no rows in common.
The inserted table stores copies of the affected rows during INSERT and UPDATE statements. During an insert or update transaction, new rows are added to both the inserted table and the trigger table. The rows in the inserted table are copies of the new rows in the trigger table.
Tying this back to the other part of the question, the part relating to the Transaction Isolation Level: The Transaction Isolation Level has absolutely no effect on the inserted and deleted tables as they pertain specifically to that event/query. However, the net effect of that operation, which is captured in those two psuedo-tables, can still be visible to other processes if they are using the READ UNCOMMITTED Isolation Level or the NOLOCK table hint.
And just to clarify something, the MSDN page linked above regarding the inserted and deleted tables states at the very beginning that they are "in memory" but that is not exactly correct. Starting in SQL Server 2005, those two pseudo-tables are actually based in tempdb. The MSDN page for the tempdb Database states:
The tempdb system database is a global resource that is available to all users connected to the instance of SQL Server and is used to hold the following:
...
Row versions that are generated by data modification transactions for features, such as: online index operations, Multiple Active Result Sets (MARS), and AFTER triggers.
Prior to SQL Server 2005, the inserted and deleted tables were read from the Transaction Log (I believe).
To summarize, the inserted and deleted tables:
operate within a Transaction
are static (i.e. read-only) tables
are visible to only the current Trigger
only contain rows for the specific event/operation/query that fired that instance of that Trigger

fn_cdc_map_lsn_to_time returns NULL for some cases

I am trying to use CDC in my sql database, I use a stored procedure to move the data from the temporary CDC tables to new tables. I also move the Start_lsn stamp. In the new tables I try using the fn_cdc_map_lsn_to_time to map the lsn to time, for some records it returns null and for others it returns the correct time, this is an example:
NULL 0x000034D5000002F80001
2014-01-16 00:38:39.377 0x0000350F000006D70001
NULL 0x00003513000003BA0001 2 0x3FFFFF
2014-01-18 02:00:05.320 0x0000351E000009FA0001 2 0x3FFFFF
is there any explanation.
Thanks
I found the answer and I just want to add it for the record.
The problem here is that I moved the records from the CDC tables to my own tables to prevent them from being deleted.
fn_cdc_map_lsn_to_time uses a system table to map the LSN from the cdc tables to time stamp. after some time those records will be deleted that is why some of the records where mapped correctly to time because they were recently added,
but the old ones were not mapped because they were already deleted.

Database update

Scenario:
I have Database1 (PostgreSQL). For this i) When a record is deleted, the status col. for that record is changed to inactive. ii) When a record is updated, the current record is rendered INACTIVE and a new record is inserted. iii) Insertion happens as usual. There is a timestamp col for each record for all the tables in the database.
I have another database2 (SQLite) which is synced with Database1, and follows the same property of Database1
Database1 gets changed regularly and I would get the CSV files for all the tables. The CSV would include all the data, including new insertions, and updations.
Requirement:
I need to make the data in Database1 consistent with the new CSV.
i) For the records that are not in the CSV, but are there in Database1 (DELETED RECORDS) - These records I have to set the status as inactive.
ii) For the records that are there in the CSV but not there in the Database1 (INSERTED RECORDS) - I need these records to be inserted.
iii) For the records that are updated in the CSVs I need to set status as inactive and insert new records.
Kindly help me with the logical implementation of these!!!
Thanks
Jayakrishnan
I assume you're looking to build software to achieve what you want, not looking for an off-the-shelf solution.
What environments are you able to develop in? C? PHP? Java? C#?
Lots of options in many environments that can all read/write from CSV/SQLite/PostgreSQL.
you could use an ON DELETE trigger to override existing delete behavior.
This strikes me as dangerous however. Someone is going to rely on this and then when the trigger isn't there, you will have actual deletions occur... It's better to encapsulate this behind a view or something and put a trigger on that. Or go through a stored procedure or something.

Resources