Scenario:
I have Database1 (PostgreSQL). For this i) When a record is deleted, the status col. for that record is changed to inactive. ii) When a record is updated, the current record is rendered INACTIVE and a new record is inserted. iii) Insertion happens as usual. There is a timestamp col for each record for all the tables in the database.
I have another database2 (SQLite) which is synced with Database1, and follows the same property of Database1
Database1 gets changed regularly and I would get the CSV files for all the tables. The CSV would include all the data, including new insertions, and updations.
Requirement:
I need to make the data in Database1 consistent with the new CSV.
i) For the records that are not in the CSV, but are there in Database1 (DELETED RECORDS) - These records I have to set the status as inactive.
ii) For the records that are there in the CSV but not there in the Database1 (INSERTED RECORDS) - I need these records to be inserted.
iii) For the records that are updated in the CSVs I need to set status as inactive and insert new records.
Kindly help me with the logical implementation of these!!!
Thanks
Jayakrishnan
I assume you're looking to build software to achieve what you want, not looking for an off-the-shelf solution.
What environments are you able to develop in? C? PHP? Java? C#?
Lots of options in many environments that can all read/write from CSV/SQLite/PostgreSQL.
you could use an ON DELETE trigger to override existing delete behavior.
This strikes me as dangerous however. Someone is going to rely on this and then when the trigger isn't there, you will have actual deletions occur... It's better to encapsulate this behind a view or something and put a trigger on that. Or go through a stored procedure or something.
Related
Would there be something similar as the master-slave database but at the table level in the database?
For example, I have the following scenario:
I have a table with millions of records and the reason is because the system is more than 15 years old.
I only want to show the records of the last year (2019-2020).
I decided to create a view that only shows the records of that range (1 year) from the information of that table that contains millions of records.
Thanks to the view, the loading time of that system screen is lighter, thanks to the fact that I have less load of records.
The problem: What if the user adds a new record to the table that contains millions of records? how do I make my view modify when the other table are modified ...
I can use triggers to update the view I think, but, is there a functionality in oracle that allows me something similar to what I just asked (master-slave) where the "slave" table is updated as the "master" table suffers changes?
First of all, you misunderstood views. View is not a physical table, and does not store any information. If you insert data into view, you are actually inserting into the source table.
Since the view is not physical, you are just filtering the data. This does not have any performance benefits.
For the big tables, you can use partitioning which drastically improves performance. And if you still need archival you can archive the partitioned data.
Partitioning is generally the best method, because you can typically archive data by simply doing an "exchange" command to archive off old data.
Data doesn't "move" in that scenario, it simply gets 'detached' from the table via data dictionary manipulation.
Would there be something similar as the master-slave database but at the table level in the database
If you are asking about master/slave replication on a table level, then,
I suppose, table/materialized view relationship is appropriate to call as a master-slave. Quote from Oracle Docs:
A materialized view is a database object that contains the results of a query. The FROM clause of the query can name tables, views, and other materialized views. Collectively these objects are called master tables (a replication term)...
When you need to "update" or, more appropriately - refresh the mview, you can use different options:
update mview periodically and refresh it periodically
update mview each time the data in the master table is changed and commited.
update manually calling DBMS_MVIEW.REFRESH or DBMS_SNAPSHOT.REFRESH
Mview could be faster then view because each time you select from a mview you select from a different "table" which was replicated from the original one. Especially if you have complex logic in a sql, you can put the logic to mview definition.
The drawbacks are you need extra disk space for mview, and there will be a delay of refreshing the data.
I upload an Excel file using BCP. (Truncate the current table in DB every day and BCP in from the excel file to repopulate table). It is important for me to keep a log of all the changes made to the rows (could be row additions or changes in columns of current rows). The idea is to keep a log of all the changes made.
I have read a few articles online, where we can create a log table and trigger (have no idea how to do it). A table of logs that has columns like
Date | Field | Old Value | New Value.
Firstly, how to do this?
Secondly, whats a smarter way to not log truncating of table and just the actual changes. I'm thinking of creating a temp table (tbl_Excefile_Temp) where I will import the file and then UPDATE the current table (tbl_Excefile) from the tbl_Excefile_Temp This way all the changes made in the current table will get logged automatically in the logs table.
I know its a big use case, could you please guide.
If you are using SQL server 2016 or higher I would advise you to look into temporal tables. If you stop truncating and use a merge statement you have a very easy way of keeping a log. Whenever you make a change SQL server will write to old values away and add the datetimes when the old row was valid.
With temporal tables you can query your table as they were at a specific datetime. In regular use there is no difference with a non-temporal table.
I'm using tracing to log all delete or update queries run through the system. The problem is, if I run a query like DELETE FROM [dbo].[Artist] WHERE ArtistId>280, I know how many rows were deleted but I'm unable to find out which rows were deleted (the data they had).
I'm thinking of doing this as a logging system so it would be useful to see which rows were affected and what data they had if at all possible. I don't really want to use triggers for this job but I will if I have to (and if it's feasible).
If you need the original data and are planning on storing all the deleted data in a separate table why not just logically delete the original data rather than physically delete it? i.e.
UPDATE dbo.Artist SET Artist_deleted = 1 WHERE ArtistId>280
Then you only need add one column to your current table rather than creating new tables and scripts to support these. You could then partition the current table based on the deleted flag if you are worried about disk space/performance etc.
I have a db on Oracle 11g where there's a table updated by external users. Now I want to catch the insert/update/delete on this table in order to bring these changes on a table on another db and I'm trying different methods for research. I tested polling (a job to check every minute if there is an update, insert or delete on the table) and trigger (fire on update, insert or delete on the table) yet, so are there alternative methods?
I found AOQ (Oracle Advanced Queuing), DBMS_PIPE, Oracle SNMP Agent Integrator Polling Activity, but I don't know if they are right for this case...
It depends.
Polling or triggers are often all you need depending on the volume of data involved, and the frequency of inserts/updates/deletes.
For example, the polling method might be as simple as adding a column which is set to 1 by default, and updated to NULL when the row is "consumed" by the replication code. A trigger on the table would set it back to 1 if a row is updated. An index on this column would be lightweight (the index would only include entries for rows where the column is 1) and therefore fast to query. You'd need another table to handle deletes, though.
The trigger method would merely write insert/update/delete rows into a log table of some sort, which would then get purged periodically by a job.
For heavier volumes solutions include Oracle GoldenGate and Oracle Streams: http://www.oracle.com/technetwork/database/focus-areas/data-integration/index.html
I have a db table that gets entirely re-populated with fresh data periodically. This data needs to be then pushed into a corresponding live db table, overwriting the previous live data.
As the table size increases, the time required to push the data into the live table also increases, and the app would look like its missing data.
One solution is to push the new data into a live_temp table and then run an SQL RENAME command on this table to rename it as the live table. The rename usually runs in sub-second time. Is this the "right" way to solve this problem?
Are there other strategies or tools to tackle this problem? Thanks.
I don't like messing with schema objects in this way - it can confuse query optimizers and I have no idea what will happen to any transactions that are going on while you execute the rename.
I much prefer to add a version column to the table, and have a separate table to hold the current version.
That way, the client code becomes
select *
from myTable t,
myTable_currentVersion tcv
where t.versionID = tcv.CurrentVersion
This also keeps history around - which may or not be useful; if it's not delete old records after setting the CurrentVersion column.
Create a duplicate table - exact copy.
Create a new table that does nothing more than keep track of the "up to date" table.
MostCurrent (table)
id (column) - holds name of table holding the "up to date" data.
When repopulating, populate the older table and update MostCurrent.id to reflect this table.
Now, in your app where you bind the data to the page, bind the newest table.
Would it be appropriate to only push changes to the live db table? For most applications I have worked with changes have been minimal. You should be able to apply all the changes in a single transaction. Committing the transaction will make them visible with no outage on the table.
If the data does change entirely, then you could configure the database so that you can replace all the data in a single transaction.