I have a table for bio-metric devices which capture the data as soon as the employees punch their fingers and uses SQL Server 2014 Standard Edition.
However, our legacy devices were exporting log files and we used a vb engine to push to our Oracle table and used to generate the attendance details.
I managed to export the data from SQL Server and built the first set of records. I would like to schedule a JOB with SQL Server with a condition that the Oracle table should receive ONLY the rows those are NOT already inserted from the SQL Server table.
I checked the append possibilities, which dumps the entire SQL Server table data when the job is executed thus duplicating the rows within the Oracle target table, forcing me to discard the job and to build a new one that deletes the Oracle table and recreates when the job is executed. I feel this is a kind of overkill...
Any known methods available to append only the rows those are NOT existing in the Oracle target table? Unfortunately the SQL Server tables doesn't have any unique id column for the transaction.
Please suggest
Thanks in advance
I think the best way is to use sal server replication with Oracle database as subscriber.
You can read about this solution on MSDN site:
Oracle Subscribers
Regards
Giova
Since you're talking about attendance data for something like an electronic time card, you could just send the data where the punch time is > the last time stamp synced. You would need to maintain that value some where, and it doesn't take into account retro actively entered records. If there's a record creation date in addition to the punch time you could use the created date. Further if there is a modified date in the record you could look into using the merge statement as Alex Pool suggested so you could get both new records and modifications synced to oracle.
Related
I want to create a daily process where I reload all rows from table A into table B. Over time table A rows will change due to changes in source system and also because of aging/deletion of records in the origin table. Table A gets truncated/reloaded daily in step 1. Table B is the master table that just gets new/updated rows.
From a historical point of view, I want to keep track of ALL the rows in table B and be able to do a point in time comparison for analytics purposes.
So I need to do two things, Daily insert rows from table A to table B if they don't exist and then also create a new record in Table B if the record already exists but ANY of the columns have changed. At one point I attempted to use temporal tables but I had too many false/positives on 'real' changes, basically certain columns were throwing off things because a date/time column was updated(only real change in row).
I'm using a Azure SQL Server Managed Instance database (Microsoft SQL Azure (RTM) - 12.0.2000.8).
At my disposal I have SSMS, SQL Server and also Azure Data Factory.
Any suggestions on the best way to do this or tools to help with this?
There are 2 concepts out of which you can implement any one.
Temporal table
Capture Data Change (CDC)
As CDC is the commonly used approach in which you can create an Azure data factory with a pipeline that loads delta data based on change data capture (CDC) information in the source Azure SQL Managed Instance database to an Azure blob storage.
To implement the CDC, you can you can follow this simple Microsoft tutorial Incrementally load data from Azure SQL Managed Instance to Azure Storage using change data capture (CDC)
Note: You also need to Create a storage account which is required but not given in above tutorial.
Is there a way to automate sql server profiler to record data then save data to a table continuously?
The reason, I am supporting a fragile SQL Server application and there is no auditing. I receive a lot of support calls regarding the deletion of records. I want a quick way to be able to view who has changed what data.
You can configure your profiler to save the trace directly to table as described here: How To Save a SQL Server Trace Data to a Table
But it's not a good idea for 2 reasons: first, profiler itself will be loading up your server, second, writing to table is the most costly option and you can even loose some events.
Maybe if you are on Enterprise edition you can use SQL Server database audit
that is more light weight
And here you can find a complete example of setting up database audit that audits the DELETE events
Here are few articles for your reference.
Save trace results to a database table
https://learn.microsoft.com/en-us/sql/tools/sql-server-profiler/save-trace-results-to-a-table-sql-server-profiler
Save Trace Results to a Table
https://technet.microsoft.com/en-us/library/ms191276(v=sql.110).aspx
9 Steps to an Automated Trace
http://sqlmag.com/t-sql/9-steps-automated-trace
alternatively, you may try this automated solution ( https://www.lepide.com/lepideauditor/sql-server-auditing.html ) to accomplish this task.
I have an Oracle database and a SQL Server database. There is one table say Inventory which contains millions of rows in both database tables and it keeps growing.
I want to compare the Oracle table data with the SQL Server data to find out which records are missing in the SQL Server table on daily basis.
Which is best approach for this?
Create SSIS package.
Create Windows service.
I want to consume less resource to achieve this functionality which takes less time and less resource.
Eg : 18 millions records in oracle and 16/17 millions in SQL Server
This situation of two different database arise because two different application online and offline
EDIT : How about connecting SQL server from oracle through Oracle Gateway to SQL server to
1) Direct query to SQL server from Oracle to update missing record in SQL server for 1st time.
2) Create a trigger on Oracle which gets executed when record is deleted from Oracle and it insert deleted record in new oracle table.
3) Create SSIS package to map newly created oracle table with SQL server to update SQL server record.This way only few records have to process daily through SSIS.
What do you think of this approach ?
I would create an SSIS package and load the data from the Oracle table use a Data Flow / OLE DB Data Source. If you have SQL Enterprise, the Attunity Connectors are a bit faster.
Then I would load key from the SQL Server table into a Lookup transformation, where I would match the 2 sources on the key, and direct unmatched rows into a separate output.
Finally I would direct the unmatched rows output to a OLE DB Command, to update the SQL Server table.
This SSIS package will require a lot of memory, but as the matching is done in memory with minimal IO, it will probably outperform other solutions for speed. It will need enough free memory to cache all the keys from the SQL Server Table.
SSIS also has the advantage that it has lots of other transformation functions available if you need them later.
What you basically want to do is replication from Oracle to SQL Server.
You could do this in SSIS, A windows Service or indeed a multitude of platforms.
The real trick is using the correct design pattern.
There are two general design patterns
Snapshot Replication
You take all records from both systems and compare them somewhere (so far we have suggestions to compare in SSIS or compare on Oracle but not yet a suggestion to compare on SQL Server, although this is valid)
You are comparing 18 million records here so this is a lot of work
Differential replication
You record the changes in the publisher (i.e. Oracle) since the last replication then you apply those changes to the subscriber (i.e. SQL Server)
You can do this manually by implementing triggers and log tables on the Oracle side, then use a regular ETL process (SSIS, command line tools, text files, whatever), probably scheduled in SQL Agent to apply these to the SQL Server.
Or you could do this by using the out of the box replication capability to set up Oracle as a publisher and SQL as a subscriber: https://msdn.microsoft.com/en-us/library/ms151149(v=sql.105).aspx
You're going to have to try a few of these and see what works for you.
Given this objective:
I want to consume less resource to achieve this functionality which takes less time and less resource
transactional replication is far more efficient but complicated. For maintenance purposes, which platforms (.Net, SSIS, Python etc.) are you most comfortable with?
Other alternatives:
If you can use Oracle gateway for SQL Server then you do not need to transfer data and can make the query directly.
If you can't use Oracle gateway, you can use Pentaho data integration or another ETL tool to compare tables and get results. Is easy to use.
I think the best approach is using oracle gateway.Just follow the steps. I have similar type of experience.
Install and Configure Oracle Database Gateway for SQL Server.
https://docs.oracle.com/cd/B28359_01/gateways.111/b31042/installsql.htm
Now you can create a dblink from oracle to sql server.
Create a procedure which compare the missing records in oracle database and insert into sql server database.
For example, you can use this statement inside your procedure.
INSERT INTO "dbo"."sql_server_table"#dblink_name("column1","column2"...."column5")
VALUES
(
select column1,column2....column5 from oracle_table
minus
select "column1","column2"...."column5" from "dbo"."sql_server_table"#dblink_name
)
Create a scheduler which execute the procedure daily.
When both databases are online, missing records will be inserted to sql server. Otherwise the scheduler fail or you can execute the procedure manually.
It takes minimum resource.
I will suggest having a homemade ETL solution.
Schedule an oracle job to export source table data (on a daily
manner based on the application logic ) to plain CSV format.
Schedule a SQL-Server job (with acceptable delay from first oracle job) to read this CSV file and import it
to a medium table inside sql-servter using BULK INSERT.
Last part of the SQL-Server job will be reading medium table data
and do the logic(insert, update target table). I suggest having another table to store reports of this daily job result.
Any easy way to get mysql server to query from an iseries (as/400 db2)? I have the odbc installed so I can query and export the data manually to my desktop and then import it to mysql.
The problem is the as400 database is so huge the performance is poor. I need to run a query every 1/2 hour or so on mysql to pull the new updated information on the iseries database.
Basically how do you use odbc on the mysql server to query from the iseries odbc?
I haven't worked on an iSeries for over 10 years but - here is what I know/remember.
You create physical files and then logicals(sort sequences) over them.
To help make it as efficient as possible the FIRST logical that will be executed during a "reorg" should contain ALL the fields you will use in any subsequent select/sequence logicals. Then the following logicals will use the first logical to built themselves - it is now ONLY using an index instead of a physical file.
Second when you use open query it looks for a logical that is "pre-built". If it can't find one at least "near" what it needs it has to build one of its own every time.
My next point is the file you are reading and selecting from. When a record is added does it update physical/logicals immediately? On open? On close?
If you are looking for speed for your query then you don't want to be busy updating the records which have been added.
Note that if these are order entry type of records the update may be deliberately delayed to enhance the data entry process.
Hope this helps - An "updated" and "appropriate" keyed and sequenced logical will make a huge difference.
If you don't know the iSeries you need someone who does that can check that side. Cheers Ted
Data replication. One method is using a row update timestamp and using column to drive the replication.
alter table mylib.mytable add column
UPDATETS TIMESTAMP GENERATED ALWAYS FOR EACH ROW
ON UPDATE AS ROW CHANGE TIMESTAMP NOT NULL
Now your replication would use the updatets column and pull rows with a updatets greater than the current max(updatets) in the mysql database.
I am looking for information about my table history,
I have to know the time that specific row was inserted
is there someway to know it?
thanks.
Using Triggers
A Datetime column in that table and an After Insert,Update Trigger
which updates that column to GETDATE().
This will only give you the details about the very last
change(update/Insert).
CDC
Change Data Capture (CDC) was introduced in SQL Server 2008. Change
Data Capture records INSERTs, UPDATEs, and DELETEs applied to SQL
Server tables, and makes a record available of what changed, where,
and when, in simple relational ‘change tables’.
These "Change Tables" contain columns that reflect the column
structure of the source table you have chosen to track, along with the
metadata needed to understand the changes that have been made.
Read here more about CDC
Also it is only supported in Datacenter & Enterprise edition.