I am trying to find a way to, as close to real time as possible, have Informix 11.50.FC9GE database data available for SQL Server 2014 SSRS reports.
Right now, we have SSIS (Integration Packages) that are on a 4 hour schedule to go out to our 8 Informix databases via ODBC, gather all of their table data, and update tables on the SQL Server side.
So, table 'abc' exists on all 8 databases. All of that data is input into a single table on the SQL Server. As that data is gathered, an artificial column is created to say which database the data came from.
Select *, "250" as db from abc
This process takes about 1-2 hours to complete. If someone attempts to run a report during this time, they get skewed data.
My hope is to have all of the table data in the SQL Server, and only pass over changed data.
I was looking at SQL Server Replication, but it doesn't look like it can replicate from a Non-SQL Server database?
I also started looking at IBM InfoSphere Change Data Capture 6.5. I installed the Access Server and Management Console on a Windows server with one of my Informix databases.
I installed InfoSphere CDC Configuration Tool (Instance) on the SQL Server with the Database entries pointing to the Informix server, but when I try to start that Instance I get the error:
IBM InfoSphere Change Data Capture could not identify a supported default database encoding. The detected encoding is null. Please override the encoding with a supported IANA encoding name that matches or is very close to your default database encoding and restart IBM InfoSphere Change Data Capture. Use dmset command line utility to override the encoding.
I found this command to enter:
dmset -I instanceName database_default_character_encoding=UTF-8
But that gives an error:
C:\Program Files (x86)\IBM\InfoSphere Change Data Capture\Replication Engine for
IBM Informix Dynamic Server\bin>dmset -I vsqldev2014 database_default_character
_encoding=UTF-8
There is a problem with the IBM InfoSphere Change Data Capture service.
Frankly, I probably didn't set it up right, because there's hardly any instructions out there. :(
I did find a 3rd party software that appears to work, but they are quoting tens of thousands of dollars. No way, my company would go for that.
Any help/suggestions?
Related
As per my understanding before setting up transaction replication in Oracle Goldengate, we have to setup initial data load. In my case the source is SQL Server 2012 and the destination is Oracle 12 and both are residing in the same system. Now my questions are
1. What is the best way to setup the initial load? I meant to use some SQL Server utility such as SSIS or use Goldengate's "Direct Bulk load" feature?
2. Though my source DB and destination DB are residing on the same machine, do I still have to use two installations (one for source and other for destination) of the Goldengate for transaction replication?
I used GG direct load for MSSQL initial load; the database was huge and it went fine. The downside of it is that if a failure occurs, then you'll need to truncate the target table ans start the load from the beginning. As for multiple installations, in one environment I have both target and source Oracle databases running on the same machine and using the same installation, so I think you'll be fine with just one.
Look at the link it could be beneficial
http://www.ateam-oracle.com/oracle-goldengate-heterogeneous-database-initial-load-using-oracle-goldengate/
I have to move data from existing database oracle to which I don't have direct access. The data is about 11 tables, 5GB each. The database admin can export the tables to some .csv or xml. The problem with csv is that some data is textual with lots of special characters. The problem with xml is that the markup is an overhead which will increase significantly the size of the files. The DBA admin is not competent enough to provide a working and neat solution. He uses toad as the database tool. Can you provide some ideas how to perform such a migration in the best possible way?
Please refer the below steps to migrate the data from Oracle to SQL server.
Recommended Migration Process
To successfully migrate objects and data from Oracle databases to SQL Server, Azure SQL DB, or Azure SQL Data Warehouse, use the following process:
1.Create a new SSMA project.
2.After you create the project, you can set project conversion, migration, and type mapping options. For information about project settings, see Setting Project Options (OracleToSQL). For information about how to customize data type mappings, see Mapping Oracle and SQL Server Data Types (OracleToSQL).
3.Connect to the Oracle database server.
4.Connect to an instance of SQL Server.
5.Map Oracle database schemas to SQL Server database schemas.
6.Optionally, Create assessment reports to assess database objects for conversion and estimate the conversion time.
7.Convert Oracle database schemas into SQL Server schemas.
8.Load the converted database objects into SQL Server.
You can do this in one of the following ways:
* Save a script and run it in SQL Server.
* Synchronize the database objects.
9. Migrate data to SQL Server.
10.If necessary, update database applications.
For more details :
[https://learn.microsoft.com/en-us/sql/ssma/oracle/migrating-oracle-databases-to-sql-server-oracletosql?view=sql-server-2017]
After the admin export data into CSV, try to convert it into a character set which will recognize all special characters.
Then, try to follow the steps from this link: link, it might work.
If after the import, there are still special characters, thy to manually convert them.
Get the DBA to export the tables using the ASCII delimiters which were designed for this purpose:
Row delimiter: Decimal 30 / 0x1E
Column delimiter: Decimal 31 / 0x1F
Then you can use BCP (or any other similar product) to upload the data to SQL Server.
I'm trying to query my Hortonworks cluster Hive tables from SQL Server. My scenario below:
HDP 2.6, Ambari, HiveServer2
SQL Server 2016 Enterprise
Kerberos configuration for secure logins in HDP
I was reading about the PolyBase service in SQL Server 2016 and I suppose later versions. However, I realize that according to the documentation what this service is going to perform in SQL Server is a bridge to reach out my HDFS and recreate external tables based in this data source.
Otherwise what I'm expecting is query Hive objects like these would be SQL Server objects as well, such as a linked server.
Someone has an example or knows if this is possible within SQL Server and Hive?
Thanks so much
Hive acts more as a job compiler than a database. This means every SQL statement you are writing will be translated into a job for Hadoop, sent over to the cluster and become executed there. From the user perspective it looks like querying a table.
The already mentioned approach by reading the HDFS data source and re-create it in SQL Server is the correct one. Since both, Hive and database server are different technologies, something like a linked server seems to technically not possible for me.
Hive provides nowadays a JDBC interface which could be used to connect to it. But even with Hive JDBC, every query will end up as cluster job for distributed computing, running over the files in HDFS, create a result set and present that to you.
If you want to querying Hive from SQL server, you can download ODBC driver (Microsoft or Hortonsworks) and create a Data Source Name (DSN) for Hive. In Advanced option check Use Native Query. Then just create new linked server in the SQL server with the same name of datasource as Data Source Name in ODBC driver.
Write openquery something like:
select top 100 * from
openquery(HadoopLinkedServer,
'column1, column2 from databaseInHadoop.tableInHadoop')
I have an Oracle database and a SQL Server database. There is one table say Inventory which contains millions of rows in both database tables and it keeps growing.
I want to compare the Oracle table data with the SQL Server data to find out which records are missing in the SQL Server table on daily basis.
Which is best approach for this?
Create SSIS package.
Create Windows service.
I want to consume less resource to achieve this functionality which takes less time and less resource.
Eg : 18 millions records in oracle and 16/17 millions in SQL Server
This situation of two different database arise because two different application online and offline
EDIT : How about connecting SQL server from oracle through Oracle Gateway to SQL server to
1) Direct query to SQL server from Oracle to update missing record in SQL server for 1st time.
2) Create a trigger on Oracle which gets executed when record is deleted from Oracle and it insert deleted record in new oracle table.
3) Create SSIS package to map newly created oracle table with SQL server to update SQL server record.This way only few records have to process daily through SSIS.
What do you think of this approach ?
I would create an SSIS package and load the data from the Oracle table use a Data Flow / OLE DB Data Source. If you have SQL Enterprise, the Attunity Connectors are a bit faster.
Then I would load key from the SQL Server table into a Lookup transformation, where I would match the 2 sources on the key, and direct unmatched rows into a separate output.
Finally I would direct the unmatched rows output to a OLE DB Command, to update the SQL Server table.
This SSIS package will require a lot of memory, but as the matching is done in memory with minimal IO, it will probably outperform other solutions for speed. It will need enough free memory to cache all the keys from the SQL Server Table.
SSIS also has the advantage that it has lots of other transformation functions available if you need them later.
What you basically want to do is replication from Oracle to SQL Server.
You could do this in SSIS, A windows Service or indeed a multitude of platforms.
The real trick is using the correct design pattern.
There are two general design patterns
Snapshot Replication
You take all records from both systems and compare them somewhere (so far we have suggestions to compare in SSIS or compare on Oracle but not yet a suggestion to compare on SQL Server, although this is valid)
You are comparing 18 million records here so this is a lot of work
Differential replication
You record the changes in the publisher (i.e. Oracle) since the last replication then you apply those changes to the subscriber (i.e. SQL Server)
You can do this manually by implementing triggers and log tables on the Oracle side, then use a regular ETL process (SSIS, command line tools, text files, whatever), probably scheduled in SQL Agent to apply these to the SQL Server.
Or you could do this by using the out of the box replication capability to set up Oracle as a publisher and SQL as a subscriber: https://msdn.microsoft.com/en-us/library/ms151149(v=sql.105).aspx
You're going to have to try a few of these and see what works for you.
Given this objective:
I want to consume less resource to achieve this functionality which takes less time and less resource
transactional replication is far more efficient but complicated. For maintenance purposes, which platforms (.Net, SSIS, Python etc.) are you most comfortable with?
Other alternatives:
If you can use Oracle gateway for SQL Server then you do not need to transfer data and can make the query directly.
If you can't use Oracle gateway, you can use Pentaho data integration or another ETL tool to compare tables and get results. Is easy to use.
I think the best approach is using oracle gateway.Just follow the steps. I have similar type of experience.
Install and Configure Oracle Database Gateway for SQL Server.
https://docs.oracle.com/cd/B28359_01/gateways.111/b31042/installsql.htm
Now you can create a dblink from oracle to sql server.
Create a procedure which compare the missing records in oracle database and insert into sql server database.
For example, you can use this statement inside your procedure.
INSERT INTO "dbo"."sql_server_table"#dblink_name("column1","column2"...."column5")
VALUES
(
select column1,column2....column5 from oracle_table
minus
select "column1","column2"...."column5" from "dbo"."sql_server_table"#dblink_name
)
Create a scheduler which execute the procedure daily.
When both databases are online, missing records will be inserted to sql server. Otherwise the scheduler fail or you can execute the procedure manually.
It takes minimum resource.
I will suggest having a homemade ETL solution.
Schedule an oracle job to export source table data (on a daily
manner based on the application logic ) to plain CSV format.
Schedule a SQL-Server job (with acceptable delay from first oracle job) to read this CSV file and import it
to a medium table inside sql-servter using BULK INSERT.
Last part of the SQL-Server job will be reading medium table data
and do the logic(insert, update target table). I suggest having another table to store reports of this daily job result.
I am replacing an Access application with a web app, but the client is using SQL Server 2000, and I am using SQL Server 2008.
So, I have the database redesigned, with foreign keys, but now I need to get the data on the client's system.
Part of the problem is that they have images that are over 32k, so osql failed as the command buffer filled up.
I should be able to use osql to import the new schema at least, and perhaps all of the data except for the images.
The Export wizard just wouldn't work, even though I tried the Native SQL Driver and the OLE DB Sql Driver.
Flat files seems like a bad choice, as I don't know if it can do the images.
So, what is a good way to copy a 330M database from 2008 -> 2000?
Not sure about performance or time needed, but you could always try a tool like
Red-Gate SQL Compare / SQL Data Compare
Apex SQL Diff / SQL Data Diff
These will allow you to compare both the schema of two databases, as well as the data, and allow you to create synchronization scripts, or synchronize online.
Marc
I set the image column to null, which reduced the size of the insert statements.
This enabled me to import the data into the target database.