Data migration between different DBMS's - sql-server

As i couldnt get any satisfying answer to my Question it seems we have to write our own program for that, we are in the design phase and we are thinking which format shall we use to backup the data.
The program will be written in Delphi.
Needed is Exporting/Importing data between Oracle/Informix/Msserver, very important here is the Performance issue, as this program will run on a 1-2 GB Databases. Beside the normal data there are Blobs in the Database which have to be backuped.
We thought of Xml-Data or comma-separated data as both are transparent (which is nice to have), but Blobs must be considered here. Paradox format is not optinal in this case.
Can anybody recommend some performant formats?
Any other Ideas to achieve the same Goal are welcome.
Thanx in Advance.

I use an excellent program called OmegaSync for my backups, but it will only handle Informix via ODBC and not directly. If you find you can use OmegaSync, you'll find its performance to be excellent, because it compares the databases first, and then syncs only the differences. You might want to use this idea if you decide to do the programming yourself if efficiency is your number one goal.
But programming database conversion is very complex as others answers to your question have said. So why not just develop the SQL you need, and do the conversion that way. For example see: Convert Informix Schema to Oracle Schema Or Any Other RDBMS For moving the data, check out sources like: Moving non-informaix data between computers and dbspaces
You can optimize the SQL to what I'm sure will be an adequate speed if you dump and load your data smartly.

DbUnit is a popular tool which can extract and load data in XML format, see
http://www.dbunit.org/faq.html#extract
// partial database export
QueryDataSet partialDataSet = new QueryDataSet(connection);
partialDataSet.addTable("FOO", "SELECT * FROM TABLE WHERE COL='VALUE'");
partialDataSet.addTable("BAR");
FlatXmlDataSet.write(partialDataSet, new FileOutputStream("partial.xml"));
// full database export
IDataSet fullDataSet = connection.createDataSet();
FlatXmlDataSet.write(fullDataSet, new FileOutputStream("full.xml"));

Did you check ODI (Oracle Data Integrator) It has support for lots of source databases. It is able to capture changes from the source databases and integrate them in the target database. It is performant but has a price tag.
Ronald.

The new DBExpress framework give the possibility to exporting/importing data between many databases. you can check this CodeRage session Deep Dive into dbExpress by John Kaster

You should use your own binary format, integrated by (xml for text/streams for Blobs).

If you have to export metadata too and not only data, it could be very complex. There are many subtle (and not so subtle differences) among the databases you're going to use, that such a format should be general enough and the exporting/importing code should be able to translate and map metadata across databases, and because an external application can't write directly to the database internal structures, it would have to generate the db proper DDL to create the data structures.
As long as this is a proprietary format, IMHO its design is the least of your issues, if size and performance are important and the file is read sequentially it would not be difficult to design a binary format.
Anyway import/exports and backups are two different tasks. If you have to backup a database, use its facilities. They usually allow far more control, i.e. point-in-time recovery. If you have to move data across databases that's another issue - I would write just the code to move data, not metadata, pre-creating the required structure in the target database.

You could give Toad (Quest Software) a try.
It supports all your mentioned platforms and can do things like 'Export table data to INSERT statements' on your source platform which can then be run on the target platform.
IIRC there is even some Toad-internal backup-format which might be cross-platform.
Toad Communities:
Toad for ORACLE
Toad for SQL SERVER
Toad for OTHER RDMBS (including Informix)
Some videos about exporting, importing:
YouTube: Toad for Data Analysts v2.7 Export Enhancements
YouTube: Toad for Data Analysts v2.7 Import Enhancements

Related

MS Access database with no relations

Can anyone recommend a tool or suggest the approach when dealing with MS Access database with no relationships between tables?
As part of data migration project I am creating data mapping definition rules but it becomes more and more difficult and time consuming to correctly identify source tables/fields for extraction.
I have many tables with the same data appearing in different places. Furthermore, as there were no validation rules when data was input, many entries contain spelling errors or generally do not match expected data type. Most of the tables however already have the keys (primary & foreign) created.
I am looking for a quick solution to rebuild the database (*.mdb), ideally with a use of some software which could identify all potential data issues, suggest corrections, allow for adjustments and finally left off with fully relational database where the data can easily be identified and not scattered all over the place.
I have some general knowledge of databases and SQL but didn't use Access much before so I'm trying to save myself some of the time. And - if it matters - I don't care about database performance at all... Only the data itself. I will be extracting it to *.csv files later anyway...
Comments, suggestions and/or other considerations will be appreciated.
Thanks in advance
J.
I don't believe there is any software that will analyze an Access database and use some kind of artificial intelligence to generate a new database with good data and strong relationships.
My recommendation though is to export all the data into SQL Server (or even MySQL) and then work with it there. It's much easier to manipulate the data with a real query language instead of trying to scrub data in Access.
You can do mass updates, comparisons, joins, etc. with SQL Server. You can query the schema easily (write queries to see if a field appears in a table), change schemas/table definitions with code, etc.
Then once you're done you can use jobs (SSIS) to export the data to CSV.
(You can download SQL Express if you don't have/can't afford SQL Server.)

Practical Implementation for Data Warehouse

Data warehousing seems to be a big trend these days, and is very interesting to me. I'm trying to acquaint myself with its concepts, and am having a problem "seeing the forest through the trees" because all of the data warehouse models and descriptions I can find online are theoretical, but don't gives examples with actual technologies being used. I'm a contextual learner, so abstracted, theoretical explanations don't really help me out all that much.
Now there seem to be many "data warehousing models", but all of them seem to have some similar characteristics. There is ually an "ODS" (operational data store that aggregates data from multiple sources into the same place. A process known as "ETL" then converts data in this ODS into a "data vault", and again into "data" and/or "strategy marts."
Can someone provide an example of the technologies that would be used for each of these components (ODS, ETL, data vault, data/strategy marts)?
It sounds like the ODS could just be any ordinary database, but the data vault seems to have some special things going on because it is used by these "marts" to pull data from.
ETL is the biggest thing I'm choking on by far. Is this a language? A framework? An algorithm?
I think once I see a concrete example of what's going on at each step of the way, I'll finally get it. Thanks in advance!
ETL is a process. The abbreviation stands for Extract-Transform-Load which describes what is being done with data during the process. The process can be implemented anywhere where you need to create a bridge between two systems with differenet data formats. First, you need to pull (exract) data from a source system (database, flat files, web service etc.), Then data are being processed (transform) to comply with format of a target storage (again it can vary: databases, files, API calls). During the transform step, further actions can be performed on the data set as enrichment with data from other sources, cleansing and improving its quality. The last step is loading transformed data into a target storage.
Typically, an ETL process is employed for loading a datawarehouse, migrating data from one system or database to another during moving from a legacy system to new one, synchronizing data between two or more systems. It is also used as an intermediate layer in broader MDM and BI solutions.
In terms of specific software, there are many ETL tools on the market ranging from robust solutions from big players as Informatica, IBM DataStage, Oracle Data Integrator, to more affordable and open source providers as CloverETL, Talend, or Pentaho. The most of these tools offer a GUI where flow and processing of data is defined through diagrams.
For Microsoft SQL Server 2005 and later the ETL tool is called SSIS (SQL Server Integration Services). If you install at least the Standard version of the SQL Server you get the Business Intelligence Developer Studio with which you can design your data flows. Basically what an ETL tool does is take data from one or more sources (tables, flat files, ...) then transform it (add columns, join, filter, map to different data types, etc.) and finally store it again to one or more tables or files.
To get a basic understanding of how something works you can watch e.g. this video or this one (both from midnightdba). They're a bit lengthy, but you get an idea. They certainly helped me in understanding the basic functionality of an ETL tool.
Unfortunately I have not yet digged into other platforms or tools.
I'd highly recommend checking out some of the books by Ralph Kimball and Margy Ross (The Data Warehouse Toolkit, The Data Warehouse Lifecycle Toolkit) for an introduction to data warehousing.
My company's data warehouse is built using the Oracle Warehouse Builder tool for ETL. The OWB is a GUI tool that generates PL/SQL code on the database to manipulate the data. After manipulation and cleansing, the data is published to an Oracle datamart. The datamart is a database instance that users access for ad-hoc querying via Oracle Discoverer (Java software).

How to build big and complex database in sql - IN EASY WAY?

I have installed Oracle XE. I build small database every day to practice from command prompt, but now I want to have more. I want to have a bigger database with a lot of different data to practice and make exercises.
So, is possible to get a big data file from somewhere and upload to XE database?
You can't get 'big' data for Oracle Express edition as it is limited to 4GB (10g) or 10GB (11g ).
That said, there are public datasets available. Personally I like the FAA data on registered aircraft owners/operators
As you are practicing with Oracle, perhaps a good solution (which will also generate exactly the data you need) would be to write your own stored procedures to generate your data in a loop (or similar construct).
You could then generate as much as you like whilst also practicing your handling of large datasets and writing of efficient PL/SQL and SQL code.
This way your data will match your current database structure too without having to build a new database matching whichever dataset you download from the web.
IIRC there are sample schemas as HR that can be enabled. See this.

Transferring data between different DBMS's

I would like to transfer the whole Database i have in Informix to Oracle. We have an an application which works on both Databases, one of our customers is moving from Informix to Oracle, and needs to transfer the whole Database to Oracle (the structure is the same).
We need often to transfer data between oracle/Mssql/Informix sometimes only one table and not the whole Database.
Does anybody know about any good program which does this kind of job?
The Pentaho Data Integration ETL tools are available as open source (also known under the former name "Kettle") for cross-database migration and many other use cases.
From their data sheet:
Common Use Cases
Data warehouse population with built-in support for slowly changing
dimensions, junk dimensions
Export of database(s) to text-file(s) or other databases
Import of data into databases, ranging from text-files to excel
sheets
Data migration between database applications
...
A list of input / output data formats can be found in the accepted answer of this question: Does anybody know the list of Pentaho Data Integration (Kettle) connectors list?
It supports all databases with a JDBC driver, which means most of them.
Check this question of mine, it includes some very good ideas: Searching for (freeware) database migration tool
you could give the Oracle Migration Workbench a try. See http://download.oracle.com/docs/html/B15858_01/toc.htm If you want to read Informix data into Oracle on a regular basis, using the Heterogeneous Services might be a better option. Check for hs4odbc or dg4odbc, depending on the Oracle release you have.
I hope this helps,
Ronald.
I have done this in the past and it is not a trivial task. We ended up writing out each table out to a pipe delimited flat file and reloading each table into Oracle with Oracle SQL Loader. There was a ton of Perl scripts to scrub the source data and shell scripts to automate the process as much as possible and run things in parallel as well.
Gotchas that can come up:
1. Pick a delimiter that is as unique as possible.
2. Try to find data types that match as close as possible to the Informix ones as possible. ie date vs. timestamp
3. Try to get the data as clean as possible prior to dumping out the flat files.
4. HS will most likely be too slow..
This was done years ago. You may want to investigate Golden Gate (now owned by Oracle) software which may help with the process(GG did not exist when I did it)
Another idea is use an ETL tool to read Informix and dump the data into Oracle (Informatica comes to mind)
Good luck :)
sqlldr - Oracle's import utility
Here's what I did to transfer 50TB of data from MySQL to ORacle. Generated csv files from MySql and used sqlldr utility in oracle to export all the data from the files to oracle db. It is the fastest way to import data. I researched on this for a few weeks and done lot of benchmark test cases and sqlldr is hands down best and fastest way to import into oracle.

Are heterogeneous database systems in practice?

I was probing around a bit in the realm of databases and hit the notion of having heterogeneous databases. I googled and found this - link text
My question is what kind of scenario would put this into practice and is it really useful? Is it just another thing which was thought about but not implemented or in case it was implemented, then it got restricted to a very niche area?
cheers
I would say yes, very much so. One implementation I am familiar with is integrating MAS90 with an LOB production system. The data is duplicated in both but accessed and used in different ways.
I've worked on a heterogeneous system before. It's a commercial system to manage study abroad programs for large universities, and they had installations on Oracle, MySql, and Sql Server. I was an outside consultant handling a very specific conversion project, though, so I didn't get to see many of the issues involved in making it work well everywhere.
I do remember that the single biggest hurdle I had to deal with was Oracle's lack of a simple autoincrement-style column and having to set up separate sequences instead. There were a number of datatype mismatches as well, but there was a pretty good system in place to just map those.
Note that even here, each customer only had one kind of database. We didn't have to worry about replicating data itself between db types (aside from a few common lookup tables). Just structure.
Different departments in your company might use different databases. I pull data in and push data to from the following
SQL Server
Oracle
Sybase IQ
Access
MySQL
FoxPro
Flat files
Excel files
The SQL Server database is the repository off all the data but it pull from many different databases to populate data and then data will be pushed to different databases for departmental use

Resources