Sync, not import, CSV files with SQLite - database

In MySQL I have observed that under table options, the storage engine can be configured as a CSV file. This gives me a nice clean csv file in the database directory. This file doesn't come with column names but this isn't a deal breaker.
Can SQLite be configured to point to or sync with a csv file? I am not just trying to Import data from a csv file into the SQLite database.
If this is not possible in SQLite, I welcome alternative suggestions.
Big picture: This is a portable database that deals with tables of radically different sizes. Some may have 10 lines. Other a few hundred, others a few ten-thousand. Because of the nature of the data, most tables are manually maintained and this is best suited by spreadsheet like interfaces. Some are the result of an automated process, but in general, they still require some manual characterization in a few columns to Join (via SQL) to the other tables.
Previously I performed all this in Microsoft Access, but because I now need a cross platform open-source approach, I am exploring alternatives. I have had reasonable productivity with MySQL, but I would something a little smaller, simpler, and more portable for my users.

Related

Data migration between different DBMS's

As i couldnt get any satisfying answer to my Question it seems we have to write our own program for that, we are in the design phase and we are thinking which format shall we use to backup the data.
The program will be written in Delphi.
Needed is Exporting/Importing data between Oracle/Informix/Msserver, very important here is the Performance issue, as this program will run on a 1-2 GB Databases. Beside the normal data there are Blobs in the Database which have to be backuped.
We thought of Xml-Data or comma-separated data as both are transparent (which is nice to have), but Blobs must be considered here. Paradox format is not optinal in this case.
Can anybody recommend some performant formats?
Any other Ideas to achieve the same Goal are welcome.
Thanx in Advance.
I use an excellent program called OmegaSync for my backups, but it will only handle Informix via ODBC and not directly. If you find you can use OmegaSync, you'll find its performance to be excellent, because it compares the databases first, and then syncs only the differences. You might want to use this idea if you decide to do the programming yourself if efficiency is your number one goal.
But programming database conversion is very complex as others answers to your question have said. So why not just develop the SQL you need, and do the conversion that way. For example see: Convert Informix Schema to Oracle Schema Or Any Other RDBMS For moving the data, check out sources like: Moving non-informaix data between computers and dbspaces
You can optimize the SQL to what I'm sure will be an adequate speed if you dump and load your data smartly.
DbUnit is a popular tool which can extract and load data in XML format, see
http://www.dbunit.org/faq.html#extract
// partial database export
QueryDataSet partialDataSet = new QueryDataSet(connection);
partialDataSet.addTable("FOO", "SELECT * FROM TABLE WHERE COL='VALUE'");
partialDataSet.addTable("BAR");
FlatXmlDataSet.write(partialDataSet, new FileOutputStream("partial.xml"));
// full database export
IDataSet fullDataSet = connection.createDataSet();
FlatXmlDataSet.write(fullDataSet, new FileOutputStream("full.xml"));
Did you check ODI (Oracle Data Integrator) It has support for lots of source databases. It is able to capture changes from the source databases and integrate them in the target database. It is performant but has a price tag.
Ronald.
The new DBExpress framework give the possibility to exporting/importing data between many databases. you can check this CodeRage session Deep Dive into dbExpress by John Kaster
You should use your own binary format, integrated by (xml for text/streams for Blobs).
If you have to export metadata too and not only data, it could be very complex. There are many subtle (and not so subtle differences) among the databases you're going to use, that such a format should be general enough and the exporting/importing code should be able to translate and map metadata across databases, and because an external application can't write directly to the database internal structures, it would have to generate the db proper DDL to create the data structures.
As long as this is a proprietary format, IMHO its design is the least of your issues, if size and performance are important and the file is read sequentially it would not be difficult to design a binary format.
Anyway import/exports and backups are two different tasks. If you have to backup a database, use its facilities. They usually allow far more control, i.e. point-in-time recovery. If you have to move data across databases that's another issue - I would write just the code to move data, not metadata, pre-creating the required structure in the target database.
You could give Toad (Quest Software) a try.
It supports all your mentioned platforms and can do things like 'Export table data to INSERT statements' on your source platform which can then be run on the target platform.
IIRC there is even some Toad-internal backup-format which might be cross-platform.
Toad Communities:
Toad for ORACLE
Toad for SQL SERVER
Toad for OTHER RDMBS (including Informix)
Some videos about exporting, importing:
YouTube: Toad for Data Analysts v2.7 Export Enhancements
YouTube: Toad for Data Analysts v2.7 Import Enhancements

Transferring data between different DBMS's

I would like to transfer the whole Database i have in Informix to Oracle. We have an an application which works on both Databases, one of our customers is moving from Informix to Oracle, and needs to transfer the whole Database to Oracle (the structure is the same).
We need often to transfer data between oracle/Mssql/Informix sometimes only one table and not the whole Database.
Does anybody know about any good program which does this kind of job?
The Pentaho Data Integration ETL tools are available as open source (also known under the former name "Kettle") for cross-database migration and many other use cases.
From their data sheet:
Common Use Cases
Data warehouse population with built-in support for slowly changing
dimensions, junk dimensions
Export of database(s) to text-file(s) or other databases
Import of data into databases, ranging from text-files to excel
sheets
Data migration between database applications
...
A list of input / output data formats can be found in the accepted answer of this question: Does anybody know the list of Pentaho Data Integration (Kettle) connectors list?
It supports all databases with a JDBC driver, which means most of them.
Check this question of mine, it includes some very good ideas: Searching for (freeware) database migration tool
you could give the Oracle Migration Workbench a try. See http://download.oracle.com/docs/html/B15858_01/toc.htm If you want to read Informix data into Oracle on a regular basis, using the Heterogeneous Services might be a better option. Check for hs4odbc or dg4odbc, depending on the Oracle release you have.
I hope this helps,
Ronald.
I have done this in the past and it is not a trivial task. We ended up writing out each table out to a pipe delimited flat file and reloading each table into Oracle with Oracle SQL Loader. There was a ton of Perl scripts to scrub the source data and shell scripts to automate the process as much as possible and run things in parallel as well.
Gotchas that can come up:
1. Pick a delimiter that is as unique as possible.
2. Try to find data types that match as close as possible to the Informix ones as possible. ie date vs. timestamp
3. Try to get the data as clean as possible prior to dumping out the flat files.
4. HS will most likely be too slow..
This was done years ago. You may want to investigate Golden Gate (now owned by Oracle) software which may help with the process(GG did not exist when I did it)
Another idea is use an ETL tool to read Informix and dump the data into Oracle (Informatica comes to mind)
Good luck :)
sqlldr - Oracle's import utility
Here's what I did to transfer 50TB of data from MySQL to ORacle. Generated csv files from MySql and used sqlldr utility in oracle to export all the data from the files to oracle db. It is the fastest way to import data. I researched on this for a few weeks and done lot of benchmark test cases and sqlldr is hands down best and fastest way to import into oracle.

What nosql database is ideal to use for storing code/snippets?

I want to store code similar to how jsfiddle stores code. I currently use Postgres for my main database but I'm wondering if it's more ideal to be using a NoSQL database?
Code snippets for now will have just one author, but in the future there may be multiple authors and I want the ability for reverting as well.
I know there are key/value databases and document-oriented databases. Which specific noSQL db would suite my needs? Or should I still stick with my Postgres db?
FYI:
I'm using django
The users will be permanently stored in postgres ( I'm using openID )
You can't choose a non-relational data strategy without defining what you want to do with your data.
Relational database design comes from rules of normalization, which you can apply once you know your data alone. But non-relational database design depends on your queries more than your data.
But without knowing anything about your application, my first recommendation would be to stick with PostgreSQL. Store your code snippets in text blobs, and meta-data about the code (authorship, date, language, project, etc.) in additional columns alongside the text blob. Also you can consider using GIST indexes to allow for flexible searching.
You might also consider Apache Solr, which is technically similar to a document-oriented DBMS, though it is usually presented as a fulltext search engine.
As for NoSQL databases, the only ones I'm familiar with are XML (doesn't scale well and has bad concurrency), and local databases such as Paradox, dBase, FoxProx and Access. I would not recommend any of these.
I think that the idea that it's a NoSQL database should be a smaller factor in your decision. Consider these things instead.
Redundancy. Can you run it on two servers at the same time or does it support failover? (SQL Server, Interbase, Firebird)
Concurrency. Will you host this app on the web? How will it handle 10 concurrent operations? (PostGres, MySql, Interbase, Firebird)
Speed. How long is acceptable for a lookup or post?
Embeddability. Is this a desktop application? An embedded database can make things easier. (Local databases such as Paradox, dBase, FoxPro, Access, Interbase, Firebird or SQLite)
Portability. Desktop apps may run on Mac, Linux, Windows. (SQLite)
Sounds like a relatively uncomplicated application which could be implemented in a traditional relational database or a NoSQL without too many problems.
However if you're keeping the userbase info in PostgreSQL, it would seem simplest to just stick with that as a single storage method. Using both an SQL database and a NoSQL adds complexity, makes joining across the datasets hard (so eg. you couldn't make a query to do something like ‘list users along with their most recent document’), and makes it impossible to ensure consistency between the two datasets.
What do you get for this trouble? You want versioning. CouchDB will give you revision control, but it's questionable whether you should be using that for UI-level versioning (eg because compacting the database will lose your old versions).

Databases for easy comparison

We have an application which has metadata information stored in database (some tables with relations between). The metadata can be edited through web app or directly manipulating values in SQL Server database.
The problem: metadata changes and needs to be merged between different environments (test, staging, production, etc.). There are tools (e.g. RedGate) that help but it is still quite a lot of work to compare databases if autogenerated ID's are being used (as it is now in our DB, and yes, one way is to use natural keys to make comparison easier).
However, our metadata may be stored not necessarily in SQL database - it could be stored as documents in NOSQL databases (MongoDB, CouchDB, RavenDB) or even simple XML databases (maybe Berkeley DB XML?). Storing as XML file seems would work (as it easier to compare and merge files rather than databases) but may not be a good option as there needs to some concurrency mechanisms, some degree of transaction support.
We do not need replication to other servers, there is no need for high availability, etc.
The requirements to store data:
some kind of ACID
Should run on Windows
Easy comparison (bi-directional sync)
(optional) GUI to see what is in database
(optional) export to file (JSON, XML)
What are the options?
Why conflate the storage with the representation you are performing the diff on?
I'd keep everything in SQL, but when it came time to compare, select all the important data (not the ids) into a XML format, and use an XML differencing tool (or a csv format, and use a plain text comparer).
I have never used it but CouchDB has built-in support for birectional syncing between db's.

Are heterogeneous database systems in practice?

I was probing around a bit in the realm of databases and hit the notion of having heterogeneous databases. I googled and found this - link text
My question is what kind of scenario would put this into practice and is it really useful? Is it just another thing which was thought about but not implemented or in case it was implemented, then it got restricted to a very niche area?
cheers
I would say yes, very much so. One implementation I am familiar with is integrating MAS90 with an LOB production system. The data is duplicated in both but accessed and used in different ways.
I've worked on a heterogeneous system before. It's a commercial system to manage study abroad programs for large universities, and they had installations on Oracle, MySql, and Sql Server. I was an outside consultant handling a very specific conversion project, though, so I didn't get to see many of the issues involved in making it work well everywhere.
I do remember that the single biggest hurdle I had to deal with was Oracle's lack of a simple autoincrement-style column and having to set up separate sequences instead. There were a number of datatype mismatches as well, but there was a pretty good system in place to just map those.
Note that even here, each customer only had one kind of database. We didn't have to worry about replicating data itself between db types (aside from a few common lookup tables). Just structure.
Different departments in your company might use different databases. I pull data in and push data to from the following
SQL Server
Oracle
Sybase IQ
Access
MySQL
FoxPro
Flat files
Excel files
The SQL Server database is the repository off all the data but it pull from many different databases to populate data and then data will be pushed to different databases for departmental use

Resources