Theory of how to sync. (two) databases - database

What steps should I pass if I like to synchronize two databases ex.: in every 15 minutes?
What practical advices can you give me if I'd like to sync. a MYSQL and an MSSQL databases?

The theory behind true replication (something like MySQL to MySQL) is very complicated and difficult. I wouldn't recommend trying to implement something like that for MySQL to SQL Server.
Some things to look at:
Look at Mule ESB (http://www.mulesoft.org/) You can get off the ground pretty fast with JDBC connections to MySQL and SQL Server. Then it's just a matter of how often you want to poll one endpoint to push to another endpoint. (For example, poll MySQL every 15 minutes and take the results and write to SQL Server.)
You can write your own syncing program. Maybe export data from one system every 15 minutes and write to the file system. Have another program watch that directory and import anything it sees. (Disadvantage is you have to touch the disk.)
To be really creative, you can write triggers in MySQL and SQL Server that fire an external process to send data. That way when a record gets touched, it will send off a message in near real time to the other database.
Try to make the schemas the same. MySQL and SQL Server share many of the same data types, so definitely try to not use data types that are specific to one of the two databases. (For example, I don't believe MySQL supports the "xml" data type. But maybe I'm wrong?)

Related

How to sync SQL Server data to MongoDB

I have a table in SQL Server, and I have to sync the data to MongoDB for r/w separation. The table was inserted/updated/deleted so often that the SQL Server was not able to be stopped.
Is there an experienced way to implement that?
A proper answer to that is probably going to be beyond the scope of the question as posed, but things like workload (lots of data? heavy volume? network toplogy), SLAs (how fast, how often, how foolproof, can you have down time?) and synchronicity (See Brewers Cap Theorem), will dramatically change what is or isnt reasonable as an approach.
A naive answer might be as simple as "write the data to a csv once a day, then feed those into mongo via a script".
On the other side of the coin, there are ETL tools and libraries out there which I'm sure specialize in moving lots and lots of data as quickly as possible between your two storage engines. In fact I believe there's an ODBC driver for mongo which is a standard you can find in products from microsoft, oracle, open source, and everything in between.
A happy middle ground might just be a lightweight application or a script whos job it is to fire off documents one by one (or in batches) off of a queue till its empty. App reads sql, writes mongo, and handles any distributed transaction logic you may wish to impose.

SQL Server Table > MS Access Local Copy?

I'm looking for a little advice.
I have some SQL Server tables I need to move to local Access databases for some local production tasks - once per "job" setup, w/400 jobs this qtr, across a dozen users...
A little background:
I am currently using a DSN-less approach to avoid distribution issues
I can create temporary LINKS to the remote tables and run "make table" queries to populate the local tables, then drop the remote tables. Works as expected.
Performance here in US is decent - 10-15 seconds for ~40K records. Our India teams are seeing >5-10 minutes for the same datasets. Their internet connection is decent, not great and a variable I cannot control.
I am wondering if MS Access is adding some overhead here than can be avoided by a more direct approach: i.e., letting the server do all/most of the heavy lifting vs Access?
I've tinkered with various combinations, with no clear improvement or success:
Parameterized stored procedures from Access
SQL Passthru queries from Access
ADO vs DAO
Any suggestions, or an overall approach to suggest? How about moving data as XML?
Note: I have Access 7, 10, 13 users.
Thanks!
It's not entirely clear but if the MSAccess database performing the dump is local and the SQL Server database is remote, across the internet, you are bound to bump into the physical limitations of the connection.
ODBC drivers are not meant to be used for data access beyond a LAN, there is too much latency.
When Access queries data, is doesn't open a stream, it fetches blocks of it, wait for the data wot be downloaded, then request another batch. This is OK on a LAN but quickly degrades over long distances, especially when you consider that communication between the US and India has probably around 200ms latency and you can't do much about it as it adds up very quickly if the communication protocol is chatty, all this on top of the connection's bandwidth that is very likely way below what you would get on a LAN.
The better solution would be to perform the dump locally and then transmit the resulting Access file after it has been compacted and maybe zipped (using 7z for instance for better compression). This would most likely result in very small files that would be easy to move around in a few seconds.
The process could easily be automated. The easiest is maybe to automatically perform this dump every day and making it available on an FTP server or an internal website ready for download.
You can also make it available on demand, maybe trough an app running on a server and made available through RemoteApp using RDP services on a Windows 2008 server or simply though a website, or a shell.
You could also have a simple windows service on your SQL Server that listens to requests for a remote client installed on the local machines everywhere, that would process the dump and sent it to the client which would then unpack it and replace the previously downloaded database.
Plenty of solutions for this, even though they would probably require some amount of work to automate reliably.
One final note: if you automate the data dump from SQL Server to Access, avoid using Access in an automated way. It's hard to debug and quite easy to break. Use an export tool instead that doesn't rely on having Access installed.
Renaud and all, thanks for taking time to provide your responses. As you note, performance across the internet is the bottleneck. The fetching of blocks (vs a continguous DL) of data is exactly what I was hoping to avoid via an alternate approach.
Or workflow is evolving to better leverage both sides of the clock where User1 in US completes their day's efforts in the local DB and then sends JUST their updates back to the server (based on timestamps). User2 in India, also has a local copy of the same DB, grabs just the updated records off the server at the start of his day. So, pretty efficient for day-to-day stuff.
The primary issue is the initial DL of the local DB tables from the server (huge multi-year DB) for the current "job" - should happen just once at the start of the effort (~1 wk long process) This is the piece that takes 5-10 minutes for India to accomplish.
We currently do move the DB back and forth via FTP - DAILY. It is used as a SINGLE shared DB and is a bit LARGE due to temp tables. I was hoping my new timestamped-based push-pull of just the changes daily would have been an overall plus. Seems to be, but the initial DL hurdle remains.

Asking for better strategy implementing Delphi online reports based on firebird database

In a Firebird database driven Delphi application we need to bring some data online, so we can add to our application online-reporting capabilities.
Current approach is: whenever data is changed or added send them to the online server(php + mysql), if it fails, add it to the queue and try again. Then the server having the data is able to create it's own reports.
So, to conclude: what is a good way to bring that data online.
At the moment I know these two different strategies:
event based: whenever changes are detected, push them to the web server / mysql db. As you wrote, this requires queueing in case the destination system does not receive the messages.
snapshot based: extract the relevant data in intervals (for example every hour) and transfer it to the web server / mysql db.
The snapshot based strategy allows to preprocess the data in a way that if fits nicely in the wb / mysql db data structure, which can help to decouple the systems better and keep more business logic on the side of the sending system (Delphi). It also generates a more continuous load, as it does not care about mass data changes.
One other way can be to use replication but I don't know system who make replication between Firebird and MySQL database.
For adding reporting tools capability on-line : you can also check fast report server

Replicating / Cloning data from one MS SQL Server to another

I am trying to get the content of one MSSQL database to a second MSSQL database. There is no conflict management required, no schema updating. It is just a plain copy and replace data. The data of the destination database would be overwritten, in case somebody would have had changed something there.
Obviously, there are many ways to do that
SQL Server Replication: Well established, but using old protocols. Besides that, a lot of developers keep telling me that the devil is in the details and the replication might not always work as expected and that is this best choice for an administrator, but not for a developer.
MS Sync Framework: MSF is said to be the cool, new technology. Yes, it is this new stuff, you love to get, because it sounds so innovative. There is the generic approach for synchronisation, this sounds like: Learn one technology and how to integrate data source, you will never have to learn how to develop syncing again. But on the other hand, you can read that the main usage scenario seems to be to synchronize MSSQL Compact databases with MSSQL.
SQL Server Integration Services: This sounds like an emergency plannable solution. In case the firewall is not working, we have a package that can be executed again and again... until the firewall drops down or the authentication is fixed.
Brute Force copy and replace of database files: Probably not the best choice.
Of course, when looking on the Microsoft websites, I read that every technology (apart from brute force of course) is said to be a solid solution that can be applied in many scenarios. But that is, of course, not the stuff I wanted to hear.
So what is your opinion about this? Which technology would you suggest.
Thank you!
Stefan
The easiest mechanism is log shipping. The primary server can put the log backups on any UNC path, and then you can use any file sync tools to manage getting the logs from one server to another. The subscriber just automatically restores any transaction log backups it finds in its local folder. This automatically handles not just data, but schema changes too.
The subscriber will be read-only, but that's exactly what you want - otherwise, if someone can update records on the subscriber, you're going to be in a world of hurt.
I'd add two techniques to your list.
Write T-SQL scripts to INSERT...SELECT the data directly
Create a full backup of the database and restore it onto the new server
If it's a big database and you're not going to be doing this too often, then I'd go for the backup and restore option. It does most of the work for you and is guaranteed to copy all the objects.
I've not heard of anyone using Sync Framework, so I'd be interested to hear if anyone has used it successfully.

Copying data from a local database to a remote one

I'm writing a system at the moment that needs to copy data from a clients locally hosted SQL database to a hosted server database. Most of the data in the local database is copied to the live one, though optimisations are made to reduce the amount of actual data required to be sent.
What is the best way of sending this data from one database to the other? At the moment I can see a few possibly options, none of them yet stand out as being the prime candidate.
Replication, though this is not ideal, and we cannot expect it to be supported in the version of SQL we use on the hosted environment.
Linked server, copying data direct - a slow and somewhat insecure method
Webservices to transmit the data
Exporting the data we require as XML and transferring to the server to be imported in bulk.
The data copied goes into copies of the tables, without identity fields, so data can be inserted/updated without any violations in that respect. This data transfer does not have to be done at the database level, it can be done from .net or other facilities.
More information
The frequency of the updates will vary completely on how often records are updated. But the basic idea is that if a record is changed then the user can publish it to the live database. Alternatively we'll record the changes and send them across in a batch on a configurable frequency.
The amount of records we're talking are around 4000 rows per table for the core tables (product catalog) at the moment, but this is completely variable dependent on the client we deploy this to as each would have their own product catalog, ranging from 100's to 1000's of products. To clarify, each client is on a separate local/hosted database combination, they are not combined into one system.
As well as the individual publishing of items, we would also require a complete re-sync of data to be done on demand.
Another aspect of the system is that some of the data being copied from the local server is stored in a secondary database, so we're effectively merging the data from two databases into the one live database.
Well, I'm biased. I have to admit. I'd like to hypnotize you into shelling out for SQL Compare to do this. I've been faced with exactly this sort of problem in all its open-ended frightfulness. I got a copy of SQL Compare and never looked back. SQL Compare is actually a silly name for a piece of software that synchronizes databases It will also do it from the command line once you have got a working project together with all the right knobs and buttons. Of course, you can only do this for reasonably small databases, but it really is a tool I wouldn't want to be seen in public without.
My only concern with your requirements is where you are collecting product catalogs from a number of clients. If they are all in separate tables, then all is fine, whereas if they are all in the same table, then this would make things more complicated.
How much data are you talking about? how many 'client' dbs are there? and how often does it need to happen? The answers to those questions will make a big difference on the path you should take.
There is an almost infinite number of solutions for this problem. In order to narrow it down, you'd have to tell us a bit about your requirements and priorities.
Bulk operations would probably cover a wide range of scenarios, and you should add that to the top of your list.
I would recommend using Data Transformation Services (DTS) for this. You could create a DTS package for appending and one for re-creating the data.
It is possible to invoke DTS package operations from your code so you may want to create a wrapper to control the packages that you can call from your application.
In the end I opted for a set of triggers to capture data modifications to a change log table. There is then an application that polls this table and generates XML files for submission to a webservice running at the remote location.

Resources