Best practices to update multiple maria database instances from application

Best practices to update multiple maria database instances from application - database

we have two databases in different servers for maria DB, Primary and secondary.As Master-Slave relation is not in place. We need to update both the data bases for each user actions.
Example: If add employee action performed from front end by user. We need to insert employee details in Primary data base first and later in secondary.
We designed to insert separately to database ie two insert call's from application for each Database.
As we have multiple DB interaction for single operation, this solution will effect performance.
Is there any way we can achieve this by using procedure or UDF's?
Any better approaches or suggestion are helpfull

There are multiple ways we can do what you are looking for. Of course, there are pros and cons of each approach.
The simplest one would be to use MaxScale's "Tee" filter, that can automatically execute all your queries on multiple servers. This is, of course, transparent to the application code.
Refer to: https://mariadb.com/kb/en/mariadb-maxscale-24-tee-filter/
The benefit is that it does with the least amount of strain on your application performance as it doesn't have to make two calls to execute your statements on the databases and your application code remains simple.
Negative of this is that, there are no return values form the "tee" filter! You won't be able to tell if the insert, update, delete was successful or not.
The other method that might be much better is to use the "Spider" storage engine in MariaDB and connect to the remote table on the remote MariaDB server.
Now you can create a trigger on your primary table and depending on your business logic/requirements, you can update the Spider table, a.k.a Remote table, from within the trigger. This will give you more control and your application is still kept clean without dual connections.
Another benefit of this is that if due to some reason your Trigger fails to update the remote "Spider" table, your primary transaction will also fail! This will ensure proper data integrity is maintained :)
There are other engines that can be used similarly like "CONNECT" but "Spider" is the only officially supported among the ones that can connect to a remote database's table. Spider, however, can only connect as long as the remote database is also MariaDB.
Hope this answers your question and is what you are looking for :)
Cheers,
Faisal.

Related

Architecture Challenge with 2 databases

How to handles two databases for a single rest app. I see that this is a cloud app concept but in cases that something gets deleted in DB1 and not DB2 how to handle this and make sure that both are always the same.

There are many ways to do this.
First you need to consider whether the DBs have the same schema and data. in such case a master-slave, master-master replication will solve it.
In the case where the schema and data is similar in some tables an not in others, in that case you can use dblink to replicate just one table Postgresql: master-slave replication of 1 table
most of the time such changes are made in the app level and if you really want to avoid dealing with 2 phase commit, you can have a queue and a service processing (retrying and recovering) the updates to both dbs
finally in the case where direct updates to the db (without passing through the app) are required to be supported, triggers and dblink is an opportunity.
Please explore these tools to get more ideas: https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling
EDIT: start here if you're lucky enough to have the same db/data https://www.2ndquadrant.com/en/resources/postgres-bdr-2ndquadrant/

Detect Table Changes In A Database Without Modifications

I have a database ("DatabaseA") that I cannot modify in any way, but I need to detect the addition of rows to a table in it and then add a log record to a table in a separate database ("DatabaseB") along with some info about the user who added the row to DatabaseA. (So it needs to be event-driven, not merely a periodic scan of the DatabaseA table.)
I know that normally, I could add a trigger to DatabaseA and run, say, a stored procedure to add log records to the DatabaseB table. But how can I do this without modifying DatabaseA?
I have free-reign to do whatever I like in DatabaseB.
EDIT in response to questions/comments ...
Databases A and B are MS SQL 2008/R2 databases (as tagged), users are interacting with the DB via a proprietary Windows desktop application (not my own) and each user has a SQL login associated with their application session.
Any ideas?

Ok, so I have not put together a proof of concept, but this might work.
You can configure an extended events session on databaseB that watches for all the procedures on databaseA that can insert into the table or any sql statements that run against the table on databaseA (using a LIKE '%your table name here%').
This is a custom solution that writes the XE session to a table:
https://github.com/spaghettidba/XESmartTarget
You could probably mimic functionality by writing the XE events table to a custom user table every 1 minute or so using the SQL job agent.
Your session would monitor databaseA, write the XE output to databaseB, you write a trigger that upon each XE output write, it would compare the two tables and if there are differences, write the differences to your log table. This would be a nonstop running process, but it is still kind of a period scan in a way. The XE only writes when the event happens, but it is still running a check every couple of seconds.

I recommend you look at a data integration tool that can mine the transaction log for Change Data Capture events. We are recently using StreamSets Data Collector for Oracle CDC but it also has SQL Server CDC. There are many other competing technologies including Oracle GoldenGate and Informatica PowerExchange (not PowerCenter). We like StreamSets because it is open source and is designed to build realtime data pipelines between DB at the schema level. Till now we have used batch ETL tools like Informatica PowerCenter and Pentaho Data Integration. I can near real-time copy all the tables in a schema in one StreamSets pipeline provided I already deployed DDL in the target. I use this approach between Oracle and Vertica. You can add additional columns to the target and populate them as part of the pipeline.
The only catch might be identifying which user made the change. I don't know whether that is in the SQL Server transaction log. Seems probable but I am not a SQL Server DBA.

I looked at both solutions provided by the time of writing this answer (refer Dan Flippo and dfundaka) but found that the first - using Change Data Capture - required modification to the database and the second - using Extended Events - wasn't really a complete answer, though it got me thinking of other options.
And the option that seems cleanest, and doesn't require any database modification - is to use SQL Server Dynamic Management Views. Within this library residing, in the System database, are various procedures to view server process history - in this case INSERTs and UPDATEs - such as sys.dm_exec_sql_text and sys.dm_exec_query_stats which contain records of database transactions (and are, in fact, what Extended Events seems to be based on).
Though it's quite an involved process initially to extract the required information, the queries can be tuned and generalized to a degree.
There are restrictions on transaction history retention, etc but for the purposes of this particular exercise, this wasn't an issue.
I'm not going to select this answer as the correct one yet partly because it's a matter of preference as to how you approach the problem and also because I'm yet to provide a complete solution. Hopefully, I'll post back with that later. But if anyone cares to comment on this approach - good or bad - I'd be interested in your views.

How to Sync Child Databases based on record type?

We have a situation where it is an unmovable requirement to have two separate databases, but we want to keep the single web-based front end that we currently use to manipulate what used to be a single database. Records need to go in a child database based on the value of a column, let's say employee type "hourly" vs. "salaried".
There are a lot of synonyms, stored procedures, and other bits of SQL that lie between the web interface and the database, so we figured that instead of doing the split there we could use the current database as a "master" database and then have something behind it direct the data into either of the two child databases. (as in the following diagram:)
We seem to be good to the extent that data flows one way (from the web interface to the child databases) - to the extent that data flows back the other way (from the child databases to the master), we seem to get into some hairy situations.
Some of them seem intractable (e.g. if one person on child DB A inserts a record with an autoincrement ID of 1 at the same time a person on child DB B inserts a record with the same Id of 1), but most of them seem to just be a pain in the ass.
My question is: Does there exist a solution that will allow us to sync these databases, but allow us to insert the logic of "only if the employee column has a status of X", rather than just blindly mirroring them?
Here are a few ideas that were floated around: triggers seem to have potential but seem to be a lot of work as well, and we were wondering if there were any tools out there that could do the heavy lifting of the sync for us. Does anyone out there have any ideas?
triggers
Service Broker
SSIS
Microsoft Sync Framework

So this is really more of an opinion thing, because there are possibilities with each solution that you already identified. There is no easy solution, but a lightweight approach would be to use CDC in combination with SSIS. SSIS has built in hooks to work with CDC and CDC will provide better performance with your master database - it will not involve the kind of waits that could occur from using triggers that insert data into another database.
Here is more on CDC

SQL Server - robust protection of client data (multi-tenancy)

We are considering using a single SQL Server database to store data for multiple clients. We feel having all the data in one database could make things more manageable than a "separate db per client" setup.
The biggest concern we have is accidental access to the wrong client. It would be very, very bad if we were to ever accidentally show one client's data to another client. We perform lots of queries, and are afraid of a scenario where someone says "write me a query of this and this to go show the client for the meeting in 15 minutes." If someone is careless and omits the WHERE clause that filters for the correct client then we would be in serious trouble. Is there a robust setup or design pattern for SQL Server such that it makes it impossible (or at least very difficult) to accidently pull the wrong client's data from a single "global" database?
To be clear, this is NOT a database that the clients use directly or via apps (yet). We are talking about a database accessed by several of our programmers and we are afraid of screwing up ourselves.

At the very minimum, you should put the client data in separate schemas. In SQL Server, schemas are the unit of authorization. Only people authorized for a given client should be able to see that client's data. In addition to other protections, you should be using the built-in authorization capabilities of the database.
Right now, it sounds like you are in a situation where a very small group of people are the ones accessing all the data for everyone. Well, if you are successful, then you will probably need more people in the future. In fact, you might be giving some clients direct access to the data. If it is their data, they will want apps running on it.
My best advice, if you are planning on growing, is to place each client's data in a separate database. I would architect the system so this database can be on a remote server. If it needs to synchronize with common data, then develop a replication strategy for moving that data around.
You may think it is bad to have one client see another client's data. From the business perspective, this is deadly -- like "company goes out of business, no job" deadly. Your clients are probably more concerned about such confidentiality than you are. And, an architecture that ensures protection will make them more comfortable.

Multi-Tenant Data Architecture
http://msdn.microsoft.com/en-us/library/aa479086.aspx
here's what we do (mysql unfortunately):
"tenant" column in each table
tables are in one schema [1]
views are in another schema (for easier security and naming). view must not include tenant column. view does a WHERE on the tenant based on current user
tenant value is set by trigger on insert, based on the user
Assuming that all your DDL is in .sql files under source control (which it should be), then having many databases or schemas is not so tough.
[1] a schema in mysql is called a 'database'

You could set up one inline table valued function for each table that takes a required parameter #customerID and filters that particular table to the data of this customer. If the entire app were to use only these TVP's the app would be safe by construction.
There might be some performance implications. The exact numbers depend on the schema and queries. They can be zero, however, as inline TVP's are inlined and optimized together with the rest of the query.

You can limit access to data only via storedprocedures with obligatory customerid parameter.
If you allow you IT build views sooner or later someone forget this where clause as you said.
But a schema per client with already prefiltered views will enable selfservice and extra Brings value i guess.

Log inserted/updated/deleted rows in all tables for a given database in SQL Server 2008

Whats the best way to track/Log inserted/updated/deleted rows in all tables for a given database in SQL Server 2008?
Or is there a better "Audit" feature in SQL Server 2008?

Short answer is that there is no one single solution fits all. It depends on the system but and requirements but here are couple different approaches.
DML Triggers
Relatively easy to implement, because you have to write one that works well for one table and then apply it to other tables.
Downside is that it can get messy when you have a lot of tables and even more triggers. Managing 600 triggers for 200 tables (insert, update and delete trigger per table) is not an easy task.
Also, it might cause a performance impact.
Creating audit triggers in SQL Server
Log changes to database table with trigger
Change Data Capture
Very easy to implement, natively supported but only in enterprise edition which can cost a lot of $ ;). Another disadvantage is that CDC is still not as evolved as it should be. For example, if you change your schema, history data is lost.
Transaction log analysis
Biggest advantage of this is that all you need to do is to put the database in full recovery mode and all info will be stored in transaction log
However, if you want to do this correctly you’ll need a third party log reader because this is not natively supported.
Read the log file (*.LDF) in SQL Server 2008
SQL Server Transaction Log Explorer/Analyzer
If you want to implement this I’d recommend you try out some of the third party tools that exist out there. I worked with couple tools from ApexSQL but there are also good tools from Idera and Netwrix
ApexSQL Log – auditing by reading transaction log
ApexSQL Comply – uses traces in the background and then parses those traces and stores results in central database.
Disclaimer: I’m not affiliated with any of the companies mentioned above.

Change Data Capture is designed to do what you want, but it requires each table be set up individually, so depending on the number of tables you have, there may be some logistics to it. It will also only store the data in capture tables for a couple of days by default, so you may need an SSIS package to pull it out and store for longer periods.

I don't remember whether there is already some tool for this, but you could always use triggers (then you will have access for temporal tables with changed rows- INSERTED and DELETED). Unfortunately, it could be quite a work to do if you would like to track all tables. I believe that there should be some simpler solution, but do not remember as I said.
EDIT.
Maybe this could be helpful:
--Change tracking
http://msdn.microsoft.com/en-us/library/cc280462.aspx

http://msdn.microsoft.com/en-us/library/cc280386.aspx
This allows you to do audits at the database level; it may or may not be enough to meet the business requirements, as database records usually don't make all that much sense without the logic to glue them together. For instance, knowing that user x inserted a record into the "time_booked" table with a foreign key to the "projects", "users", "time_status" tables may not make all that much sense without the SQL query to glue those 4 tables together.
You may also need to have each database user connect with their own user ID - this is fine with integrated security and a client app, but probably won't work with a website using a connection pool.

The sql server logs are not possible to analyze just like that. There are some 3rd party tools available to read the logs but as far as I know you can't query them for statistics and such. If you need this kind of info you'll have to create some sort of auditing to capture all these events in separate tables. You can use "DDL triggers".

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight