Peer to peer replication in SQL Server 2005/08

Peer to peer replication in SQL Server 2005/08 - sql-server

Has anyone had any experience in setting up peer to peer replication using SQL Server 2005 or 2008?
Specifically, I'm interested in whether other options/alternatives where considered and why P2P replication was ultimately chosen.
If you have used P2P replication:
Did you encounter any issues during synchronization and was it easy to monitor?
How easy was/is it to do conflict resolution?
Did you have to make schema changes (i.e. replace identity columns, etc)?
Alternatively, if you considered P2P replication and went with a different option, why did you rule it out?

(Disclaimer: I'm a developer, not a DBA)
We have SQL Server 2005 merge replication set up to replicate between two active/active geographically-separated nodes for resilience in a legacy system.
I don't know whether it's easy to monitor; outside of my remit.
It creates triggers on every table to do the publish/subscribe mechanism, each of which calls its own stored procedure.
In our case, it was set up to use identities 1-1bn in node 0, 1bn-2bn in node 1 to avoid identity collisions (rather than use a composite key of NodeId + EntityId for each table, or change keys to be GUIDs, for example).
I think the replication latency is around 15s (between London and New York over dedicated bandwidth).
It is a huge pain to work with:
It took a highly paid contractor a year to set it up (granted, part of this was due to the legacy nature of the DB design)
We lack anyone in-house with the expertise to support it (the in-house DBA we had took ~6 months to learn it, and has since moved on)
Schema updates are now painful. From what I understand:
Certain updates must be performed on only one node; replication then takes care of figuring out what to do on the other node(s)
Certain updates must be performed on both nodes
Data updates must be performed on one node only (I think)
All updates now take significantly longer to perform - from the split-second it takes to run a DDL change-script to ~30 minutes
I don't know for sure, but I think the bandwidth requirement for replication is very high (in the MBit/s range)
It introduces many "noise" objects (3 sprocs per table, 3 triggers per table) into the DB, making it inconvenient to find in the object explorer the item that one wants to work on.
We will never set up a third node for this system, based largely on the perceived difficulty and added pain it would introduce at deployment-time.
We also now lack a staging environment that mirrors production, because it's too painful to set up.
Anecdotal: The DBA doing the setup would frequently curse the fact that it was an "MS v1" he was being forced to work with.
Dimly remembered: The DBA needed to raise several priority support tickets to get help from MS directly.
Granted - some of the pain involved is due to our specific environment and not having in-house talent to support this setup. Your mileage may vary.

Related

Long loading time after creating Availability Groups and migrating in SQL

so I have this issue. Our client using MS SQL databases. Two months ago they migrated their databases to the SQL Enterprise 2019 from earlier version and Standard edition.
They major reason was to secure high availability through feature in MS SQL - Availability groups.
After that our application get really slowed. In the simply way to tell, customer startup an app select workspace and then its takes like 15 seconds to load data.
First step is just sending request to database to select data - no inserts, deletes or any high performance processes.
App is using and working with geographical and geometry data, every geo objects is saved in database as geometry data type. The first huge, major select is causing the slow issue.
When I was looking at activity mon under wait categories is only one thing suspicious to me and its type Other.
In database I dont see any high cost queries and availability group mode is set to synchronous.
If Im getting this right, the synchronous mode should not be the cause of this problem because this database is clearly for reading a data not as I mentioned modifying.
I made changes to some instance parameters and set Optimize for Ad hoc workloads to True and and threshold for parallelism from 5 to 20.
Other thing which I tried was create a new app source database and database which contains geo data inside of that SQL instance and didnt add them to availability groups.
From application we are using, for test causes, a connection to the one instance with new test databases.
Neither of this settings work. So guys if you have any idea or any experience with this please help me.
Here is a screen of top 10 waits from sys dmv.

1 - Stats recompute...
When you are going from a SQL version to a higher one, you must first change the compatibility level (to have some performance benefits) and then recompute all statistics in the database with a FULLSCAN. Why ? Because each version of SQL Server come with a new optimizer that have new operators, new algorithms and many improvements... To stick to this new version of the optimizer the method of computing statistics and the form of the results of these calculations, is rethought with each modification of the engine ... so much so that if we use the old statistics with a new engine, it is like taking the census of the population in 1930, to plan the construction of roads, schools and hospitals for the current actual population ....
2 - SQL Server Editions...
When upscaling SQL Server from Standard to Enterprise, you need to increase the "hardware" (even if it is a VM) because many of the features that runs under Enterprise version, and does not exists in Standard, needs some more computationnal resources. As an example, using the AUTO_UPDATE_STATISTICS_ASYNC will use automatically one more thread to the detriment of other processes... In comparison, using a Rolls Royce or a Hummer, instead of a VolksWagen is arguably more comfortable, faster ... but requires more oil and more expensive insurance!
3 - Synchronous AVG...
Synchronous AlwaysOn availability groups must have a very fast and faultless network .... If this is not the case, the replication of update requests can drag performance down, especially if you are in pessimistic lockdown (default mode).
4 - Transaction logs...
One common global lack of performances can be the latency to write the transaction log.
5 - Tempdb files...
Another current global lack of performances can be the latency to access tempdb files.
For those two file problems, use the Glenn Berry latency file query that will give you a indice... Good values are under 7 ms for reads and 15 ms for writes...
CONCLUSION
Many other factors can contribute to slow down you system. But without no more information, we cannot help you...

Reliable alternative to replication for continous data sync between two databases

I have one central database and 25 client databases and all have same schema.
I want that whenever some changes are done in some tables of the central database then these changes flow down to the client database.
The databases used is SQL Express so I cannot use replication.
The solution that I have today is to make keep track of the changes in the central database and then a program makes a text file with these changes and sends them down to the client databases.Another program reads these text files and updates the client database.
There are three problems with this:-
1. The files get lost or arrive in jumbled order which messes up the client data
2. the process is slow
3. the programs are sometimes shutdown so the whole sync flow gets stopped.
Is there a reliable alternative that is fast and secure ?
I wonder how banking software are made ...they never lose transactions and they are fast.

Add an UpdateDate column to all the entities that need to be replicated. At each client add a linked server to the central repository. Now, every 5 minutes or so, poll your central repository for changes using the last UpdateDate of a client entity and grab the delta.
Then use merge or insert and update to merge data on the client. That's a very reliable way of doing homebrew replication. To keep track of deleted elements you would either want to mark them as deleted or have another table to keep track of entity kind and its reference, again combined with UpdateDate for replication.
Update
Then you mention transactions and banking software. When you do your replication via files, we ain't talkin' about no transactional replication here, not by a long shot.
If you need transactional consistency you need to subscribe to the transaction flow of the data warehouse.

I don't want to be unhelpful and you haven't given any background about your business needs, but you have to decide if your priority is really "fast and secure" or if it's actually "cheap". Replicating changes between multiple databases in a reliable, consistent way is not easy (as you know) and it's highly unlikely that you will be able to develop a solution yourself that has the features, stability and performance of SQL Server replication.
SQL Express can be a replication subscriber, by the way, so it's not clear why it doesn't meet your needs. But if it doesn't, you should estimate the cost to your business (or customer) of dealing with issues caused by an unreliable solution: your time, business downtime, finding and correcting incorrect data, customer complaints, lost business etc. Then compare that to the cost of 25 SQL Server licenses (you should certainly be able to get a good discount when you order that volume), additional hardware (if any) and the costs of training, consulting and/or learning how to use replication. Then extrapolate those costs over 5 years or so. You may find that it's cheaper just to buy the solution you need. And of course buying the full SQL Server edition means you get a lot of other new features that might be useful to you.
If you (or your boss) is really determined to get something for nothing, you might want to investigate PostgreSQL or MySQL. They both have free replication solutions that seem to be widely enough used to be reliable for many companies. Of course, you then need to calculate the costs of switching to a new database platform.

If you have one central database and 25 clients, you can easily do it with one (yes only one) SQL server licence for the main database. Subscribers to this database can run SQL express. As long as users access the the client databases, you are not even obliged to buy SQL CALs.
Back to banking software, be sure that they are paying good money for their server licenses! So don't be surprised if these are reliable and fast ...

We failed trying database per custom installation. Plan to recover?

There is a web application which is in production mode for 3 years or so by now. Historically, because of different reasons there was made a decision to use database-per customer installation.
Now we came across the fact that now deployments are very slow.
Should we ever consider moving all the databases back to single one to reduce environment complexity? Or is it too risky idea?
The problem I see now is that it's very hard to merge these databases with saving referential integrity(primary keys of different database' tables can not be obviously differentiated).
Databases are not that much big, so we don't have much benefits of reduced load by having multiple databases.

Your question is quite broad.
a) Ensure that merged databases don't suffer from degraded performance with things like JOIN statements when, say, 1000 databases are merged even though each is small. As for your referential integrity ... which I assume is auto_increment based ... you can replace these relationships by altering the schema and supplanting UUID or a similar unique, non-sequential value. Or even a surrogate key pair in addition to your auto increment PK.
b) Do benchmarking to ensure your application would respond within performance limits
c) Is there a direct ROI for doing this? What are the long term cost benefits vs the expense of migration? Is the decreased complexity worth increased (if any) cost?
d) How does this impact your backup and disaster recovery plans? Does it make them cheaper? Slower? More expensive?
Abstraction and management tools approach:
if it were me, depending on the situation, I would keep the scalability that comes with per-client sharding and create a set of management tools to abstractly create one virtual database. Using these tools you can acquire the simplified management without loosing technical flexibility. I suspect you want to simplify the cost of managing all these databases (based on your deployment statement). Creating a 'control panel' for your farm can be a good way to simplify a complex system (especially when deployments may use different schema versions).

For the migrated data... customer one database UUIDs can start with 10000000, Customer two database UUIDs can start with 20000000. Customer three database UUIDs can start with 30000000.....
In my opinion when you host the database for your customers, a single database that handles multiple customers is a better idea overall. Of course you need to add a "customers" table to record the customers, and a "customer_id" column on all top-level data that is within the table, and include checks in all your SQL to ensure the customer's view is limited to their own data.
I'd set up a new database with the additional columns, and then test it with a dummy customer or three for a while to ensure all bugs are wiped out. Then I'd migrate all the customers across, one by one, doing checks that the data will fit.

Database Design Theory For Multiple Application Instances

I'm working on a SaaS project that will have each customer having an instance of the application (customer1.application.com, customer2.application.com, etc.) and ideally each customer would have their "own" space in the DB. The current plan is to create a DB for each customer and deploy an instance of the application into the web farm. The idea is that each customer could opt out of an upgrade to maintain the status quo (something one of our investors REALLY wanted, largely in part because he hates how Facebook keeps changing how it works.)
Last night I attempted to roll out, to my two test accounts, an update that altered the database. While the ensuing errors that were caused were my fault (forgetting a small but apparently very important change in the DDL) I'm starting to worry about my overall theory of operation because missing one ALTER COLUMN statement and a whole upgrade cycle could be blown to hell. So after that long build up here's my questions:
1) Is there a way to do a diff between two databases (the "test" production database and a actual production database) that will accurately record each change being made?
2) Is there another database (and/or application) design model I should be considering? I know if I take away supporting multiple versions of the application that I actually remove a lot of the long term support headaches.

Food for thought:
Code upgrades happen more frequently than DB Schema upgrades. Make sure you have a really good SCM in place to handle the code upgrades. We use git with great success.
Code is easy to manage, databases are not (in comparison). The reason is that they are mutable, and change each moment. Plus, they are really hard to roll back (possible, but time consuming, with downtime). So we must arrive at a simple way to track schema updates (along with associated data changes), and be able to apply them in the future to other similar databases.
Each database schema version should be given a unique, sequential integer version number. Start with 100 per say.
Each time you have to upgrade it, write an sql scripts like
100-101.sql
101-102.sql
102-103.sql
It is the job of each script to perform the upgrade for that specific version. It can be as simple as adding a table, or as complicated as re-arranging foreign keys. But in any event, they will be reliable in what they are designed to do.
You can apply any given script many times during testing (on fresh data) to ensure it will work as expected.
So when you find yourself needing to upgrade a client from version 130 to 180, you can safely apply the sql scripts (IN ORDER), and you will arrive at the correct destination.

You should never be changing DBs by hand. Do it by a script that does all DDL changes, etc...
Ideally, there should be a generic DB release script that uses DDL version as configuration/input.
(and DDL changes should be tagged with a specific tag in a versioning system)
You can go Microsoft route re: supporting multiple versions as a headache - simply designate all versions prior to X (say 2 versions back) as un-supported. That way, you can support last 2-3 versions but don't waste resources on anything more, while allowing per-client flexibility to a large extent.
You should carefully weigh pros/cons of having versioned app/DB system like you propose.
List the the pros (such as placating an investor, positive experience for a client when version changes unexpectedly that you mentioned - translated into marginal probability to retain/add new clients who require such feature, plus an easy way to do BETA/UAT testing, plus a failrly wasy way to roll back the schema changes gone awry by loading client's data into DB schema from prior version).
List the cons (cost of DB space, cost of your time to implement, cost of support)
Compare the two and decide which is better for your business.

Redgate's SQL Compare does a good job of comparing and diffing two SQL Server databases (warning: commercial third-party product). Also, I think there's free stuff out there that does much the same thing.
If you want to be able to leave some customers behind on older versions of your product, it might make more sense to maintain a one-database-per-customer model, with the scripts for building each version of the databases under source control. This keeps your customers isolated from each other, and even allows you to switch database vendors (e.g. from SQL Server to Oracle) or versions (i.e. from SQL Server 2000 to Sql Server 2005) on some customers while keeping other customers on the older versions.

Manual run scripts will not work. Nor diff tools, for the matter. Diff works for 2,4 maybe 10 databases. But does not scale, because what you need is reliability in presence of failures (offline databases, server restarting all that).
You deploy by scheduling upgrade scripts. For instance, see how MySpace does this for over 1000 databases: MySpace Uses SQL Server Service Broker to Protect Integrity of 1 Petabyte of Data. the key take is that they use a guaranteed, reliable, delivery mechanism (SSB) to deploy schema maintenance scripts. You need an asynchronous, reliable, mechanism to run scripts because destination databases may be offline, running scheduled maintenance, unreacahbe etc, and a reliable delivery mechanism like Service Broker can handle all the retries and related issues (handling duplicates, acknowledgments etc). You can also look at Asynchronous procedure execution for an example of how to handle script execution via SSB.
As for the scripts themselves, I recommend you start looking at your database schema and configuration data as a versioned resource. I have addresses this problem already several times, eg. see Do you put your database static data into source-control ? How?
Update
I guess I own some explanation why I consider diffing a wrong approach. Just to makes things clear, I'm talking about deployments of hundreds of servers and thousands of databases. The original post compares itself to facebook and I whish them to have the success to reach that size, but also the questions asks about design principles, so I say that discussing about cloud level scale is appropiate.
I see two problems with diff tools:
Availability. All diff tools work by connecting to both the 'master' and the 'copy', so they can do their job only when both are online. This creates a hot spot, a single point of failure, the 'master' copy, whose availability becomes critical for deploying upgrades. High availability always comes at a cost. It also leaves the problem of 'copy' availability as a minor implementation details, the upgrade scheme must handle retries and time outs and disconnects from the client on its own (not a trivial problem by any means).
Atomicity. The diff tools expect a stable schema of the 'master'. This in effect places a freeze on 'master' while an upgrade is taking place. While this can be controlled on a small scale, on large scales it becomes a problem as upgrading the master itself to v. N+1 becomes a race against all the thousands of databases, when some may be still upgrading from v. N-1.
Script based solutions that ship the upgrade script to the 'copy' solve both of these problems. Also diff tools like the VSDB .dbschema based vsdbcmd.exe are better than a 'live' diff tool since the 'master' dbschema file can be delivered to the 'copy' machine and turn the whole upgrade process into a local operation.
Overall I also belive that script based upgrade, using metadata versioning, is supperior to diff based upgrade, because of reasons of testing and source control I had already talked about in the link to Q1525591.

if I take away supporting multiple
versions of the application that I
actually remove a lot of the long term
support headaches
Any change, however small, has a chance of breaking something that is important for someone.
So if you have multiple customers, rolling out a fix for customer 1 will upset customer 2. It doesn't even have to be a bugged release; it might just be a change in behaviour they disagree with. For most customers, not controlling the release schedule is simply unacceptable.
So I'd advise you to keep a different codebase for every customer. Roll out fixes only after agreement with a customer.
There is a number of customers where this approach breaks down (think Yahoo mail), but reading your question I think you're safely below that number. And for a compare tool, I can't help but agree with the posts suggesting Redgate's SQL Compare.

How to Audit Database Activity without Performance and Scalability Issues?

I have a need to do auditing all database activity regardless of whether it came from application or someone issuing some sql via other means. So the auditing must be done at the database level. The database in question is Oracle. I looked at doing it via Triggers and also via something called Fine Grained Auditing that Oracle provides. In both cases, we turned on auditing on specific tables and specific columns. However, we found that Performance really sucks when we use either of these methods.
Since auditing is an absolute must due to regulations placed around data privacy, I am wondering what is best way to do this without significant performance degradations. If someone has Oracle specific experience with this, it will be helpful but if not just general practices around database activity auditing will be okay as well.

I'm not sure if it's a mature enough approach for a production
system, but I had quite a lot of success with monitoring database
traffic using a network traffic sniffer.
Send the raw data between the application and database off to another
machine and decode and analyse it there.
I used PostgreSQL, and decoding the traffic and turning it into
a stream of database operations that could be logged was relatively
straightforward. I imagine it'd work on any database where the packet
format is documented though.
The main point was that it put no extra load on the database itself.
Also, it was passive monitoring, it recorded all activity, but
couldn't block any operations, so might not be quite what you're looking for.

There is no need to "roll your own". Just turn on auditing:
Set the database parameter AUDIT_TRAIL = DB.
Start the instance.
Login with SQLPlus.
Enter the statement audit all;This turns on auditing for many critical DDL operations, but DML and some other DDL statements are still not audited.
To enable auditing on these other activities, try statements like these:audit alter table; -- DDL audit
audit select table, update table, insert table, delete table; -- DML audit
Note: All "as sysdba" activity is ALWAYS audited to the O/S. In Windows, this means the Windows event log. In UNIX, this is usually $ORACLE_HOME/rdbms/audit.
Check out the Oracle 10g R2 Audit Chapter of the Database SQL Reference.
The database audit trail can be viewed in the SYS.DBA_AUDIT_TRAIL view.
It should be pointed out that the internal Oracle auditing will be high-performance by definition. It is designed to be exactly that, and it is very hard to imagine anything else rivaling it for performance. Also, there is a high degree of "fine-grained" control of Oracle auditing. You can get it just as precise as you want it. Finally, the SYS.AUD$ table along with its indexes can be moved to a separate tablespace to prevent filling up the SYSTEM tablespace.
Kind regards,
Opus

If you want to record copies of changed records on a target system you can do this with Golden Gate Software and not incur much in the way of source side resource drain. Also you don't have to make any changes to the source database to implement this solution.
Golden Gate scrapes the redo logs for transactions referring to a list of tables you are interested in. These changes are written to a 'Trail File' and can be applied to a different schema on the same database, or shipped to a target system and applied there (ideal for reducing load on your source system).
Once you get the trail file to the target system there are some configuration tweaks you can set an option to perform auditing and if needed you can invoke 2 Golden Gate functions to get info about the transaction:
1) Set the INSERTALLRECORDS Replication parameter to insert a new record in the target table for every change operation made to the source table. Beware this can eat up a lot of space, but if you need comprehensive auditing this is probably expected.
2) If you don't already have a CHANGED_BY_USERID and CHANGED_DATE attached to your records, you can use the Golden Gate functions on the target side to get this info for the current transaction. Check out the following functions in the GG Reference Guide:
GGHEADER("USERID")
GGHEADER("TIMESTAMP")
So no its not free (requires Licensing through Oracle), and will require some effort to spin up, but probably a lot less effort/cost than implementing and maintaining a custom solution rolling your own, and you have the added benefit of shipping the data to a remote system so you can guarantee minimal impact on your source database.

if you are using oracle then there is feature called CDC(Capture data change) which is more performance efficient solution for audit kind of requirements.