Avoiding data duplication with Logical replication ( PostgreSQL 10) - database

I've configured two servers with redundancy setup using pcsd configuration.
Both machines consists of Postgres 10 and logical replication. Used below steps for logical replication setup.
Took PG Dump on Server1 using pg_dump command.
Restored it on Server2 with postgres 10 using pg_restore.
Made changes in pg_hba.conf and postgres.conf files.
Used below commands for setup of logical replication.
CREATE PUBLICATION my_publication FOR ALL TABLES;
CREATE SUBSCRIPTION my_subscription
CONNECTION 'host=Server1 port=5432 password=postgres user=postgres dbname=database1'
PUBLICATION my_publication WITH (copy_data = false);
Restarted both servers.
After above steps I could see services are running fine on both the systems(Redundant systems). But from the logs I could see below error messages.
...
2020-01-08 15:14:08.551 EET >LOG: logical replication apply worker for subscription "my_subscription" has started
2020-01-08 15:14:08.559 EET >ERROR: duplicate key value violates unique constraint "pk_xyz_instance"
2020-01-08 15:14:08.559 EET >DETAIL: Key (xyz_instance_id)=(103) already exists.
2020-01-08 15:14:08.560 EET >LOG: worker process: logical replication worker for subscription 23176 (PID 7411) exited with exit code 1
....
As I need earlier data of Server1, I took dump and restored it on other and using copy_data as false to avoid duplication.
After every switchover of services from Server1 to Server2 or vice versa, these unique constraint violation errors are seen on Server2 (where services are inactive state)
Is there anything I'm missing here in setup of replication using PostgreSQL 10.11?
Is copy_data flag not working as I expected?

With asynchronous replication, it can always happen that the standby is lagging at the point of failover and some transactions are lost. If you try to use the old primary server, which may be some transactions ahead, as new standby, the databases can be inconsistent and replication conflicts like you observe can happen.
One solution would be to use synchronous logical replication, but that reduces availability unless you have more than one standby server.
The best would be to use physical replication. Not only is it simpler and more performant, but you can also use pg_rewind to quickly turn an old primary server into a new standby server.

Related

Opengauss+keepalived active/standby switchover, and the active/standby replication relationship is lost

Opengauss + keepalived active/standby switch, and the active/standby replication relationship is lost.
Use opengauss + keepalived to build a high availability environment for simple HA.
Process: After the failure of the primary simulation, the vip also drifts to the standby database. Check the status of the standby database, which has changed from standby to primary. Then, restart the master database, and the master database will preempt back to the vip, which will also drift to the master database. But before, through gs_ The master-slave replication relationship built by ctl build D/gaussdb/data/db1 - M standby is gone, so you need to manually rebuild the relationship.
Question:
After the primary database of opengauss is restored, does the previous active/standby replication relationship really disappear? It cannot be self created or automatically modified. Can you only manually re create the relationship?
Is there any solution to automatically modify or create a master-slave replication relationship after the failure recovery?
Keepalived.conf configuration file
Use the nopreempt parameter to set it to the non preemptive mode, so that after the master database recovers from the failure, the VIP will not be retrieved from the new master database. However, you need to set the master and backup states to backup.

SQL Server Distributed Availability Group (DAG) Failover

I have a distributed SQL Server (Always On) High Availability Group, using SQL Server 2016, with the purpose of disaster recovery. The servers exist in different datacenters, and the primary is doing async commits to the Distributed AG. I want to test a failover to the DAG without disrupting the traffic flow to the primary.
My question is - If I execute the ALTER AVAILABILITY GROUP [DAG_NAME_HERE] FORCE_FAILOVER_ALLOW_DATA_LOSS; command on the DAG, will both servers now be able to handle read/writes, or will the original primary become unavailable, and only the DAG (which is now the primary) can handle read/writes.
Only the secondary (which is not the primary) can do read/writes. The others are suspended.
From MS (https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/perform-a-forced-manual-failover-of-an-availability-group-sql-server?view=sql-server-ver15):
The secondary databases in the remaining secondary replicas are
suspended and must be manually resumed.
Confirmed that running the ALTER AVAILABILITY GROUP [DAG_NAME_HERE] FORCE_FAILOVER_ALLOW_DATA_LOSS; SQL Command against the secondary in a Distributed Availability Group configuration, both SQL Servers essentially become a primary that can handle read/writes, but synchronization between the 2 AG's is halted.

SQL DB Mirroring

I have two SQL 2016 servers configured in a mirror with a witness. We've been running some failover tests and every test has succeeded except the following two scenarios. Does anyone know why these scenarios wouldn't result in a failover?
Dismounting the storage that the database files reside on the primary server will not fail the server over. I thought the witness would notice that the files do not exist anymore and fail over?
Throttling the network down to 1kbps on the primary server disconnects the mirror on both the primary and secondary. I would think that the witness would lose connectivity to the primary because the network is so slow and fail over but instead both servers go to disconnected status.
Has anyone ran into any of these issues?
Partial answer:
(1) Dismounting disk does not cause failover because mirroring doesn't actively check to confirm that SQL Server components are up and running - rather it listens to errors and uses a timeout mechanism. Per BOL, disk failures are unlikely to be detected.

Database is read-only error in Secondary Replica of Alwayson Groups

I tried many things and analyzed lots of documents but I haven't found a solution yet.
I have three virtual machines in VmWare called (DC,SQLServer01,SQLServer02). All of SQL Servers are member of a domain.(DC) I installed failover cluster for SQLServer01 and SQLServer02. I did necessary configurations in SQLServer01. Then I installed SQL Server 2014 for both servers. Now, I created an alwaysOn group. SQLServer01 is a primary and other is secondary. When I cut the connection of SQLServer01, everything is fine (Secondary becomes primary). It is acceptable for other condition.
However, when all servers are online, I can not do any operation (insert,update,delete,alter ,etc) except read operations in my secondary replica. I see always "database is read only" error. In properties of Alwayson group, both primary and secondary replica have all connections and secondary readable is "YES".
I want to make CRUD operations even if all servers are online. (I mean, do everything also for secondary replica. )
So, do you have any suggestion or idea?
Thank your time and consideration.
The error occurs because writing to secondary replicas in sql server is not possible. Only primary replica can host read-write databases, and an availability group can only have one single primary replica. Secondary replicas can host read-only databases only. When both replicas are available, only one of the two can be the primary and therefore support read-write. When only a single replica is available, that replica becomes primary replica because there are no other replicas, and read-write operations against that replica is possible.
What you need to configure instead is replication.
In SQL Server, merge replication allows you to write at multiple nodes, with periodic synchronization that resolves conflicts and pushes changes to all replicas.
Peer to Peer replication is another solution. Application layer must not allow conflicts (update of same row at more than one node), but is much faster.

Database replication

OK when working with table creation, is it always assumed that creating a table on one database (the master) mean that the DBA should create the table on the slave as well? Also, if using a master/slave configuration, shouldn't data always be getting replicated from the master to the slave to synch?
Right now the problem I am having is my database has a lot of stuff in the master, but the slave is missing parts that only exist in the master. Is something not configured correctly here?
Depends how the replication is configured. Real time replication should keep the master and slave in sync at all times. "Poors mans" replication is usually configured to sync upon some time interval expiring. This is whats probably happening in your case.
I prefer to rely on CREATE TABLE statements being replicated to set up the table on the slave, rather than creating the slave's table by hand. That, of course, relies on the DBMS supporting this.
If you have data on the master that isn't on the slave, that's some sort of failure of replication, either in setup or operationally.
Any table creation on master is replication on slave. Same goes with the inserting data.
Go through the replication settings in my.cnf file for mysql and check if any database / table is ignored from replicating.

Resources