CouchDB - Overwrite deleted documents from backup database to original database - database

Replicate the documents which marked deleted on original database, but that document is present into the replicated database. Now I am performing the replication, but the deleted document is not restoring from the replicated database to original database even though document is present.
For Example,
Database A contains,
Document 1
Document 2
Replicated database A to database B
Marked document 1 as deleted
Now performing replication from database B to database A
But the document 1 is not replicating to database A.

If I understand correctly, you marked document 1 in database A as deleted and then replicated from B to A.
In that case, the replication will transfer no data because there are no changes in B (the Source) that A (the destination) hasn't got. So consequently, document 1 is not going to "reappear" in A.
There is a replication guide in this document

Related

Different query result in SQL Server high availability group's nodes [migrated]

This question was migrated from Stack Overflow because it can be answered on Database Administrators Stack Exchange.
Migrated 23 days ago.
I have high availability group with two nodes and I ensure they have "synchronized commit" mode. So I assured they are identical in records of tables.
A developer reported a bug for different results of stored procedure is related to a report between primary and secondary node. At first, it was ridiculously funny for me due to knowing about my mode in HA but it is correct. The result sometimes is different.
I tried to save log for this difference so I created a table in master database which isn’t involved in HA and try to save the count of any tables are in the query( I save two numbers one count(*) as usual and the other one with "nolock" parameter to know about dirty records) but I have some log records show the tables are the same but the result of join of them is different in two nodes! Do you have any suggestion for me to discover what causes this distinguish ?
You'll want to read this documentation that covers a wide variety of items that can all contribute to this. The long and the short is, if you need exact point in time 100% up to date information, use the primary replica.
Notable excerpt, though:
The primary replica sends log records of changes on primary database to the secondary replicas. On each secondary database, a dedicated redo thread applies the log records. On a read-access secondary database, a given data change does not appear in query results until the log record that contains the change has been applied to the secondary database and the transaction has been committed on primary database.
Don't forget that secondary replicas also remap the isolation level to snapshot.

AWS RDS backups and restore

I have some questions regarding AWS backups on their RDS.
I have a mysql server with 2 schemas which needs backing up from unintentional messing with entries in tables. As far I understand Snapshots and restoring DB to specified time would restore the entire DB values to that time and basically we need to change a specific entry without restoring the rest. Is that possible? So far we would copy entire row of data from the table before altering entries just in case it values needed to be restored.

Implementation of initial database replication

How various databases implement copying data (replication) to a new instance when it is added to the replication setup?
I. e., when we add a new instance, how is the data loaded into it?
There are a lot of information about ways of replication, but they are explained in cases when the target database instance already has the same data from its source. But not when there is a new initially empty instance of database
There are basically 3 approaches here.
First you start capturing the changes from the source database using CDC tool. Since the target database is not yet created you store all the changes to apply them later.
Depending on the architecture you can:
If you have 1:1 copy
Take a backup of the source database, backup it, and restore it to the target database. Having some point in time of the backup you start applying the changes from the timestamp when the database backup was created.
Assuming you have a consistent backup of the database you would have the same data on the target but delayed compared to the source.
If you have a subset of the tables or a different vendor
The same approach like in 1. but you don't backup & restore the full database but just a list of tables. You can also restore the database backup in a temporary location, export part of the tables (or not full tables but just subset of columns), and next load them to the target.
When the target is initially prepared - you start applying the changers from the source to the target.
No source database snapshot available
If you can't get a snapshot of the replication tool often contains a method to work with that. Depending on the tool the function is named AUTOCORRECTION (SAP/Sybase Replication Server), HANDLECOLLISIONS (Oracle GoldenGate). This method basically means that the replication tool has a full image of the UPDATE operation, and when the record does not exist in the target - it is cerated. When the row for DELETE does not exists - the operation is ignored. When the rows already exists for INSERT - the operation is ignored.
To get a consistent state of the target you work in mode described here for some time until the point when you have data in sync, and next switch to regular replication.
One thing to mention about this mode is that you need to make sure that during the reconciliation operation the CDC must provide full UPDATE content for rows. If the UPDATE just contains modified columns - you would not be able to create INSERT command (with all column values) if the row is missing.
Of course the replication tool you use can incorporate the solution described above and do the task instead of you - automatically.

Confused about AWS RDS read replica databases. Why can I edit rows?

Edit: I'm not trying to edit the read replica. I'm saying I did edit it and I'm confused on why I was able to.
I have a database in US-West. I made a read replica in Mumbai, so the users in India don't experience slowness. Out of curiosity, I tried to edit a row in the Mumbai read-replica database hoping to get a security error rejecting my write attempt (since after all, it is a READ replica). But the write operation was successful. Why is that? Shouldn't this be a read-only database?
I then went to the master database hoping the writing process would at least be synchronized, but my write execution didn't persist. The master database was now different than the place.
I also tried edited data in the master database, hoping it would replicate it to the slave database, but that failed as well.
Obviously, I'm not understanding something.
Take a look at this link from Amazon Web Service to get an idea:
How do I configure my Amazon RDS DB instance read replica to be modifiable?
Probably your read replica has the flag read_only = false
Modify the newly created parameter group and set the following parameter:
In the navigation pane, choose Parameter Groups. The available DB parameter groups appear in a list.
In the list, select the parameter group you want to modify.
Choose Edit Parameters and set the following parameter to the specified value:
read_only = 0
Choose Save Changes.
I think you should read a little about Cross region read replicas and how they work.
Working with Read Replicas of MariaDB, MySQL, and PostgreSQL DB Instances
Read Replica lag is influenced by a number of factors including the load on both the primary and secondary instances, the amount of data being replicated, the number of replicas, if they are within the same region or cross-region, etc. Lag can stretch to seconds or minutes, though typically it is under one minute.
Reference: https://stackoverflow.com/a/44442233/1715121
Facts to remember about RDS Read Replica
In Read Replica, a snapshot is taken of the primary database.
Read replicas are available in Amazon RDS for MySQL, MariaDB, and PostgreSQL.
Read replicas in Amazon RDS for MySQL, MariaDB, and PostgreSQL provide a complementary availability mechanism to Amazon RDS Multi-AZ Deployments
All traffic between the source and destination database is encrypted for Read Replica’s.
You need to enable backups before creating Read replica’s. This can be done by setting the backup retention period to a value other than 0
Amazon RDS for MySQL, MariaDB and PostgreSQL currently allow you to create up to five Read Replicas for a given source DB Instance
It is possible to create a read replica of another read replica. You can create a second-tier Read Replica from an existing first-tier Read Replica. By creating a second-tier Read Replica, you may be able to move some of the replication load from the master database instance to a first-tier Read Replica.
Even though a read replica is updated from the source database, the target replica can still become out of sync due to various reasons.
You can delete a read replica at any point in time.
I had the same issue. (Old question, but I couldn't find an answer anywhere else and this was my exact issue)
I had created a cross-region read replica, which when complete, all the original data was there, but no updates were synchronised between the two regions.
The issue was the parameter groups.
In my case, I had changed my primary from the default parameter group to one which allowed case insensitive tables. The parameter group is not copied over to the new region, so the replication was failing with:
Error 'Table 'MY_TABLE' doesn't exist' on query. Default database: 'mydb'. Query: 'UPDATE MY_TABLE SET ....''
So in short, create a parameter group in your new region that matches the primary region and assign that parameter group as soon as the replica is created.
I also ran this script on the newly created replica:
CALL mysql.rds_start_replication
I am unsure if this was required, but I ran it anyway.
I think only adding index can be done on slave db in amazon rds if you put read replica in write mode and it will be continue in write mode till you change parameter read_only=1 and apply it immediately.

SQL Azure Export/Bacpacs and Foreign Key Integrity

I've a bit of a strange problem with a BACPAC I took last night using the SQL Azure Import/Export Service.
In our database there are 2 related tables.
dbo.Documents --All Documents in the database
Id
DocName
Extension
dbo.ProcessDocuments --Doc's specific to a process
Id
DocumentId (FK -> dbo.Documents.Id with Check Constraint)
ProcessId
Based on that Schema it should not be possible for the ProcessDocuments table to include a row that does not have a companion entry in the main Documents table.
However after I did the restore of the database in another environment I ended up with
7001 entries in ProcessDocuments. Only 7000 equivalent entries for them in Documents (missing 1). And the restore failed on attempting to restore the ALTER TABLE CHECK CONSTRAINT on ProcessDocuments
The only thing I can imagine is that when the backup was being taken, it was sequentially (alphabetically???) going through the tables, and backing up the data 1 table at a time and something like the following happened.
Documents gets backed up. Contains 7000 entries
Someone adds a new process document to the system / Insert to Documents & Process Documents
ProcessDocuements gets backed up. Contains 7001 entries
If that's the case, then it creates a massive problem in terms using BACPACs as a valid disaster recovery asset, because if they're taken while the system has data in motion, it's possible that your BACPAC contains data integrity issues.
Is this the case, or can anyone shed any light on what else could have caused this ?
Data export uses bulk operations on the DB and is NOT guaranteed to be transactional, so issue like you described can and eventually will happen.
"An export operation performs an individual bulk copy of the data from each table in the database so does not guarantee the transactional consistency of the data. You can use the Windows Azure SQL Database copy database feature to make a consistent copy of a database, and perform the export from the copy."
http://msdn.microsoft.com/en-us/library/windowsazure/hh335292.aspx
if you want to create transactionally consistent backups you have to copy the DB first (which may cost you a lot, depending on size of your db) and then export a copied DB as BACPAC (as ramiramilu pointed out) http://msdn.microsoft.com/en-us/library/windowsazure/jj650016.aspx
you can do it yourself or use RedGate SQL Azure Backup but from what I understand they follow exactly the same steps as described above, so if you choose their consistent backup option it's gonna cost you as well.
As per the answer from Slav, the bacpac is non-transactional and will be corrupted if any new rows are added to any table while the bacpac is being generated.
To avoid this:
1) Copy the target database, which will return straight away, but the database will take some time to copy. This operation will create a full transactional copy:
CREATE DATABASE <name> AS COPY OF <original_name>
2) Find the status of your copy operation:
SELECT * FROM sys.dm_database_copies
3) Generate a bacpac file on the copied database, which isn't being used by anyone.
4) Delete the copied database, and you'll have a working bacpac file.

Resources