AWS RDS modify snapshot schema tables as NULLS - database

I have a simple case where I have an AWS RDS snapshot and I want to modify the data inside that particular snapshot and share the database (snapshot). i.e. I want to modify few columns as nulls so that the client cannot see those columns.
I was looking into the API documentation but couldn't find the correct way of doing this. I looked into documentation Modifying an Amazon RDS DB instance and ModifyDBSnapshot etc but these modifications are meant for modifications in database engine and configuration settings.
Need expert opinion on how to accomplish this task in an optimal way. TIA

You can't modify the content of DB on the snapshot. You have to:
restore your snapshot to a new DB instance first,
remove/modify the columns you want by connecting to your DB as you would normally do, and
create new snapshot of it.
Then you can copy/share the modified snapshot.

Related

Copying tables from databases to a database in AWS in simplest and most reliable way

I have some tables from three databases that I want to copy their data to another database in an automated way and these data are quite large. My servers are running on AWS. What is the simplest and most reliable way to do so?
Edit
I want them to stay on-sync (automation process as DevOps engineer)
The databases are all MySQL and all moved between AWS EC2. The data is in range between 100GiB and 200GiB
Currently, Maxwell to take the data from the tables then moved to Kafka and then a script written in Java to feed the other database.
I believe you can use AWS Database Migration Service (DMS) to replicate tables from each source into a single target. You would have a single target endpoint and three source endpoints. You would have three replication tasks that would take data from each source and put it into your target. DMS can keep data in sync via ongoing replication. Be sure to read up on the documentation before proceeding as it isn't the most intuitive service to use, but it should be able to do what you are asking.
https://docs.aws.amazon.com/dms/latest/userguide/Welcome.html

PostgreSQL database sync

Im new to working with databases and Im trying to do the following
Copy all databaseA schemas (which has several tables each and permissions attached) without any data to my
existing databaseB table as record fields (which now contains only one schema and
also a few tables and permissions attached)
databaseA is an Amazon redshift database & databaseB is an Amazon RDS database. Im connecting to both using DBeaver, databaseA I'm using a redshift driver & databaseB I'm using a PostgreSQL driver
After the inital copy I want to run a daily cron job that checks for the following
a. Compare databaseA to databaseB table
b. If databaseA does not match databaseB (in terms of schema & table permissions)
c. Then switch all perms to match databaseB table
Any feedback on how to approach this would be appreciated!
You could create a python script that connects to both of the databases. You could set up a cron job to spot the differences daily and to update the database.
You can have a query like this for PG:
SELECT table_schema,table_name
FROM information_schema.tables
ORDER BY table_schema,table_name;
And something like this for Redshift:
SELECT schemaname, tablename
FROM PG_TABLE_DEF;
From there it`s just a matter of comparing the two and deciding if you want to update certain tables. Good luck.
I don't have experience with AWS. Im translating the little knowledge I have from OCS which is a younger solution than AWS.
First, Amazon Redshift is tailored for Data warehousing. RDS is a cloud relational database. Im not sure what your aim is to copy from Redshift to RDS. It would be more natural to have the DB or multiple DB copy/clone to the data warehouse unless this was some form of backup. You might need to look into the architecture of your solution.
Oracle Cloud which is fairly new provides a service for copying. Amazon should have a similar solution as they have been in the cloud business longer.
I have had a look at the Amazon documentation. Your challenge has a solution "backwards" here.
After copying the two dbs my assumption is they would be similar structurally. What is affecting the changes on dbA ? It feels like you don't want to use the permissions on dbA maybe its compromised.
My suggestion is to use permissions to prevent changes to dbA. Look at the IAM documentation and check the logs for dbA. If you really need to develop a solution use the API or CLI to interface with the db

Keep two different databases synchronized

I'm modeling a new microservice architecture migrating some part of a monolithic software to microservices.
I'm adding a new PostgreSQL database and the idea is in the future use that database but for now I still need to keep updated the old SQL Server database and also synchronize the PostgreSQL database if something new appears in the old database.
I've searched for ETL tools but are meant to move data to a datawarehouse (that's not what I need). I just can't replicate the information because the DB model is not the same.
Basically I need a way to detect new rows inserted in the SQL Server database, transform that information and insert it in my PostgreSQL.
Any suggestions?
PostgreSQL's foreign data wrappers might be useful. My approach would be, to change the frontend to use PostgreSQL and let postgreSQL handle the split via it's various features (triggers, rules, ...)
Take a look at StreamSets Data Collector. It can detect changes in SQL Server and insert/update/delete to any DB that has a JDBC driver including Postgres. It is open source but you can buy support. You can also make field changes/additions/removals/renaming to the data stream so that the fields match the target table.

Is it posible to use SQL Server Session Context with Azure elastic queries

I want to know if it's posible to share SQL Server SESSION CONTEXT variables between different Azure Sql databases using Elastic Queries.
I searched in official documentation but i can't found any information about this feature is available or not.
SESSION CONTEXT exists locally to a single server instance in SQL Server. (It's tied to a session). SQL Azure is built using SQL Server but there are some parts of the mapping that are opaque to customers (they can change based on circumstances such as what Edition you use or what version of the internal software we are using to deliver the service).
Elastic Queries is a feature to let you query from one database (source) to one or more other databases (target(s)). In such a model, you have a SQL Server session to the source database, and the elastic query has a separate connection/session to each other database being touched.
I think the question you are asking is "can I set the session context on the source connection/session and have it flow through to all the target connections when running queries there?" (That's my best guess - let me know if it is different). The answer today is "no" - the session variables do not flow from source to target as part of the elastic query. Also, since today elastic query is read-only, you can't use elastic query to set the session context individually on each target database connection/session as part of the operation.
In the future, we'll consider whether there is something like this we can do, but right now we don't have a committed timeline for something like this.
I hope this explains how things work a bit under the convers.
Sincerely,
Conor Cunningham
Architect, SQL

Move data from one database to another in Azure

I'm in the process of migration from dedicated servers to Azure. In my existing SQL Server, I have a few jobs that move data from live database to archives.
From what I have read so far, in Azure you cannot use cross database scritps. The other options I have seen include Azure SQL Data Sync, Azure Factory and maybe SSIS. I have to note that there's some logic on what data is archived and I need the ability to specify this in the query.
Has anyone some experience and what would you recommend?
Thanx
You can use the copy feature inside of data factory to do this now directly in Azure.
Azure Data Factory

Resources