Copy Postgres database structures but not data - database

We are creating a Dockerfile that can spin up a Postgres database container, using this image as a basis
https://hub.docker.com/_/postgres/
Every time we test we want to create a fresh database with the production database as a template - we want to copy the database structures without copying the data in tables etc.
Can someone provide an example of doing this? I need something concrete with database urls etc.
There are some good examples here, but some are a bit nebulous to a Postgres newb.
I see examples like this:
pg_dump production-db | psql test-db
I don't know what "production-db" / "test-db" refer to (are they URL strings?), so I am lost. Also, I believe this copies over all the data in the DB, and we really just want to copy the database structures (tables, views, etc).

Related

How do I clone an entire database from one Postgres instance, to become a part of a database hosted on a different instance?

I have a PostgreSQL instance A with 10 tables, and another instance B hosted on a different box, which contains the same 10 tables but also many others. I'd like to clone all 10 tables from the small database A to overwrite their equivalents in the larger database B. What's a good way to do this?
One path I'm considering is to do a full pg_dump of A, copy that dump file to B's host, then pg_restore it into B. It seems like it should work since I do want every single table on A to overwrite the table of the same name on B, but I'm just a bit nervous doing a pg_restore of a full database dump, and I'm also not very familiar with pg_dump and pg_restore so it would be great to have that plan validated by someone more knowledgeable.
You can use a plain format pg_dump with the --clean option and specify the tables you want to dump with -t.
Then you get an SQL script that contains only the tables you want replaced, and each table is preceeded with a DROP TABLE.
You can check the script before using it.

What is the best way to update (or replace) an entire database collection on a live mongodb machine?

I'm given a data source monthly that I'm parsing and putting into a MongoDB database. Each month, some of the data will be updated and some new entries will be added to the existing collections. The source file is a few gigabytes big. Apart from these monthly updates, the data will not change at all.
Eventually, this database will be live and I want to prevent having any downtime during these monthly updates if possible. What is the best way to update my database without any downtime?
This question is basically exactly what I'm asking, but not for a MongoDB database. The accepted answer there is to upload a new version of the database and then rename the new database to use the old one's name.
However, according to this question, it is impossible to easily rename a MongoDB database. This renders that approach unusable.
Intuitively, I would try to iteratively 'upsert' the entire database using each document's unique 'gid' identifier (this is a property of the data, as opposed to the "_id" generated by MongoDB) as a filter, but this might be an inefficient way of doing things.
I'm running MongoDB version 4.2.1
Why do you think updating the data would mean downtime?
It sounds like you don't want your users to be able to access the new data mid-load.
If this is the case, a strategy could be to have 2 databases; a live and a staging; rather than renaming the staging database to live, you could just rename the connection string in the client application(s) that connect to it.
Also consider mongodump and mongorestore to copy databases; although these can be slower with larger databases.

How to take a backup of a big filestream enabled db without the files

We have a large >40Gb filestream enabled db in Production. I would like to automatically make a backup of this db and restore it to staging, for testing deployments. The nature of our environment is such that the filestream data is > 90% of the data and I don't need it in staging.
Is there a way that I can make a backup of the db without the filestream data, as this would drastically reduce my staging disk and network requirements, while still enabling me to test a (somewhat) representative sample of prod?
I am assuming you have a fairly recent version of SQL Server. Since this is production, I am assuming you are in full recovery model.
You can’t just exclude individual tables from a backup. Backup and restore do not work like that. The only possibility i can think is to do a backup of just the file groups that do not contain the filestream. I am not 100% sure if you will be able to restore it though since I have never tried it. Spend some time researching partial backups and restoring a file group and give it a try.
You can use Generate Scripts and interface and do one of the following:
copy all SQL objects and the data (without the filestream tables) and recreate the database
copy all SQL objects without the data; create the objects in new database on the current SQL instance; copy the data that you need directly from the first database;
The first is lazy and probably will not work well with big database. The second will work for sure, but you need to sync the data by your own.
In both cases, open this interface:
Then choose all objects and all tables without the big ones:
From this option you can control the data extraction (skip or include):
I guess it will be best to script all the objects without the data. Then create a model database. You can even add some sample data in your model database. When you are changing the the production database (create new object, delete object, etc), apply these changes on your model database, too. Having such model database means you are having a copy of your production database with all supported functionalities and you can restore a this model database on every test SQL instance you want.

SQLDeveloper copy database

I'm trying to copy a database for use in testing/developing, in SQLDeveloper I can only see the user views, the data objects are not accessible for me.
Is there anyway to copy the views only and get a dll that creates some sort of phantom structure for the data objects that are not reachable but referenced in the sql queries for those views? Problem is there are over a thousand such references,
In the example below I cannot reach the header object due too permissions,
Example:
CREATE OR REPLACE FORCE VIEW "TRADE"."EXCHANGE" ("MSGQUE", "MSGQUE2") AS
select msgque, msgque2
from head.msgqueues;
I have tryed to export the views in SQL developer but when I import it in my Oracle test database the views contain error and are unusable because the data object did not get exported in the export.sql file,
Thanks in advance
I recommend using the expdp utility to perform this. You can explicitly say to grab views and tables.
Example parfile:
SCHEMAS=SCOTT
INCLUDE=TABLE:"IN ('DEPT')"
INCLUDE=VIEW
DIRECTORY=datapump
DUMPFILE=dept.dmp
LOGFILE=dept.log
Then you can impdp that parfile into the DB you wish and you will have the TABLE and the VIEW that goes in the schema. You can modify the IN clause to grab whatever naming scheme you would need.

PostgreSQL: Difference between pg_dump + psql versus template create?

I know there are two ways of making a copy of a database.
One is to export the database as a giant SQL file, then load it as a separate database:
pg_dump <database> | psql <new database>
Another way is to pass the database name as a template to a database creation argument:
createdb -T <database> <new database>
What is the difference between these two methods, if any?
Are there any benefits of using one over another, such as performance?
Using CREATE DATABASE/createdb with a template makes a directory copy, whereas pg_dump + psql has to serialize and deserialize the whole database, send them on a round-trip to the client, and has to run everything through the transaction and write-ahead logging machinery. So the former method should be much faster.
The disadvantage is that CREATE DATABASE locks the template database while it's being copied. So if you want to create copies of a live database, that won't work so well. But if you want to quickly make copies of an inactive/template database, then using CREATE DATABASE is probably the right solution.
According to the current docs
Although it is possible to copy a database other than template1 by
specifying its name as the template, this is not (yet) intended as a
general-purpose "COPY DATABASE" facility. The principal limitation is
that no other sessions can be connected to the template database while
it is being copied. CREATE DATABASE will fail if any other connection
exists when it starts; otherwise, new connections to the template
database are locked out until CREATE DATABASE completes.
Apart from that mild warning, which goes back to at least version 8.2, you can make certain kinds of changes by using createdb--things like changing collation, encoding, etc. (Within limits.)
Personally, I'd have a hard time justifying the use of createdb, which takes a full database lock, to copy a production database.
I think the other main difference is that "dump and load" is a fully supported way of copying a database. Also, you can carry a copy of the dump to an isolated development computer for testing if you need to. (The createdb utility has to have access to both the source and the target at the same time.) But I have not used createdb to make copies, so I could be wrong.

Resources