Access Newly Created PostgreSQL Cluster - database

I am looking to create a new database cluster in PostgreSQL because I want this cluster to point to a different (and larger) data directory than my current cluster on localhost:5432. To create a new cluster, I ran the command below. However, after restarting PostgresSQL I don't see the db cluster in pgAdmin and can't connect to a server with port 5435. How do I connect to this new cluster? Alternatively, I thought I could create a new tablespace within the old cluster which points to this larger data directory, but I'm assuming that populating a database using this tablespace would still result in files being written to the cluster data directory? I'm running PostgreSQL 9.3 on Ubuntu 12.4.
$ pg_createcluster -d /home/foo 9.3 test_env
Creating new cluster 9.3/test_env ...
config /etc/postgresql/9.3/test_env
data /home/foo/
locale en_US.UTF-8
port 5435

You need to specify option '--start' if you want it to actually start the database. Without that it will just create the data directory.
However using table spaces would probably be a better solution since running multiple database clusters introduces the overhead of running completely separate postmaster processes. If you create a database in a table space then all files associated with the database will go in the associated directory. The system catalogue will still be in the cluster's data directory, but this is unlikely to be large enough to cause you any problems.

Related

How to Setup an existing postgres DB in a new server?

I have an postgresql DB on an AWS Instance, for some reason the instance now is damaged and the only thing I can do is to detach the disk volume and attach it to a new instance.
The challenge I have now is, how do I setup the postgresql DB I had on the damaged instance volume into the new instance without losing any data.
I tried to attach the damaged instance volume as the main volume in the new instance but it doesnt boot up so what I did was that I mounted the volume as a secondary disk and now I can see the information in it including the "data" folder where postgres DB information its supposed to be, but I dont know what to do in order to enable the DB on this new instance.
The files in the /path/to/data/ directory are all that you need in order for a PostgreSQL instance to start up, given that the permissions are set to 0700 and owned by the user starting up the process (usually postgres, but sometimes others). The other things to bear in mind are:
Destination OS needs to be the same as where you got the data/ directory from (because filesystem variations may either corrupt your data or prevent Postgres from starting up)
Destination filesystem needs to be the same as where you got your data/ directory from (for reasons similar to above)
Postgres major version needs to be the same as where you got your data/ directory from (for reasons similar to above)
If these conditions are met, you should be able to bring up the database and use it.

How to migrate DolphinDB cluster onto another machine

Our company is using DolphinDB on a daily basis. As time goes by, we have too much data on our old server. So we decided to migrate our cluster onto to another Dell PowerEdge server with 128 cores. We did this by copying all the data onto the new machine, but as we started the new cluster and trying to open a dfs database by using command:
db = database("dfs://rangeDB");
it reported an error message:
The chunk meta returned from name node didn't contain any site.
How can we solve this problem?
DolphinDB is a distributed database system. A database contains meta data on controller and chunk data on multiple data nodes. Simply copying all directory from one machine to another doesn't work.
The best practice for DolphinDB migration is to use builtin backup and restore functions. First, use backup function to export data to a shared disk and then use restore function to import data from the backup directory.

Copy data between two linked servers

I have two MSSQL Server instances and one is on DMZ so it has not access to the inside network.
So SERVER1 (On the inside of firewall) pushes today data to SERVER2 (on DMZ).
How do i get better performance in shuffling large amount of rows to tables on SERVER2? Today when doing this.
INSERT INTO SERVER2.DB.DBO.TABLE SELECT something from SERVER1Table
Its very slow and time consuming and not to say the least it locks the table for outside users.
The thing is that SERVER2 is a webserver that is a portal for customers to log in and check certain information.
Or am I almost pushed into the choice of using pull-data query? So that I need to open up the MSSQL port through the firewall and let the DMZ SERVER2 pull data from SERVER1?
SQL Server Integration Services (SSIS) should be right tool for the job...
The tool's purpose is to transfer and transform data, so they are really good at this.
You can easily extend your packages and develop simple tasks like the one you mention in minutes.
Ssis will likely take a similar amount of time. You should optimise your architecture. Try adding two more steps to minimise locking:
Copy to a new local table very quickly on server, using any filtering.
Copy to server2 on a similarly named new table
Copy from the new table on server2 to the final destination.
This way the slowest step occurs between two tables completely disconnected from affecting users.

Easiest way to replicate (copy? Export and import?) a large, rarely changing postgreSQL database

I have imported about 200 GB of census data into a postgreSQL 9.3 database on a Windows 7 box. The import process involves many files and has been complex and time-consuming. I'm just using the database as a convenient container. The existing data will rarely if ever change, and will be updating it with external data at most once a quarter (though I'll be adding and modifying intermediate result columns on a much more frequent basis. I'll call the data in the database on my desktop the “master.” All queries will come from the same machine, not remote terminals.
I would like to put copies of all that data on three other machines: two laptops, one windows 7 and one Windows 8, and on a Ubuntu virtual machine on my Windows 7 desktop as well. I have installed copies of postgreSQL 9.3 on each of these machines, currently empty of data. I need to be able to do both reads and writes on the copies. It is OK, and indeed I would prefer it, if changes in the daughter databases do not propagate backwards to the primary database on my desktop. I'd want to update the daughters from the master 1 to 4 times a year. If this wiped out intermediate results on the daughter databases this would not bother me.
Most of the replication techniques I have read about seem to be worried about transaction-by-transaction replication of a live and constantly changing server, and perfect history of queries & changes. That is overkill for me. Is there a way to replicate by just copying certain files from one postgreSQL instance to another? (If replication is the name of a specific form of copying, I'm trying to ask the more generic question). Or maybe by restoring each (empty) instance from a backup file of the master? Or of asking postgreSQL to create and export (ideally on an external hard drive) some kind of postgreSQL binary of the data that another instance of postgreSQL can import, without my having to define all the tables and data types and so forth again?
This question is also motivated by my desire to work around a home wifi/lan setup that is very slow – a tenth or less of the speed of file copies to an external hard drive. So if there is a straightforward way to get the imported data from one machine to another by transference of (ideally compressed) binary files, this would work best for my situation.
While you could perhaps copy the data directory directly as mentioned by Nick Barnes in the comments above, I would recommend using a combination of pg_dump and pg_restore, which will dump a self-contained file which can then be dispersed to the other copies.
You can run pg_dump on the master to get a dump of the DB. I would recommend using the options -Fc -j3 to use the custom binary format (instead of dumping in SQL format; this should be much smaller and perhaps faster as well) and will dump 3 tables at once (this can be adjusted up or down depending on the disk throughput capabilities of your machine and the number of cores that it has).
Then you run dropdb on the copies, createdb to recreate an empty DB of the same name, and then run pg_restore on that new empty DB to restore the dump file to the DB. You would want to use the options -d <dbname> -f <dump_file> -j3 (again adjusting the number for -j according to the abilities of the machine).
When you want to refresh the copies with new content from the master DB, simply repeat the above steps

Copying a tablespace from one postgresql instance to another

I'm looking for a way to quickly "clone" a database from 1 postgresql server to another.
Assuming...
I have a postgresql server running on HostA, serving 2 databases
I have 2 devices mounted on HostA, each device stores the data for one of the database (i.e. 1 database => 1 tablespace => 1 device )
I am able to obtain a "safe" point in time snapshot through some careful method
I am able to consistently and safely produce a clone of any of the 2 devices used (I could even assume both database only ever receive reads)
The hosts involved are CentOS 5.4
Is it possible to host a second postgresql server on HostB, mount one of the cloned devices and get the corresponding database to "pop" into existence? (Kind of like when you copy the MyISAM table files in MySQL).
If this is at all possible, what is the mechanism (i.e. what DDL should I look into or pg commands)?
It's important for me to be able to move individual databases in isolation of each other. But if this isn't possible, would a similar approach work at a server level (trying to clone and respawn a server by copying the datadir over to a host with the same postgresql install) ?
Not easily, because there are a number of files that are shared between databases which means each database in the same install is dependent on this.
You can do it at the server level, or at the cluster level, but not at individual database level. Just be sure to copy/clone over the whole data directory and all external tablespaces. As long as you can make the clone atomically (either on the same filesystem or with a system that can do atomic clones across filesystems), you don't even need to stop the database on hostA to do it.

Resources