Upload Postgres db on an Amazon VM [closed] - database

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I've been given a database which I can't handle with my pc, because of little available storage and memory.
The person who gave me this db gave me the following details:
The compressed file is about 15GB, and uncompressed it's around
85-90GB. It'll take a similar amount of space once restored, so make
sure the machine that you restore it on has at least 220GB free to be
safe. Ideally, use a machine with at least 8GB RAM - although even our
modest 16GB RAM server can struggle with large queries on the tweet
table.
You'll need PostgreSQL 8.4 or later, and you'll need to create a
database to restore into with UTF8 encoding (use -E UTF8 when creating
it from the command-line). If this is a fresh PostgreSQL install, I
highly recommend you tweak the default postgresql.conf settings - use
the pgtune utility (search GitHub) to get some sane defaults for your
hardware. The defaults are extremely conservative, and you'll see
terrible query performance if you don't change them.
When I told him that my pc sort of sucks, he suggested me to use an Amazon EC2 instance.
My two issues are:
How do I upload the db to an Amazon VM?
How do I use it after that?
I'm completely ignorant regarding cloud services and databases as you can see. Any relevant tutorial will be highly appreciated.

If you're new to cloud hosting, rather than using EC2 directly consider using EnterpriseDB's cloud options. Details here.
If you want to use EC2 directly, sign up and create an instance.
Choose your preferred Linux distro image. I'm assuming you'll use Linux on EC2; if you want to use Windows that's because you probably already know how. Let the new VM provision and boot up, then SSH into it as per the documentation available on Amazon for EC2 and for that particular VM image. Perform any recommended setup for that VM image as per its documentation.
Once you've done the recommended setup for that instance, you can install PostgreSQL:
For Ubuntu, apt-get install postgresql
For Fedora, yum install postgresql
For CentOS, use the PGDG yum repository, not the outdated version of PostgreSQL provided.
You can now connect to Pg as the default postgres superuser:
sudo -u postgres psql
and are able to generally use PostgreSQL much the same way you do on any other computer. You'll probably want to make yourself a user ID and a new database to restore into:
echo "CREATE USER $USER;" | sudo -u postgres psql
echo "CREATE DATABASE thedatabase WITH OWNER $USER" | sudo -u postgres psql
Change "thedatabase" to whatever you want to call your db, of course.
The exact procedure for restoring the dump to your new DB depends on the dump format.
For pg_dump -Fc or PgAdmin-III custom-format dumps:
sudo -u postgres pg_restore --dbname thedatabase thebackupfile
See "man pg_restore" and the online documentation for details on pg_restore.
For plain SQL format dumps you will want to stream the dump through a decompression program then to psql. Since you haven't said anything about the dump file name or format it's hard to know what to do. I'll assume it's gzip'ed (".gz" file extension), in which case you'd do something like:
gzip -d thedumpfile.gz | sudo -u postgres psql thedatabase
If its file extension is ".bz2" change gzip to bzip2. If it's a .zip you'll want to unzip it then run psql on it using sudo -u postgres psql -f thedumpfilename.
Once restored you can connect to the db with psql thedatabase.

Related

Database issue when migrating a Trac project

I am trying to migrate a series of Trac projects originally hosted on CloudForge onto a new Bitnami virtual machine (debian with Trac stack installed).
The documentation on the Trac wiki regarding restoring from a backup is a little vague for me but suggests that I should be able to setup a new project
$ sudo trac-admin PROJECT_PATH initenv
stop the services from running
$ sudo /opt/bitnami/ctlscript.sh stop
copy the snapshot from the backup into the new project path and restart the services
$ sudo /opt/bitnami/ctlscript.sh start
and should be good to go.
Having done this (and worked through quite a few issues on the way) I have now got to the point where the browser page shows
Trac Error
TracError: Unable to check for upgrade of trac.db.api.DatabaseManager: TimeoutError: Unable to get database connection within 0 seconds. (OperationalError: unable to open database file)
When I setup the new project I note that I left the default (unedited) database string but I have no idea what database type was used for the original CloudForge Trac project i.e. is there an additional step to restore the database.
Any help would be greatly appreciated, thanks.
Edit
Just to add, the CloudForge was using Trac 0.12.5, new VM uses Trac 1.5.1. Not sure if this will be an issue?
Edit
More investigation and I'm now pretty sure that the CloudForge snapshot is not an SQLite (or other) database file - it looks like maybe a query type response as it starts and ends with;
BEGIN TRANSACTION;
...
COMMIT;
Thanks to anyone taking the time to read this but I think I'm sorted now.
After learning more about SQLite i discovered that the file sent by CloudForge was an sqlite DUMP of the database and was easy enough to migrate to a new database instance using the command line
$ sqlite3 location_of/new_database.db < dump_file.db
I think I also needed another prior step of removing the contents of the original new_database.db using the sqlite3 command line (just type sqlite3 in terminal)
$ .open location_of/new_database.db
$ BEGIN TRANSACTION;
$ DELETE FROM each_table_in_database;
$ COMMIT;
$ .exit
I then had some issue with credentials on the bitnami VM so needed to retrieve these (as per the bitnami documentation) using
$ sudo cat /home/bitnami/bitnami_credentials
and add this USER_NAME as a TRAC_ADMIN using
$ trac-admin path/to/project/ permission add USER_NAME TRAC_ADMIN
NOTE that pre and post this operation be sure to stop and re-start the bitnami services using
$ sudo /opt/bitnami/ctlscript.sh stop
$ sudo /opt/bitnami/ctlscript.sh start
I am the guy from Trac Users, you need to understand that the user isnt really stored in the db. You got some tables with columns holding the username but there is no table for an user. Looking at you post i think your setup used htdigest and then your user infos are in that credential file. if you cat it you should see something like
username:realmname:pwhash
i thing this is md5 as hash but it doesnt really matter for your prob. so if you want to make a new useryou have to use
htdigest [ -c ] passwdfile realm username
then you should use trac-admin to give the permission and at that point your user should be able to login.
Cheers
MArkus

postgres major upgrade (9.5.x to 9.6.x) within same data space

I was trying to upgrade my postgres installation from 9.5.7 to 9.6.5
My postgres database production instance has several databases and consumed ~700 GB space till now.
pg_upgrade needs 2 different dir for old and new datadir.
pg_upgrade -b oldbindir -B newbindir -d olddatadir -D newdatadir
it needs a new directory to do the pg_upgrade where as I was able to run the above command in my local/stage database as my database size was small in comparison to prod and I observed the following in my local
sudo du -sh /var/lib/pgsql/data-9.5
64G /var/lib/pgsql/data-9.5
sudo du -sh /var/lib/pgsql/data-9.6
60G /var/lib/pgsql/data-9.6
and I was having sufficient free data space to do the interim pg_upgrade process in my local/stage and I did it successfully there.
While in production I have only ~300 GB free space.
However after the successful upgrade we will delete the /var/lib/pgsql/data-9.5 dir.
Is there any way to do the in-place data upgrade so that It will not need the same amount of extra space for interim pg_upgrade process ?
Run pg_upgrade
/usr/lib/postgresql/9.6/bin/pg_upgrade
-b /usr/lib/postgresql/9.5/bin/
-B /usr/lib/postgresql/9.6/bin/
-d /var/lib/pgsql/data-9.5/
-D /var/lib/pgsql/data/
--link --check
Performing Consistency Checks
-----------------------------
Checking cluster versions ok
Checking database user is the install user ok
Checking database connection settings ok
Checking for prepared transactions ok
Checking for reg* system OID user data types ok
Checking for contrib/isn with bigint-passing mismatch ok
Checking for roles starting with 'pg_' ok
Checking for presence of required libraries ok
Checking database user is the install user ok
Checking for prepared transactions ok
Clusters are compatible
Always run the pg_upgrade binary of the new server, not the old one. pg_upgrade requires the specification of the old and new cluster's data and executable (bin) directories. You can also specify user and port values, and whether you want the data linked instead of copied (the default).
If you use link mode, the upgrade will be much faster (no file copying) and use less disk space, but you will not be able to access your old cluster once you start the new cluster after the upgrade. Link mode also requires that the old and new cluster data directories be in the same file system. (Tablespaces and pg_xlog can be on different file systems.) See pg_upgrade --help for a full list of options.
Thanks to the postgres community comprehensive documentation which helped me a lot to find the solution after all.

How to operate a postgres database from 1 hard disk on multiple systems?

My issue is that I work from various systems yet require access to a single, large (50 Gb) database that is always up to date. If it were a smaller database, dumping and restoring the database onto the external disk would by fine, e.g. via $ pg_dump mydb > /path../db.sql to save it and then on the other computer use $psql -d mydb -f /path../db.sql to recover the data and so on...
In this case, however, that option will not work as I don't have 50 Gb of free space on both machines. So I'd like the files for this particular db to be on a single external drive.
How would I do this?
I've been using pg_cltcluster for this purpose, e.g.
$ cp -rp ax/var/lib/postgresql/9.1/main /media/newdest # move the data
$ pg_cltcluster 9.1 main stop # stop the postgres server (on ubuntu: add -- [ctl args])
$ /usr/lib/postgresql/9.1/bin/initdb -D /media/newdest # initialise instance in new place
(On ubuntu, pg_ctlcluster is used instead of pg_ctl to allow multiple db clusters and this should allow pg_ctlcluster 9.1 main start -- -D /media/newdest to replace the last line of code, I think)
I suspect this approach is not the best solution because a) it's not working at present after various tries and b) I'm not sure I'll be able to access the cluster from another computer.
Database software are designed to handle large datasets and moving your data around is a common task so I am baffled about why there is less info on this on the internet:
This question that basically says "don't use TABLESPACES to do it"
This one that just solves the permissions problem and links to a (useful) IMB page on the matter that talks about moving the entire setup, not just one database as I want to.

Oracle public server

I want to learn about oracle, to try some queries and other SQL features of oracle data base, but don't want to install and mess with all realted issues. So my question is - is there any publicly available oracle server, to which I can connect through terminal and play with it?
I mean a service where I can register and some space would be allocated to my profile
Take a look at: http://apex.oracle.com/
The only thing I can think of is SQLFiddle: http://sqlfiddle.com/
But it won't let you have a "private" space. You need to re-create your schema each time (but you can bookmark your script which might be enough for you).
You could also try one of the pre-built virtual appliances - see
http://www.oracle.com/technetwork/community/developer-vm/index.html
If you need direct database access, you can run it in a Docker instance:
docker run -d -p 1521:1521 -p 8080:8080 alexeiled/docker-oracle-xe-11g
Then connect to it with sqlplus
sqlplus system/oracle#localhost:1521/xe
See here for more passwords, info on apex, etc.
Just came across this: Oracle Live SQL. It is browser based so nothing to install locally. But, you need to have an Oracle account.
Browser based SQL worksheet access to an Oracle database schema

Restore PostgreSQL database from mounted volume

My EC2 database server failed, preventing SSH or other access (not sure why ... grrr AWS ... that's another story).
I was able to make a snapshot of the EBS root volume. I can not boot a new instance from this volume (I'm guessing the boot partition is corrupt). However, I can attach and mount the volume on a new instance.
Now I need to get the PostgreSQL 8.4 on the new machine (Ubuntu 10.04) to load the data from the mounted volume. Is this possible? I've tried:
pg_ctl start -D /<mount_dir>/etc/postgresql/8.4/main/
But no joy ... PostgreSQL just starts with empty tables.
Is /etc/postgresql/8.4/main/ the correct location for PostgreSQL data files?
Is there a way to recover the data from the mounted volume in a way that PostgreSQL can read again?
(You should really specify your distro and version, etc, with this sort of system admin question.)
Running Pg via pg_ctl as shown above should work, assuming the original database was from Pg 8.4 and so are the binaries you're trying to use to start it. Perhaps you forgot to stop the instance of PostgreSQL automatically started by the distro? Or connected on the wrong port, so you got the distro's default instance instead of your DB on another port (or different unix socket path, for unix sockets)?
Personally I wouldn't do what you're doing anyway. First, before I did anything else, I'd make a full backup of the entire data directory because you clearly don't have good backups, otherwise you wouldn't be worrying about this. Take them now, because if you break something while restoring you're going to hate yourself. As demonstrated by this fault, trusting Amazon's storage (snapshot or otherwise) probably isn't good enough.
Once you've done that: The easiest way to restore your DB will be to, on a new instance you know you don't have any important data on that has the same major version (eg "8.4" or "9.0") of postgresql as your original instance did installed:
/etc/init.d/postgresql-8.4 stop
datadir=/var/lib/postgresql/8.4/main
rm -rf "$datadir"
cp -aR /<mount_dir>/etc/postgresql/8.4/main/ "$datadir"
chown -R postgres:postgres "$datadir"
/etc/init.d/postgresql-8.4 start
In other words: take a copy, fix the permissions, start the DB.
You might need to edit /etc/postgresql/8.4/main/postgresql.conf and/or /etc/postgresql/8.4/main/pg_hba.conf because any edits you made to the originals aren't there anymore; they're on your corrupted root FS. The postgresql.conf and pg_hba.conf in the datadir are just symlinks to the ones in etc under Debian - something I understand the rationale behind, but don't love.
Once you get it running, do an immediate pg_dumpall and/or just a pg_dump of your important DB, then copy it somewhere safe.

Resources