In a server with a single postgres database, is it possible to migrate the whole database onto a different server (running the same OS, etc) without going through the usual time-consuming way of dumping and importing (pg_dump)?
After all, everything must still be in the filesystem?
Assumptions are the postgres service is not running, and the servers are running Ubuntu.
Also, if you want, you can use pg_basebackup which will connect over the network connection and request a copy of all files. This is preferable where the architecture, OS, etc. is not changing. For more complex cases, see barman which will manage this process for you.
Related
I'm in charge of fixing part of an application that syncs data entities from a DB2 database running on an iSeries to a SQL Server database on Azure. The application seems to run this synchronization just fine when the iSeries and the SQL Server host are on the same local network. However, add in the change in latency when the database host is in Azure and this process slows to an unacceptable level.
I'm looking for ideas on how to proceed. Some additional information:
Some of our clients might have millions of records in some tables.
An iSeries platform is not virtualizable/containerizable as far as I know.
Internet connection speed will vary wildly. Highly dependent on the client.
The current solution uses an ODBC driver to get data from the iSeries. I've started looking at taking a different approach as far as retrieving the data including creating API's (doesn't seem to be the best idea for transferring lots of data) or using Azure Data Services, but, those are really pieces to a bigger puzzle.
Edit: I'm trying to determine what the bottleneck is and how to fix it. I forgot to add that the IT manager (we only have one person in IT... we are a small company) has setup a test environment using a local iSeries and an instance of our software in Azure. He tried analyzing the network traffic using Wireshark and told me we aren't using tons of bandwidth, but, there seems to be alot of TCP "chatter". His conclusion was we need to consider another way to get the data. I'm not convinced and that is part of the reason I'm here asking this question.
More details:
Our application uses data from an ERP system that runs on the iSeries.
Part of our application is the sync process in question.
The process has two components: a full data sync and a delta/changes data sync.
When these processes are run when both the iSeries and our application are on site, the performance isn't great, but, it is acceptable.
The sync process is written in C# and connects to DB2 using an ODBC driver.
We do some data manipulation and verification as part of the sync process. i.e., making sure parent records exist, foreign keys are appropriate, etc.
I have a SQLite database on my local machine and my web services running on the same machine access it using SQLAlchemy like this:
engine = create_engine('sqlite:///{}'.format('mydatabase.db'), echo=True)
We are planning to host our web services on a separate machine from where the database is hosted. How can we make this 'mydabata.db' be accessible for our web services remotely for my web services? Thanks.
From SQLite when to use docs:
Situations Where A Client/Server RDBMS May Work Better
Client/Server Applications
If there are many client programs sending SQL to the same database over a network, then use a client/server database engine instead of SQLite. SQLite will work over a network filesystem, but because of the latency associated with most network filesystems, performance will not be great. Also, file locking logic is buggy in many network filesystem implementations (on both Unix and Windows). If file locking does not work correctly, two or more clients might try to modify the same part of the same database at the same time, resulting in corruption. Because this problem results from bugs in the underlying filesystem implementation, there is nothing SQLite can do to prevent it.
A good rule of thumb is to avoid using SQLite in situations where the same database will be accessed directly (without an intervening application server) and simultaneously from many computers over a network.
SQLite works well for embedded system or at least when you use it on the same computer. IMHO you'll have to migrate to one of the larger SQL solutions like PostgreSQL, MariaDB or MySQL. If you've generated all your queries though the ORM (SQLAlchemy) then there will be no problem migrating to another RDBMS. But even if wrote SQL queries too there should not be much problems because all these RDBMSes use very similar dialects (unlike Microsoft's T-SQL). And since SQLite is lite it supports only a subset of what other RDBMSes support so there should not be a problem.
I am just wondering if this is the safest way, in terms of the database, to copy my production setup to a development environment?
ssh user#app.com pg_dump app-production | psql app-development
I just want to make sure that this command doesn't or can't have any unintended side effects on the database being dumped.
It will impose a considerable load on the production to read all of the data from disk and send it over the net. It will also lock each object, sometimes in ways that can potentially interfere with the operation of the production system.
I think the least-impact method is to hook into whatever backup system you already have in place for the production system. If you use pg_dump for your backup, restore from the most recent one of those without touching production at all. If you use wal archiving for your backup, "restore" from that to get your clone, again without touching production at all.
It won't make any changes to the production database, however it might have a noticeable effect on production database performance.
It will increase the general load as its obviously going to access all the tables and the large objects.
However, the thing I'd be more concerned about is the way you're using the network. By piping direct through the connection you're relying on an open network connection throughout the process of the pg_dump and also keeping the access open until the load is completed at app-development.
Also, if there is a network drop or anything, you'd have to restart completely.
I'd recommend you dump to a file if you can. Something like
pg_dump -Fc --file=app-production.backup app-production
And then transfer app-production.backup with sftp to your dev box.
That way you can utilise the custom format "-Fc" which compresses the data so your ssh hit will be smaller. Also once you sftp the file to your local dev box, you can then load, reload, reload again as often as you want without revisiting your production database.
PG Dump documentation
I run CouchDB on my production server, and I want to periodically replicate it to my development server (running on another machine, at my home). So what's the easiest way to do it securely ?
I don't believe Replication can occur over HTTPS, at least not yet. (I believe that feature is coming with v1.1)
If you need it to be secure (ie. encrypted over the wire) you should probably use ssh/scp to periodically copy your database files to your development server.
I am looking for an embedded database for a VB 2010 application working over the network. The database file is on a shared network folder on a NAS server (NTFS). For this reason I cannot use any server database like mysql, sql server, etc...
There are nearly 20 PCs accessing the shared folder on the network.
Each pc can open till 3 connections to the database, so we could have till 60 connections to the database. Mostly they just read the database, a writing to the database happens each 5-6 minutes and rarely at the same time, but it can happen.
In the past I had successfully used access+jet for such applications and never had problems, anyway with less network users.
I would still use access+jet (so I do not need to convert the whole database and code), but I would like to use something newer.
I have seen that SQLite is not jet right for network/shared enviroment.
SQL Compact is also not right for shared folder.
VistaDB is too expensive.
Firebird could be an option, but I have no experience: It should be used in a production system and I do not know if I could trust it.
Any suggestion? Or shell I stay by access?
Thanks for replying.
Go with firebird. Stable, lightweight, free and very fast as a network and as an embedded database. I am using it everywhere.
However the database cannot reside on a shared network folder. It must reside on a hard drive that is physically connected to the host machine.
VistaDB is good as an embedded database, but has awful performance as a network database because it is not true client-server.