what's the best way to dump a large(terabytes) db? are there other faster/efficient way besides mysqldump? this is intended to be zipped, unzipped, and then reimported into another mysql db on another server.
If it's possible for you to stop the database server, the best way is probably for you to:
Stop the database
Do a file copy of the files (including appropriate transaction logs, etc) to a new file system.
Restart the database.
Then move the copied files to the new server and bring up the database on top of the files. It's a bit complicated to do this, but it's by far the fastest way.
I used to be a DBA for a terabyte+ database in MySQL and this is one of the ways we'd do nightly backups of the database. mysqldump would've never worked for data that large. We'd stop the database each night and file copy the underlying files.
Since your intent seems to be having two copies of the DB, why not set up replication to do this?
That will ensure that both copies of the DB remain in an identical state (in terms of data anyway).
And, if you want a snapshot to be exported, you can:
wait for a quiet time.
disable replication.
back up the slave copy.
re-enable replication.
Related
I removed some indexes on a very large table and realized I needed them. Instead of adding them back concurrently, which would take a very long time, I was wondering if I could just do restore using a database copy that was taken before the indexes were removed?
If by "database copy" you mean a copy of the Postgres DB directory at file level (with Postgres not running to get a consistent state), then yes, such a snapshot includes everything, indexes too. You could copy that back on file level, and then start Postgres - falling back to the previous state, of course.
If, OTOH, you mean a backup with the standard Postgres tools pg_dump or pg_dumpall, then no, indexes are not included physically. Just the instructions to build them. It would not make sense to include huge junks of functionally dependent values. Building them from restored data may be about as fast.
Either way, you could not add back an index from an older snapshot to a live DB anyway, after changes to the table have been made. That's a logically impossible. Then there is no alternative to rebuilding the index one way or another.
I'll answer for MySQL. You tagged your question with both mysql and postgresql so I don't know which one you really use.
If your backup was a physical backup made with a backup solution like Percona XtraBackup or MySQL Enterprise Backup, it will include the indexes, so restoring it will be quicker.
If your backup was a logical backup made with mysqldump or mydumper, then the backup includes only data. Restoring it will have to rebuild the indexes anyway. It will not save any time.
If you made the mistake of making a "backup" only by copying files out of the data directory, those are sort of like the physical backup, but unless you copied the files while the MySQL Server was shut down, the backup is probably not viable.
Background:
I have a medium-sized database (900GB) that needs to be copied onto another server (driven via code, not scheduled). Currently we take a backup (to .bak), move it to a staging server, and restore it to the target server. The target server does not have enough space to hold the backup file, and the restored instance simultaneously, thus the staging server. These transfers (backup to staging, restore from staging) happen over SMB2. The staging server needs to go away due to business requirements, however. It is worth mentioning the target server will be taken offline (and used offline) after the transfer, so I'm not sure the mirroring or replication options are valid.
I have identified two options -- one is to backup the database to the primary server, and open up firewall rules/smb to serve the backup file to the target server over SMB. ("RESTORE FROM \x.x.x.x\blah\db.bak"). Security isn't a fan, though.
The ideal solution (and one that could easily be implemented in every other database I've worked with), is to quiesce the database and transfer the datafiles (in the case of ms-sql, mdf and ldf files). However, upon research I see there is no such functionality available out of the box. I know I can take the database offline to copy the mdf/ldf safely, but that's not an acceptable solution (database must remain online).
I have read LOTS of posts and Microsoft documentation regarding VSS / shadow copy, but I have also read lots of conflicting information about the reliability of using VSS/sqlwriter to copy the mdf/ldf file to the target server, and simply re-attaching the database.
I am looking for documentation or advice (or even backup software that can be programmatically driven via an API) to accomplish this goal of transferring the database without requiring a secondary holding place. Currently I'm researching how to drive this copying process with Powershell, using VSS(vssadmin/vshadow from sdk), but I'm not confident in what I'm reading, and it's not even clear to me if VSS/sqlwriter is a supportable method to copying online LDF/MDF files. Any advice is appreciated.
Thanks,
We are working on generating a database that will be distributed to several third parties. We will also re-generate this database on an on-going basis, redistribute it, and those third parties will need to overwrite their existing database copy with the new version.
I'm trying to decide whether I want to detach the database temporarily, make a copy of the .mdf, and send that copy out, or whether I should just do a full backup of the database, and send the .bak out.
The primary difference I can see is that to distribute the .mdf, you must detach the database temporarily, so that you can copy it.
What are the other pros/cons of each format?
Are there security implications with distributing one over the other?
Is it easier to initially import one format over the other?
Thank you.
Neither. The proper way to distribute database changes is via upgrade scripts, otherwise those third parties using the database will loose the actual data contained in the database.
For the case when the data is never changed by the third parties (ie. the database is read only at those sites) then distribution by backup file is feasible. MDF is completely out of the question, first and foremost because MDF is not the entire database: at least the LDF is required in addition to recreate a coherent database. Simply attaching the MDF w/o a corresponding LDF will result in most cases, in a corrupt database. In addition to being incorrect, MDF distribution is inefficient (BAK files are smaller than the corresponding MDF because they do not contain unallocated pages) and also MDF manipulation requires placing the database offline during the file copy.
I need to do some structural changes to a database (alter tables, add new columns, change some rows etc) but I need to make sure that if something goes wrong i can rollback to initial state:
All needed changes are inside a SQL script file.
I don't have administrative access to database.
I really need to ensure the backup is done on server side since the BD has more than 30 GB of data.
I need to use sqlplus (under a ssh dedicated session over a vpn)
Its not possible to use "flashback database"! It's off and i can't stop the database.
Am i in really deep $#$%?
Any ideas how to backup the database using sqlplus and leaving the backup on db server?
better than exp/imp, you should use rman. it's built specifically for this purpose, it can do hot backup/restore and if you completely screw up, you're still OK.
One 'gotcha' is that you have to backup the $ORACLE_HOME directory too (in my experience) because you need that locally stored information to recover the control files.
a search of rman on google gives some VERY good information on the first page.
An alternate approach might be to create a new schema that contains your modified structures and data and actually test with that. That presumes you have enough space on your DB server to hold all the test data. You really should have a pretty good idea your changes are going to work before dumping them on a production environment.
I wouldn't use sqlplus to do this. Take a look at export/import. The export utility will grab the definitions and data for your database (can be done in read consistent mode). The import utility will read this file and create the database structures from it. However, access to these utilities does require permissions to be granted, particularly if you need to backup the whole database, not just a schema.
That said, it's somewhat troubling that you're expected to perform the tasks of a DBA (alter tables, backup database, etc) without administrative rights. I think I would be at least asking for the help of a DBA to oversee your approach before you start, if not insisting that the DBA (or someone with appropriate privileges and knowledge) actually perform the modifications to the database and help recover if necessary.
Trying to back up 30GB of data through sqlplus is insane, It will take several hours to do and require 3x to 5x as much disk space, and may not be possible to restore without more testing.
You need to use exp and imp. These are command line tools designed to backup and restore the database. They are command line tools, which if you have access to sqlplus via your ssh, you have access to imp/exp. You don't need administrator access to use them. They will dump the database (with al tables, triggers, views, procedures) for the user(s) you have access to.
I have one database which contains the most recent data, and I want to replicate the database content into some other servers. Due to non-technical reasons, I can not directly use replicate function or sync function to sync to other SQL Server instances.
Now, I have two solutions, and I want to learn the pros and cons for each solution. Thanks!
Solution 1: detach the source database which contains the most recent data, then copy to the destination servers which need the most recent data, and attach database at the destination servers;
Solution 2: make a full backup of source server for the whole database, then copy data to destination servers and take a full recovery at the destination server side.
thanks in advance,
George
The Detach / Attach option is often quicker than performing a backup as it doesn't have to create a new file. Therefore, the time from Server A to Server B is almost purely the file copy time.
The Backup / Restore option allows you to perform a full backup, restore that, then perform a differential backup which means your down time can be reduced between the two.
If it's data replication you're after, does that mean you want the database functional in both locations? In that case, you probably want the backup / restore option as that will leave the current database fully functional.
EDIT: Just to clarify a few points. By downtime I mean that if you're migrating a database from one server to another, you generally will be stopping people using it whilst it's in transit. Therefore, from the "stop" point on Server A up to the "start" point on Server B this could be considered downtime. Otherwise, any actions performed on the database on server A during transit will not be replicated onto server B.
In regards to the "create a new file". If you detach a database you can copy the MDF file immediately. It's already there ready to be copied. However, if you perform a backup, you have to wait for the .BAK file to be created and then move it to it's new location for a restore. Again this all comes down to is this a snapshot copy or a migration.
Backing up and restoring makes much more sense, even if you might eek out a few extra minutes from a detach attach option instead. You have to take the original database offline (disconnect everyone) prior to a detach, and then the db is unavailable until you reattach. You also have to keep track of all of the files, whereas with a backup all of the files are grouped. And with the most recent versions of SQL Server the backups are compressed.
And just to correct something: DB backups and differential backups do not truncate the log, and do not break the log chain.
In addition, the COPY_ONLY functionality only matters for the differential base, not for the LOG. All log backups can be applied in sequence from any backup assuming there was no break in the log chain. There is a slight difference with the archive point, but I can't see where that matters.
Solution 2 would be my choice... Primarily becuase it won't create any downtime on the source database. The only disadvatage i can see is that depending on the database recovery model, the transaction log will be truncated meaning if you wanted to restore any data from the transaction log you'd be stuffed, you'd have to use your backup file.
EDIT: Found a nice link; http://sql-server-performance.com/Community/forums/p/5838/35573.aspx