Converting massive .bak files to .parquet for upload to BigQuery - sql-server

I have 12 files, each around 20GB, on Google Drive, that are database backups in a .bak file format. I'd like to upload them to BigQuery for analysis, however, BigQuery cannot handle .bak files and requires csv or parquet files. I am currently planning on downloading each file to a local machine, uploading it to Microsoft's SQL Studio, converting it to .parquet, and then uploading that file to BigQuery from my local machine (laptop), but this is long and painful. Is there a better way to do this?

I have the same problem, the worst step I am thinking to copy .bak files to the storage account. create SQL on VM with massive memory and then run an ADF to copy SQL data to Parquet. unless anyone else has a better option.

Related

Data migration for .SQB files to Snowflake

I need to migrate .SQB files to Snowflake.
I have a data relay where MSSQL Server database files are saved in .SQB format (Redgate) and available via sSTP with full backups every week and hourly backups in between.
Our data warehouse is Snowflake and the rest of our data from other sources. I'm looking for the simplest, most cost effective solution to get my data to Snowflake.
My current ETL process is as follows.
AWS EC2 instance (Windows) that downloads the files, applies
Redgate's SQL Backup Converter
(https://documentation.red-gate.com/sbu7/tools-and-utilities/sql-backup-file-converter)
to convert the files to .BAK. This tool requires a license
Restore MS SQL database on the same AWS EC2 Instance
Migrate MS SQL database to Snowflake via Fivetran
Is there a simpler / better solution? I'd love to eliminate the need for the intermediate EC2 if possible.
The .SQB files come from an external vendor and there is no way to have them change the file format or delivery method.
This isn't a full solution to your problem, but it might help to know that you're okay to use the SQL Backup file converter wherever you need to, free of any licensing restrictions. This is true for all of SQL Backup's desktop and command-line tools. Licensing only gets involved when dealing with the Server Components, but once a .SQB file has been created you're free to use SQBConverter.exe to convert it to a .BAK file wherever you need to.
My advice would be to either install SQL Backup on whichever machine you want to use the tooling on, or just copy all the files from an existing installation. Both should work fine, so pick whichever is easiest for you.
(FYI: I'm a current Redgate software engineer and I used to work on SQL Backup until fairly recently.)
You can
Step 1: Export Data from SQL Server Using SQL Server Management Studio.
Step 2: Upload the CSV File to an Amazon S3 Bucket.
Step 3: Upload Data to Snowflake From S3 using COPY INTO command.
You can use your own AWS S3 bucket for this and then create a External Stage pointing to the S3 bucket or You can upload the files into internal Snowflake Stage.
Copy Into from External Stage -
https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html#loading-files-from-a-named-external-stage
Copy Into from an Internal Stage -
https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html#loading-files-from-an-internal-stage
Creating External Stage-
https://docs.snowflake.com/en/sql-reference/sql/create-stage.html

Exporting documents stored inside SQL Server database to AWS S3 Bucket

We have a SQL Server/Windows (2016) database that stores documents inside BLOB (nvarbinary) fields. We are looking to migrate these documents out of the database and store them in AWS S3 storage.
I have been able to read BLOB data out of SQL Server in Powershell using stream read/write to a local file system and I can then use AWS S3 CP ... to get the files out to S3. However, this approach causes a need for an extra step - to store the file locally, on the SQL Server drive.
Is there a way to read binary data out of SQL Server database and store it directly in S3? I tried Write-S3Object, but it looks like it is expecting text (system.String) for contents, which is not working with images/pdfs and other non-text type documents we have.
Any suggestions, Powershell code samples, or references are much appreciated!
Thank you in advance
--Alex

Fastest Way to copy/transfer Tera bytes of data between SQL Database Servers

I am looking for a most efficient and fastest way to transfer huge data from a SQL Server located in Europe to SQL Server located in USA.
The traditional ways are taking longer time.
Linked Server
SQL bulk copy or BCP
SQL database replication
SQL Import Wizard
Cloud is an option but it comes with data privacy issues. I am not looking for offline copy using backup and restore or transfer via hard disc.
Can anyone suggest the best way to overcome this issue ?
You can ask the company from Europe to back everything up in a HD and ship it securely. My work does it this way. Shipping Oracle DB copies from LA
Alternative 1: using compressed full backup file of the database
Full backup the database
Compress the backup file and splitting it into smaller chunks of size 500MB (or less) using zip tool.
note: You can back up to multiple files with compression that save about 60%– in one or more locations using SSMS or T-Sql script – and multiple threads will be used.
This can make a backup take less time and no need to use zip tool.
Host the files in ftp server or http uploader server
Copy the data files from the source http/ftp using http /ftp protocol
In the target server uncompress the files and compose one backup file
Restore the database
Update:
Alternative 2: Using compressed bcp files
SQL bulk copy or BCP out as native data
compress the files using zip
host files in ftp server or http uploader server
Copy files from the source http/ftp using http /ftp protocol
in the target server uncompress the files
bcp in from data files
note:
You can use batch file or Powershell scripts to automate these tasks
Speed of network is controlled by network speed of the service provider, contact your internet service provider to get the max speed.
We avoided the online interaction between source/target Sql servers to avoid the time out of the network.

SQL Server 2012: Backup restore from a compressed backup

I am trying to restore a database from the backup (.bak) file which is [saved] inside a ZIP file, but not been successful so far. However, I am able to restore it after extracting from the ZIP file.
This MS page says every edition of SQL Server 2008 and later can restore a compressed backup with the following restrictions.
Restrictions: The following restrictions apply to compressed backups:
Compressed and uncompressed backups cannot co-exist in a media set.
Previous versions of SQL Server cannot read compressed backups.
NTbackups cannot share a tape with compressed SQL Server backups.
I do not clearly understand the first restriction. Could someone please clarify/elaborate this?
I have done the following steps:
Taken a backup on a staging SQL Server [MyTestDB.bak]; Compressed it (by Right Click > Send To - Compressed(zipped) folder); Now, named it as MyTestDB.ZIP
FTP'd the ZIP file to local development SQL Server and trying to restore it from the ZIP file. But the database name is not available to backup.
Both SQL Server versions are exactly same.
SQL Servers Version: Microsoft SQL Server 2012 - 11.0.5058.0 (X64)-Standard Edition (64-bit)
Would the usage of term 'compressed backup' for the backup files saved inside ZIP files is correct or Is this need to be backed up in a different way so it can be called as a compressed backup?
However, if I select the .bak file after extracting it from the ZIP file it all works fine.
I am not sure where I am going wrong? I can simply extract the backup and restore it without any problem, but would like to know the cause why it is not working, as it would have been a much better solution to just back up it from the ZIP itself.
Backup compression is something different.
You specify this when you configure or start the backup and you then get a backup file that contains compressed data, as opposed to it containing uncompressed data if you don't enable compression.
SQL Server is not able to use a zip file, while the backup file is certainly compressed, it is not a "compressed backup" that this refers to.
So yes, you need to extract the backup file before restoring from it.
If you want to learn how to make compressed backups correctly, check out this page full of links to related material:
Configure Backup Compression

Converting .BAK file of SQL Server DB to .CSV

Ok so I have a .bak file which is a backup of our old CRM data. This has come from a backup of an SQL SErver database (not sure what version). What I am trying to achieve is converting all the data that file contains in a .CSV that I can then use to import the data into a different CRM.
I have tried the obvious things, renaming the file .csv and trying various text editors and applications that claim to be able to view these kind of files. All I ever get is a ton of gibberish, by that I mean a ton of characters and symbols that clearly do not represent what is in the data backup.
From what I have obtained already, I need to restore this file to an SQL Server database, and then do the export to .csv. I have managed to set up a trial version of SQL server 2012, however when I try to import the file (import from flat file option), when I get to the preview, it appears to just be gibberish populating the fields again, and if I then run that anyway, its fails and returns errors. I can confirm that another CRM company had managed to restore this and extract what they needed, sadly we decided not to continue with them, but based on that, I would say the .bak file is not corrupted.
I assume I am doing something wrong. My question is what is the correct way to import / restore a .bak file into MS SQL 2012?
Is there an alternative that I have missed or is this not the right approach to begin with?
Any help greatly appreciated as always!
I recently needed to convert a MS SQL Server database backup in .BAK format to anything digestible by other tools such as CSV, SQL. The only thing I found was online converter RebaseData. It's free for smaller file sizes, up to 10MByte.

Resources