I am working on a project to import data from a Sybase database backup. From my review of Snowflake documentation I see no mention of how one might do this other than writing custom ETL to export data for each table into a supported structured data format (e.g. csv or xml) and then load that file into snowflake.
Is there a way to have Snowflake load schema and data directly from a database backup file? Even if there is a way to do this for some other database vendor (other than Sybase) that might be helpful.
Related
I frequently need to validate CSVs submitted from clients to make sure that the headers and values in the file meet our specifications. Typically I do this by using the Import/Export Wizard and have the wizard create the table based on the CSV (file name becomes table name, and the headers become the column names). Then we run a set of stored procedures that checks the information_schema for said table(s) and matches that up with our specs, etc.
Most of the time, this involves loading multiple files at a time for a client, which becomes very time consuming and laborious very quickly when using the import/export wizard. I tried using an xp_cmshell sql script to load everything from a path at once to have the same result, but xp_cmshell is not supported by AzureSQL DB.
https://learn.microsoft.com/en-us/azure/azure-sql/load-from-csv-with-bcp
The above says that one can load using bcp, but it also requires the table to exist before the import... I need the table structure to mimic the CSV. Any ideas here?
Thanks
If you want to load the data into your target SQL db, then you can use Azure Data Factory[ADF] to upload your CSV files to Azure Blob Storage, and then use Copy Data Activity to load that data in CSV files into Azure SQL db tables - without creating those tables upfront.
ADF supports 'auto create' of sink tables. See this, and this
I would like to know the steps on how to restore data dumped from an Oracle database to a SQL Server database?
Our purpose is to get data from an external Oracle database out of our organization. Due to security concern, the team that manages data source refused us to transfer data through ODBC server link. They dumped the selected tables that we need so we can restore the data in our organization. Each table's data files include .sql file to create table and constraints, a ".ctl" file, one or multiple ".ldr" files.
An extra trouble is: one of the tables contains a blob column, which stores a lot of binary data files, such as PDF etc.. This column takes most of the size of our dumped files. Otherwise I could ask them to send us data in excel directly.
Can someone give me a suggestion about what route we should take?
Either get them to export the data in an open format, or load it into an Oracle instance you have full control over. .ctl and .ldr files looks like they used the old SQL*Loader.
I have a unique query regarding Apache Sqoop. I have imported data using apache Sqoop import facility into my HDFS files.
Next ,. I need to put the data back into another database (basically I am performing data transfer from one database vendor to another database vendor) using Hadoop (Sqoop).
To Put data into Sql Server , there are 2 options.
1) Using Sqoop Export facility to connect to my RDBMS,(SQL server) and export data directly.
2) Copy the HDFS data files (which are in CSV format) into my local machine using copyToLocal command and then perform BCP ( or Bulk Insert Query) on those CSV files to put the data into SQL server database.
I would like to understand which is the perfect(or rather correct) approach to do so and which one of them is more Faster out of the two - The Bulk Insert or Apache Sqoop Export from HDFS into RDBMS. ??
Are there any other ways apart from these 2 ways mentioned above which can transfer faster from one database vendor to another.?
I am using 6-7 mappers (records to be transferred is around 20-25 millions)
Please suggest and Kindly let me know if my Question is unclear.
Thanks in Advance.
If all you do is ETL from one vendor to another, then going through Sqoop/HDFS is a poor choice. Sqoop makes perfect sense if the data originates in HDFS or is meant to stay in HDFS. I would also consider sqoop if the set is so large as to warrant a large cluster for the transformation stage. But a mere 25 million records is not worth it.
With SQL Server import it is imperative, on large imports, to achieve minimally logging, which require bulk insert. Although 25 mil is not so large as to make the bulk option imperative, still AFAIK sqoop, nor sqoop2, do not support bulk insert for SQL Server yet.
I recommend SSIS instead. Is much more mature than sqoop, it has bulk insert task and has a rich transformation featureset. Your small import is well within the size SSIS can handle.
I'd like to perform an oracle dump of my data and then load it back in after re-installing the software.
The problem is however that as I re-install the software, the schema of the data which I just exported may have changed slightly.
In mysql, I would hand-edit the SQL formatted dump file before importing it to match any schema changes.
But Oracle uses a proprietary dump/load format :(
Any tricks to preserving my data? Thanks!
BH
You can export the data and import the data into another schema (created by you) and copy the data from your schema to newly created application schema with some sql statements.
I need to export the data from 36 SQL tables containing 24GB of data into flat files, copy them to the client and import them there into the existing tables in his SQL database.
And I will need this for several customers (same tables, though).
How do I mass export and import data?
Is there a command line tool for this so I can write a script for repeated use?
The basic knowledge you will find here Importing and Exporting Bulk Data
What is bcp ?
bcp.exe is the standard bulk import/export tool for MSSQL. Using SSIS packages is an alternative, but brings a lot of overhead with it: it's a full ETL tool. In TSQL there's also a BULK INSERT statement that you can use as an alternative to "bcp in", but I personally haven't played around to see which one is faster or more useful etc.
See "bulk exporting" and "bulk importing" in Books Online for all the details.