I want to convert the Azure Synapse script into GCP BigQuery. The Azure Datalake script is written in T-SQL, I want to convert it to BigQuery script. Please guide me - is there any procedure to convert a T-SQL query to BigQuery, like similar to SQL ? Thank you
Related
I am trying to load data from a couple of snowflake tables [200-300 columns] into azure sql server. Is there a way to convert the datatypes automatically or to convert the entire table creation script ?
Paste the ddl into a text editor and use find-and-replace to change the datatypes
There are tools online that can assist here (Your mileage may vary)
Example: https://www.jooq.org/translate/
I am working on SQL Server migration to Databricks.
I have a number of TSQL procedures, minimum of 100 lines of code.
I want to convert these procedures to Spark code.
For POC ( worked on 1 TSQL proc), all source files were imported and created as GlobalTempView's, and converted TSQL into Spark SQL.
and by using final globalTempView exported as a file.
Now, I have a question here, creating GlobalTempView's and converting TSQL proc to Spark SQL is the best way?, or loading all files into a data frame and re-write that TSQL proc to Spark data frame logic is best way.
kindly please let me know which is the best way to convert TSQL procs to Spark SQL or dataframes? and reason also.
You can use Databricks to query many SQL databases using JDBC drivers, therefore no extra task is required to convert the existing stored procedure to Spark code.
Check this Databricks official document to know more and steps to Establish connection with SQL Server
Migrating file to DataFrame is also another possible approach but be aware that Spark DataFrames are immutable so any UPDATE or DELETE actions will have to be changes to output to a new modified DataFrame.
I suggest you to go through Executing SQL Server Stored Procedures from Databricks (PySpark) in case you are approaching to execute stored procedures from Databricks.
I want to enter data from mutiple T-SQL queries into my azure sql database, We want to enter data in such a way so that we have 8 columns in a single table in azure sql database, and for those 8 columns we have multiple T-SQL statements that 1 for each that will enter the data from the select statments into the azure sql database, how can this be achieved, for long term we want this to run as a job going forward.
If your multiple T-SQL queries run in one database, I suggest you can think about the Azure Data Factory.
Azure Data Factory can help migrate data from one table or multiple tables to Azure SQL database by T-SQL queries.
You also can trigger the pipeline runs on a schedule. You can create a scheduler trigger to schedule the pipeline to run periodically (hourly, daily, and so on).
For details about Data factory, please see Azure Data Factory Documentation.
Tutorials:
Incrementally load data from multiple tables in SQL Server to an Azure SQL database
Copy multiple tables in bulk by using Azure Data Factory.
And if your source data is in SQL Server instance, you can create a linked server to Azure SQL Database, this also can help you achieve that.
You can query and insert data to linked Azure SQL server by T-SQL statements.
About SQL Server linked server, please see: Create Linked Servers (SQL Server Database Engine)
Hope this helps.
Architectural/perf question here.
I have a on premise SQL server database which has ~200 tables of ~10TB total.
I need to make this data available in Azure in Parquet format for Data Science analysis via HDInsight Spark.
What is the optimal way to copy/convert this data to Azure (Blob storage or Data Lake) in Parquet format?
Due to manageability aspect of task (since ~200 tables) my best shot was - extract data locally to file share via sqlcmd, compress it as csv.bz2 and use data factory to copy file share (with 'PreserveHierarchy') to Azure. Finally run pyspark to load data and then save it as .parquet.
Given table schema, I can auto-generate SQL data extract and python scripts
from SQL database via T-SQL.
Are there faster and/or more manageable ways to accomplish this?
ADF matches your requirement perfectly with one-time and schedule based data move.
Try copy wizard of ADF. With it, you can directly move on-prem SQL to blob/ADLS in Parquet format with just couple clicks.
Copy Activity Overview
I am newbie and needs guidance or resources to read. I have two databases, one is in Azure SQL-Server 2012 and the other is in MongoDB at remote location. I access the Azure SQL-Server data using Sql Server Management Studio (SSMS) from my PC and the data of Mongodb in browser using REST API. The retrieved data is in JSON format.
For analysis I want to merge the data from Mongodb in to SQL-Server. I don't know how to store the results of the REST API query as a table in SQL-Server 2012? Note that the columns I want to retrieve from MongoDB are not sub-structured so can easily fit in Relational database.
Azure SQL Database supports OPENJSON function that can parse JSON tet and transform it into a table see https://azure.microsoft.com/en-us/updates/public-preview-json-in-azure-sql-database/