Hi all I have an Alteryx workflow that is designed to read data from Amazon Redshift with a Connect In-DB tool. I've been provided CSV files but I don't have access to Redshift, I want to run the workflow but I can't figure out how to import a CSV into the Connect In-DB tool. Any ideas how to do this?
Alteryx Connect In-DB tool flow image
As #johnjps111 mentioned in the comment, you need to replace all the in-database tools with an Input Tool that reads the .csv. This assumes that the .csv was filtered and summarized to form the same output that would be coming from the in database tools. Otherwise you would need to replicate them with their normal counterparts.
Related
We have Snowflake in our organization and currently we dont have an ETL tool.
I would like to pull data directly from Salesforce into Snowflake staging table manually for an analysis.
Would it be possible to do this with python or java code ?
many thanks,
This article is pretty useful reference for this requirement: https://rudderstack.com/guides/how-to-load-data-from-salesforce-to-snowflake-step-by-step
You can do this using the simple-salesforce library for python.
https://pypi.org/project/simple-salesforce/
run the query, write out the results to a CSV, then load the CSV into snowflake.
You can use Salesforce Data Loader (provided by Salesforce) : a simple interface to export Salesforce data to a .csv file. It's a java-based tool.
Once exported you can PUT your flat file in a Snowflake Stage then use COPY INTO to load into your final table.
Documentation available here
I can't really find anything online about how to do this.
There are a few separate, offline, Microsoft Databases existing...
Everyone has begun staging different .accdb files in an Amazon S3 bucket - I'm hoping Snowflake now provides an easy (ish) solution to reading them into the SQL database I'm building.
The short answer is that you can't. Snowflake can import text files in various formats (csv, XML, JSON, etc) but is has no extract capabilities so it can't connect to applications and read data from them: asking it to read a MS Access file is no different from asking it to read an Oracle or SQL Server file.
You probably have 2 options:
Export the data from MS Access to a file format that Snowflake
can ingest
Use an ETL tool that can read from MS Access and
write to S3 as text files (or directly to Snowflake, which is
probably simpler
You should be able to connect to Snowflake in Microsoft Access through an ODBC connection. You first need to install the Snowflake ODBC Driver and configure a DSN.
I've been trying to sort through Microsoft's extensive documentation, but, cannot find the answer I'm looking for, hence, posting it here for the experts!
I have a table in a database in MS SQL Server 2016, that I read/write to using MS SSMMS. I would like to export this single table into my Azure storage account for further analysis in the MS Data Science Virtual Machine, but cannot find a way to do this. Any suggestions?
Thanks.
You can also use tools built into the MS Data Science Virtual Machine (DSVM) to first export from SQL Server to CSV. BCP (command line) is one tool.
If you want a graphic tool, use SSMS and use the "import and export" option to save result of your query to CSV file. Then you can copy the CSV file to a Azure storage account using Azcopy (command line) or Azure Storage explorer (graphical) also available on the DSVM. Hope this helps.
I have a .GDB database (old one) and the data in it is very important
I need to convert that .gdb database to a SQL Server database - can anyone help me...
Create connections to both source GDB and Destination SQL Server in ArcCatalog. Copy everything from source and paste it into the destination. You won't be able to do it with SQL tools alone.
Lacking ESRI software, for simple cases, my workflow is to use the GDAL C++ API to read the GDB. This requires the GDAL File GDB driver. Then I will use Microsoft.SqlServer.Types to transfer to SQL Server. This involves low-level APIs and you need to understand the spatial types in the respective libraries. It gets complex if you have polygons with rings, for example.
I'm not aware of a tool that will automatically convert between these database types. You'll need to use an application that can read the old database type (Firebase), learn the table design, create a similar table design in SQL Server, and use the application to load the data from Firebase to SQL Server.
Typically, this kind of work is called ETL (Extract/Transform/Load) and is done with migration tools like SQL Server Integration Service (SSIS). SSIS is free with SQL Server, and there are a lot of books available on how to use it - but like learning to develop software, this isn't a small task.
The easiest way to export Esri File Geodatabase FGDB (.gdb) data to MS SQL Server is with ArcGIS for Desktop at the Standard or Advanced level.
You may also want to try exporting to shapefile (SHP) format (an open transitional format) then import to your MS SQL Server. I've seen a tool online that has worked for me called Shape2SQL.
Esri also has an open File Geodatabase API that you can use to write your own too.
I highly recommend FME Workbench for GIS data conversion. It's like SQL Server Integration Services (ETL) but for GIS. Graphical interface, connect data readers with data writes, insert transforms, run them, etc.
I have a client who needs to import rows from a LARGE Excel file (72K rows) into their SQL Server database. This file is uploaded by users of the system. Performance became an issue when we tried to upload and process these at the same time on user upload. Now we just save it to the disk and an admin picks it up and splits it into 2K rows and runs it through an upload tool one by one. Is there an easier way to accomplish this without affecting performance or timeouts?
If I understand your problem correctly you get a large spreadsheet and need to upload it into a SQL Server database. I'm not sure why your process is slow at the moment, but I don't think that data volume should be inherently slow.
Depending on what development tools you have available it should be possible to get this to import in a reasonable time.
SSIS can read from excel files. You could schedule a job that wakes up periodically and checks for a new file. If it finds the file then it uses a data flow task to import it into a staging table and then it can use a SQL task to run some processing in it.
If you can use .Net then you could write an application that reads the data out through the OLE automation API and loads it to a staging area through SQLBulkCopy. You can read the entire range into a variant array through the Excel COM API. This is not super-fast but should be fast enough for your purposes.
If you don't mind using VBA then you can write a macro that does something similar. However, I don't think traditional ADO has a bulk load feature. In order to do this you would need to export a .CSV or something similar to a drive that can be seen off the server and then BULK INSERT from that file. You would also have to make a bcp control file for the output .CSV file.
Headless imports from user-supplied spreadsheets are always troublesome, so there is quite a bit of merit in doing it through a desktop application. The principal benefit is with error reporting. A headless job can really only send an email with some status information. If you have an interactive application the user can troubleshoot the file and make multiple attempts until they get it right.
I could be wrong, but from your description it sounds like you were doing the processing in code in your application (i.e. file is uploaded and the code that handles the upload then processes the import, possibly on a row-by-row basis)
In any event, I've had the most success importing large datasets like that using SSIS. I've also set up a spreadsheet as a linked server which works but always felt a bit hackey to me.
Take a look at this article which details how to import data using several different methods, namely:
SQL Server Data Transformation Services (DTS)
Microsoft SQL Server 2005 Integration Services (SSIS)
SQL Server linked servers
SQL Server distributed queries
ActiveX Data Objects (ADO) and the Microsoft OLE DB Provider for SQL Server
ADO and the Microsoft OLE DB Provider for Jet 4.0