Migrating from SQL server to Neo4j with Pentaho Kettle Spoon - sql-server

I want to Migrate from SQL Server to Neo4j. I used CSV for this mean but I need to a ETL Tool for solving this problem with simplest way.
for this reason I use Pentahoo Kettle Spoon.
I used this to connect to Neo4j with Pentaho Kettle Spoon.
How can I migrate from SQL Server to Neo4j with Pentaho Kettle Spoon?
Which Tools can help me in Pentaho Kettle Spoon?

I faced to this problem and I could solve it. :)
At first you need to add Table Input tool for getting records from SQL Server then you can add Execute SQL Script from Scripting tool.
create your Transformation from Table input to Execute SQL Script. then Get fields and check mark :
Execute for each row?
Execute as a single Statement
then you can add your Cypehr Query Like that:
CREATE(NodeName:NodeLabel{field1:?,field2:?,field3:?,...})
Execute Transformation and Enjoy it! :)
--------------------------------------------------------
Edited:
Load CSV Command in Neo4j is very faster than Create All Nodes node by node. you can use Load CSV advantages in Pentaho Kettle Spoon. for this mean we need two Transformations, first Transformation exports Data to CSV and second Transformation loads CSV to Neo4j.
For First Transformation:
add a Table Input and a Text File Output to transformation. Config Connection String and other parts of them.
For Config Neo4j Connection String, refer to this
For Second Transformation:
add a Execute SQL Script tool to transformation, Config Connection String and Write below code for that:
LOAD CSV FROM 'file:///C:/test.CSC' AS Line
CREATE(NodeName:NodeLabel{field1:Line[0],field2:Line[1],field3:Line[2],...})
at final create a job and add the transformations to that.

Related

Is there a way to save all queries present in a ssis package/dtsx file?

I need to run some analysis on my queries (specifically finding all the tables which a ssis calls).
Right now I'm opening up every single ssis package, every single step in it and copy and pasting manually the tables from it.
As you can imagine it's very time consuming and mind-numbing.
Is there a way to do export all the queries automatically ?
btw i'm using sql server 2012
Retrieve Queries is not a simple process, you can work in two ways to achieve it:
Analyzing the .dtsx package XML content using Regular Expression
SSIS packages (.dtsx) are XML files, you can read these file as text file and use Regular Expressions to retrieve tables (as example you may search all sentences that starts with SELECT, UPDATE, DELETE, DROP, ... keywords)
There are some questions asking to retrieve some information from .dtsx files that you can refer to to get some ideas:
Reverse engineering SSIS package using C#
Automate Version number Retrieval from .Dtsx files
Using SQL Profiler
You can create and run an SQL Profiler trace on the SQL Server instance and filter on all T-SQL commands executed while executing the ssis package. Some examples can be found in the following posts:
How to capture queries, tables and fields using the SQL Server Profiler
How to monitor just t-sql commands in SQL Profiler?
SSIS OLE DB Source Editor Data Access Mode: “SQL command” vs “Table or view”
Is there a way in SQL profiler to filter by INSERT statements?
Filter Events in a Trace (SQL Server Profiler)
Also you can use Extended Events (has more options than profiler) to monitor the server and collect SQL commands:
Getting Started with Extended Events in SQL Server 2012
Capturing queries run by user on SQL Server using extended events
You could create a schema for this specific project and then have all the SQL stored within views on that schema... Will help keep things tidy and help with issues like this.

Import Data from Mysql to Neo4j using Pentaho kettle Spoon

I was trying to migrate my mysql database to the Neo4j database. For that I was using Pentaho Kettle. I have also Downloaded plugins for the Neo4j in Kettle.
I am a beginner to kettle. I don't know the efficient way to bulk load data from mysql database to Neo4j, using kettle. I am thinking of doing as follows:
First take input from table by writing sql query
Then connect it to the Execute Script and create a cyper query for creating node and relationships corresponding to the table Schema.
Is there any better way of doing this? Because my database size is large. I have to write so many sql and cipher queries for importing my sql database to neo4j. I am looking for some bulk load feature in pentaho kettle spoon.
Here are the snapshots of my Transformations.
Transformation
Input Table
Execute Script
My PDI version is 8.1.

Daily Feed a SQL server table from a CSV file in sFTP

I have a CSV file that can be acessed only via sFTP.
The CSV file is daily updated (same structure but different values).
My aim is to daily copy the values of the CSV and paste it into a SQL Server table. Of course the process needs to be automated.
My CSV also contains too many row. The first column of the csv is 'ID'. And I have a fixed list of 'ID'. So I need to do some filtering before to paste into SQL Server
What would be the best option to reach the aim? Using an external ETL, Batch, PowerShell, SQL Script ?
Integration services (SSIS) is a good choice, because you can use a combinition of tasks (FTP connection, flat file source, t-SQL, ....), and you can integrate an SSIS package in SQL Server job to be executed daily.

Speeding Up ETL DB2 to SQL Server?

I came across this blog post when looking for a quicker way of importing data from a DB2 database to SQL Server 2008.
http://blog.stevienova.com/2009/05/20/etl-method-fastest-way-to-get-data-from-db2-to-microsoft-sql-server/
I'm trying to figure out how to achieve the following:
3) Create a BULK Insert task, and load up the file that the execute process task created. (note you have to create a .FMT file for fixed with import. I create a .NET app to load the FDF file (the transfer description) which will auto create a .FMT file for me, and a SQL Create statement as well – saving time and tedious work)
I've got the data in a TXT file and a separate FDF with the details of the table structure. How do I combine them to create a suitable .FMT file?
I couldn't figure out how to create the suitable .FMT files.
Instead I ended up creating replica tables from the source DB2 system in SQL Server and ensured that that column order was the same as what was coming out from the IBM File Transfer Utility.
Using an Excel sheet to control what File Transfers/Tables should be loaded, allowing me to enable/disable as I please, along with a For Each Loop in SSIS I've got a suitable solution to load multiple tables quickly from our DB2 system.

Batch Inserting from Excel into SQL Server

Is there a way to do a batch update on SQL Server from a row of data in Excel? We have excel documents that contain 2000+ plus rows and need to be imported in SQL Server. Is there a way to do a batch insert of these guys without calling the database over and over to insert one row at a time?
SQL Server Integrations Services offers a wizard based import which can help you easily set up a package to import an excel file. You can even save the package and schedule it to repeat the import in the future.
You have other options, as well. If you save the excel file to a comma or tab delimited text file, you can use the BULK INSERT t-sql command. See an example on the sqlteam.com forums.
Another T-SQL option is SELECT INTO. Excel is a valid OLEDB or ODBC data source from T-SQL. Here's an example.
There's also a command line import tool included with Microsoft SQL Server called BCP. Good documentation on BCP and the other options can be found on MSDN at: http://msdn.microsoft.com/en-us/library/ms187042.aspx
You can create an SSIS package to read your Excel file. When you create your task, you can select a connection type of "Excel", and then it helps you create an "Excel Connection Manager". Then you can easily send the data to your SQL Server table. Here's a tutorial on how to import an Excel file into SQL Server (2005). Give it a look.
Yes! Use the import/export wizard of the SSMS! Use an Excel-source and a SQL Server destination. You can also create a SSIS-Package in the BIDS or use the BULK INSERT-statement from T-SQL, if you convert your Excel-sheets in to CSV-files.

Resources