Problem with ADF Dataflow pipeline when writing data to SQL Server - sql-server

I have an ADF Dataflow pipeline that reads data from a data lake and writes it to a SQL Server table. I have a problematic column called "gdtxft" that I converted from binary to string using the 'toString' function in the Derived column setting. The column has a size of nvarchar(max) in SQL Server.
The problem is that when ADF Dataflow writes to SQL Server, it only writes the first character in the string and when I save the results using Excel, I can see all the string characters. I have verified that the data is being stored correctly in SQL Server by running queries on the table.
I have also checked the pipeline settings and verified that the correct encoding is being used and that the data is being read and written correctly. I have tried reading the data from the source using other tools and comparing the results with the data in SQL Server, but I haven't found any differences.
I tried to convert binary column to string using 'toString' function in the Derived column setting in ADF Dataflow pipeline, expecting the complete string to be written to SQL Server table, but the pipeline only wrote the first character of the string.

Related

SSIS Oracle Source not pulling through Unicode characters correctly

Problem
I'm creating an SSIS package to move data from Oracle to SQL Server but am having trouble with certain Unicode characters.
Source Oracle database column is NVARCHAR2 and an example character is U+0102 but also applies to other characters. These will be migrated to an NVARCHAR column in SQL Server but the issue seems to be at the point of extraction as when I preview the source in SSIS the characters in question just show as inverted question marks e.g.
DEMO¿DEMO
Setup
I'm using the Attunity - Oracle Source task/connection as couldn't get the OLE DB connection working
Oracle Database has NLS_NCHAR_CHARACTERSET AL16UTF16
Things I've tried
Changing the DefaultCodePage value in Advanced settings of the Source task to 1252, 65001, 1200, 1201
Converting the source column in the SQL command text in various ways, E.G.: Convert(SOURCE_COLUMN,'AL32UTF8')
Using UTL_RAW.Convert_To_Raw in the SQL Command text. This generates the correct binary values (as DT_BYTES in SSIS), but I couldn't then transform it back into a DT_WSTR using either Data Conversion or Derived Column.
I've tested extracting the same characters from a SQL Server database and they are appearing correctly in the SSIS preview window just to rule out an SSIS issue.
I'm using the SQL Command access mode as per below:
SELECT SOURCE_COLUMN
FROM SOURCE_TABLE;
Any help greatly appreciated.

SSIS - Using ODBC SQL execute task having parameters into a resultset to be imported

First thing first. I'm totally new to SSIS and trying to figure out its potential when it comes to ETL and eventually go to SSAS. I have the following scenario:
I have an Intersystems Database which I can connect via ADO .NET
I want to take data from this db and make inserts into MS SQL through incremental loads
My proposed solution/target is:
Have table in the MS SQL that stores the last pointer read or date/time snapshot. (irrevevant at this stage). Let's keep it simple and say we are going to use the record ID that exists in the Intersystems Database
Get the pointer from this table and use it as a parameter through ODBC to read the source database and then make inserts into the target MS SQL db
Update the pointer with the last record read so that next time we continue from there. (I don't want to get into the complications of updates/deletes. let's keep it simple)
Progress so far:
I have succeed to make a connection with MS SQL to read the pointer from there and place it in a variable
I have managed to use the [Execute SQL task] using parameters to read data from Intersystems Db and I'm placing that into a variable using FullResultSet
I have managed to use the [ForEach Loop Container] using the [Foreach ADO Enumerator] to go through each record and each field (yeeeey!)
Now. I can use a [Script task] that makes inserts into the MS SQL database using VB.NET code (theoretically) and then update the counter with the last record read from the source database. I have spent endless hours looking for solutions using ODBC parameters and the above is the only way forward I could see working.
My question is this:
Is this the only way and best practise? Isn't there some easy way that I can plug this resultset into some dataflow components which does the inserts and updates the record pointer for me??
Please assume that I do not have rights access to write into Intersystems Db and thus I cannot make any changes there to the tables structures. But I can only read data so that I can place it into MS SQL.
Over to you guys (or gals?)
I would suggest using a dataflow to improve your design for both efficiency (bulk loading vs row by row in script) and ease of use (no need for scripting).
You should use an execute SQL to get your pointer and save it into a variable.
You should build a sql variable using dynamic sql and above variable.
Make a data connection in manager to Source
Add a dataflow and go into it
Add a source manager and select your source from popup
Choose sql from variable and choose your variable
At this point you should have all the data you want and you can continue to transform or directly load to your target.
Edit: Record Pointer part
Add a multicast (this makes as many copies as you want)
Add an Aggregate Object and max(whatever your pointer is)
OleDBSQL Object (Allows live SQL and used mainly for updates
9a. UPDATE "YourPointerTable" SET "PointerField in DB" = ? (? is actually what you need to enter.
9b. Map to whatever you named in step 8.
This will allow you to handle insert/updates
From Multicast flow a new stream to a lookup object and map your key to the key of destination table
Specify no matches to redirect to no match output
Your matches map to an UPDATE
Your no matches map to an Insert

Link_Server data displayed as junk characters when passed from SQL Server to Oracle DB

I'm currently having issues executing an Oracle procedure from a SQL Server DB using LINK_SERVER. I'm able to execute the procedure, which involves taking data from the SQL Server DB and inserting it into the Oracle DB. However, for some of the record's values it's inserting the data as "junk characters" (i.e. "䍄䵒吠獥⁴潦⁲乓‣湡⁤䅖呓䤠" instead of "Test Description") and I'm not entirely sure why. We're not using any special characters but, when it's being passed from the SQL Server DB to our Oracle DB it changes the value to these junk characters. Let me know if more information is needed and thanks in advance.

How to copy data from one table to another in SQL Server

I am trying to copy data from views on a trusted SQL Server 2012 to tables on a local instance of SQL Server on a scheduled transfer. What would be the best practice for this situation?
Here are the options I have come up with so far:
Write an executable program in C# or VB to delete existing local table, query the data from remote database and then write results to tables in the local database. The executable would run on a scheduled task.
Use BCP to copy data to a file and then upload into local table.
Use SSIS
Note: The connection between local and remote SQL Server is very slow.
Since the transfers are scheduled, so I suppose you want this data to be up-to-date.
My recommendation would be to use SSIS and schedule it using SQL Agent. If you wrote a C# program, I think the best outcome you will gain is a program imitating SSIS. Moreover, SSIS will be a very easy to amend the workflow anytime.
Either way, to make such program/package up-to-date, you will have to answer an important question: Is the source table updatable or is it like a log (inserts only)?
This question is so important because it will determine how you will fetch the new updates from the source table. For example, if the table represents logs, you will most probably use the Primary Key to detect new records, if not, you might want to seek a column representing update date/time. If you have the authority to alter the source table, you might want to add timestamp column which represent the row version (timestamp differs than datetime)
For building an SSIS package, it will mainly contain the following components:
Execute SQL Task to get the maximum value from source table.
Execute SQL Task to get the last value where it should start from at the destination table. You can get this value either by selecting the maximum value from the destination table or if the table is pretty large you can store that value in another table (configuration table for example).
Data Flow which moves the data from source table starting after the value fetched in step 2 to the value fetched in step 1.
Execute SQL Task for updating the new maximum value back to the configuration table if you chose this technique.
BCP can be used to export the data compress and transfer over network which can be then imported into local instance of SQL.
Also with BCP data exports can be contained with smaller batches of data for easier management of data.
https://msdn.microsoft.com/en-us/library/ms191232.aspx
https://technet.microsoft.com/en-us/library/ms190923(v=sql.105).aspx

SSIS cannot convert between unicode and non-unicode string data types

I am new to SSIS, and have been asked to migrate data from a Postgres database, to SQL. There is no data transformation requirements. But I have hit an early snag.
I've create a System DSN on my machine for an ODBC driver to connect to the Postgres database, and it connects fine.
In SSIS, I created an ODBC Data Provider connection, and used the DSN I created. Connects well. I then create an ADO.Net data source, and select my source data connection. I select the table I want this flow to pull from, press Preview, and I see the data. All good.
I then create an OLE DB Destionation component. I can't use an ADO.Net Destination, as one of my fields is a 'GEOMETRIC' type, and SSIS seems to have an issue with that. OLE DB allows it.
I connect the Source to the Destination, and then open the Destination item. I select the Destination table, select 'Mappings' and all is mapped perfectly.
I close that screen, but I see a red cross next to my Destination component.
"Column "pid" cannot convert between unicode and non-unicode string
data types."
This seems to be the case with ALL VARCHAR fields in my database. Do I need to create a Data Conversion or something for every table, and every VARCHAR in my database? Or is there a way to fix this?
To convert Unicode and Non-Unicode string datatypes use the Data Conversion Task and for every problematic source Unicode add String[DT_STR]

Resources