ODI: SQL Server Source - sql-server

We need to extract a SQL Server source into ODI in our Oracle Database.
In this source we have a difference between a NULL and an empty string. We need to capture this difference into ODI. Something like nvl(attribute, 'XXX') so that an empty string becomes a NULL into Oracle or something like that.
But in the physical mapping, coming from SQL Server, ODI always uses a temporary C$ table (which already is an Oracle table). After that C$ table, my 'nvl' gets applied but in Oracle a null and an empty string are handled the same.
Does anyone know how to handle this issue?
Thanks!

In the logical mapping you can apply to the target column the ANSI SQL function coalesce(attribute, 'XXX'), which is valid SQL Server syntax.
If you set the parameter Execute on Hint: Source the function will be applied to the SELECT statement on the Source before inserting into the C$ table.

Related

Is it possible to compare tables from different SQL server?

I have to compare the tables in Server1 database A dbo.X and Server2, database B dbo.Y. Both table X and table Y contains same values.
SO I need to validate both tables contains same values in every row and column. Is it possible to do it?
Thanks
If you do not want to use any Tool like SSIS/Visual Studio then Linked Server will be required.
Select * FROM Server1.databaseA.dbo.X
EXCEPT
Select * FROM Server2.databaseB.dbo.Y
EXCEPT returns distinct rows from the left input query that aren’t output by the right input query.
EXCEPT
Sure, you can do it by creating linked servers. Please, follow this manual to create it:
Creating Linked Servers
After this you will able to make sql-queries to another server like this:
SELECT name FROM [SRVR002\ACCTG].master.sys.databases ;
There is a more easy way if you have visual studio installed. There is a option to compare schema and data with any server and it is very efficient as you can update the target server within the tool as well.
VisualStudio -> Tools -> SQL server -> Data Comparison

In Ms-SQL Server to restore database via script as below I would like to parameterize the database name, path and file name in an SSIS Package

In Ms-SQL Server to restore database I have a script as below but would like to parameterize the database name, path and file name in an SSIS Package, how do I do that?
Details:
The script, which works, I got this by right click at Restore in Ms SQL server:
USE [master]
RESTORE DATABASE [DataSystem2014] FROM DISK = N'F:\Folders\99_Temp\12_DataSystem\DataSystem_20140915_181216.BAK' WITH FILE = 1, MOVE N'DataSystem2014' TO N'F:\DatabaseFiles\DataSystem2014_Primary.mdf', MOVE N'DataSystem2014_log' TO N'F:\DatabaseLogFiles\DataSystem2014_Primary.ldf', NOUNLOAD, STATS = 5
GO
but I'd like to use above as an SQL task in an SSIS package, and I couldn't properly parameterize the database name ( [DataSystem2014] ) or the path ( F:\Folders\99_Temp\12_DataSystem\ ) or the file name ( DataSystem_20140915_181216.BAK ).
The database name will be fairly stable, but would like to bring it in to the SQL statement as a parameter, path might change but also stable enough, the file name always changes. I tried a few versions, used ? and parameter mapping, used #[User::Variable] in SQL statement, but couldn't get them working, always error messages.
Is this something I could get some help with, how to do this, please?
The Task for issuing SQL Statements is called the Execute SQL Task. Depending on your Connection Manager type, you will use different characters for placeholders. Map Query Parameters
Generally speaking, people use an OLE DB connection manager when working with SQL Server so the replacement character is a ?. That is going to be an ordinal based replacement token so even if you have the same string in there N times, you would be having to add N ? and make N variable mappings.
Looking at your query,
RESTORE DATABASE
[DataSystem2014]
FROM DISK = N'F:\Folders\99_Temp\12_DataSystem\DataSystem_20140915_181216.BAK' WITH FILE = 1
, MOVE N'DataSystem2014' TO N'F:\DatabaseFiles\DataSystem2014_Primary.mdf'
, MOVE N'DataSystem2014_log' TO N'F:\DatabaseLogFiles\DataSystem2014_Primary.ldf'
, NOUNLOAD
, STATS = 5;
it could be as heavily parameterized as this
RESTORE DATABASE
?
FROM
DISK = ? WITH FILE = 1
, MOVE ? TO ?
, MOVE ? TO ?
, NOUNLOAD
, STATS = 5;
Since those are all unique-ish values, I'd create a number of Variables within SSIS to hold their values. Actually, I'd create more Variables than I directly map.
For example, my restored database, DataSystem2014, name might always match the virtual name of the data and log so knowing one, I could derive the other values. The mechanism for deriving values is an Expression. Thus if create a Variable called #[User::BaseDBName], I could then create #[User::DBLogName] by setting EvaluateAsExpression = true and then using a formula like
#[User::BaseDBName] + "_log"
You can see in the linked MSDN article how actually map these Variables to their placeholders in the query.
Where this all falls apart though, at least in my mind, is when you have multiple data files. You're now looking at building your restore command dynamically.
I'm assuming you know what you are doing with parameterized execute sql tasks and skip right to the solution I had when doing this.
declare variables in the sql with quotename('?') and then build dynamic sql to execute the restore statement.
hope it helps.

copy large table from sql server to odbc linked database (postgresql) in ssms

I'm trying to copy created a whole database in SQL Server to Postgres. I've been trying to write a script that can run in ssms with a postgres instance set up as a linked server. This will not be a one off operation.
I've managed to create a script that creates most of the schema i.e. tables, constraints, indexes etc. I do this by using the information_schema tables in sql server and formatting the data to form valid sql for postgres and run EXEC(#sql) AT POSTGRES statement, where POSTGRES is the linked server and #SQL a variable containing my statement. This is working fine.
Now I'm trying to insert the data using a statement like this:
INSERT INTO OPENQUERY(POSTGRES,'SELECT * FROM targettable')
SELECT *
FROM sourcetable
The statements are actually slightly modified in some cases to deal with different data types, but this is the idea. The problem is when the table is particularly large, this statement fails with the error :
Msg 109, Level 20, State 0, Line 0
A transport-level error has occurred when receiving results from the server. (provider: Named Pipes Provider, error: 0 - The pipe has been ended.
I think the error is caused by either postgre or sql server using too much memory generating the large statement. I've found that if I manually select only parts of the data to insert at a time, it works. For instance, the top 10000 rows. But I don't know a way to write a general statement to select only x amount of rows at a time that isn't specific to the table I'm referencing.
Perhaps someone can suggest a better way of doing this though. Keep in mind I do need to change some of the data before inserting it into postgres e.g. geospatial information is transformed to a string so postgres will be able to interpret it.
Thanks!
I have transfered some large databases and for PostgreSQL I see 2 ways:
export data into CSV file, convert CSV file into PostgreSQL COPY format (see https://wiki.postgresql.org/wiki/COPY) and use COPY (wiki page shows more alternatives)
make Jython program that connect to both databases (Python is easy and Jython can work with JDBC drivers), make SELECT from source database (if you have a lot od data then use setFetchSize()), use PreparedStatement with INSERT in destination database and then dest_insert_stmt.setObject(i, src_rs.getObject(i))
I ended up using the OFFSET X ROWS FETCH NEXT Y ROWS ONLY introduced in SQL Server 2012 so the complete statement looked like this:
INSERT INTO OPENQUERY(POSTGRES,'SELECT * FROM targettable')
SELECT *
FROM sourcetable
ORDER BY 1
OFFSET 0 ROWS FETCH NEXT 10000 ROWS ONLY
And everything is working and error appears! I actually iterate through the OFFSET value adding 10,000 to it on every iteration using dynamic SQL.
Not the cleanest or nicest solution. I think most people would be better writing something in another language as mentioned by Michal Niklas, but this worked for me.

SSIS Package: convert between unicode and non-unicode string data types

I am connecting to an Oracle DB and the connection works, but I get the following error for some of the columns:
Description: Column "RESOURCE_NAME" cannot convert between unicode
and non-unicode string data types.
Value for RESOURCE_NAME:
For Oracle: VARCHAR2(200 BYTE)
For SQL Server: VARCHAR(200 BYTE)
I can connect to the Oracle DB via Oracle SQL Developer without any issues. Also, I have the SSIS package setting Run64BitRuntime = False.
The Oracle data type VARCHAR2 appears to be equivalent to NVARCHAR in SQL Server, or DT_WSTR in SSIS. Reference
You will have to convert using the Data Conversion Transformation, or CAST or CONVERT functions in SQL Server.
If the package works in one machine and doesn't in other; Try setting the NLS_LANG to right language, territory and character set and test the package.
[Command Prompt]> set NLS_LANG=AMERICAN_AMERICA.WE8ISO8859P1
The easiest way around this to to open the SSIS package in notepad (the dtsx file) and do a global find and replace of all instances of validateExternalMetadata="True" with validateExternalMetadata="False".
note: we encountered this issue when connecting to an Oracle 11g database on Linux through SSIS.
on oledb source ->advanced editor options->input/output columns->output columns->select RESOURCE_NAME column and change Data type as DT_WSTR and length also u can change as required
You can use SQL command in SSIS and use CONVERT or CAST. If SSIS still gives you an error its because of the metadata. Here is how you can fix it.
Open the Advanced Editor.
Under the Input and Output properties, Expand Source Output.
Expand Output columns
Select the column which is causing the issue.
Go to Data Type Properties and change the DataType to your desired type DT_STR, DT_Text etc.
You can just double-click on the "Data Conversion" block in the Data Flow and for every item change it to: "Unicode String [DT_WSTR]"
Works
If everything failed from above. Create a table variable and insert the data into it. Then select all records as source. use SET NOCOUNT ON in the script.
I encountered a very similar problem even using SQL Server rather than Oracle. In my case, I was using a Flat File as a data source, so I just went in to the Flat File Connection Manager and manually changed the column type to be a Unicode string:
I don't know if this would fix your problem or not, but it helped me - hopefully someone else will be helped too. (I was inspired to try that by this previous answer to this question BTW, just to give credit where credit's due).

DATASTAGE- SQL Server from datastage: load table with strange name

I have a SQL Server table, whose name is like Vers-xxx_yyy.
As you can see, there is a character "-".
I don't know why this table was made so, but I have to load it from datastage job.
So when I run my job, I obtain error "table doesn't exist".
I use odbc stage.
Directly on SQL Server it is possible to use syntax [Vers-xxx_yyy], but not in datastage.
This db already exists and it is used by other applications.
Is there a way to avoid/resolve the problem?
Try using double quotes over the table name. Also it is good practice not to use hyphen, instead you can use underscore
Try using a backslash \ to escape the - character - Vers\-xxx_yyy.
You should be able to put the table name in this form on the ODBC Connector too: [Vers-xxx_yyy]
Another solution would be to inform the SQL to query this table: SELECT * FROM [Vers-xxx_yyy]

Resources