Pentaho insert operation error due to a collation - sql-server

I have a Pentaho transformation which executes a procedure from a database and then insert the rows into another database.
The source database is a sql server 2008 r2 database, not utf8 charset, Latin_general_ci, and the destiny databanase is postgresql with utf8 charset.
If I execute the ETL it throws an error when attempt to insert the following statement:
INSERT INTO aux (name, account, id, state) VALUES ( 'UAN 5 BAR ','01082082R','UY903847JDNF','BAJA')
As you can see in the name value field exists some unknown characters. Pentaho shows that with some square images,exactly 6. If I copy the insert statement from log the row is break in this point so I understant that it is a break of line or something like that.
I solved this for other rows, that the hidden chars are different, but in this case I can not solve this. Furthermore I would like to find a solution to solve all the possible problems of charset.
Anyone knows how to solve that?
In the other hidden char I solved it applying cast(name as binary) but this does not workd for this other case.
EDIT
The value 'UAN 5 BAR ' has unknown chars just after the word BAR until the quote char '.
When I say unknown means weird chars that I can not see what they are.

Related

IBM DB2 values displayed as utf-8 text

Once I connect to the database (DB2) to check the values in the tables, if they have special chars then I see their utf-8 text value:
I expected instead to see the correct: Tükörfúrógép.
I am still able to handle the value properly, but is there any configuration in the db that I am missing to display the value properly when checking the table?
More Info:
Connected to DB with Intellij and also tried with DbVisualizer.
The following JDBC connection was used in intellij:
jdbc:db2://(...)?characterEncoding=UTF-8;
Tried both with the characterEncoding and without getting the same results.
I am still able to handle the value properly, but is there any configuration in the db that I am missing to display the value properly when checking the table?
DB Version: v11 LUW
JDBC: com.ibm.db2.jcc -- db2jcc4 -- Version 10.5
Encoding being used: UTF-8
db2 "select char(value,10), char(name,10) from sysibmadm.dbcfg where
name like 'code%'"
1 2
---------- ---------- 1208 codepage UTF-8 codeset
2 record(s) selected.
UPDATE 1:
I was able to directly insert in the database values with special
chars, so starting to think this is not DB2 configuration missing but
maybe jdbc or other related issue.
You must have the following HEX string representation for given string Tükörfúrógép in UTF-8 database:
54C3BC6BC3B67266C3BA72C3B367C3A970.
But you have the following instead with repeating garbage symbols:
54C383C2BC6BC383C2B67266C383C2BA72C383C2B367C383C2A970
You may try to manually remove such a byte sequence with the following statement, but it's better to understand a root cause of such a garbage appearance in this column.
VALUES REPLACE (x'54C383C2BC6BC383C2B67266C383C2BA72C383C2B367C383C2A970', x'83C2', '');
SELECT REPLACE (TOWN, x'83C2', '') FROM ...;

Bulk import only works with 1 column csv file

Whenever I try to import a CSV file into sql server with more than one column I get an error (well, nothing is imported). I know the file is terminated fine because it works with 1 column ok if I modify the file and table. I am limiting the rows so it never gets to the end, the line terminator is the correct and valid one (also shown by working when having 1 column only).
All I get is this and no errors
0 rows affected
I've also check all the other various questions like this and they all point to a bad end of file or line terminator, but all is well here...
I have tried quotes and no quotes. For example, I have a table with 2 columns of varchar(max).
I run:
bulk insert mytable from 'file.csv' WITH (FIRSTROW=2,lastrow=4,rowterminator='\n')
My sample file is:
name,status
TEST00040697,OK
TEST00042142,OK
TEST00042782,OK
TEST00043431,BT
If I drop a column then delete the second column in the csv ensuring it has the same line terminator \n, it works just fine.
I have also tried specifying the 'errorfile' parameter but it never seems to write anything or even create the file.
Well, that was embarrassing.
SQL Server in it's wisdom is using \t as the default field terminator for a CSV file, but I guess when the documentation says 'FORMAT = 'CSV'' it's an example and not the default.
If only it produced actual proper and useful error messages...

How to insert XML into SQL Server when it contains escaped invalid characters

I'm trying to insert some XML into a SQL Server database table which uses column type XML.
This works fine most of the time, but one user submitted some XML with the character with hex value 3, and SQL Server gave the error "hexadecimal value 0x03, is an invalid character."
Now I want to check, and remove, any invalid XML characters before doing the insert, and there are various articles suggesting how invalid XML characters can be replaced using regex or something similar.
However, the problem for me is that the user submitted the XML document with the invalid character escaped i.e. "", and none of the methods I've found will detect this. This is also why the error was not detected earlier: it's only when inserting it into the SQL database that the problem occurs.
Has anyone written a function that will check for all escaped invalid XML characters? I suppose the character above could have been written as  or , or lots of other ways, so it's quite hard to catch them all.
Thanks in advance for any help you can offer.
You could try importing the XML to a temporary varchar(max) variable or table column and use REPLACE to strip out the offending characters, then insert the cleansed string into the destination CASTing it to XML

String or binary data would be truncated, even though the string is shorter than column length

I had a table with a column which is a VARCHAR(4). In SQL Server (Right-click > Design) I changed this to VARCHAR(10).
Now when I try to update the the value from AAAA to BBBBBB it gives me the error message:
String or binary data would be truncated. The statement has been terminated.
Is there anything else than the column's length in VARCHAR(10) that might be causing this error?
Some things you need to confirm.
Is the column actually resized (does it register as varchar(10)) ?
Are you updating the correct table ?
Are you updating in the correct database ?
Check in profiler the update command that is send to the database, maybe your client does something to it ?
Are there any other string fields in your update, maybe one of them is too long ?
Are you updating the correct field with BBBBBB ?

INSERT Query SQL (Error converting data type nvarchar to (null))

I'm trying to run an INSERT query but it asks me to convert varchar to null. Here's the code:
INSERT Runtime.dbo.History (DateTime, TagName, vValue)
VALUES ('2015-09-10 09:00:00', 'ErrorComment', 'Error1')
Error message:
Error converting data type nvarchar to (null).
The problem is at the vValue column.
column vValue(nvarchar, null)
How it looks in the database:
The values inside vValue are placed by the program I'm using. I'm just trying to manually insert into the database.
Last post was with the wrong column, I apologize.
After contacting Wonderware support i found out that INSERT is not supported on the vValue column by design. It’s a string value and updates are supposed to carry out via the StringHistory table.
What is the type of the column value in the database ?
If it's float, you should insert a number, not string.
Cast "error1" to FLOAT is non-sense.
Float is a number exemple : 1.15, 12.00, 150.15
When you try to CAST "Error1" to float, he tries to transform the text "error1" to number and he can't, it's logic.
You should insert a number in the column.
I think I can help you with your problem since I've got a decent test environment to experiment with.
Runtime.dbo.History is not a table you can interact directly with, it is a View. In our case here the view is defined as:
select * from [INSQL].[Runtime].dbo.History
...Which I believe implies the History data you are viewing is from the Historian flat file storage itself, a Wonderware Proprietary system. You might see some success if you expand the SQL Server Management Studio's
Server Objects -> Linked Servers -> INSQL
...and play with the data there but I really wouldn't recommend it.
With that said, for what reason do you need to insert tag history? There might be other workarounds for the purpose you need.

Resources