Incorrect encoding of the Cyrillic's characters by BindValue - sql-server

I have DB on MS SQL 2008 R2
I install follow software:
Ubuntu 16
Apache 2
PHP 7
Symfony 3.0 (with leaseweb/doctrine-pdo-dblib)
Windows Server 2008 R2 SP1
MS SQL Server 2008 R2 SP3 (with Cyrillic_General_CI_AS - I can't change the collation)
In my project, there is controller which has follow code
$em = $this->getDoctrine()->getManager();
$connection = $em->getConnection();
$sql="
select
isnull(ID,'') as 'id'
,isnull(SNAME,'') as 'sname'
,isnull(FNAME,'') as 'fname'
from TBL1
where SNAMElike '%'+:sname+'%' ";
$sql = iconv('UTF-8','Windows-1251', $sql);
$statement = $connection->prepare($sql);
$statement->bindValue(':sname', iconv('UTF-8', 'Windows-1251', $request->request->get('sname')), 'text');
$statement->execute();
In SQL Server Profiler I watch the string 0xc8e2e0edeee2 instead of Агеев. As you can see the codes is come from UTF8.
if I don't use the code page conversion,
$statement->bindValue(':sname', $request->request->get('sname'), 'text');
I see the string 0xd098d0b2d0b0d0bdd0bed0b2 in SQL Server Profiler.
If I don't use BindValue and put the variable into SQL like it's shown below
$sname = $request->request->get('sname');
$em = $this->getDoctrine()->getManager();
$connection = $em->getConnection();
$sql="
select
isnull(ID,'') as 'id'
,isnull(SNAME,'') as 'sname'
,isnull(FNAME,'') as 'fname'
from TBL1
where SNAMElike '%'+'".$sname."'+'%' ";
$sql = iconv('UTF-8','Windows-1251', $sql);
$statement = $connection->prepare($sql);
$statement->execute();
I watch correct string Иванов in SQL Server Profiler.
If the variable containts a string which has only latin charasters then there areb't any problem.
I send the string Ivanov and I watch the string Ivanov.
What is the problem?

The problem is in quoter function that called for each string parameter dblib_handle_quoter in ext\pdo_dblib\dblib_driver.c. It will convert any string containing character with codes outside of 32 > char > 127 range into binary string.
So if Агеев is not converted to 'Windows-1251' it becomes 0xd098d0b2d0b0d0bdd0bed0b2 - a UTF-8 representation converted to binary string (something like Рванов).
SQL Server then implicitly converts this binary value to string using:
single-byte conversion (with CP1251 code page in your case i guess), if expected data type is char or varchar. Here you need to convert parameters to Windows-1251.
unicode conversion, if expected data type is nchar or nvarchar. Here you need to convert parameters to UCS-2LE.
That's why there is no other choice (at least for now) than to use iconv for parameters.
PS If you set connection charset right, you shouldn't need to convert $sql as it should be done by dblib.
I use freetds-1.00.13 with PHP7 as pdo-dblib like this (data types is varchar):
function pdo_params(...$params){
foreach ($params as &$v){
$v = iconv('UTF-8','Windows-1251', $v);
}
unset($v);
return $params;
}
$dsn = 'dblib:dbname=DataBase;host=sql.server.host.name';
<...>
$dbh = new PDO($dsn, $user, $password,array(PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION));
$stmt = $dbh->prepare('INSERT INTO table (a,b) VALUES (?,?)');
$stmt->execute(pdo_params("Test",$msg));

Related

Storing binary with JDBCTemplate

I have table like
create table test(payload varbinary(max))
I am trying to store text lines in compressed format in the database using the following code
String sql = "insert into test(payload) values (compress(:payload))
MapSqlParametersource msps = new MapSqlParameterSource();
msps.addValue("payload", "some text", Types.VARBINARY)
NamedParameterJdbcTemplate npjt = //;
npjt.update(sql, msps);
This gives the following error -
String is not in a valid hex format
If I provide the datatype in MapSqlParameterSource as VARCHAR, it doesn't give any error, but then using MSSQL's decompress function returns garbage value
select decompress(payload) from test

Writing Unicode from R to SQL Server

I'm trying to write Unicode strings from R to SQL, and then use that SQL table to power a Power BI dashboard. Unfortunately, the Unicode characters only seem to work when I load the table back into R, and not when I view the table in SSMS or Power BI.
require(odbc)
require(DBI)
require(dplyr)
con <- DBI::dbConnect(odbc::odbc(),
.connection_string = "DRIVER={ODBC Driver 13 for SQL Server};SERVER=R9-0KY02L01\\SQLEXPRESS;Database=Test;trusted_connection=yes;")
testData <- data_frame(Characters = "❤")
dbWriteTable(con,"TestUnicode",testData,overwrite=TRUE)
result <- dbReadTable(con, "TestUnicode")
result$Characters
Successfully yields:
> result$Characters
[1] "❤"
However, when I pull that table in SSMS:
SELECT * FROM TestUnicode
I get two different characters:
Characters
~~~~~~~~~~
â¤
Those characters are also what appear in Power BI. How do I correctly pull the heart character outside of R?
It turns out this is a bug somewhere in R/DBI/the ODBC driver. The issue is that R stores strings as UTF-8 encoded, while SQL Server stores them as UTF-16LE encoded. Also, when dbWriteTable creates a table, it by default creates a VARCHAR column for strings which can't even hold Unicode characters. Thus, you need to both:
Change the column in the R data frame from being a string column to a list column of UTF-16LE raw bytes.
When using dbWriteTable, specify the field type as being NVARCHAR(MAX)
This seems like something that should still be handled by either DBI or ODBC or something though.
require(odbc)
require(DBI)
# This function takes a string vector and turns it into a list of raw UTF-16LE bytes.
# These will be needed to load into SQL Server
convertToUTF16 <- function(s){
lapply(s, function(x) unlist(iconv(x,from="UTF-8",to="UTF-16LE",toRaw=TRUE)))
}
# create a connection to a sql table
connectionString <- "[YOUR CONNECTION STRING]"
con <- DBI::dbConnect(odbc::odbc(),
.connection_string = connectionString)
# our example data
testData <- data.frame(ID = c(1,2,3), Char = c("I", "❤","Apples"), stringsAsFactors=FALSE)
# we adjust the column with the UTF-8 strings to instead be a list column of UTF-16LE bytes
testData$Char <- convertToUTF16(testData$Char)
# write the table to the database, specifying the field type
dbWriteTable(con,
"UnicodeExample",
testData,
append=TRUE,
field.types = c(Char = "NVARCHAR(MAX)"))
dbDisconnect(con)
Inspired by last answer and github: r-dbi/DBI#215: Storing unicode characters in SQL Server
Following field.types = c(Char = "NVARCHAR(MAX)") but with vector and compute of max because of the error dbReadTable/dbGetQuery returns Invalid Descriptor Index .... :
vector_nvarchar<-c(Filter(Negate(is.null),
(
lapply(testData,function(x){
if (is.character(x) ) c(
names(x),
paste0("NVARCHAR(",
max(
# nvarchar(max) gave error dbReadTable/dbGetQuery returns Invalid Descriptor Index error on SQL server
# https://github.com/r-dbi/odbc/issues/112
# so we compute the max
nchar(
iconv( #nchar doesn't work for UTF-8 : help (nchar)
Filter(Negate(is.null),x)
,"UTF-8","ASCII",sub ="x"
)
)
,na.rm = TRUE)
,")"
)
)
})
)
))
con= DBI::dbConnect(odbc::odbc(),.connection_string=xxxxt, encoding = 'UTF-8')
DBI::dbWriteTable(con,"UnicodeExample",testData, overwrite= TRUE, append=FALSE, field.types= vector_nvarchar)
DBI::dbGetQuery(con,iconv('select * from UnicodeExample'))
Inspired by the last answer I also tried to find an automated way for writing data frames to SQL server. I can not confirm the nvarchar(max) errors, so I ended up with these functions:
convertToUTF16_df <- function(df){
output <- cbind(df[sapply(df, typeof) != "character"]
, list.cbind(apply(df[sapply(df, typeof) == "character"], 2, function(x){
return(lapply(x, function(y) unlist(iconv(y, from = "UTF-8", to = "UTF-16LE", toRaw = TRUE))))
}))
)[colnames(df)]
return(output)
}
field_types <- function(df){
output <- list()
output[colnames(df)[sapply(df, typeof) == "character"]] <- "nvarchar(max)"
return(output)
}
DBI::dbWriteTable(odbc_connect
, name = SQL("database.schema.table")
, value = convertToUTF16_df(df)
, overwrite = TRUE
, row.names = FALSE
, field.types = field_types(df)
)
I found the previous answer very useful but ran into problems with character vectors that had another encoding such as 'latin1' instead of UTF-8. This resulted in random NULLs in the database column due to special characters such as non-breaking spaces.
In order to avoid these encoding issues, I've made the following modifications to detect the character vector encoding or otherwise default back to UTF-8 before conversion to UTF-16LE:
library(rlist)
convertToUTF16_df <- function(df){
output <- cbind(df[sapply(df, typeof) != "character"]
, list.cbind(apply(df[sapply(df, typeof) == "character"], 2, function(x){
return(lapply(x, function(y) {
if (Encoding(y)=="unknown") {
unlist(iconv(enc2utf8(y), from = "UTF-8", to = "UTF-16LE", toRaw = TRUE))
} else {
unlist(iconv(y, from = Encoding(y), to = "UTF-16LE", toRaw = TRUE))
}
}))
}))
)[colnames(df)]
return(output)
}
field_types <- function(df){
output <- list()
output[colnames(df)[sapply(df, typeof) == "character"]] <- "nvarchar(max)"
return(output)
}
DBI::dbWriteTable(odbc_connect
, name = SQL("database.schema.table")
, value = convertToUTF16_df(df)
, overwrite = TRUE
, row.names = FALSE
, field.types = field_types(df)
)
Ideally, I'd still modify this to remove the rlist dependency but it seems to work now.
You could consider using the package RODBC instead of odbc/DBI. I've have used RODBC with SQL Server and with Microsoft Access as permanent data storage system. I never had trouble with german umlaut (e.g. Ä, ä, ..., ß)
I wonder if using iconv is an appealing alternative as there seem to boe some '\X00' issues (e.g. https://www.r-bloggers.com/2010/06/more-powerful-iconv-in-r/)
I am posting this answer as an Extension to the top answer, because some people might find it useful.
If you need Unicode strings in SQL statements such as INSERT or UPDATE where you cannot use dbWriteTable(), you can constructing your query with dbBind() like this:
x <- "äöü"
x <- iconv(x, from="UTF-8", to="UTF-16LE", toRaw = TRUE)
q <-
"
update foobar
set umlauts = ?
where id = 1
")
query <- DBI::dbSendStatement(con, q)
DBI::dbBind(query, list(x))
DBI::dbClearResult(query)

Invalid parameter number error in Codeigniter

I am working on a process in Codeigniter to take a user-uploaded image (managed using the CI upload library) and insert it into a varbinary(max) field in a SQLServer database. My controller and model code are as follows.
if($this->upload->do_upload($upload_name)) {
//get temp image
$tmpName = $config['upload_path'] . $config['file_name'];
// Read it into $data variable
$fp = fopen($tmpName, 'rb');
$data = fread($fp, filesize($tmpName));
fclose($fp);
//insert into DB
$this->the_model->storeImage($data, $user_id);
//delete temp image
unlink($config['upload_path'] . $config['file_name']);
}
/***** Function from the_model ************/
function storePropertyImage($image_data, $user_id) {
$my_db = $this->load->database('admin');
$stmt = "INSERT INTO my_table (UserID, ImageData) VALUES (" . $my_db->escape($user_id) . ", " . $my_db->escape($image_data) . ")";
$insert = $my_db->query($stmt);
return $insert;
}
This all seems like it should be OK but when I run the code, I get the error:
Fatal error: Uncaught exception 'PDOException' with message
'SQLSTATE[HY093]: Invalid parameter number: mixed named and positional parameters'
in {my app path}\helpers\mssql_helper.php on line 213
I've done some googling on this error message and the results seem to indicate this is the result of there being a colon character in the $data value being sent to the model, making the DB think that I am trying to pass a named parameter when I am not. However I haven't been able to find any reports that match my specific use case or that have much info on how to correct the error.
I'd appreciate any insight on where I might be tripping up.
$image_data is a binary string. ->escape may not work on it, since it may escape random bytes in it, thus leaving you with a corrupted image. Also the binary string may contain quote characters (or other characters) that is making your query invalid.
Try to encode the binary string as hex before inserting into MySQL. You can use PHP's bin2hex for this.
$escaped_user_id = $my_db->escape($user_id);
$hex_image = bin2hex($image_data);
$stmt = "INSERT INTO my_table (UserID, ImageData) VALUES ({$escaped_user_id}, X'{$hex_image}')";
The X in X{$hex_image} is how MySQL handles literal hex strings: http://dev.mysql.com/doc/refman/5.1/en/hexadecimal-literals.html
If that doesn't work, you can also try UNHEX().
$escaped_user_id = $my_db->escape($user_id);
$hex_image = bin2hex($image_data);
$stmt = "INSERT INTO my_table (UserID, ImageData) VALUES ({$escaped_user_id}, UNHEX('{$hex_image}'))";
EDIT: I didn't notice you were using MSSQL and not MySQL. My bad. In MSSQL, you can insert literal hex strings with 0x.
$escaped_user_id = $my_db->escape($user_id);
$hex_image = bin2hex($image_data);
$stmt = "INSERT INTO my_table (UserID, ImageData) VALUES ({$escaped_user_id}, 0x{$hex_image})";

Maximum length of string which can be returned from stored proc in SQL Server 2008 to .net apps

I am returning a static string from a stored procedure (in SQL Server 2008) as below:
select 'abcdefgh.........xyz'
If the static string length is exceeding more than some limit (eg:8kb) then only partial string (eg:7kb) is returned to the .net apps.
Though I tried in different ways like assigning static string to varchar(max) and selecting the variable, is still returning only partial string.
I should return complete string which could be of max of 5mb. So, main concerns:
What is the max string length I can return from a stored procedure
How to return 5 mb string from stored procedure to .net apps.
I request someone can help me to resolve this issue.
please find the code below
using (SqlCommand command = new SqlCommand(Source.GetExportRecordSP, Connection))
{
command.CommandType = CommandType.StoredProcedure;
command.Parameters.Add(new SqlParameter("#CandidateRecordID ", SqlDbType.NVarChar, 32)).Value = record;
try
{
if (Connection.State != ConnectionState.Open)
{
Connection.Open();
}
using (SqlDataReader reader = command.ExecuteReader())
{
if(reader.Read())
{
xmlRecord = new XmlDocument();
xmlRecord.LoadXml(reader.GetString(0));
}
}
}
catch (Exception Ex)
{
Logging.WriteError(string.Format("Error while retrieving the Record \"{0}\" details from Database. Exception: {1} ", Ex.ToString()));
throw;
}
}
Thanks in advance geeks.
Since you appear not to be using an OLEDB connection (which has an 8k limit), I think the problem is in your procedure code.
Or, perhaps, the compatibility version of your database is set to something other than SQL Server 2008 (SQL Server 2000 could not return more than 8k using GetString()).
Thanks for support, I found 1 fix for this at
http://www.sqlservercentral.com/Forums/Topic350590-145-1.aspx
Fix is, declare a variable, and should be initlized to empty string and concatenated with the main string.
DECLARE #test varchar(MAX);
set #test =''
select #test = #test + '<Invoice>.....'
If the string length is <8000 it will work without the above approach.
Thanks all.

How to insert a file into sql-server via tiny_tds?

In a data importing script:
client = TinyTds.Client.new(...)
insert_str = "INSERT INTO [...] (...) VALUE (...)"
client.execute(insert_str).do
So far so good.
However, how can I attach a .pdf file into the varbinary field (SQL Server 2000)?
I've recently had the same issue and using activerecord was not really adapted for what I wanted to do...
So, without using activerecord:
client = TinyTds.Client.new(...)
data = "0x" + File.open(file, 'rb').read.unpack('H*').first
insert_str = "INSERT INTO [...] (...) VALUE (... #{data})"
client.execute(insert_str).do
To send proper varbinary data, you need to read the file, convert it to hexadecimal string with unpack('H*').first and prepend '0x' to the result.
Here is PHP-MSSQL code to save binary data:
mssql_query("SET TEXTSIZE 2147483647",$link);
$sql = "UPDATE UploadTable SET UploadTable_Data = ".varbinary_encode($data)." WHERE Person_ID = '".intval($p_id)."'";
mssql_query($sql,$link) or
die('cannot upload_resume() in '.__FILE__.' on line '.__LINE__.'.<br/>'.mssql_get_last_message());
function varbinary_encode($data=null) {
$encoded = null;
if (!is_null($data)) {
$a = unpack("H*hex", $data);
$encoded = "0x";
$encoded .= $a['hex'];
}
return $encoded;
}
Here is PHP-MSSQL code to get binary data:
mssql_query("SET TEXTSIZE 2147483647",$link);
$sql = "SELECT * FROM UploadTable WHERE ID = 123";
$db_result = mssql_query($sql,$link);
// work with result like normal
I ended up using activerecord:
require 'rubygems'
require 'tiny_tds'
require 'activerecord-sqlserver-adapter'
..
my_table.create(:file_name => "abc.pdf", :file_data => File.open("abc.pdf", "rb").read)
For SQLServer 2000 support, go for 2.3.x version activerecord-sqlserver-adapter gem.

Resources