I have a EF code first database, to populate the initial tables, I am using sql scripts (are far easier to handle and update that the seed methods).
The problem is, that the scripts are inserting wihtout special characters....
The database collation is: SQL_Latin1_General_CP1_CI_AS
The seed is reading the script like this:
context.Database.ExecuteSqlCommand(File.ReadAllText(baseDir + #"..\..\Scripts\Oficina.sql"));
And the script looks like this:
INSERT [dbo].[Oficina] ([IdOficina], [Nombre], [SISTEMA], [ORDEN]) VALUES (20, N'Comisión Admisión', 1, 5)
The problem is, that its being saved in the database as:
Comisi�n Admisi�n
I have no clue what the problem could be.....any ideas?
I faced the same problem few time ago
public static void ExecuteBatchFromFile(this DataContext dc, String fileName, String batchSeparator,
Encoding enc = null) {
if (enc == null)
enc = Encoding.UTF8;
String stSql = File.ReadAllLines(fileName, enc);
/* ... */
}
I solved it by adding the enc parameter to my function.
The problem is to correctly read the source. The collation is not necessarily the storage encoding, but the encoding used for comparison.
Check the table definition to see if the data type is varchar or nvarchar. The question has been asked on the site before. Here is a good explanation:
Which special characters are allowed?
Related
Sorry if I'm using the wrong terminology - I'm not into MSSQL or VBScript.
I have a given script implementing a SQL query containing
.. AND (rep.is_primary_replica is null or rep.is_primary_replica = ''True'' or rep.is_primary_replica = ''False'') ..
which returns no results on a German server only because rep.is_primary_replica seems to contain ''Wahr'' and ''Falsch'' instead of ''True'' and ''False''
This question is about an at least similar problem.
Unfortunately there is no wtf flag.
Is there a way to do that correctly?
Can I disable localization of string conversions? (in MS SQL Server? VBS?)
Don't let printed values in the script confuse you. There is an implicit conversion from original value to string which depends on the client's (VBScript in your case I think) locale.
Type of the field is_primary_replica here must be bit if the rep is an instance of sys.dm_hadr_database_replica_states view.
If this is the case, it's pointless to check a bit field is Null, or its value equal to True or False, there's no other possibility anyway.
It appears that you can safely remove this condition from the query.
If you insist to include it, use number literals instead.
AND (rep.is_primary_replica is null or rep.is_primary_replica = 1 or rep.is_primary_replica = 0)
This is the proper way to query a bit field. With this way the server's or client's locale configuration won't cause any problems.
On a Linux machine, I am using PDO DBLIB to connect to an MSSQL database and insert data in a SQL_Latin1_General_CP1_CI_AS table. The problem is that when I am trying to insert chinese characters (multibyte) they are inserted as 哈市香åŠåŒºç 江路å·.
My (part of) code is as follows:
$DBH = new PDO("dblib:host=$myServer;dbname=$myDB;", $myUser, $myPass);
$query = "
INSERT INTO UserSignUpInfo
(FirstName)
VALUES
(:firstname)";
$STH = $DBH->prepare($query);
$STH->bindParam(':firstname', $firstname);
What I've tried so far:
Doing mb_convert_encoding to UTF-16LE on $firstname and CAST as VARBINARY in the query like:
$firstname = mb_convert_encoding($firstname, 'UTF-16LE', 'UTF-8');
VALUES
(CAST(:firstname AS VARBINARY));
Which results in inserting the characters properly, until there are some not-multibyte characters, which break the PDO execute.
Setting my connection as utf8:
$DBH = new PDO("dblib:host=$myServer;dbname=$myDB;charset=UTF-8;", $myUser, $myPass);
$DBH->exec('SET CHARACTER SET utf8');
$DBH->query("SET NAMES utf8");
Setting client charset to UTF-8 in my freetds.conf
Which had no impact.
Is there any way at all, to insert multibyte data in that SQL database? Is there any other workaround? I've thought of trying PDO ODBC or even mssql, but thought it's better to ask here before wasting any more time.
Thanks in advance.
EDIT:
I ended up using MSSQL and the N data type prefix. I will swap for and try PDO_ODBC when I have more time. Thanks everyone for the answers!
Is there any way at all, to insert multibyte data in [this particular] SQL
database? Is there any other workaround?
If you can switch to PDO_ODBC, Microsoft provides free SQL Server ODBC drivers for Linux (only for 64-bit Red Hat Enterprise Linux, and 64-bit SUSE Linux Enterprise) which support Unicode.
If you can change to PDO_ODBC, then the N-prefix for inserting Unicode is going to work.
If you can change the affected table from SQL_Latin1_General_CP1_CI_AS to UTF-8 (which is the default for MSSQL), then that would be ideal.
Your case is more restricted. This solution is suited for the case when you have mixed multibyte and non-multibyte characters in your input string, and you need to save them to a Latin table, and the N data type prefix isn't working, and you don't want to change away from PDO DBLIB (because Microsoft's Unicode PDO_ODBC is barely supported on linux). Here is one workaround.
Conditionally encode the input string as base64. After all, that's how we can safely transport pictures in line with emails.
Working Example:
$DBH = new PDO("dblib:host=$myServer;dbname=$myDB;", $myUser, $myPass);
$query = "
INSERT INTO [StackOverflow].[dbo].[UserSignUpInfo]
([FirstName])
VALUES
(:firstname)";
$STH = $DBH->prepare($query);
$firstname = "输入中国文字!Okay!";
/* First, check if this string has any Unicode at all */
if (strlen($firstname) != strlen(utf8_decode($firstname))) {
/* If so, change the string to base64. */
$firstname = base64_encode($firstname);
}
$STH->bindParam(':firstname', $firstname);
$STH->execute();
Then to go backwards, you can test for base64 strings, and decode only them without damaging your existing entries, like so:
while ($row = $STH->fetch()) {
$entry = $row[0];
if (base64_encode(base64_decode($entry , true)) === $entry) {
/* Decoding and re-encoding a true base64 string results in the original entry */
print_r(base64_decode($entry) . PHP_EOL);
} else {
/* Previous entries not encoded will fall through gracefully */
print_r($entry . PHP_EOL);
}
}
Entries will be saved like this:
Guan Tianlang
5pys6Kqe44KS5a2maGVsbG8=
But you can easily convert them back to:
Guan Tianlang
输入中国文字!Okay!
Collation shouldn't matter here.
Double-byte characters need to be stored in nvarchar, nchar, or ntext fields. You don't need to perform any casting.
The n data type prefix stands for National, and it causes SQL Server to store text as Unicode (UTF-16).
Edit:
PDO_DBLIB does not support Unicode, and is now deprecated.
If you can switch to PDO_ODBC, Microsoft provides free SQL Server ODBC drivers for Linux which support Unicode.
Microsoft - SQL Server ODBC Driver Documentation
Blog - Installing and Using the Microsoft SQL Server ODBC Driver for Linux
You can use Unicode compatible data-type for the table column for supporting foreign languages(exceptions are shown in EDIT 2).
(char, varchar, text) Versus (nchar, nvarchar, ntext)
Non-Unicode :
Best suited for US English: "One problem with data types that use 1 byte to encode each character is that the data type can only represent 256 different characters. This forces multiple encoding specifications (or code pages) for different alphabets such as European alphabets, which are relatively small. It is also impossible to handle systems such as the Japanese Kanji or Korean Hangul alphabets that have thousands of characters
Unicode
Best suited for systems that need to support at least one foreign language: "The Unicode specification defines a single encoding scheme for most characters widely used in businesses around the world. All computers consistently translate the bit patterns in Unicode data into characters using the single Unicode specification. This ensures that the same bit pattern is always converted to the same character on all computers. Data can be freely transferred from one database or computer to another without concern that the receiving system will translate the bit patterns into characters incorrectly.
Example :
Also i have tried one example you can view its screens below,it would be helpful for issues relating the foreign language insertions as the question is right now.The column as seen below in nvarchar and it do support the Chinese language
EDIT 1:
Another related issue is discussed here
EDIT 2 :
Unicode unsupported scripts are shown here
just use nvarchar, ntext, nChar and when you want to insert then
use
INSERT INTO UserSignUpInfo
(FirstName)
VALUES
(N'firstname');
N will refer to Unicode charactor and it is standard world wide.
Ref :
https://aalamrangi.wordpress.com/2012/05/13/storing-and-retrieving-non-english-unicode-characters-hindi-czech-arabic-etc-in-sql-server/
https://technet.microsoft.com/en-us/library/ms191200(v=sql.105).aspx
https://irfansworld.wordpress.com/2011/01/25/what-is-unicode-and-non-unicode-data-formats/
This link Explain of chinese character in MYSQL. Can't insert Chinese character into MySQL .
You have to create table table_name () CHARACTER SET = utf8;
Use UTF-8 when you insert to table
set username utf8; INSERT INTO table_name (ABC,VAL);
abd create Database in CHARACTER SET utf8 COLLATE utf8_general_ci;
then You can insert in chinese character in table
I have an old system that uses varchar datatype in its database to store Arabic names, now the names appear in the database like this:
"ãíÓÇÁ ÇáãÈíÖíä"
Now I am building a new system using VB.NET, how can I read these names to appear in Arabic characters?
Also I need to point out here that the old system even it stores the data as I mentioned earlier it converts the characters in a correct format.
How to display it properly in the new system and in the SQL Server Management Studio?
have you tried nvarchar? you may find some usefull information at the link below
When must we use NVARCHAR/NCHAR instead of VARCHAR/CHAR in SQL Server?
I faced the same Problem, and I solved it by two steps:
1.change the datatype of the column in DB into nvarchar
2.use the encoding to change the data into Arabic
I used the following function
private string GetDataWithArabic(string srcData)
{
Encoding iso = Encoding.GetEncoding("iso-8859-1");
Encoding unicode = Encoding.Default;
byte[] unicodeBytes = iso.GetBytes(srcData);
return unicode.GetString(unicodeBytes);
}
but make sure you use this method once on DB data, because it will corrupt the data if used twice
I think your answer is here: "storing and retrieving non english characters" http://aalamrangi.wordpress.com/2012/05/13/storing-and-retrieving-non-english-unicode-characters-hindi-czech-arabic-etc-in-sql-server/
I am using Access database for one system, and SQL server for another system. The data gets synced between these two systems.
The problem is that one of the fields in a table in Access database is a Memo field which is in double-byte format. When I read this data using DataGridView in a Windows form, the text is displayed as ???.
Also, when data from this field is inserted in sql server database nvarchar(max) field, non-English characters are inserted as ???.
How can I fetch data from memo field, convert its encoding to Unicode, so that it appears correctly in SQL server database as well?
Please help!!!
I have no direct experience with datagrid controls, but I already noticed that some database values are not correctly displayed through MS-Access controls. Uniqueidentifiers, for example, are set to '?????' values when displayed on a form. You could try this in the debug window, where "myIdField" control is bound to "myIdField" field from the underlying recordset (unique Identifier type field):
? screen.activeForm.recordset.fields("myIdField")
{F0E3C822-BEE9-474F-8A4D-445A33F363EE}
? screen.activeForm.controls("myIdField")
????
Here is what the Access Help says on this issue:
The Microsoft Jet database engine stores GUIDs as
arrays of type Byte. However, Microsoft Access can't return Byte data
from a control on a form or report. In order to return the value of a
GUID from a control, you must convert it to a string. To convert a
GUID to a string, use the StringFromGUID function. To convert a string
back to a GUID, use the GUIDFromString function.
So if you are extracting values from controls to update a table (either directly or through a recordset), you might face similar issuers ...
One solution will be to update data directly from the recordset original value. Another option would be to open the original recordset with a query containing necessary conversion instructions so that the field will be correctly displayed through the control.
What I usually do in similar situation, where I have to manipulate uniqueIdentifier fields from multiple datasources (MS-Access and SQL Server for Example), is to 'standardize' these fields as text in the recordsets. Recordsets are then built with queries such as:
SQL Server
"SELECT convert(nvarchar(36),myIdField) as myIdField, .... FROM .... "
MS-Access
"SELECT stringFromGUID(myIdField) as myIdField, .... FROM .... "
I solved this issue by converting the encoding as follows:
//Define Windows 1252, Big5 and Unicode encodings
System.Text.Encoding enc1252 = System.Text.Encoding.GetEncoding(1252);
System.Text.Encoding encBig5 = System.Text.Encoding.GetEncoding(950);
System.Text.Encoding encUTF16 = System.Text.Encoding.Unicode;
byte[] arrByte1 = enc1252.GetBytes(note); //string to be converted
byte[] arrByte2 = System.Text.Encoding.Convert(encBig5, encUTF16, arrByte1);
string convertedText = encUTF16.GetString(arrByte2);
return convertedText;
Thank you all for pitching in!
I have a very large CSV file that I imported into a sqlite table. There are over 50 columns and I'd like to find all rows where any of the columns are null. Is this even possible? I'm just trying to save myself the time of writing out all of the 50 different columns in a where clause. Thanks.
It's an interesting question but it's a probably quicker to write a quick script that generates your converts that copy/pasted header row from your CSV to the appropriate script.
For instance this works in LINQPad (C#)
void Main()
{
string input = "adasda|sadasd|adasd|";
char delim = '|';
StringBuilder sql = new StringBuilder();
sql.AppendLine("SELECT * FROM table WHERE ");
foreach (string s in input.Split(delim))
{
if (!String.IsNullOrEmpty(s))
sql.Append(s).AppendLine(" IS NULL OR ");
}
sql.ToString().Trim('\r', '\n', 'O', 'R',' ').Dump();
}
No. Not without a cursor using DESCRIBE TABLE or an intermediate technology.
Your best bet would be to DEFAULT NULL the columns and re-import the data. But depending on the CSV import and column types, you may still get empty values in the columns.
Sucks, but it is probably quicker to just copy and paste the SQL commands. The script would be reusable.