SQL Server 2019 CHARINDEX returns weird result - sql-server

When I run the following query in SQL Server 2019, the result is 1, whereas it should be 0.
select CHARINDEX('αρ', 'αυρ')
What could be the problem?

As was mentioned in the comments it may be because you have not declared your string literals as Unicode strings but are using Unicode characters in the strings. SQL Server will be converting the strings to another codepage and doing a bad job of it. Try running this query to see the difference.
SELECT 'αρ', 'αυρ', N'αρ', N'αυρ'
On my server, this gives the following output:
a? a?? αρ αυρ
Another issue is that CHARINDEDX uses the collation of the input which I think is probably not set correctly in this instance. You can force a collation by setting it on one of the inputs. It is also possible to set it at the instance, database and column level.
There are different collations that may be applicable. These have different features, for example some are case sensitive some are not. Also not all collations are installed with every SQL Server instance. It would be worth running SELECT * from sys.fn_helpcollations() to see the descriptions of all the installed ones.
If you change your query to this you should get the result you are looking for.
SELECT CHARINDEX(N'αρ' COLLATE Greek_BIN, N'αυρ')

Related

IS NOT NULL AS ... - Convert Query from MySQL to MS-SQL

I have an application that is working fine using a MySQL/MariaDB-Database.
I did make it more flexible and now I am basically able to use a Microsoft SQL-Server database.
I found out, that some SQL-queries do NOT work anymore.
I don't have experience with MS-SQL and I am looking for support to convert the following query to make it work with MS-SQL. It would be great, if the query could be converted to work in both MS-SQL and MySQL ...
I have created an SQL-Fiddle with some example-data.
Link: http://sqlfiddle.com/#!18/5fb718/2
The Query itself looks like this:
SELECT computermapping.PrinterGUID, computerdefaultprinter.PrinterGUID IS NOT NULL AS isDefaultPrinter
FROM computermapping
LEFT JOIN computerdefaultprinter ON computerdefaultprinter.ComputerGUID = computermapping.ComputerGUID
AND computerdefaultprinter.PrinterGUID = computermapping.PrinterGUID
WHERE computermapping.ComputerGUID = "5bec3779-b002-46ba-97c4-19158c13001f"
When I run this on SQL-Fiddle I get the following error:
Incorrect syntax near the keyword 'IS'.
When I run this Query in Microsoft SQL Server Management Studio I get the same Error. I have an German-Installation ...
Meldung 156, Ebene 15, Status 1, Zeile 1
Falsche Syntax in der Nähe des IS-Schlüsselworts.
I was looking on the Internet to find information on how to use the IS NOT NULL AS in MS-SQL. Maybe I was using the wrong keywords, but I was not able to find a solution myself.
If it does matter, I am using "SQL-Server 2014 SP3" at the moment.
Thank you
Convert
computerdefaultprinter.PrinterGUID IS NOT NULL AS isDefaultPrinter
to
CASE
WHEN computerdefaultprinter.PrinterGUID IS NOT NULL THEN 1
ELSE 0
END AS isDefaultPrinter
Demo here
Also bear in mind that there is no BOOLEAN type in SQL Server. BIT type is used instead.
Finally
WHERE computermapping.ComputerGUID = "5bec3779-b002-46ba-97c4-19158c13001f"
should be converted to
WHERE computermapping.ComputerGUID = '5bec3779-b002-46ba-97c4-19158c13001f'
since the single quote character is used to delimit strings in SQL Server
MySql evaluates boolean expressions like:
computerdefaultprinter.PrinterGUID IS NOT NULL
as 1 for TRUE or 0 for FALSE.
SQL Server does not do such an evaluation, so you need a CASE expression:
CASE WHEN computerdefaultprinter.PrinterGUID IS NOT NULL THEN 1 ELSE 0 END
The exact question was already answered, but you should really, really try using the SQL Server Migration Assistant for MySQL. No database supports the SQL standard beyond a basic compatibility level and MySQL is one of the quirkiest databases when it comes to SQL compatibility.
The SQL standard process is sloooow so all vendors implement features long before they're standardized. Different databases have different priorities too, and MySQL's priority for the first decade at least wasn't enterprise applications, so SQL compatibility wasn't a high priority.
Some common features were added only in MySQL 8. Some hacks allowed (but discouraged) in MySQL, like non-aggregate columns in a grouping query, or quirky updates to calculate row numbers, don't work in any other databases because logically, they lead to unpredictable results.
Even in MySQL, non-aggregate columns can cause serious performance degradation when upgrading from one 5.x version to the next. There's at least one SO question about this.
The Migration assistant will convert tables and types where possible and flag any stored procedure or view queries that can't be translated. It can also copy data from an existing MySQL database to a SQL Server database

Uniform SQL Date in Oracle SQL and SQL Server

Like the title indicates, I'm trying to make an uniform query, which has compatible syntax for Oracle SQL and for SQL Server
The current syntax I have is written in Oracle and is:
SELECT *
FROM TableA
WHERE CREATION_DATE = TO_DATE('12-09-2015', 'DD-MM-YYYY')
When I try to build this in SQL Server, this doesn't work. The syntax of SQL Server is
SELECT *
FROM TableA
WHERE OrderDate='12-09-2015'
But this doesn't work in Oracle SQL...
So what is the uniform way to write this?
I am not sure if SQL Server supports ANSI DATE literal, but it is supported in Oracle. The default string literal format is YYYY-MM-DD.
DATE '2015-09-12'
so, if ANSI standard is supported in SQL Server too, then use it. It is simple.
Based on this link, I think you could use the above in both the databases.
WHERE OrderDate='12-09-2015'
Never do that. You are comparing a DATE with a STRING, you might be just lucky to get correct data depending on your locale-specific NLS settings. But, never rely on implicit data type conversion.
Use this for something which works on both:
CAST('2015-09-14' as date)
The cast is executed only once so has no effect on performance.

Data Type Conversion from SQL Server to Oracle

I am currently moving a product from SQL Server to Oracle. I am semi-familiar with SQL Server and know nothing about Oracle, so I apologize if the mere presence of this question offends anyone.
Inferring from this page, http://download.oracle.com/docs/cd/E12151_01/doc.150/e12156/ss_oracle_compared.htm, it would seem that the data type conversion from SQL Server to Oracle should be:
REAL = FLOAT(24) -> FLOAT(63)
FLOAT(p) -> FLOAT(p)
TIMESTAMP -> NUMBER
NVARCHAR(n) -> VARCHAR(n*2)
NCHAR(n) -> CHAR(n*2)
Here are my questions regarding them:
For FLOAT, considering that FLOAT(p) -> FLOAT(p), wouldn't it also mean that FLOAT -> FLOAT(24)?
For TIMESTAMP, since Oracle also has its own version of it, wouldn't it be better that TIMESTAMP -> TIMESTAMP?
Finally, for NVARCHAR(n) and NCHAR(n), I thought the issue would be regarding Unicode. Then, again, since Oracle provides its own version of both, wouldn't it make more sense that NVARCHAR(n) -> NVARCHAR(n) and NCHAR(n) -> NCHAR(n)?
It would be much appreciated if someone were to elaborate on the previous 3 matters.
Thanks in advance.
It appears that Oracle's CHAR and VARCHAR2 (always use VARCHAR2 instead of VARCHAR) already support Unicode - the document you've linked to advises converting to those from the SQL Server NCHAR and NVARCHAR datatypes.
The SQL Server TIMESTAMP isn't actually a timestamp at all - it's some kind of identifier based on the time that's just used to indicate that a row has changed - it can't be converted back into any kind of DATETIME (at least in a way that I know about).
For FLOAT, using 126 bytes would be enormous - since the developer tools automatically map SQL Server's FLOAT to Oracle's FLOAT(53), why not use that amount?
This is more FYI than an answer to your question, but you're potentially going to run into a particularly painful difference between SQL Server and Oracle. In SQL Server, you can define a string column (of whatever flavor) to not allow NULL values, and then insert zero-length (aka "blank") strings into that column, because SQL Server does not consider a blank string to be the same as a NULL.
Oracle does consider a blank string to be the same as a NULL, so Oracle will not let you insert blank values into a NOT NULL column. This obviously causes problems when copying data from a table in SQL Server into its counterpart table in Oracle. You choices for dealing with this are:
Set the offending string column in Oracle to allow NULL values (so not a good idea)
When copying the data, replace the blank strings with something else (I have no idea what you should use here)
Skip the offending rows and pretend you never saw them
I'd love to think that Oracle's choice to consider blank strings to be NULL (where they're alone among the major DBs) was to lock customers into their platform, but this one actually works in the opposite direction. You can move a database from Oracle to something else without the blank=NULL difference causing any problems.
See this earlier question: Oracle considers empty strings to be NULL while SQL Server does not - how is this best handled?
Following is not a direct answer to your question; but it is good to take a look at the sqlteam blog
sqlteam - Datatypes translation between Oracle and SQL Server part 2: number
It has detailed explanation about how to handle numbers, etc.

SQL Server Collation / ADO.NET DataTable.Locale with different languages

we have WinForms app which stores data in SQL Server (2000, we are working on porting it in 2008) through ADO.NET (1.1, working on porting to 4.0). Everything works fine if I read data previsouly written in Western-European locale (E.g.: "test", "test ù"), but now we have to be able to mix Western and non-Western alphabets as well (E.g.: "test - ۓےۑ" - these are just random arabic chars).
On the SQL Server side, database has been set with the Latin1_General collation, the field is a nvarchar(80). If I run a SQL SELECT statement (E.g.: "SELECT * FROM MyTable WHERE field = 'test - ۓےۑ'", don't mind about the "*" or the actual names) from Query Analyzer, I get no results; the same happens if I pass the Sql statement to an ADO.NET DataAdapter to fill a DataTable. My guess is that it has something to do with collation, but I don't know how to correct this: do I have to change to collation (SQL Server) to a different one? Or do I have to set the locale on the DataAdaoter/DataTable (ADO.NET)?
Thanks in advance to anyone who will help
Shouldn't you use N when comparing nvarchar with extended char. set?
SELECT * From TestTable WHERE GreekColCaseInsensitive = N'test - ۓےۑ'
Yes, the problem is most likely the collation. The Latin1_General collation does not include the rules to sort and compare non latin characters.
MSDN claims:
If you must store character data that reflects multiple languages, you can minimize collation compatibility issues by always using the Unicode nchar, nvarchar, and ntext data types instead of the char, varchar, text data types. Using the Unicode data types eliminates code page conversion issues.
Since you have already complied with this, you should read further on the info about Mixed Collation Environments here.
Additionally I want to add that just changing a collation is not something done easy, check the MSDN for SQL 2000:
When you set up SQL Server 2000, it is important to use the correct collation settings. You can change collation settings after running Setup, but you must rebuild the databases and reload the data. It is recommended that you develop a standard within your organization for these options. Many server-to-server activities can fail if the collation settings are not consistent across servers.
You can specify a collation on a per column bases however:
CREATE TABLE TestTable (
id int,
GreekColCaseInsensitive nvarchar(10) collate greek_ci_as,
LatinColCaseSensitive nvarchar(10) collate latin1_general_cs_as
)
Have a look at the different binary multilingual collations here. Depending on the charset you use, you should find one that fits your purpose.
If you are not able or willing to change the collation of a column you can also just specify the collation to be used in the query like:
SELECT * From TestTable
WHERE GreekColCaseInsensitive = N'test - ۓےۑ'
COLLATE latin1_general_cs_as
As jfrobishow pointed out the use of N in front of the string you want to use to compare is essential. What does it do:
It denotes that the subsequent string is in Unicode (the N actually stands for National language character set). Which means that you are passing an NCHAR, NVARCHAR or NTEXT value, as opposed to CHAR, VARCHAR or TEXT. See Article #2354 for a comparison of these data types.
You can find a quick rundown here.

Sql Server 2008 - Difference between collation types

I'm installing a new SQL Server 2008 server and are having some problems getting any usable information regarding different collations. I have searched SQL Server BOL and google'ed for an answer but can't seem to be able to find any usable information.
What is the difference between the Windows Collation "Finnish_Swedish_100" and "Finnish_Swedish"?
I suppose that the "_100"-version is a updated collation in SQL Server 2008, but what things have changed from the older version if that is the case?
Is it usually a good thing to have "Accent-sensitive" enabled? I know that it depends on the task and all that, but is there any well-known pros and cons to consider?
The "Binary" and "Binary-code point" parameters, in which cases should theese be enabled?
The _100 indicates a collation sequence new in SQL Server 2008, those with _90 are for 2005 and those with no suffix are 2000. I don't know what the differences are, and can't find any documentation. Unless you are doing linked server queries to another SQL server of a different version, I'd be tempted to go with the _100 one. Sorry I can't help with the differences.
The letters ÅÄÖ/åäö do not mix up with A and O just by setting the collation to AI (Accent Insensitive). That is however true for â and other "combinations" not part of the Swedish alphabet as individual letters. â will mix or not mix depending of the setting in question.
Since I have a lot of old databases I still need to communicate with, also using linked servers, I chose FINNISH _SWEDISH _CI _AS now that I'm installing SQL2008. That was the default setting for FINNISH _SWEDISH when the Windows collations first appeared in SQL Server.
Use the query below to try it out yourself.
As you can see, å, ä, etc. do not count as accented characters, and are sorted according to the Swedish alphabet when using the Finnish/Swedish collation.
However, the accents are only considered if you use the AS collation. For the AI collation, their order is unchanged, as if there was no accent at all.
CREATE TABLE #Test (
Number int identity,
Value nvarchar(20) NOT NULL
);
GO
INSERT INTO #Test VALUES ('àá');
INSERT INTO #Test VALUES ('áa');
INSERT INTO #Test VALUES ('aa');
INSERT INTO #Test VALUES ('aà');
INSERT INTO #Test VALUES ('áb');
INSERT INTO #Test VALUES ('ab');
-- w is considered an accented version of v
INSERT INTO #Test VALUES ('wa');
INSERT INTO #Test VALUES ('va');
INSERT INTO #Test VALUES ('zz');
INSERT INTO #Test VALUES ('åä');
GO
SELECT Number, Value FROM #Test ORDER BY Value COLLATE Finnish_Swedish_CI_AS;
SELECT Number, Value FROM #Test ORDER BY Value COLLATE Finnish_Swedish_CI_AI;
GO
DROP TABLE #Test;
GO
To address question 3 (info taken off the MSDN; wording theirs, format mine):
Binary (_BIN):
Sorts and compares data in SQL Server tables based on the bit patterns defined for each character.
Binary sort order is case-sensitive and accent-sensitive.
Binary is also the fastest sorting order.
If this option is not selected, SQL Server follows sorting and comparison rules as defined in dictionaries for the associated language or alphabet.
Binary-code point (_BIN2):
For Unicode data: Sorts and compares data in SQL Server tables based on Unicode code points.
For non-Unicode data: will use comparisons identical to binary sorts.
The advantage of using a Binary-code point sort order is that no data resorting is
required in applications that compare sorted SQL Server data. As a result, a Binary-code point sort order provides simpler application development and possible performance increases.
For more information, see Guidelines for Using BIN and BIN2 Collations.
To adress your question 1. Accent sensitive is a good thing to have enabled for Finnish-Swedish. Otherwise your "å"s and "ä"s will be sorted as "a"s and "ö"s as "o"s. (Assuming you will be using those kind of international characters).
More here: http://msdn.microsoft.com/en-us/library/ms143515.aspx (discusses both binary codepoint and accent sensitivity)
On Questions 2 and 3
Accent Sensitivity is something I would suggest turning OFF if you are accepting user data, and ON if you have clean, sanitized data.
Not being Finnish myself, I don't know how many words there are that are different depending on the ó ô õ or ö that they have in them. But if there are users entering data, you can be sure that they will NOT be consistent in their usage, and you want to be able to match them.
If you are gathering data from a dataset that you know the content of, and know the consistency of, then you will want to turn Accent Sensitivity ON because you know that the differences are purposeful.
The same questions apply when considering Question 3. (I'm mostly getting this from the link Tomalak provided) If the data is case and accent sensitive, then you want _BIN, because it will sort faster. If the data is irregular, and not case/accent sensitive, then you will want _BIN2, because it is designed for Unicode data.
To address qestion 2:
Yes, if accent's are required grammer for the given language.

Resources