Sql Server - Encoding issue, replace strange characters - sql-server

After importing some data into a Sql 2014 database, I realized that there are some fields in which the data replaced German characters such as (ü, ß, ä, ö, etc) with some weird characters. Ex.
München should be München
ChiemgaustraÃe should be Chiemgaustraße
Königstr should be Königstr
I would like to replace these characters with the right German letter. Ex.
ü -> ü
à - > ß
ö -> ö
However when I run queries like the following to try to identify which rows have these characters, the queries returns 0 rows.
select address
from Directory
where street like N'%ChiemgaustraÃe 50%'
select address
from Directory
where street like N'%ü%'
Is there a query I can run to identify and replace these characters?
I must clarify that most of the data was imported correctly, in fact I believe the strange characters were already part of the original data.
Also, I think I can export the data to a text file, replace the characters and re-import, but I was wondering if there is a way to do it directly in sql.
Thanks in advance for the help.

I couldn't get it fix using only sql.
FutbolFa suggestion worked for the most part but there were a couple of symbols, in particular "Ã" that wasn't picked up by any query a tried. I ended up exporting the data to a text file and replacing the symbols there. Then I just re-imported the info.

Related

Encoding error reading Greek characters string form SQL database

I have a search form (with method GET) with only one text field named “search_field”. When a user submits the form, the typed by the user characters are posted to the URL. For example if the user type "blablabla" the generated URL will be something like that:
results.asp?search_field=blablabla
In my MSSQL 2012 database I have a table named “Products” with a column named “kodikos” in it.
I want to display all the records from the column “kodikos” containing the typed characters. My SQL select statement if the following:
"SELECT * FROM dbo.Products WHERE dbo.Products.kodikos LIKE '%' + ? + '%' "
(the question mark is the “search_field” that contains the typed by the user characters.
All the above works perfect and I am getting the correct results. The problem that I am facing is with the Greek characters. For example when the user type “fff” my codes works perfect and finds all the records containing the characters “fff”. Also works perfect with numbers too. But if the user type in Greek characters “φφφ” I am not getting any results. And there are a lot of records with “φφφ”. The problem is that the Greek characters are not recognized at all.
For your information:
In my local PC with the same SQL version the Greek characters are recognized correctly with my code, because my regional settings are set in Greek. But the same code in the hosting server in US does not recognize them.
All of my pages have UTF-8 encoding.
Can someone have any idea to solve this issue???
SQL Server knows two encodings natively:
2-byte-unicode (in most cases NVARCHAR)
extended ASCII in connection with a collation (in most cases VARCHAR)
I assume, that the language you are calling this from is using 2-byte-unicode for normal strings. This is pretty usual today...
I assume, that your column Products.kodikos is of type NVARCHAR (2-byte-unicode). In this case it should help to force your search string to be 2-byte-unicode too. Try
LIKE N'%' + CAST(? AS NVARCHAR(MAX)) + N'%'
If your column is not 2-byte encoded it might help to use COLLATE to force your search string to know your special characters.
If you pass this string into a SQL-Server routine as-is, you should make sure, that the accepting parameter is 2-byte-unicode too.
You have to make sure your search string is two byte encoded using the N'' notation...
For instance, the following query uses a string that is two byte encoded:
SELECT * FROM dbo.Products WHERE dbo.Products.kodikos LIKE N'%φφφ%'
But this query uses a string that is not two byte encoded (you won't get any results):
SELECT * FROM dbo.Products WHERE dbo.Products.kodikos LIKE '%φφφ%'

_x000D_ appearing when importing into SQL

I am importing some Excel spreadsheets into a MS SQL Server. I load the spreadsheets, cleanse the data and then export it to SQL using Alteryx. Some files have text columns where the cells span multiple lines (i.e. with new line characters, like when you press ALT + ENTER in Excel). When I export the tables to SQL and then query the table, I see lots of '_x000D_' which are not in the original file.
Is it some kind of newline character encoding? How do I get rid of it?
I haven't been able to replicate the error. The original file contains some letters with accents (à á etc); I created multi-line spreadsheets with accented letters, but I managed to export these to SQL just fine, with no 'x000D'.
If these were CSV files I would think of character encoding, but Excel spreadsheets? Any ideas? Thanks!
I know this is old, but: if you're using Alteryx, just run it through the "Data Cleansing" tool as the last thing prior to your export to SQL. For the field in question, tell the tool to remove new lines by checking the appropriate checkbox.
If that still doesn't work... 0x000D is basically ASCII 13; (Hex "D" = Int 13)... so try running your data through a regular Formula tool, and for the [field] in question, just use the expression Replace([field],CharFromInt(13),""), which should remove that character by replacing it with the empty string.
This worked for me:
REGEX_REPLACE([field],"_x000D_","")

insert special character in my sql server database

ANSWER :
Sorry about the this sort of question guys, I assumed that it wouldn't work if I directly enter the special character into my string in query but it does. so all you need to do is locate the special character, copy it and paste it into your query and it works :)
folks,
QUESTION CHANGED:
I want to enter a ascii character in the database which is the standard trademark symbol (®) using a direct query and have it read correctly ! how can i do this ?
PREVIOUS QUESTION:
how can i enter a special character in SQL Server in varchar column... ® (there is also a line below this symbol which I am unable to paste here) so that it is read correctly.
Also, I am unable to find the character sequence for that symbol any places where I can look for ?
The symbol is standard ® symbol which hangs on the top and there is a line below it just like an underscore.
Thanks
EDIT 1: I am talking about a direct query to the database.
You can use this T-SQL query:
INSERT INTO dbo.YourTable(UnicodeCol)
VALUES(nchar(0x00AE))
® is the Unicode character with code 0x00AE
But of course - since this is a Unicode character, the column you're inserting into must be of type NVARCHAR (not VARCHAR)
You can convert it to Unicode NCR format before you store to database, or just encode it with related functions of the language you are using , like JavaScript's encodeuricomponent, PHP's urlencode.
You can use 'N' ahead of data.
This query might be helpful to you.
insert into product_master(product_name) values(N'कंप्यूटर')

Hyphen vs Dash : Replace Dash with Hyphen

Alright
so we had a problem recently
In reporting services some of the String Columns were appearing as gibberish Chinese characters.
On further investigation we found it is the hyphen. Well that's what we though first. On further investigation we found it a dash (or en dash) . Basically the reason this has happened is people copy pasting values into this column from word which converts hyphens into dashes automatically.
But if you look at the database they both look the same. Though on the application side you can see the difference.
How do I replace the dash with a normal hyphen.
If you copy the value in put it in SQL server. A hyphen is gray while a dash is black
but they both look exactly the same (i.e not bigger or smaller). Problem is I can't write a REPLACE script then (they are the freakin same)
REPLACE ('-' with '-')
is there a way special characters like the dash can be identified in SQL server?
SQL Server v 2005
You can use NCHAR(8211) for the en dash, or NCHAR(8212) for em dash.

SQL 2005 CSV Import Quote Delimited with inner Quotes and Commas

I have a CSV file with quote text delimiters. Most of the 90000 rows are fine, but I have a few rows that have a text field that contains both a quote and a comma. For example the fields value would be:
AB",AB
When Delimited this becomes
"AB"",AB"
When SQL 2005 attempts to import this I get errors such as...
Messages
Error 0xc0202055: Data Flow Task: The column delimiter for column "Column 4" was not found.
(SQL Server Import and Export Wizard)
This only seems to happen when a quote and comma are in a text value together. Values like
AB"AB which becomes "AB""AB"
or
AB,AB which becomes "AB,AB"
work fine.
Here are some example rows...
"1464885","LEVER WM","","B","MP17"
"1465075",":PLT-BC !!NOTE!!","","B",""
"1465076","BRKT-STR MTR !NOTE!","","B",""
"1465172",":BRKT-SW MTG !NOTE!","","B","MP16"
"1465388","BUSS BAR !NOTE!","","B","MP10"
"1465391","PLT-BLKHD ""NOTE""","","B","MP20"
"1465564","SPROCKET:13TEETH,74MM OD,66MM","ID W/.25"" SETSCR","B","MP6"
"S01266330002","CABLE:224"",E122/261,8 CO","","B","MP11"
The last row is an example of the problem - the "", causes the error.
I've had MAJOR problems with SSIS. Things that Access, Excel and even DTS seemed to do very well, SSIS chokes on. Variable record-length data is another problem but, yes, these embedded qualifiers are a major problem. Especially if you do not have access to the import files because they're on someone else's server that you pay to gain access to and might even be 4 to 5 GB in size! Cant just to a "replace all" on that every import.
You may want to check into this at Microsoft Downloads called "UnDouble" and here is another workaround you might try.
Seems like with SSIS in SQL Server 2008, the bug is still there. I dont know why they havent addressed this in the parser but its like we went back in time with SSIS in basic import functionality.
UPDATE 11-18-2010: This bug still exists in SSIS. Amazing.
How about just:
Search/replace all "", with ''; (fix all the broken fields)
Search/replace all ;''; with ,"", (to "unfix" properly empty fields.)
Search/replace all '';''; with "","", (to "unfix" properly empty fields which follow a correct encapsulation of embedded delimiters.)
That converts your original to:
"1464885","LEVER WM","","B","MP17"
"1465075",":PLT-BC !!NOTE!!","","B",""
"1465076","BRKT-STR MTR !NOTE!","","B",""
"1465172",":BRKT-SW MTG !NOTE!","","B","MP16"
"1465388","BUSS BAR !NOTE!","","B","MP10"
"1465391","PLT-BLKHD ""NOTE""","","B","MP20"
"1465564","SPROCKET:13TEETH,74MM OD,66MM","ID W/.25"" SETSCR","B","MP6"
"S01266330002","CABLE:224'';E122/261,8 CO","","B","MP11"
Which seems to run the gauntlet fine in SSIS. You may have to step 3 recursively to account for 3 empty fields in a row ('';'';'';, etc.) but the bottom line here is that when you have embedded text qualifiers, you have to either escape them or replace them. Let this be a lesson in your CSV creation processes going forward.
Microsoft says doubled double quotes inside double quote delimited fields just don't work. A fix is planned for the end of 2011...
In the mean time we will have to use workarounds like described in the other answers.
I would just do a search/replace for ", and replace it with ,
Do you have access to the original file?

Resources