Oracle database encoding and decoding - database

I'm running an SQL file with Shell script (.sh) contains an update statement but one of the columns is not inserted as needed, I tried to convert the file encoding to UTF-8 with notepad++ but the issue still in place.
bellow you will find the current value of the column and the correct version of it
current (issue):
R��servation
required:
Réservation

You must set the encoding of your shell according to NLS_LANG value.
On Windows it would be for example:
C:\>chcp 65001
Active code page: 65001
C:\>set NLS_LANG=.AL32UTF8
C:\>sqlplus ...
See also OdbcConnection returning Chinese Characters as "?"

Related

En Dash in file path batch job query

Hoping someone can help.
I have a batch job (Windows env) which simply copies a file to another folder.
copy "\\ACP-MS-NAS21\Global\MEC Daily Productivity\Business Analysts\Master_List\HCP_Master_List.xlsx" ^
"\\ACP-MS-NAS21\Global\CSD [?] DWP Medical Services\CSL_CSD_DB\Master_List"
But I get the following error:
The filename, directory name, or volume label syntax is incorrect.
I can see there's an en dash in the file path which I believe is causing the problem.
Is there any way to have a wildcard in the file path or any other way the job can recognise this?
Thanks in advance.
PS. newbie at batch programming, coding etc, so please can explanations be in plain English. Many thanks
The command-line windows (and by extension - BAT files) operate in OEM codepage by default. Which exact codepage is defined by your OS settings (Language for non-Unicode programs or similar). Therefore you cannot normally use anything outside ASCII or whatever codepage you have.
Save script as UTF-8
To use those characters you will need to work in a codepage that actually has them. UTF-8 is the best one for this tasks (and probably the only one that will work in your case).
First, save your script as UTF-8. In notepad you can select codepage from save as menu.
If your editor does not allow you to select UTF-8 (no BOM) leave first line of your BAT file blank as some editors may preface your file a special header called BOM that helps detecting codepage. If it does and you leave 1st line blank you will get a Bad command or file name error as soon as your script starts but this won't prevent it from running properly.
Select UTF-8 codepage
Now, your script is in UTF-8 but windows command processor will still execute it as if it was in ASCII, thus corrupting all special characters. To specify our encoding we need to add following command, preferentially as a first non-blank line of your script, including comments (those may have non-ASCII characters in them - with unpredictable results)
CHCP 65001
CHCP changes current codepage to 65001 which is internal codepage number for UTF-8.
This works because latin letters and numbers in UTF-8 have the same encoding as in ASCII and OEM codepages. Thus your scripts starts executing in OEM codepage, but since CHCP 65001 command itself does not have non-ASCII characters it understood correctly. All following commands will be executed in UTF-8 and may have non-ASCII characters.
You may now insert em-dash into the filename and it won't be replaced with ? upon saving.
Set UTF-8 font
Unfortunately default console window font does not display UTF-8 correctly so you won't be able to see non-ASCII characters correctly.
To solve this you should right-click command-line window title bar, select properties, and change font to UNICODE one. Consolas, Lucida Console and Courier New should work.
Thanks for all the responses.
I've had to make a design change to my DB so have managed to get around the need to do this now (phew :))
Thanks again

ASCII Code [☢] in Batch

Ive tried pasting this character: ☢ into a txt file,but when I use TYPE,it doesn't show the symbol? Is there a certain code page?
Tried different encoding options.
I know about the CHCP command and tried different pages.
As Josep indicates, the text file containing the UTF-8 character has to be encoded in UTF-8 with BOM.
Also, your console font has to include that character. The default fonts don't. I just tested DejaVu Sans Mono, and it works. See this page for instructions on how to add a TrueType font to the cmd console.
Finally, you can get around messing with the code page garbage by invoking powershell. That way you don't have to mess with scraping the current code page, changing it to 65001, doing your thing, then restoring back to original. Just do this:
powershell "gc nuke.txt"
... where nuke.txt contains ☢
Edit: Since you're unable to install new fonts on your computer, the only solution is ascii art.
___
\_/
.--,O.--,
\/ \/
You have to make sure your file has the UTF8 BOM (Use Notepad++ or the option suggested in the answers below), then you may need to change the character set of your console with chcp 65001, see this answers for more information:
How to make Unicode charset in cmd.exe by default?
How to convert a batch file stored in utf-8 to something that works via another batch file and run it

OSQL Incorrect syntax - Character Encoding - Powershell Scripting Help Needed

So we use osql to run in stored procedures as part of our build process. We use a project with an sp folder that gets published with applications as part of a build pack.
I used Visual Studio to create this project structure and created the sql scripts to run in the procs.
Visual Studio saved the files with UTF8 formatting (by default). osql when running in the scripts complained about every single script having a syntax error on line 1 i.e.
> Incorrect syntax near '´'. 1> 2> Msg 102, Level 15, State 1,
> Server GBEPIAP-SQL01, Line 1
Rather baffling.
Anyway; to fix the issue, the sql scripts could be saved with Unicode Codepage 1200 (File -> Advanced Save Options)
et voila - problem gone
Now that's left me with an even bigger problem; I have over 200 proc scripts that I need to open, change encoding and save with the new encoding.
Can any powershell guru do me up a quick script to change the encoding of every file in a folder to Unicode Codepage 1200? Would be doing me a favour while also saving time.
In the end I used the approach documented here
Save all files in Visual Studio project as UTF-8
But instead of UTF8; I specified Unicode.
foreach (var f in new DirectoryInfo(#"...").GetFiles("*.sql", SearchOption.AllDirectories)) {
string s = File.ReadAllText(f.FullName);
File.WriteAllText (f.FullName, s, Encoding.Unicode);
}

Batch script Latin characters

I am writing a batch script to go through some directories doing an specific task, something like the following:
set DBCreationScript=//Here I set the full path for the script
echo %DBCreationScript%
Problem is the path has got some latin characters (ç, ã, á) and when I run the script, the output shows strange characters, not the ones I typed in. The batch script is in ANSI encoding.
I already tried to set the script encoding to UTF-8, but apparently the batch interpreter can't handle the control characters that appear on the beggining of the file.
Any thoughts?
Save the batch file in OEM encoding (a decent editor should allow this) or change the code page prior to running it with
chcp 1252
You can also save it as UTF-8 without signature (BOM) and use
chcp 65001
but down that path lies peril and dragons await to eat you (in short: It's usually painful and has a few weird side-effects).

Exporting UTF8 data from db2

I have a db2 table that contains values in many languages (including right-to-left languages.) When I export this table on a linux box using cli's ''export'' command, I get a good looking comma delimited text file (DEL file,) but when I try it on aix, it replaces all characters that are not in ascii with 0x1a.
I tried playing around with LC_LANG and DB2CODEPAGE, no go. I also tried using codepage modifier, but cli said it can't convert between these two codepages (any codepage I tried that is not English.)
I also tried IXF export, and the data is corrupted there as well.
Help! F1!
Thanks
The codepage of the database has to be set when creating the database. It is not possible to modify it later. You can check the codepage of the database with the following command and look for the value of "Database code page":
db2 get db cfg for [database_name]
Newer AIX versions shouldn't have problems with Unicode, but if you have and older version, that might cause problems too.

Resources