Batch file dropping characters - batch-file

I'm creating a simple batch file that uses Azure REST API to download data from a blob. If I type the request directly into the command prompt, it works perfectly and my data appears in the directory. However, when I run it as a batch file, it does not work and I can see in the command line that some characters from the blob connection string (acts as an access token) have been dropped. I cannot share the full access token, but can show that the drop happens at the end of the connection string, in what is known as the signature:
correct: "...5U%2BJgo%3D"
batch file output: "...5UBJgoD"
It appears the issue is with special characters and some numbers. There are no other special characters in the signature and other numbers in the rest of the signature are not affected.
Other notes:
The connection string is indeed entered within a "" string
I tried forcing the encoding to UTF-8 encoding by running chcp 65001 before the request line executes; didn't work

You should escape your percent sign (%) with double-percent sign (%%). For example you should type:
"...5U%%2BJgo%%3D"
It is quite useful to search in the internet before you post here, on Stack OverFlow. So, check the links provided:
http://www.robvanderwoude.com/escapechars.php
https://ss64.com/nt/syntax-esc.html
Special Characters in Batch File
Hope this helps!

Related

Why are the contents of my batch file being interpreted as alt codes?

I'm using a batch file to execute some adb commands. When trying to do a longpress of the power key I use the following line:
adb -s <ipaddress>:5555 shell input keyevent --longpress 26.
If I type this command into cmd, it works without a hitch. Running it from the batch file, however, results in a short press. I created a single line batch file, with the above command as the sole contents. When running the batch file (I just type the file name in cmd), the command is printed as:
adb -s <ipaddress>:5555 shell input keyevent -ΓÇôlongpress 26
Is there a setting I may have unknowingly enabled that is causing this, or do I need some sort of escape character?
I'm rather embarrassed that I came across a solution to my issue only a few minutes after posting this, but I figure I should share and not waste anyone's time.
I've replaced the second hyphen in my command with its own alt code (i.e. alt 45) and it is now interpreted correctly in the batch file. The line still reads:
adb -s <ipaddress>:5555 shell input keyevent --longpress 26
I don't understand why this works, and would appreciate it if someone would shed light on the subject.
Edit: Based off the recommendation of the comment below, I looked up the differences between encoding schemes. If I understand it correctly, when encoding in ASCII or ANSI, characters are limited to 7 bits of data. This will keep characters in the first 128 members of the ASCII table, so the alt codes I saw previously couldn't be generated.

Escape character for '£' in batch file

I am trying to execute a program from command line where there will be parameters. In my password there is a symbol '£', which I could not find to escape.
It is always good to enclose a parameter string like a quite good password containing also other characters than ASCII letters and numbers in double quotes.
But care must be taken on using characters in batch files which are not from ASCII table, i.e. the code point value (byte value) of the character is greater 127 decimal.
On using Windows Notepad to write a batch file and saving the file with ANSI encoding, the characters with a code point value greater 127 are saved using the code page according to Windows Region and Language settings. For North American and Western European countries this means using code page Windows-1252. The pound sign has decimal value 163 (hexadecimal: A3) in this code page.
But in a command process a different code page is used which can be seen by opening a command prompt window and run the command CHCP (change code page) without any parameter. This command outputs the active code page for command process which depends also on Windows Region and Language settings. The code page OEM 437 is used in North American countries and OEM 850 in Western European countries by default within a command process. The pound sign has the decimal value 156 in code page 437 as also in code page 850.
In other words you need to know what the application which compares the password expects for the pound sign in password:
A byte with value 163 as the password was defined using a GUI application.
A byte with value 156 as the password was defined from within a command prompt window.
Or 1 or even more other byte values depending on the code page and character encoding (ANSI, OEM, UTF-8, UTF-16) used as the password with pound sign was defined. For example UTF-8 character encoding uses 2 bytes with the decimal values 194 and 163 to encode a pound sign.
So what to write into the batch file?
Well, you have to find that out by yourself.
For example the password was defined from within a command prompt window using code page 850 and so the pound sign in stored password is a single byte with value 156. The batch file is edited in Notepad using code page 1252 and therefore the character œ must be used in password to have a byte with value 156 in the batch file in password string.
Thank you for your detailed answer #Mofi.
Background: My CMD program calls SQLPlus and the database password contains a '£'.
Summing this up into a short fix, the following steps worked for me.
The fix:
Open your script in a robust text editor (e.g. Atom, Notepad++,
etc)
Change the file encoding (of the text editor) to CP-1252
Add chcp 1252>nul to the top of your script
Run your script and enjoy the results!
As you have found, handling of the UK pound sign is a trap for the unwary in batch files.
The issue here is that a UK pound sign £ is not an ascii character, so is processed differently by the command prompt and Windows GUI programs like Notepad.
A solution that worked for me was to change the code page in the batch file to 650001 for unicode before using the £ sign.
This idea was discussed at Change the active console Code Page, which explains that the default code page is determined by the Windows Locale.
For example, put this code at the start of your batch file:
#echo off
:: Change the code page to Unicode/65001 before using non-ascii characters.
chcp 65001

ftp file upload fails when special character is present in password

I was trying to upload a file through application i wrote in c.
As i did not find any API, i decided to go through commands.
Input command line looked like this.
ftp -u ftp://ftpuser:password#123#x.x.x.x/test.txt /tmp/test.txt
Whenever a special character is present, login will fail. when i tried with different user without any special characters in the password upload works.
How this issue can be resolved or is there any another method available like API which can be made use of.
If any sample code available then it will be of great help.
Special character means #, $, # (Ex : password#123, password$123)
code snippet:
RunCommandWithPipe(PSTRING CmdLine)
{
FILE *fp;
int status;
fp = popen(CmdLine, "r");
if (fp == NULL)
{
ErrGen(constErrOpenFile);
}
status = pclose(fp);
if (status == -1)
{
ErrGen(constErrCloseFile);
}
}
The reason why this doesn't work is because you are passing unfiltered meta characters into the shell. This is very dangerous. If someone untrustworthy gets to decide the value of any of the parameters to your ftp command, such as the username, password, ftp server, or file name, then that person will be able to run arbitrary shell commands.
You can see what's going on by putting an "echo" in front of your ftp command:
echo ftp -u ftp://ftpuser:password$123#x.x.x.x/test.txt /tmp/test.txt
You'll get this result:
ftp -u ftp://ftpuser:password23#x.x.x.x/test.txt /tmp/test.txt
The shell is trying to evaluate $1 as a variable, leaving an empty result.
There's a couple of things you can do.
1) Make the command safe by escaping all the meta characters. Here you need to be very careful, using a whitelist approach rather than just trying to get rid of the special characters you've thought of. In the whitelist approach you accept that some set of characters are safe, such as [A-Za-z0-9:_-]. Every other character you either strip out or escape by preceding it with a backslash. (eg. "foo:bar$baz&abc" becomes "foo:bar\$bazabc") If you do this way don't try to think of all the characters you know of that are special and escape those. You will most likely forget some, and not handle input this like:
ftp -u ftp://ftpuser:; rm -rf /;echo #x.x.x.x/test.txt /tmp/test.txt
2) Don't pass arguments on the shell, instead control the FTP client through fread()/fwrite() on the pipe that popen() gave you.
In this case what you do is launch the ftp client with no arguments. Then you write "OPEN 192.168.1.1" or wherever you want to connect. Then you write the username. Then you write the password. Then you write the GET or PUT command want. Then you write "EXIT" or write an EOF. You should read the result codes from the server. You'll get 200 series results on success. You'll get a 500 series result if the login is bad, etc.
You still have to watch out when piping into the FTP command because it will take shell escapes like "!rm -rf /", but there is much less opportunity for that than on the shell. You just need to make sure the strings you get to build your FTP commands are one line and that you always precede them with a valid FTP command. You should also watch out for any funny business with untrustworthy filenames. (eg. don't allow absolute paths, "..", and so forth)
You propably using a wrong charset to send the password

Foreign language characters replaced by "?"

I am working on a program which takes file/folder names as input. Currently when I try to run a file which has got foreign language character in its name it is replaced by a ? For each of its character. I am running my exe on command prompt so trying to run the particular file results in an error. When I am using DIR on command prompt it displays ? For each character of the file name. Is there any way to display the actual foreign language characters in command prompt as I believe that could be causing my exe not to work any of those files.
This is the text that I am trying to read - 科普書籍推展教案 which is being replaced by ? on the console.
The command prompt can only display characters in your current ACP. So, if you have files with names outside the ACP, you're going to see ?. You can use changecp to pick a different CP, but here is no code page for full Unicode in the DOS box.
Inside your code, you need to learn to use 'W' API to work with full unicode pathnames. The safest thing is to just #define _UNICODE and use it uniformly.

Strange Characters in database text: Ã, Ã, ¢, â‚ €,

I'm not certain when this first occured.
I have a new drop-shipping affiliate website, and receive an exported copy of the product catalog from the wholesaler. I format and import this into Prestashop 1.4.4.
The front end of the website contains combinations of strange characters inside product text: Ã, Ã, ¢, â‚ etc. They appear in place of common characters like , - : etc.
These characters are present in about 40% of the database tables, not just product specific tables like ps_product_lang.
Another website thread says this same problem occurs when the database connection string uses an incorrect character encoding type.
In /config/setting.inc, there is no character encoding string mentioned, just the MySQL Engine, which is set to InnoDB, which matches what I see in PHPMyAdmin.
I exported ps_product_lang, replaced all instances of these characters with correct characters, saved the CSV file in UTF-8 format, and reimported them using PHPMyAdmin, specifying UTF-8 as the language.
However, after doing a new search in PHPMyAdmin, I now have about 10 times as many instances of these bad characters in ps_product_lang than I started with.
If the problem is as simple as specifying the correct language attribute in the database connection string, where/how do I set this, and what to?
Incidently, I tried running this command in PHPMyAdmin mentioned in this thread, but the problem remains:
SET NAMES utf8
UPDATE: PHPMyAdmin says:
MySQL charset: UTF-8 Unicode (utf8)
This is the same character set I used in the last import file, which caused more character corruptions. UTF-8 was specified as the charset of the import file during the import process.
UPDATE2
Here is a sample:
people are truly living untetheredâ€ïâ€Â
Ã‚ï† buying and renting movies online, downloading software, and
sharing and storing files on the web.
UPDATE3
I ran an SQL command in PHPMyAdmin to display the character sets:
character_set_client utf8
character_set_connection utf8
character_set_database latin1
character_set_filesystem binary
character_set_results utf8
character_set_server latin1
character_set_system utf8
So, perhaps my database needs to be converted (or deleted and recreated) to UTF-8. Could this pose a problem if the MySQL server is latin1?
Can MySQL handle the translation of serving content as UTF8 but storing it as latin1? I don't think it can, as UTF8 is a superset of latin1. My web hosting support has not replied in 48 hours. Might be too hard for them.
If the charset of the tables is the same as it's content try to use mysql_set_charset('UTF8', $link_identifier). Note that MySQL uses UTF8 to specify the UTF-8 encoding instead of UTF-8 which is more common.
Check my other answer on a similar question too.
This is surely an encoding problem. You have a different encoding in your database and in your website and this fact is the cause of the problem. Also if you ran that command you have to change the records that are already in your tables to convert those character in UTF-8.
Update: Based on your last comment, the core of the problem is that you have a database and a data source (the CSV file) which use different encoding. Hence you can convert your database in UTF-8 or, at least, when you get the data that are in the CSV, you have to convert them from UTF-8 to latin1.
You can do the convertion following this articles:
Convert latin1 to UTF8
http://wordpress.org/support/topic/convert-latin1-to-utf-8
This appears to be a UTF-8 encoding issue that may have been caused by a double-UTF8-encoding of the database file contents.
This situation could happen due to factors such as the character set that was or was not selected (for instance when a database backup file was created) and the file format and encoding database file was saved with.
I have seen these strange UTF-8 characters in the following scenario (the description may not be entirely accurate as I no longer have access to the database in question):
As I recall, there the database and tables had a "uft8_general_ci" collation.
Backup is made of the database.
Backup file is opened on Windows in UNIX file format and with ANSI encoding.
Database is restored on a new MySQL server by copy-pasting the contents from the database backup file into phpMyAdmin.
Looking into the file contents:
Opening the SQL backup file in a text editor shows that the SQL backup file has strange characters such as "sÃ¥". On a side note, you may get different results if opening the same file in another editor. I use TextPad here but opening the same file in SublimeText said "så" because SublimeText correctly UTF8-encoded the file -- still, this is a bit confusing when you start trying to fix the issue in PHP because you don't see the right data in SublimeText at first. Anyways, that can be resolved by taking note of which encoding your text editor is using when presenting the file contents.
The strange characters are double-encoded UTF-8 characters, so in my case the first "Ã" part equals "Ã" and "Â¥" = "¥" (this is my first "encoding"). THe "Ã¥" characters equals the UTF-8 character for "å" (this is my second encoding).
So, the issue is that "false" (UTF8-encoded twice) utf-8 needs to be converted back into "correct" utf-8 (only UTF8-encoded once).
Trying to fix this in PHP turns out to be a bit challenging:
utf8_decode() is not able to process the characters.
// Fails silently (as in - nothing is output)
$str = "så";
$str = utf8_decode($str);
printf("\n%s", $str);
$str = utf8_decode($str);
printf("\n%s", $str);
iconv() fails with "Notice: iconv(): Detected an illegal character in input string".
echo iconv("UTF-8", "ISO-8859-1", "så");
Another fine and possible solution fails silently too in this scenario
$str = "så";
echo html_entity_decode(htmlentities($str, ENT_QUOTES, 'UTF-8'), ENT_QUOTES , 'ISO-8859-15');
mb_convert_encoding() silently: #
$str = "så";
echo mb_convert_encoding($str, 'ISO-8859-15', 'UTF-8');
// (No output)
Trying to fix the encoding in MySQL by converting the MySQL database characterset and collation to UTF-8 was unsuccessfully:
ALTER DATABASE myDatabase CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE myTable CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
I see a couple of ways to resolve this issue.
The first is to make a backup with correct encoding (the encoding needs to match the actual database and table encoding). You can verify the encoding by simply opening the resulting SQL file in a text editor.
The other is to replace double-UTF8-encoded characters with single-UTF8-encoded characters. This can be done manually in a text editor. To assist in this process, you can manually pick incorrect characters from Try UTF-8 Encoding Debugging Chart (it may be a matter of replacing 5-10 errors).
Finally, a script can assist in the process:
$str = "så";
// The two arrays can also be generated by double-encoding values in the first array and single-encoding values in the second array.
$str = str_replace(["Ã","Â¥"], ["Ã","¥"], $str);
$str = utf8_decode($str);
echo $str;
// Output: "så" (correct)
I encountered today quite a similar problem : mysqldump dumped my utf-8 base encoding utf-8 diacritic characters as two latin1 characters, although the file itself is regular utf8.
For example : "é" was encoded as two characters "é". These two characters correspond to the utf8 two bytes encoding of the letter but it should be interpreted as a single character.
To solve the problem and correctly import the database on another server, I had to convert the file using the ftfy (stands for "Fixes Text For You). (https://github.com/LuminosoInsight/python-ftfy) python library. The library does exactly what I expect : transform bad encoded utf-8 to correctly encoded utf-8.
For example : This latin1 combination "é" is turned into an "é".
ftfy comes with a command line script but it transforms the file so it can not be imported back into mysql.
I wrote a python3 script to do the trick :
#!/usr/bin/python3
# coding: utf-8
import ftfy
# Set input_file
input_file = open('mysql.utf8.bad.dump', 'r', encoding="utf-8")
# Set output file
output_file = open ('mysql.utf8.good.dump', 'w')
# Create fixed output stream
stream = ftfy.fix_file(
input_file,
encoding=None,
fix_entities='auto',
remove_terminal_escapes=False,
fix_encoding=True,
fix_latin_ligatures=False,
fix_character_width=False,
uncurl_quotes=False,
fix_line_breaks=False,
fix_surrogates=False,
remove_control_chars=False,
remove_bom=False,
normalization='NFC'
)
# Save stream to output file
stream_iterator = iter(stream)
while stream_iterator:
try:
line = next(stream_iterator)
output_file.write(line)
except StopIteration:
break
Apply these two things.
You need to set the character set of your database to be utf8.
You need to call the mysql_set_charset('utf8') in the file where you made the connection with the database and right after the selection of database like mysql_select_db use the mysql_set_charset. That will allow you to add and retrieve data properly in whatever the language.
The error usually gets introduced while creation of CSV. Try using Linux for saving the CSV as a TextCSV. Libre Office in Ubuntu can enforce the encoding to be UTF-8, worked for me.
I wasted a lot of time trying this on Mac OS. Linux is the key. I've tested on Ubuntu.
Good Luck

Resources