Save text in UTF-8 via Batch - batch-file

I am still working on a converter for .m3u playlist files that ports them from a Windows Media Player generated format into a format that gets accepted by the Teamspeak 3 plugin "Soundboard".
The main converter is finished now and I encountered a last problem:
When writing the new code with a Batch script it gets saved into an ANSI encoded file using echo a-lot-of-text-and-code >> 3.txt and it seems like the plugin can only open UTF-8 encoded files.
Is there any way to change the encoding of 3.txt from ANSI to UTF-8 with Batch only?
Regards, Joe

It took a bit of experimentation, but I successfully tweaked Simon Sheppard's UCS-2 encode method to encode a file as UTF-8 with batch.
#echo off
setlocal
:: utf8.bat infile outfile
:: convert infile to utf8 and save as outfile
if not exist "%~1" goto usage
if "%~2"=="" goto usage
set "infile=%~f1"
set "outfile=%~f2"
:: store current console codepage to var
for /f "tokens=2 delims=:" %%I in ('chcp') do set "_codepage=%%I"
:: temporarily change console codepage to UTF-8
>NUL chcp 65001
:: set byte order mark for outfile
>"%outfile%" set /p "=" <NUL
:: dump infile to outfile encoded as UTF-8
>>"%outfile%" type "%infile%"
:: restore console to original codepage
>NUL chcp %_codepage%
goto :EOF
:usage
echo Usage: %~nx0 infile outfile
This script itself needs to be saved in ANSI encoding, though.

Related

How do I read special characters in a file using a .bat file

I have a .bat file(batch file) and I want it to be able to read a file, but it has lots of special characters the code I currently have istype "C:\Users\mehid\Desktop\Command prompt game\Stuff for coders ;)\Map.txt" and the .txt file has ─ │ ┌ ┐ └ ┘ ├ ┤ ┬ ┴ ┼ and those types of characters it outputs this ΓöÇ Γöé Γöî ΓöÉ Γöö Γöÿ Γö£ Γöñ Γö¼ Γö┤ Γö╝
The most likely reason for your issue, is that your Map.txt file was saved with UTF8 encoding, and your batch file has not told cmd.exe to use anything other than the default codepage.
The simplest way to change that, other than changing the encoding of your text file, is to change your codepage to something which can read UTF8 characters, i.e. codepage 65001.
Example:
#Echo Off
SetLocal EnableExtensions
Rem Determining, and saving to a variable, your current codepage.
Set "codepage=" & For /F "Tokens=*" %%G In ('%SystemRoot%\System32\chcp.com'
) Do For %%H In (%%G) Do If Not %%~nH Equ 65001 (Set "codepage=%%~nH"
%SystemRoot%\System32\chcp.com 65001 1> NUL)
Echo typing your text file using codepage 65001 & Echo(
Type "C:\Users\mehid\Desktop\Command prompt game\Stuff for coders ;)\Map.txt"
Pause
Rem Changing your codepage back to what it was previously.
If Defined codepage %SystemRoot%\System32\chcp.com %codepage% 1> NUL
Echo typing your text file using your original codepage, [%codepage%]. & Echo(
Type "C:\Users\mehid\Desktop\Command prompt game\Stuff for coders ;)\Map.txt"
Pause
I removed some characters at the beginning of the file that weren't needed and it fixed itself

Recursively run ffprobe to get codec types

I modified #rojo code from here a little to look for h264/AC3 and to recursively run thru all the children folders. My only issue is that it always says the videos has h264 and AC3, but when I run the ffprobe command manually it states different. Am I missing something?
#if (#CodeSection == #Batch) #then
#echo off & setlocal
for /R %%f in (*.mkv, *.mp4) do (
echo Testing %%f
set ffprobe=C:\ffmpeg-4.0.2-win64-static\bin\ffprobe -v quiet -show_entries "stream=codec_name,height" -of json "%%f"
for /f "delims=" %%I in ('%ffprobe% ^| cscript /nologo /e:JScript "%~f0"') do set "%%~I"
set "pre=-hide_banner -fflags +genpts+discardcorrupt+fastseek -analyzeduration 100M"
set "pre=%pre% -probesize 50M -hwaccel dxva2 -y -threads 3 -v error -stats"
set "global="
set "video=-c:v h264_nvenc"
set "audio=-c:a ac3"
if defined h264 if defined ac3 (
echo %%~nf already in x264 + AC3 format.
)
if not defined h264 if not defined ac3 (
if not defined ac3 (
echo Already has AC3 audio. Re-encoding video only.
set "audio=-c:a copy"
)
if not defined h264 (
echo Already has h264 video. Re-encoding audio only.
set "video=-c:v copy"
)
echo output "%%~df%%~pf%%~nf.new.mkv"
echo C:\ffmpeg-4.0.2-win64-static\bin\ffmpeg %pre% -i "%%f" %global% %video% %audio% "%%~df%%~pf%%~nf.new.mkv"
pause
echo del "%%f" /f /q
echo ren "%%~df%%~pf%%~nf.new.mkv" "%%f"
)
)
#end // end Batch / begin JScript
var stdin = WSH.CreateObject('Scripting.FileSystemObject').GetStandardStream(0),
htmlfile = WSH.CreateObject('htmlfile'),
JSON;
htmlfile.write('<meta http-equiv="x-ua-compatible" content="IE=9" />');
htmlfile.close(JSON = htmlfile.parentWindow.JSON);
var obj = JSON.parse(stdin.ReadAll());
for (var i = obj.streams.length; i--;) {
if (/h264/i.test(obj.streams[i].codec_name)) WSH.Echo('h264=true');
if (/ac3/i.test(obj.streams[i].codec_name)) WSH.Echo('ac3=true');
}
I had this working of a second then it stopped for not reason.
#if (#CodeSection == #Batch) #then
#echo off & setlocal & goto run
:run
for /R %%f in (*.mkv, *.mp4) do (
echo Testing %%f
set "file=%%f"
set "drive=%%~df"
set "dir=%%~pf"
set "name=%%~nf"
set "ext=%%~xf"
for /f "delims=" %%I in ('C:\ffmpeg-4.0.2-win64-static\bin\ffprobe.exe -v quiet -show_entries "stream=codec_name,height" -of json "%%f" ^| cscript /nologo /e:JScript "%~f0"') do (set "%%~I")
set "pre=-hide_banner -fflags +genpts+discardcorrupt+fastseek -analyzeduration 100M"
set "pre=%pre% -probesize 50M -hwaccel dxva2 -y -threads 3 -v error -stats"
set "global="
set "video=-c:v h264_nvenc"
set "audio=-c:a ac3"
if defined ac3 if defined h264 call :both
if not defined ac3 call :either
if not defined h264 call :either
)
:both
echo %name% already in x264 + AC3 format.
goto :EOF
:either
if not defined h264 (
echo Already has AC3 audio. Re-encoding video only.
set "audio=-c:a copy"
)
if not defined ac3 (
echo Already has h264 video. Re-encoding audio only.
set "video=-c:v copy"
)
echo "C:\ffmpeg-4.0.2-win64-static\bin\ffmpeg %pre% -i "%file%" %global% %video% %audio% "%drive%%dir%%name%.new.mkv""
echo del "%file%" /f /q
echo ren "%drive%%dir%%name%.new.mkv" "%name%%ext%"
goto :EOF
#end // end Batch / begin JScript
var stdin = WSH.CreateObject('Scripting.FileSystemObject').GetStandardStream(0),
htmlfile = WSH.CreateObject('htmlfile'),
JSON;
htmlfile.write('<meta http-equiv="x-ua-compatible" content="IE=9" />');
htmlfile.close(JSON = htmlfile.parentWindow.JSON);
var obj = JSON.parse(stdin.ReadAll());
for (var i = obj.streams.length; i--;) {
if (/h264/i.test(obj.streams[i].codec_name)) WSH.Echo('h264=true');
if (/ac3/i.test(obj.streams[i].codec_name)) WSH.Echo('ac3=true');
}
ffprobe Output for h264
{
"programs": [
],
"streams": [
{
"codec_name": "h264",
"height": 528
},
{
"codec_name": "aac"
}
]
}
output for ac3
{
"programs": [
],
"streams": [
{
"codec_name": "h265",
"height": 528
},
{
"codec_name": "ac3"
}
]
}
output for both ac3/h264
{
"programs": [
],
"streams": [
{
"codec_name": "h264",
"height": 528
},
{
"codec_name": "ac3"
}
]
}
It looks like the batch file / JScript hybrid script written by rojo is not designed for recursive execution on all *.mkv and *.mp4 files in a directory tree. For that reason I completely rewrote the batch file and omitted the JScript script parts.
It looks like information about height of video output by ffprobe because of option "stream=codec_name,height" is not really needed here because every video should be processed independent on its height. For that reason "stream=codec_name" on ProbeOptions definition line should be enough for this task to reduce output of ffprobe by one line.
The JSON output of ffprobe can be processed in this use case also directly with FOR loop using as delimiters comma ,, colon :, left square bracket [, horizontal tab TAB, right square bracket ], left { and right brace } and normal space SPACE. Lines starting with { can be completely ignored on processing the JSON formatted output. A case-sensitive string comparison is used to find out if the line contains a codec_name value with interpreting first coder/decoder value as video codec and second as audio codec.
#echo off
setlocal EnableExtensions DisableDelayedExpansion
set "ProgramFolder=C:\ffmpeg-4.0.2-win64-static\bin"
set "ProbeOptions=-v quiet -show_entries "stream^^=codec_name" -of json"
set "MpegOptions=-hide_banner -fflags +genpts+discardcorrupt+fastseek -analyzeduration 100M -probesize 50M -hwaccel dxva2 -y -threads 3 -v error -stats"
set "FilesFound=0"
set "FilesEncoded=0"
for /F "delims=" %%I in ('dir *.mkv *.mp4 /A-D-H /B /S 2^>nul') do (
set "FullFileName=%%I"
set "TempFileName=%%~dpnI_new%%~xI"
set "AudioCodec="
set "AudioOption=ac3"
set "VideoCodec="
set "VideoOption=h264_nvenc"
set /A FilesFound+=1
for /F "eol={ tokens=1,2 delims=,:[ ]{} " %%B in ('""%ProgramFolder%\ffprobe.exe" %ProbeOptions% "%%I""') do (
if "%%~B" == "codec_name" (
if not defined VideoCodec (
set "VideoCodec=%%~C"
if "%%~C" == "h264" set "VideoOption=copy"
) else (
set "AudioCodec=%%~C"
if "%%~C" == "ac3" set "AudioOption=copy"
)
)
)
setlocal EnableDelayedExpansion
echo(
echo File: !FullFileName!
echo Video codec: !VideoCodec!
echo Audio codec: !AudioCodec!
if not "!VideoOption!" == "!AudioOption!" (
"%ProgramFolder%\ffmpeg.exe" %MpegOptions% -i "!FullFileName!" -c:v !VideoOption! -c:a !AudioOption! "!TempFileName!"
if not errorlevel 1 (
move /Y "!TempFileName!" "!FullFileName!"
if not errorlevel 1 set /A FilesEncoded+=1
)
if exist "!TempFileName!" del "!TempFileName!"
)
endlocal
)
if %FilesFound% == 1 (set "PluralS=") else set "PluralS=s"
echo(
echo Re-encoded %FilesEncoded% of %FilesFound% video file%PluralS%.
endlocal
pause
Attention: The whitespace between [ and ] must be in batch file a tab character!
The batch file first sets up a local environment with enabled command extensions as required for this batch file and disabled delayed environment variable expansion to be able to process correct also files with one or more exclamation marks in file name or file path.
Next some environment variables are defined for usage later in the script. Something special is the definition of variable ProbeOptions because of argument string "stream=codec_name" which later must be passed to a separate command process started by FOR requiring double escaping the equal sign with two ^ to finally have = passed to ffprobe.exe.
The outer FOR executes once in a separate command process started with cmd.exe /C in background the command line:
dir *.mkv *.mp4 /A-D-H /B /S 2>nul
DIR outputs to handle STDOUT of this command process
only the file names because of /B
of non-hidden files because of /A-D-H (attribute not directory and not hidden)
matching either wildcard pattern *.mkv or *.mp4
in current directory and all its subdirectories because of /S
with full path also because of /S.
It could be that no matching file name is found resulting in an error message output by DIR to handle STDERR. This error message is suppressed by redirecting it to device NUL.
Read the Microsoft article about Using Command Redirection Operators for an explanation of 2>nul. The redirection operator > must be escaped with caret character ^ on FOR command line to be interpreted as literal character when Windows command interpreter processes this command line before executing command FOR which executes the embedded dir command line with using a separate command process started in background.
FOR captures all lines output to STDOUT of background command process and processes them after started cmd.exe terminated. So FOR is processing a list of full qualified file names not changing while running the loop.
On drives with NTFS it would be also quite safe to use:
for /R %%I in (*.mkv *.mp4) do (
This results also in processing all non-hidden *.mkv and *.mp4 files in current directory and all subdirectories. NTFS returns the list of files sorted alphabetically. But this approach is problematic on FAT32 and ExFAT drives because of the code executed in every iteration of the loop could result in updating file allocation table. FAT32 and ExFAT return file names matching specific criteria simply as currently stored in file allocation table on which last modified file in a directory is always at bottom of the directory table. This means the list of file names could change while the loop runs on first, second, third, ... file name returned by FAT32 and ExFAT file systems. This could result in processing a video file more than once and skipping others. So it is better to process a list of file names which is loaded completely in memory before loop iteration starts.
FOR with option /F skips by default empty lines not output by DIR in this case and lines starting with a semicolon which is also not possible here because every line starts with drive letter C. But FOR would split up every captured line into substrings (tokens) using normal space and horizontal tab as string delimiters and would assign just first space/tab separated string to specified loop variable I. This behavior is not wanted here as needed is always the full qualified file name even on containing one or more spaces. For that reason delims= is used to define an empty list of string delimiters resulting in turning off completely the string splitting behavior and get assigned to loop variable I always the file name of a found *.mkv or *.mp4 file with path, name and extension.
Following happens on every loop iteration:
The full qualified file name of current *.mkv or *.mp4 file is assigned to environment variable FullFileName.
The full qualified file name of current *.mkv or *.mp4 file with _new inserted left to file extension is assigned to environment variable TempFileName.
The environment variable AudioCodec is deleted if existing from a previous iteration of the loop.
The environment variable AudioOption is defined with string value ac3 being the wanted audio codec.
The environment variable VideoCodec is deleted if existing from a previous iteration of the loop.
The environment variable VideoOption is defined with string value h264_nvenc being the wanted video codec.
The environment variable FilesFound is incremented by one with a simple arithmetic expression evaluated by command SET.
Then one more FOR is used to run the ffprobe command line with cmd.exe /C in background. In this special case it is necessary to enclose the entire command line in double quotes because of argument string "stream=codec_name" for getting the entire command line passed correct to additional command process started by FOR.
The inner FOR captures the output written by ffprobe in JSON format to handle STDOUT of started command process and processes this output line by line. Of interest are only the lines containing "codec_name". Therefore option eol={ is used to ignored completely all lines starting with {. Option tokens=1,2 results in getting assigned first substring assigned to specified loop variable B and second substring to next loop variable C according to ASCII table. The list of delimiters specified with option delims= results in getting more or less just property name enclosed in double quotes like "codec_name" and its value also enclosed in double quotes like "h264" assigned to the loop variables B and C.
If the string assigned to loop variable B without double quotes explicitly enclosed in double quotes is case-sensitive equal the string "codec_name", then this line is of real interest. The codec value assigned to loop variable C is assigned without double quotes to either environment variable VideoCodec or AudioCodec depending on video codec already found in JSON output in one of the processed lines before. Additionally the video or audio option used perhaps later is set to copy on video or audio codec being already the wanted codec h264 respectively ac3.
It is necessary to enable delayed environment variable expansion after processing the output of ffprobe to be able to process the values of environment variables defined before in same command block. Read this answer for details about the commands SETLOCAL and ENDLOCAL.
Output is first an empty line with echo( and next full qualified file name of current video file and its current video and audio codec.
The IF condition compares case-sensitive the video and audio option. The two option strings are identical only if current video file is already h264/ac3 encoded in which case both environment variables have copy as value. So if the two compared strings are not identical, the video files must be re-encoded with ffmpeg to change video codec or audio codec or both codecs.
The re-encoding of the video file was successful on ffmpeg exiting with an exit code not greater or equal 1, i.e. with value 0. In this case the temporary video file created by ffmpeg is moved over current video file with overwriting the existing video file if current video file is not write-protected by read-only attribute or NTFS permissions.
These actions result in updating file allocation table on FAT32 and ExFAT drives which is the reason for outer FOR running DIR to get a list of video file names into memory before loop iterations.
The environment variable FilesEncoded is incremented by one of the original video file could be replaced really successfully by a re-encoded version.
The temporary video file created by ffmpeg on existing at all after execution of ffmpeg.exe is deleted finally in case of any error resulting in this file still existing after the other command lines.
Finally, after processing all non-hidden *.mkv and *.mp4 files, a summary information is output using the two counter environment variables and initial environment is restored before halting batch file execution to be able to see all the output on having started the batch file with with double clicking on it.
For understanding the used commands and how they work, open a command prompt window, execute there the following commands, and read entirely all help pages displayed for each command very carefully.
del /?
dir /?
echo /?
endlocal /?
for /?
if /?
move /?
pause /?
set /?
setlocal /?

Can't get batch to remove a weird double quote character from a string

In our company, they export their SAP data into a text file. Sometimes the users from SAP type in the double quote character as ” and not a ". When it gets exported, it is shown weird in notepad++ and when the batch echos it. I would like to remove this character when I'm running my batch file for analysis. However, I don't know how to do this.
You can find a sample text file here:
https://1drv.ms/t/s!At4HWeqiFYNvh-d-1vpV9cLWvsXQug
The code I am using is:
#echo off
setlocal EnableDelayedExpansion
Set FileLocation=C:\desktop\test.txt
FOR /f "delims=" %%a IN ('findstr /I "ABC" %FileLocation%%') DO (
Set FileLine=%%a
echo !FileLine!
Set RemoveChar=!FileLine:”=!
Set RemoveChar=!RemoveChar:ö=!
Echo !RemoveChar!
)
Thank you in advance!
Late answer, but I find a solution.
By changing the code page, it is possible for both Notepad++ and command prompt to process the text string properly.
First, save the batch file as UTF-8 encoding. It can be done by choosing "UTF-8" in the encoding menu.
Second, add chcp 65001 to the beginning of the script. This allows command processor to see the characters as UTF-8.
Here is the fixed code:
#echo off
CHCP 65001
setlocal EnableDelayedExpansion
Set FileLocation=C:\desktop\test.txt
FOR /f "delims=" %%a IN ('findstr /I "ABC" %FileLocation%%') DO (
Set FileLine=%%a
echo !FileLine!
Set RemoveChar=!FileLine:”=!
Set RemoveChar=!RemoveChar:ö=!
Echo !RemoveChar!
)

grep specific string from a file and store the result in another file using bat or vbs

I have the below contents in file1
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\Applications\vlc.exe\shell\Open\command]
#="\"C:\\Program Files (x86)\\VideoLAN\\VLC\\vlc.exe\" --started-from-file \"%1\""
I need C:\\Program Files (x86)\\VideoLAN\\VLC\\vlc.exe string to be copied from file1 to file2.
As a end result file2 contents should have
C:\\Program Files (x86)\\VideoLAN\\VLC\\vlc.exe
How can we achieve this using a bat file or a vbs ? Please share your thoughts. Thanks!
#echo off
for /f usebackq^ tokens^=^3^ delims^=^" %%a in ("file1") do >"file2" echo %%a
Using the quote as delimiter, read file1, split the lines to get the third token in line and send to file2
#ECHO OFF
SETLOCAL
FOR /f "tokens=1*delims==" %%a IN (q21568377.txt) DO IF NOT "%%b"=="" FOR /f "tokens=1,2delims=:-" %%c IN (%%b) DO SET var1=%%c&SET var2=%%d
SET var=%var1:~-1%:%var2:~0,-3%
ECHO %var%
GOTO :EOF
I used a file named q21568377.txt for my testing.
Output simply shown on screen. Redirect to file if that's your wish.

In a batch file, set variables from lines in a file containing spanish characters

I found on this site this useful script to assign values to variables from lines in a txt file.
#echo off
setlocal ENABLEDELAYEDEXPANSION
set vidx=0
for /F "tokens=*" %%A in (caplist.txt) do (
SET /A vidx=!vidx! + 1
set var!vidx!=%%A
echo %%A
)
set var > test.txt
pause
It works fine but the lines in my text file contain spanish characteres like ñ, é, etc.
Lines in test.txt are OK but the echo in the for loop displays exotic characters.
BTW, I obtain the same effect just doing a simple Type command of the file.
Can somehone help?
Many thanks
that's because the terminal window that is displaying the output of cmd.exe is using a different codepage that the characters in caplist.txt are encoded in.
Probably they are encoded using either ISO 8859-1 or UTF-8.
Try chcp 1250 at your terminal, to see if this fixes your problem.
If not, dump the text file in hexadecimal and figure out the encoding used, and set the terminal code page accordingly.

Resources