Convert all CR to CRLF in text file using CMD - batch-file

Is there a way to convert all CRs to CRLFs in a text file?
When I open a text file from Linux server on Windows, all text is displayed in one line, but actually it's a multi line one.
I'd like to perform the conversion in a batch file.
Can anyone advice, please?

Line separators and line terminators have been a source of compatibility friction between systems as long as there has been more than one kind of system and an urge to exchange data. The Wikipedia article on the Newline has a decent overview of the historical context. And, it suggests a variety of solutions to this problem specifically for use on the Unix side or the Windows side.
On the Unix (Linux) side, look for a utility named unix2dos and its close relative dos2unix. These are commonly available, either as a component of a commercial Unix or as open source tools. If available, they are the best answer because they (usually, see your verson's man pages for details) are careful about files that are accidentally written with both line endings. In that unfortunate case, a trip through both utilities will usually clean up the file to be internally consistent. In the absence of these convenient commands, many native utilities can be made to do the conversion. For instance, converting DOS CRLF line endings to Unix newlines can be done with the tr command:
$ tr -d '\r' < inputfile > outputfile
But do note the caveat that this command assumed that all lines were terminated by CRLF (or LFCR) and works by simply deleting every CR character from the input. Any naked CR characters will be lost.
On the DOS and Windows side, it used to be a lot bleaker. Ports of unix2dos and dos2unix certainly exist, for instance they are included in the much larger Cygwin tools that provide a complete unix emulation on a Windows machine. But a solution using only built-in features was hard to find.
Modern Windows (probably since Windows XP), however, is better. There, the built-in FIND command is much less touchy about choice of line terminator than it used to be, and can be used to do the required conversion from Unix line endings to DOS endings. The Wiki page cited above gives this recipe:
C:\...> TYPE filename.u | FIND "" /V >filename.txt
Experimentation shows that this works as well, but it may not give identical results for unknown reasons:
C:\...> FIND "" /V <filename.u >filename.txt
In both cases, you create a copy of the file with the changed line endings. It would probably not be recommended to change the files in place.
I'll mention one other approach that always seems tempting on paper. When you use Samba to provide the file system share on the Linux server for mounting by Windows, there is a configuration option you can set for the share that mounts it in "text mode". Shares mounted in "text mode" automatically have line endings converted. If it works for you, that is probably the cleanest possible solution. Both systems use their preferred text file format, and neither has to fuss about it. But test carefully, this solution is full of edge cases and pitfalls. Most importantly, don't expect binary files on a text mode file system mount point to read correctly. They often will, but not necessarily always.

type inputfile | find /v "" > outputfile
That should do it. type reads input file and pipes output to find with parameters to match all lines and output them to output file. In the process, LF is converted to CRLF

A possible though quite cumbersome way is to use CertUtil.exe, an executable that is natively included since past Windows XP, if I remember correctly. Here is a possible script (let us call it conv-eol.bat; see all the explanatory rem remarks in the code):
#echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set "_IFILE=%~1" & rem // (input file; first command line argument)
set "_OFILE=%~2" & rem // (output file; second command line argument)
set "_IEOL=0d" & rem // (incoming line-breaks; `0d` or `0a`)
set "_OEOL=0d 0a" & rem // (outgoing line-breaks; `0d`, `0a`, `0d 0a`, ``)
set "_TFILE1=%TEMP%\%~n0_%RANDOM%.hex" & rem // (first temporary file)
set "_TFILE2=%TEMP%\%~n0_%RANDOM%.tmp" & rem // (second temporary file)
rem // Verify input file:
< "%_IFILE%" rem/ || exit /B
rem // Convert input file to hexadecimal values (first temporary file):
CertUtil -f -encodehex "%_IFILE%" "%_TFILE1%" 4 > nul
rem // Write to second temporary file:
> "%_TFILE2%" (
setlocal EnableDelayedExpansion
rem // Read first temporary file line by line:
for /F "usebackq delims=" %%L in ("!_TFILE1!") do (
rem /* Store current line (hex. values), then replace line-breaks
rem using the given line-break codes and return result: */
set "LINE=%%L" & echo(!LINE:%_IEOL%=%_OEOL%!
)
endlocal
)
rem // Verify output file:
> "%_OFILE%" rem/ || exit /B
rem // Convert second temporary file back to text into output file:
CertUtil -f -decodehex "%_TFILE2%" "%_OFILE%" 4 > nul
rem // Clean up temporary files:
del "%_TFILE1%" "%_TFILE2%"
endlocal
exit /B
Provide the input file as the first command line argument and the output file as the second one to the script (they may even equal):
conv-eol.bat "input-file.txt" "output-file.txt"
The input and output line-breaks must be specified as hexadecimal character codes, while 0d represents the carriage-return (CR) and 0a the line-feed (LF) character.
The following table tells how to set the variables _IEOL and _OEOL at the top of the script for different line-break style conversion tasks:
from \ to||Mac (CR) ||Unix/Linux (LF) ||DOS/Windows (CR+LF)
Mac (CR) ||#####################||_IEOL=0d, _OEOL=0a ||_IEOL=0d, _OEOL=0d 0a
Unix/Linux (LF) ||_IEOL=0a, _OEOL=0d ||#####################||_IEOL=0a, _OEOL=0d 0a
DOS/Windows (CR+LF) ||_IEOL=0a, _OEOL= ||_IEOL=0d, _OEOL= ||#####################

cat file | perl -pe 's/\R/\n/g'

The following batch fragment does the trick:
del outputfile
for /f "delims=" %%x in (inputfile) do echo %%x>>outputfile
Its advantage is not relying on the find program, which is rather temperamental (hangs or doesn't work on some machines where I tested the other solutions).

In Windows XP and earlier, you can convert a text file to CRLF simply by opening and saving it in Dos Edit (or Windows Edit). Unfortunately, the Edit program was removed in Vista.

One ridiculous way. Works with the following scenarios:
Text file with a CR at end of every line.
Text file with a repeating set of CR at end of line followed by an empty line with CRLF. Good luck!
Open the file in Notepad++ (free app) and set View -> All Characters.
IF all lines end in CR then:
Open in Microsoft Wordpad - NOT - Word and save the file in MSDOS-Format.
ELSE IF lines end in CR followed by a blank line ending with CRLF then
remove the blank lines first with Notepad++. Go to Edit -> Line Operations -> Remove empty lines and save the file.
Open the file in Microsoft Wordpad and save in MSDOS-Format.
END IF

Related

Use content of a file in a string

I have a file (let's call it version.txt) that contains a version number and some text:
v5.02
Some text explaining
where and how this
number is used
Based on this answer, I use
set /p version=<version.txt
to store the first line of the file in the version variable. Now I'm trying to write a batch script that operates on folders that contain this version number in their name. However, I get unexpected results because something seems to go wrong when I insert the variable in a path. For example, this script
#set /p version=<version.txt
#echo C:\some\folder\%version%\some\file.exe
prints
C:\some\folder\v5.02
instead of
C:\some\folder\v5.02\some\file.exe
What's going on? I have a feeling there are hidden characters of some sort at the end of the text in the variable, because setting the variable by hand to a constant in the script works.
Edit: I'm using Windows 10 with Notepad++ as my editor, if it helps.
I can only replicate your issue, when version.txt uses Unix line endings (LF) instead of Windows (CRLF). for /f is immune to this issue:
for /f "delims=" %%a in (version.txt) do set "verion=%%a" & goto :skip
:skip
echo C:\some\folder\%version%\some\file.exe
goto :skip breaks the loop after reading the first line.
Since everything I tried didn't seem to work, the solution I found in the end is to call the batch script from a Python script. The Python script reads the first line of the version file and passes it as an argument to the batch script. Out of context, it is a bit of an inelegant solution, but in my case the batch script was already called by a Python script, so it's not that terrible.
Here is a minimal example:
version.txt
v5.02
Some text explaining
where and how this
number is used
script.bat
#echo C:\some\folder\release\%1\some\file.exe
script.py
import os
with open("version.txt") as f:
version = f.readline().rstrip()
os.system("cmd /c script.bat %s" % version)
Edit: Following Stephan's comment, I tried to change the line ending in the text file from LF to CRLF and it indeed solves the problem. However, since I don't really have control over everything that writes in that file, the solution above remains the most feasible in my case.
Edit 2: Stephan's answer (with the for loop) is actually a better solution than this one since it avoids having to transfer part of the work to the calling Python script.

Odd behavior with long file names and Win10 console VT-100 escape sequences

I have a Windows 10 batch file that runs ffprobe on all files in a directory. Each file name is displayed on the screen on a specific line. When the next file is processed the previous file name is erased and the next file name is written on the same line (so the file names don't run down the screen). I've noticed that for file names greater than 120 characters in length the VT100 escape sequences I'm using break to some extent. Here's the portion of my code that is applicable:
echo Checking files......
for %%a in ("*.*") do (
set filename=%%a
for /f "tokens=1,2,3 delims=," %%a in ('ffprobe [...]') do (
set codec=%%b
)
set /p="%ESC%[1G%ESC%[2K%%~a"<nul
)
set /p="%ESC%[A%ESC%[1G%ESC%[2K"<nul
(I've edited the ffprobe portion just so everything is more readable. ffprobe has nothing to do with the issue I'm seeing).
The escape sequences normally result in the display of the current file name, once ffprobe is finished with that file the cursor is moved to the 1st position on that line and the line (file name) is deleted. After the for loop the line is advanced down so sequence [A is used to move the line back up one so all the files display on the same line.
This has been working fine for months, but I just noticed on a very long file name that was 124 characters the file name is not erased and the next file name is displayed on the following line and the batch file runs correctly from there with that long file name remaining on the screen and the rest of my script runs below it. The way things should work is that each file name is deleted from the screen and none should be shown once this section of the batch file completes.
I deleted characters in the file name to see what number of characters would result in the escape sequences processing correctly and apparently there's a 120 character max.
Is this known behavior? This really isn't a major problem, but it is kind of annoying. Is there anyway to get around this?

How to edit/delete an entire section or just a line from an initialization file with script?

I have been tasked to edit multiple configuration files on my company's proprietary software (Windows OS). These configuration files are in INI file format (config.ini) which structures are composed of sections, properties, and values. The requirement is to:
Search for the section name and remove all corresponding properties on config.ini file.
Example: Remove entire section [RegistryService] and its properties.
[DummyProcessor]
CCLTsVersion=112
ETransformsDescription=
ETransformsVersion=0.0.0.0
LWTs=21.10.25
Transform=10.2.2.0
[RegistryService]
LoadRegistry=1
Delete an entry from a different configuration file (not limited to section):
Example: Delete just line with LoadRegistryManager=1 entry from:
[DummyService]
InitInstructions=0
ESAPsVersion=
ESVersion=10.2
LoadRegistryManager=1
Can I use Windows command line batch scripting to make these edits?
Please provide an example. I am more comfortable with Linux commands and not as privy to Windows batch scripting aside of creating/deleting files and folders.
The question How can you find and replace text in a file using the Windows command-line environment? has lots of answers. One of those answers links to JREPL.BAT written by Dave Benham which is a batch file / JScript hybrid making it possible to use regular expressions as supported by JScript from Windows command line.
The requirements can be fulfilled with jrepl.bat and following batch file both stored together in same directory:
#echo off
call "%~dp0jrepl.bat" "^\[RegistryService\][^[]+" "" /M /F config.ini /O -
call "%~dp0jrepl.bat" "^LoadRegistryManager=[01].*\r\n" "" /X /F config.ini /O -
The regular expression search string ^\[RegistryService\][^[]+ searches case-sensitive for [RegistryService] at beginning of a line and matches everything after this string up to next opening square bracket [ or end of file.
Note: This search string as is can't be used if one of the entries in section [RegistryService] contains by chance an opening square bracket.
The regular expression search string ^LoadRegistryManager=[01].*\r\n searches case-sensitive for a line starting with LoadRegistryManager= followed by digit 0 or 1, 0 or more characters except newline characters and carriage return and line-feed.
The replace string is for both replaces an empty string to delete everything matched by the search expression.
The first replace requires Multi-line mode enabled with /M. The second replace requires eXtended ASCII and escape sequences enabled with /X.
For understanding the used commands and how they work, open a command prompt window, execute there the following commands, and read entirely all help pages displayed for each command very carefully.
call /?
echo /?
jrepl.bat /?

Dos Batches: write to files without a line ending

I have a situation while writing a dos batchs script. A tool I am using to calculate CRC (checksum) of a text string requires the text to be in a file. The data I am trying to get the CRC for is a filename, but when using a batch to put this filename into a text file to calculate CRC, the batch script naturally puts the line ending (CR/LF) and a blank line at the end. As this causes the CRC to be wrong, it is a problem.
Is there any way to get a batch script to write to a text file without appending a line ending? IE to output a single line unfinished to file?
-K.Barad
<nul set /p ".=text" > file
It's faster and safer than echo.|set /P ="text" > file
The nul redirection is faster than a pipe with echo. (btw echo.can fail).
The style of the quotes allowes to output also quotes.
But there are always restrictions!
Vista and Win7 have a "feature" to supress leading spaces, Tabs and CR's.
Xp can output text with leading spaces and so.
And it's not possible to begin the output text with an equal sign (results in a syntax error)
You could
echo.|set /P ="text" > file
source
Or directly pipe the text to the command line:
echo "text" | checksum_program.exe
edit:
if you're using CRC32DOS, then you can use its command line option -c to ignore CRs.

Strip drive letter

By using the command from a folder in D: drive
for /f "delims=" %%d in ('cd') do set pathdrv=%%d
echo %pathdrv%
I get "d:\some folder".I want to write batch commands to create autorun file in the root of the drive. Please help me to strip the drive letter "d:" so the output is "\some folder" and what extra change do i do to strip "\".
Short answer: Use the substring syntax to strip the first two characters from the %cd% pseudo-variable:
%cd:~2%
To remove the first backslash too:
%cd:~3%
This reliably works even with Unicode paths when the console window is set to raster fonts.
Longer answer, detailing some more options (none of which work well enough):
For arguments to the batch file you can use the special syntax %p1, which gives you the path of the first argument given to a batch file (see this answer).
This doesn't work the same way with environment variables but there are two tricks you can employ:
Use a subroutine:
call :foo "%cd%"
...
goto :eof
:foo
set result=%~p1
goto :eof
Subroutines can have arguments, just like batch files.
Use for:
for %%d in ("%cd%") do set mypath=%%~pd
However, both variants don't work when
The console is set to "Raster fonts" instead of a TrueType font such as Lucida Console or Consolas.
The current directory contains Unicode characters
That have no representation in the current legacy codepage (for Western cultures, CJK is a good choice that doesn't fit). Remember that in this case you'll only get question marks instead of the characters.
The problem with this is that whil environment variables can hold Unicode just fine you'll get into problems once you try setting up a command line which sets them. Every option detailed above relies on output of some kind before the commands are executed. This has the problem that Unicode isn't preserved but replaced by ?. The only exception is the substring variant at the very start of this answer which retains Unicode characters in the path even with raster fonts.
Drive letter:
%CD:~0,1%
Full drive name (incl. colon):
%CD:~0,2%

Resources