Remove lines in text files that have "." Batch File - batch-file

I have a text file that looks like this.
/var/www/xxx/html/TEST/VIDEOS/video3.mp4
/var/www/xxx/html/TEST/video_folder_1/cideo.mp4
/var/www/xxx/TEST/video_folder_1/sadasd
/var/www/xxx/html/TEST/video_folder_2/asdsadasdasdsadsadsadsadas
/var/www/xxx/html/TEST/video_folder_2/cideo2.mp4
/var/www/xxx/html/TEST/video_folder_2/sadsada
I would like it to look like this:
/var/www/xxx/TEST/video_folder_1/sadasd
/var/www/xxx/html/TEST/video_folder_2/asdsadasdasdsadsadsadsadas
/var/www/xxx/html/TEST/video_folder_2/sadsada
The idea would be to remove any line that have an extension. I.E mp4 in this case.
So I guess it would look for 4 character in at the end of the line and see if it has a "."
If it does, remove the line.

In batch you should be able to do this in many ways:
findstr /V /L "." theFile.txt
As Aacini suggests, which checks if the line contains a . and works fine when tested.
If you want to user regular expression
findstr /V /R "\....$" theFile.txt
Which does exactly what you asked for by checking if a line ends with a .***
Lastly what I would recommend is using this:
findstr /V /R "\.[a-z0-9]*$" theFile.txt
which checks if the line ends with any sort of extension, thus including possible 4-letter extensions.
I have tested each of these and they all work fine.
I really don't know why Serenity insists you use VBscript which is no doubt a great language but for a simple thing like this batch is so much more simpler.

This is a vbs regex replace program. Far more powerful than FindStr. It works globably so you can target line endings.
This is help from a similar line based, not global, program of mine. The point is to show sample RegEx expressions.
To extract all section headers, ie, lines without an equal sign
filter filter iv "=" < "%systemroot%\win.ini"
To extract all section headers starting with a lower case letter
filter filter n "\[[a-z].+" < "%systemroot%\win.ini"
This shows the caret escaping an opening bracket for CMD.EXE and the backslash escaping the opening bracket for the RegEx engine
filter filter n "\^(" < "%systemroot%\win.ini"
This shows searching for a quote character
filter filter n "\x22" < "%systemroot%\win.ini"
Use $1, $2, $..., $n to specify sub matches in the replace string
filter replace i "=" "No equal sign" < "%systemroot%\win.ini"
This searches for text within square brackets and replaces the line with cat followed by the text within brackets
Filter replace i "^\[^(.*^)\]" "cat$1" < %windir%\win.ini
This searches for any text and prints from the 11th character to the end of the line.
Filter replace i "^.{10}^(.*^)$" "$1" < %windir%\win.ini
This searches a CSV file and prints the second and fourth field
Filter replace i "^.+,^(.+^),.+,^(.+^)$" "$1,$2" < csv.txt
The script.
On Error Resume Next
Set ShellApp = CreateObject("Shell.Application")
ReportErrors "Creating Shell.App"
set WshShell = WScript.CreateObject("WScript.Shell")
ReportErrors "Creating Wscript.Shell"
Set objArgs = WScript.Arguments
ReportErrors "Creating Wscript.Arg"
Set regEx = New RegExp
ReportErrors "Creating RegEx"
Set fso = CreateObject("Scripting.FileSystemObject")
ReportErrors "Creating FSO"
If objArgs.Count = 0 then
wscript.echo "No parameters", 16, "Serenity's ReplaceRegExp"
ReportErrors "Help"
ElseIf objArgs.Count = 1 then
wscript.echo "Only one parameter", 16, "Serenity's ReplaceRegExp"
ReportErrors "Help"
ElseIf objArgs.Count = 2 then
Set srcfile = fso.GetFile(objArgs(0))
ReportErrors "srcFile"
If err.number = 0 then Set TS = srcFile.OpenAsTextStream(1, 0)
If err.number <> 0 then
wscript.echo err.description & " " & srcFile.path, 48, "Serenity's Search"
err.clear
else
ReportErrors "TS" & " " & srcFile.path
Src=ts.readall
If err.number = 62 then
err.clear
else
ReportErrors "ReadTS" & " " & srcFile.path
regEx.Pattern = objArgs(1)
regEx.IgnoreCase = True
regEx.Global = True
If regEx.Test(Src) = True then
wscript.echo "Found in " & srcfile.path, 64, "Serenity's Search"
End If
End If
End If
ReportErrors "Check OK" & " " & srcFile.path
Elseif objArgs.count = 3 then
Set srcfile = fso.GetFile(objArgs(0))
ReportErrors "srcFile"
If err.number = 0 then Set TS = srcFile.OpenAsTextStream(1, 0)
If err.number <> 0 then
wscript.echo err.description & " " & srcFile.path, 48, "Serenity's Search"
err.clear
else
ReportErrors "TS" & " " & srcFile.path
Src=ts.readall
If err.number = 62 then
err.clear
else
ReportErrors "ReadTS" & " " & srcFile.path
regEx.Pattern = objArgs(1)
regEx.IgnoreCase = True
regEx.Global = True
NewSrc= regEx.Replace(Src, objArgs(2))
If NewSrc<>Src then
wscript.echo "Replacement made in " & srcfile.path, 64, "Serenity's Search"
TS.close
Set TS = srcFile.OpenAsTextStream(2, 0)
ts.write newsrc
ReportErrors "Writing file"
End If
End If
End If
ReportErrors "Check OK" & " " & srcFile.path
Else
wscript.echo "Too many parameters", 16, "Serenity's ReplaceRegExp"
ReportErrors "Help"
ReportErrors "All Others"
End If
Sub ReportErrors(strModuleName)
If err.number<>0 then wscript.echo "An unexpected error occurred. This dialog provides details on the error." & vbCRLF & vbCRLF & "Error Details " & vbCRLF & vbCRLF & "Script Name" & vbTab & Wscript.ScriptFullName & vbCRLF & "Module" & vbtab & vbTab & strModuleName & vbCRLF & "Error Number" & vbTab & err.number & vbCRLF & "Description" & vbTab & err.description, vbCritical + vbOKOnly, "Something unexpected"
Err.clear
End Sub
RegEx Reference
Regular Expressions Reference
From the Windows Vista SDK, VBScript Language Reference © Microsoft Corp 2006
Character Description
\ Marks the next character as either a special character or a literal. For example, "n" matches the character "n". "\n" matches a newline character. The sequence "\\" matches "\" and "\(" matches "(".
^ Matches the beginning of input.
$ Matches the end of input.
* Matches the preceding character zero or more times. For example, "zo*" matches either "z" or "zoo".
+ Matches the preceding character one or more times. For example, "zo+" matches "zoo" but not "z".
? Matches the preceding character zero or one time. For example, "a?ve?" matches the "ve" in "never".
. Matches any single character except a newline character.
(pattern) Matches pattern and remembers the match. The matched substring can be retrieved from the resulting Matches collection, using Item [0]...[n]. To match parentheses characters ( ), use "\(" or "\)".
x|y Matches either x or y. For example, "z|wood" matches "z" or "wood". "(z|w)oo" matches "zoo" or "wood".
{n} n is a nonnegative integer. Matches exactly n times. For example, "o{2}" does not match the "o" in "Bob," but matches the first two o's in "foooood".
{n,} n is a nonnegative integer. Matches at least n times. For example, "o{2,}" does not match the "o" in "Bob" and matches all the o's in "foooood." "o{1,}" is equivalent to "o+". "o{0,}" is equivalent to "o*".
{ n , m } m and n are nonnegative integers. Matches at least n and at most m times. For example, "o{1,3}" matches the first three o's in "fooooood." "o{0,1}" is equivalent to "o?".
[ xyz ] A character set. Matches any one of the enclosed characters. For example, "[abc]" matches the "a" in "plain".
[^ xyz ] A negative character set. Matches any character not enclosed. For example, "[^abc]" matches the "p" in "plain".
[ a-z ] A range of characters. Matches any character in the specified range. For example, "[a-z]" matches any lowercase alphabetic character in the range "a" through "z".
[^ m-z ] A negative range characters. Matches any character not in the specified range. For example, "[m-z]" matches any character not in the range "m" through "z".
\b Matches a word boundary, that is, the position between a word and a space. For example, "er\b" matches the "er" in "never" but not the "er" in "verb".
\B Matches a non-word boundary. "ea*r\B" matches the "ear" in "never early".
\d Matches a digit character. Equivalent to [0-9].
\D Matches a non-digit character. Equivalent to [^0-9].
\f Matches a form-feed character.
\n Matches a newline character.
\r Matches a carriage return character.
\s Matches any white space including space, tab, form-feed, etc. Equivalent to "[ \f\n\r\t\v]".
\S Matches any nonwhite space character. Equivalent to "[^ \f\n\r\t\v]".
\t Matches a tab character.
\v Matches a vertical tab character.
\w Matches any word character including underscore. Equivalent to "[A-Za-z0-9_]".
\W Matches any non-word character. Equivalent to "[^A-Za-z0-9_]".
\num Matches num, where num is a positive integer. A reference back to remembered matches. For example, "(.)\1" matches two consecutive identical characters.
\ n Matches n, where n is an octal escape value. Octal escape values must be 1, 2, or 3 digits long. For example, "\11" and "\011" both match a tab character. "\0011" is the equivalent of "\001" & "1". Octal escape values must not exceed 256. If they do, only the first two digits comprise the expression. Allows ASCII codes to be used in regular expressions.
\xn Matches n, where n is a hexadecimal escape value. Hexadecimal escape values must be exactly two digits long. For example, "\x41" matches "A". "\x041" is equivalent to "\x04" & "1". Allows ASCII codes to be used in regular expressions.
Go to top of page

If you want to eliminate lines with "." from the output created in your previous question, then a much simpler solution is to insert a test in that code:
#echo off
setlocal EnableDelayedExpansion
(for /F "delims=" %%a in (test.txt) do (
set "line=%%a"
if "!line:~0,1!" equ "/" (
set "header=%%a"
) else (
if "!line:.=!" equ "!line!" echo !header:~0,-1!/%%a
)
)) > testnew.txt

Related

Repeat loop with if conditions for first and last time

I have a script that I made that works fine but I have to make some very minor edits to the output. Instead I'd like to just do it correctly.
on run {input, parameters}
set the formatted to {}
set listContents to get the clipboard
set delimitedList to paragraphs of listContents
repeat with listitem in delimitedList
set myVar to "#\"" & listitem & "\"," & (ASCII character 10)
copy myVar to the end of formatted
end repeat
display dialog formatted as string
return formatted as string
end run
I'd like prepend the first item slightly differently and append the last a little different.
I tried the following but the script is not right.
repeat with n from 1 to count of delimitedList
-- not sure how to if/else n == 0 or delimitedList.count
end repeat
There is a more efficient way, text item delimiters. It can insert the comma and the linefeed character between the list items
on run {input, parameters}
set the formatted to {}
set listContents to get the clipboard
set delimitedList to paragraphs of listContents
repeat with listitem in delimitedList
copy "#" & quote & listitem & quote to the end of formatted
end repeat
set {saveTID, text item delimiters} to {text item delimiters, {"," & linefeed}}
set formatted to formatted as text
set text item delimiters to saveTID
display dialog formatted
return formatted
end run
Side note: ASCII character 10 is deprecated since macOS 10.5 Leopard, there is linefeed, tab (9), return (13), space (32) and quote (34).
In addition to an index or range, the various items of a list can also be accessed by using location parameters such as first, last, beginning, etc (note that AppleScript lists start at index 1). To deal with the first and last items separately, you can do something like:
on run {input, parameters} -- example
set formatted to {}
set delimitedList to paragraphs of (the clipboard)
if delimitedList is not {} then
if (count delimitedList) > 2 then repeat with anItem in items 2 thru -2 of delimitedList
set the end of formatted to "#" & quote & anItem & quote
end repeat
set the beginning of formatted to "#" & quote & "First: " & first item of delimitedList & quote -- or whatever
if rest of delimitedList is not {} then set the end of formatted to "#" & quote & "Last: " & last item of delimitedList & quote -- or whatever
set {tempTID, AppleScript's text item delimiters} to {AppleScript's text item delimiters, "," & linefeed}
set {formatted, AppleScript's text item delimiters} to {formatted as text, tempTID}
end if
display dialog formatted as text
return formatted as text
end run

Best methods to extract substring into arguments and allow arguments with multiple "main split character" indicated by a character in Lua?

Let's say I have this string
"argument \"some argument\""
which prints out as
argument "some argument"
And now as example, let's say I would want to split it using the "space character" as the "main split character" to split, but allow me to indicate with a character which part to have multiple arguments of. So let's say it would be the quotation mark ". At the it should extract the arguments like so
> [1] = "argument"
> [2] = "some argument"
I am able to do this with the code here:
ExtractArgs = function(text,splitKey)
local skip = 0
local arguments = {}
local curString = ""
for i = 1, text:len() do
if (i <= skip) then continue end
local c = text:sub(i, i)
if (c == "\"") and (text:sub(i-1, i-1) ~= "\\") then
local match = text:sub(i):match("%b\"\"")
if (match) then
curString = ""
skip = i + match:len()
arguments[#arguments + 1] = match:sub(2, -2)
else
curString = curString..c
end
elseif (c == splitKey and curString ~= "") then
arguments[#arguments + 1] = curString
curString = ""
else
if (c == splitKey and curString == "") then
continue
end
curString = curString..c
end
end
if (curString ~= "") then
arguments[#arguments + 1] = curString
end
return arguments
end;
print(ExtractArgs("argument \"some argument\"", " "))
So what's the issue?
I am wondering for better ways. This current way, doesn't allow me to use " as it is being used to process.
Here is another way: Extract substring inbetween quotation marks, but skip \" and turn it into " instead in Lua
but I am wondering if there are even better ways.
Ways that would allow me to have something like this:
argument " argument2" "some argument" "some argument with a quotation mark " inside of it" another_argument"
to turn into something like this
> [1] = 'argument'
> [2] = 'argument2"'
> [3] = 'some argument'
> [4] = 'some argument with a quotation mark " inside of it'
> [5] = 'another_argument"'
The example that I made right now, might sound impossible because of no character really indicating what should be processed and what not.
But I am looking for better ways to extract substrings as arguments while allowing to have arguments as one argument that would normally just get split into arguments.
So if it wouldn't be for the ", \"some arguments\" would have just splitted into "some" and "arguments" instead of "some arguments".
Maybe a method that uses something like Lua does with ' " could be a way.
Because it would be probably impossible to have a perfect working system, that would turn this input """ into " as an extracted argument. I would imagine it just extracting it into this "", an empty string.
However, not if it would look like this '"'. But then the other question would be, how could you allow ''' to extract into '. I would imagine it working if it would be so "'". But this is getting too complicated.
I am wondering, is there even a better way to extract arguments, but allow certain special operation like keeping multiple arguments into one argument, by wrapping it around something, or just in any way?

Getting error "Array is Fixed or Locked" when splitting string

I have this issue with my vbscript file. It actually read data from first file and use it to compare with values in second file. If there is a match necessary things are done. This code is working fine for the first value from the first file. But for second input error appears as that this array is fixed or temporary locked. I searched in internet and found the issue is with the fixed size of array and can be solved by redim function. I don't know it is the correct solution or not. Here is the complete code. Thanks in advance.
set fs = CreateObject("Scripting.FileSystemObject")
set shell = CreateObject("Wscript.Shell")
set file =fs.OpenTextFile("C:\wamp\www\order.csv",1,false)'Read from orders.csv (Current Orders)
set output =fs.OpenTextFile("C:\wamp\www\completed-orders.csv",8)'Output file. All Completed Order's Number are placed here
Set tempfile = fs.CreateTextFile("C:\wamp\www\temp-order.csv",true) 'Create temporary text files to store the line numbers to be deleted
set p=fs.OpenTextFile("C:\wamp\www\current.csv",1,false)'Read the path to the current Stock Data File
path=p.ReadLine
p.close()
path = Trim(path)
set n=fs.OpenTextFile(path,1,false)'Read the data from it and check for name match and then whether current price is >= order price.
line=0
'format of line in order.csv written from php 'fwrite($file,$company."||".$price."||".$oid."||".$type."||".$email
do while file.AtEndOfStream<>true
lin=file.ReadLine
Wscript.Echo line
Wscript.Echo lin
if lin="" Then
noerr=false
else
noerr=true
end if
Wscript.Echo noerr & " error variable"
if noerr Then
temp=Split(lin,"||") 'Read the first order (id,price)
i=0
for each data in temp
If i=0 Then
namestock=data
ElseIf i=1 Then
target=CDbl(data)
ElseIf i=2 Then
orderid=data
ElseIf i=3 Then
buysell=data
ElseIf i=4 Then
transtype=data
ElseIf i=5 Then
emailto=data
ElseIf i=6 Then 'Bharti Airtel ||300.90||100000004||BUY||INTR||email#hotmail.com||5||1506.45585 format of lin
nos=CInt(data)
Else
totalprice=CDbl(data)
End If
i=i+1
next
do while n.AtEndOfStream <>true 'Compare against current data
stock=Split(n.ReadLIne,"||")
j=0
for each name in stock
If j=0 Then
If StrComp(namestock,name,1)=0 Then
Wscript.Echo namestock & " name matched"
check = true
j=j+1
Else
Exit For
End If
Else If j=1 Then
If buysell="BUY" Then
If CDbl(name) <=target Then
tempfile.WriteLine(line) 'Add the line to temporary file if condition is satisfied.
output.WriteLine(orderid) 'Write orderno to complete-orders
shell.run "send.vbs 1 "& emailto &" "& target &" "& buysell &" "& transtype &" "& nos &" "&totalprice &" "& namestock & " Order Completed"
Wscript.Echo "sended"
End If
ElseIf buysell="SELL" Then
If CDbl(name) >=target Then
tempfile.WriteLine(line) 'Add the line to temporary file if condition is satisfied.
output.WriteLine(orderid) 'Write orderno to complete-orders
shell.run "send.vbs 1 "& emailto &" "& target &" "& buysell &" "& transtype &" "& nos &" "&totalprice &" "& namestock & " Order Completed"
End If
End If
Exit Do
End If
End If
next
loop
end if
line=line+1
loop
n.close()
tempfile.close()
file.close()
output.close()
This is part of my miniproject. A simple stock market system. This piece of code is to be used for order is to be accepted or not by the system. It does the intended works fine for the first line, but for second line from the order.csv, but it will not work due to the error occurring on the stock variable.
Your whole approach is flawed. You try to read all lines from n for each line from file. That won't work without opening (and closing) n each time.
You should edit your question to disclose your real world problem/what you want to achieve with some sample data. Then it may be possible to give you useful advice.

How do I set word delimiters?

User's guide chapter 6.1.5 The Word Chunk A word is a string of characters delimited by space, tab, or return characters or enclosed by double quotes. Is it possible to have additional word delimiters?
I have the following code snippet taken from the User's Guide chapter 6.5.1 'When to use arrays', p. 184
on mouseUp
--cycle through each word adding each instance to an array
repeat for each word tWord in field "sample text"
add 1 to tWordCount[tWord]
end repeat
-- combine the array into text
combine tWordCount using return and comma
answer tWordCount
end mouseUp
It counts the number of occurences of each word form in the field "Sample text".
I realize that full stops after words are counted as part of the word with the default setting.
How do I change the settings that a full stop (and, or a comma) is considered a word boundary?
Alternatively you could simply remove the offending characters before processing.
This can be done using either the REPLACE function or the "REPLACETEXT function.
The REPLACETEXT function can use a regular expression matchstring but is slower than the REPLACE function. So here I am using the REPLACE function.
on mouseUp
put field "sample" into twords
--remove all trailing puncuation and quotes
replace "." with "" in twords
replace "," with "" in twords
replace "?" with "" in twords
replace ";" with "" in twords
replace ":" with "" in twords
replace quote with "" in twords
--hyphenated words need to be seperated?
replace "-" with " " in twords
repeat for each word tword in twords
add 1 to twordcount[tword]
end repeat
combine twordcount using return and comma
answer twordcount
end mouseUp
I think you are asking a question about delimiters. Some delimiters are built-in:
spaces for words,
commas for items,
return (CR) for lines.
The ability to create your own custom delimiter property (the itemDelimiter) is a powerful feature of the language, and pertains to "items". You can set this to any single character:
set the itemDelimiter to "C"
answer the number of items in "XXCXXCXX" --call this string "theText"
The result will be "3"
As others have pointed out, the method of replacing one string for another allows formidable control over custom parsing of text:
replace "C" with space in theText
yields "XX XX XX"
Craig Newman
As the User's guide says in chapter 6.1.5 The Word Chunk A word is a string of characters delimited by space, tab, or return characters or enclosed by double quotes.
There is itemDelimiter but not wordDelimiter.
So punctuation as to be removed first before adding the word to the word count array.
This may be done with a function effectiveWord.
function effectiveWord aWord
put last char of aWord into it
if it is "." then delete last char of aWord
if it is "," then delete last char of aWord
if it is ":" then delete last char of aWord
if it is ";" then delete last char of aWord
return aWord
end effectiveWord
on mouseUp
--cycle through each word adding each instance to an array
repeat for each word tWord in field "Sample text"
add 1 to tWordCount[effectiveWord(tWord)]
end repeat
-- combine the array into text
combine tWordCount using return and comma
answer tWordCount
end mouseUp

What roles do minus signs and single quotation marks play in if statements?

In the following code, adding the same letter to both operands of the comparison changes the result. Despite - being not greater than j, -k is greater than jk.
This only happens if one of the operands is the minus sign (-) or single quotation mark (').
Why does this happen? What are the rules?
if - gtr j (echo - greater than j) else echo - less than j
if "-" gtr "j" (echo "-" greater than "j") else echo "-" less than "j"
echo.
if -k gtr jk (echo -k greater than jk) else echo -k less than jk
if "-k" gtr "jk" (echo "-k" greater than "jk") else echo "-k" less than "jk"
echo.
if ' gtr u (echo ' greater than u) else echo ' less than u
if "'" gtr "u" (echo "'" greater than "u") else echo "'" less than "u"
echo.
if 'v gtr uv (echo 'v greater than uv) else echo 'v less than uv
if "'v" gtr "uv" (echo "'v" greater than "uv") else echo "'v" less than "uv"
The result is:
- less than j
"-" less than "j"
-k greater than jk
"-k" greater than "jk"
' less than u
"'" less than "u"
'v greater than uv
"'v" greater than "uv"
You may be assuming that strings are just compared character by character, taking their ordinal values.
That's not true. Collation is much more complex than that.
In fact, you can see the same in other environments, such as Windows PowerShell:
PS Home:\> '-' -gt 'j'
False
PS Home:\> '-k' -gt 'jk'
True
PS Home:\> '''' -gt 'u'
False
PS Home:\> '''v' -gt 'uv'
True
It could very well be that the order of strings varies with your locale as well.
As for your particular problem here, quoting from the Unicode Collation Algorithm (UTS #10):
Collation order is not preserved under concatenation or substring operations, in general.
For example, the fact that x is less than y does not mean that x + z is less than y + z, because characters may form contractions across the substring or concatenation boundaries. In summary:
x < y does not imply that xz < yz
x < y does not imply that zx < zy
xz < yz does not imply that x < y
zx < zy does not imply that x < y
and to solve the misconveption you're likely under:
Collation is not code point (binary) order.
A simple example of this is the fact that capital Z comes before lowercase a in the code charts. As noted earlier, beginners may complain that a particular Unicode character is “not in the right place in the code chart.” That is a misunderstanding of the role of the character encoding in collation. While the Unicode Standard does not gratuitously place characters such that the binary ordering is odd, the only way to get the linguistically-correct order is to use a language-sensitive collation, not a binary ordering.

Resources