Batch Script to remove Specials Charcters - batch-file

i am trying to create a batch file , but i am unable to find.
My Requirement is below.
1) i have some group of text files like Text file 1 , Text File 2 , Text file 3.
2) Each text files contains Some Special Characters .
3) I want to remove those Special characters from All the Text files .
4) Some Specials characters are there which we can type it on Notepad.
5) So i need a batch file, which can search for special character by passing ASCII Value & Remove them .
Please let us know, it would be grateful.
////// Below is text file format
81
2016-03-13 00:13:05 2016-03-14 00:51:39 �# 81
101
2016-03-13 00:13:05 2016-03-14 03:02:48 xuyou �#
2016-03-14 03:16:06 2016-03-14 08:16:13 =M 100
2016-03-14 03:16:06 2016-03-14 08:16:13
2016-03-14 03:16:06 2016-03-14 08:16:41 Search : ��~ 100
dhfcjchjchjcdhj �
huge files are not ready f okay
~
fd

Looks like binary data.May be he easiest way will be to use Strings.exe:
strings -n 1 -a -q nottextfile >purified
and see if the purified file contains what you want.

Related

Split file in dynamic name in unix

I have a file of 500 million records and I have to split this file into files of 1 million each. File should be named dynamically with numeric suffix. I tried :
split -dl 1000000 myInputFile.txt output_
But after creating 99 files( like output_00... output_99) , I got the error:
split: output file suffixes exhausted
Any suggestion?
man split says
-a 3
to use 3 digits suffixes
If all else fails read the manual

How to search a file for lines that contain a particular string but not a particular string?

I have a file in which I want to output all the lines containing a particular phrase but not another phrase to a different file. For example here is a sample file:
qwert catsanddogs werwer
sdfg catsandlions sdfggf
dfhgsdg catsandtigers dhjtjye
tqvtw erytwy weyyq
........
So I want to redirect all the lines which have cats but not the lines which have catsanddogs into another file.
Can anyone explain how i can do this?
grep is a tool that can filter inputs. Basically you can describe what you want to catch (with regexp — a regexp describes a set of words) and decide if you want to include or exclude a line (grep works on lines of text) on the output.
grep -v could be your friend :
cat myfile | grep cats | grep -v catsandlions > resultfile
You first filter each line that contains cats and then from this exclude all that contains catsandlions.
You can also make the converse : exclude catandlions first and the include cats.

Batch script to delete all the lines from the line that starts with a specific word "TRAIL"

I want to write a Batch script for the first time and am struck.
The requirement is:
I want to delete all the lines in a text file that comes after the line that starts with the letters "TRAI......".
Example:
My test file looks like:
123 sdefef dhufheij 123232
234 ddefef mijijijj 232323
345 jcdhence 345987
TRAILER0000034
456 edrftg nbuyfjjf 678655
Result should be:
123 sdefef dhufheij 123232
234 ddefef mijijijj 232323
345 jcdhence 345987
456 edrftg nbuyfjjf 678655
This can be easily done with sed to find a pattern
sed -i 's/TRAI.*//' your_file.txt
The -i flag modifies the actual file. "s/TRAI.*//" finds a pattern starting with TRAI and everything that follows (represented by .*) and replaces it with nothing
Edit: Nevermind, I misread "batch script" as "bash script"
The findstr utility can be used to filter lines. I'll leave the rest of the logic up to you.

printing part of file

Is there a magic unix command for printing part of a file? I have a file that has several millions of lines and I would like to skip first million or so lines and print the next million lines of the file.
Thank you in advance.
To extract data, sed is your friend.
Assuming a 1-off task that you can enter to your cmd-line:
sed -n '200000,300000p' file | enscript
"number comma (,) number" is one form of a range cmd in sed. This one starts at line 2,000,000 and *p*rints until you get to 3,000,000.
If you want the output to go to your screen remove the | enscript
enscript is a utility that manages the process of sending data to Postscript compatible printers. My Linux distro doesn't have that, so its not necessarily a std utility. Hopefully you know what command you need to redirect to to get output printed to paper.
If you want to "print" to another file, use
sed -n '200000,300000p' file > smallerFile
IHTH
I would suggest awk as it is a little easier and more flexible than sed:
awk 'FNR>12 && FNR<23' file
where FNR is the record number. So the above prints lines above 12 and below 23.
And you can make it more specific like this:
awk 'FNR<100 || FNR >990' file
which prints lines if the record number is less than 100 or over 990. Or, lines over 100 and lines containing "fred"
awk 'FNR >100 || /fred/' file

How to select an element from a 2d array in a file in Linux shell

I am new to shell scripting and what I need is to read from a file that contains a 2d array. Assume there is a file named test.dat which contains values as:
- Paris London Lisbon
- Manchester Nurnberg Istanbul
- Stockholm Kopenhag Berlin
What is the easiest way to select an element from this table in linux bash scripts? For example, the user inputs -r 2 -c 2 test.dat that implies to selecting the element at row[2] and column[2] (Nurnberg).
I have seen the read command and googled but most of the examples were about 1d array.
This one looks familiar but could not understand it exactly.
awk is great for this:
$ awk 'NR==row{print $col}' row=2 col=2 file
Nurnberg
NR==row{} means: on number of record number row, do {} Number of record normally is the number of line.
{print $col} means: print the field number col.
row=2 col=2 is giving both parameters to awk.
Update
One more little question: How can I transform this into a sh file so
that when I enter -r 2 -c 2 test.dat into prompt, I get to run the
script so that it reads from the file and echoes the output? –
iso_9001_.
For example:
#!/bin/bash
file=$1
row=$2
col=$3
awk 'NR==row{print $col}' row=$row col=$col $file
And you execute like:
./script a 3 2
Kopenhag

Resources