How to find non-printable characters in the file? - file

I tried to find out the unprintable characters in data filein unix.
Code :
#!/bin/ksh
export SRCFILE='/data/temp1.dat'
while read line
do
len=lenght($line)
for( $i = 0; $i < $len; $i++ ) {
if( ord(substr($line, $i, 1)) > 127 )
{
print "$line\n";
last;
}
done < $SRCFILE
The code is not working , please help me in getting a solution for the above query.

You can use grep for finding non-printable characters in a file, something like the following, which finds all non-printable-ASCII and all non-ASCII:
grep -P -n "[\x00-\x1F\x7F-\xFF]" input_file
-P gives you the more powerful Perl regular expressions (PCREs) and -n shows line numbers.
If your grep doesn't support PCREs, I'd just use Perl for this directly:
perl -ne '$x++;if($_=~/[\x00-\x1F\x7F-\xFF]/){print"$x:$_"}' input_file

You may try something like this :
grep '[^[:print:]]' filePath

This sounds pretty trite but I was not sure how to do it just now.
I have become fond of "od" depending on what you are doing you may want something suited to printing arbitrary characters. The awk code is not very elegant but it is flexible if you are looking for specifics, the point is just to show the use of od however. Note the problems with awk compares and the spaces etc,
cat filename | od -A n -t x1z | awk '{ p=0; i=1; if ( NF>16) { while (i<17) {if ( $i!="0d"){ if ( $i!="0a") {if ( $i" " < "20 " ) {print $i ; p=1;} if ( $i" "> "7f "){print $i; p=1;}}} i=i+1} if (p==1) print $0; }}' | more

Related

How to replace a word from a file with elements in an array

I am trying to replace all the word "null" to elements in array. The problem is that after replacing one word of "null", I would like to replace the next "null" with next element in the array.
I am not very good with bash and I feel like this is quite a basic question.
Here is what I have so far:
for m in $(cat finalfile.csv)
do
if [ "$m" = "null" ]
then
m=cwearray[$counter]
let counter++
fi
done
This doesn't replace anything in the finalfile.csv.
For example if the file has:
"value1","value2","null","value3"\n
"value1","value2","null","value3"...
and the array has ["foo","bar"]
I would like it to be:
"value1","value2","foo","value3"\n
"value1","value2","bar","value3"...
can be done with bash, even with multiple nulls per line:
$ cat finalfile.csv
"value1","value2","null","null"
"value1","value2","null","value3"
$ cwearray=( foo bar baz )
$ idx=0
$ while read -r line; do
while [[ $line == *null* ]]; do
line=${line/null/${cwearray[idx++]}}
# ...............^^^^^^^^^^^^^^^^^^
# replace the _first_ "null" with the _next_ array element
done
echo "$line"
done < finalfile.csv > updatedfinalfile.csv
$ cat updatedfinalfile.csv
"value1","value2","foo","bar"
"value1","value2","baz","value3"
It's easier in Perl where you can increase the index directly in the replacement part of a substitution:
printf '%s\n' 1,2,3,null null,2,3,4 null,null,null,null \
| perl -pe 'BEGIN { #cwe = qw( A B C D E F ) }
s/(?:^|(?<=,))null(?=,|$)/$cwe[$i++]/g'
Update: It seems you've updated your question with a sample input. If nulls are double quoted, it gets even easier, as there's no need to check whether they're surrounded with commas or beginning/end of the line.
perl -pe 'BEGIN{ #cwe = qw( foo bar ) }
s/"null"/"$cwe[$i++]"/g'
An awk solution :
declare -a cwearray
cwearray=(foo bar)
awk -F, 'NR==FNR{repl[NR]=$0; next}{for(i=1;i<=NF;i++){if($i=="\"null\""){$i="\""repl[++counter]"\""}}}1' OFS="," <(for i in "${cwearray[#]}"; do echo "$i"; done) <file>
Read the file line by line. If a line contains null, then use sed to replace all occurrences of null with the corresponding value, retrieved via array index.
#!/bin/bash
file="finalfile.csv"
counter=0
array=(
"foo"
"bar"
)
while read -r line; do
item="${array[$counter]}"
echo "$line" | sed "s/null/$item/g"
((counter++))
done < "$file"

Iterating over lines (w/ numbers) read from a file to an array in bash

I'm trying to write a small script that will take the 4th columns of a file and store it in an array then do a little comparison. If the element in the array is greater than 0 and less than 500 I have to increment the counter. However when I run the script the counter always shows 0. Here's my script
#!/bin/bash
mapfile -t my_array < <(cat file1.txt | awk '{ print $4 }' > test.txt)
COUNTER=0
for i in ${my_array[#]}; do
if [["${my_array[$i]}" -gt 0 -a "${my_array[$i]}" -lt 500 ]]
then
COUNTER=$((COUNTER + 1))
fi
printf "%s\t%s\n" "%i" "${my_array[$i]}"//just to test if the mapfile command is working
done
echo $COUNTER
output:
./script1.bash
0
#!/bin/bash
mapfile -t my_array < <(awk '{ print $4 }' file1.txt | tee test.txt)
COUNTER=0
for idx in "${!my_array[#]}"; do
value=${my_array[$idx]}
if (( value > 0 )) && (( value < 500 )); then
COUNTER=$((COUNTER + 1))
fi
printf "%s\t%s\n" "$idx" "$value"
done
echo "$COUNTER"
The use of cat here is needless: It added nothing but inefficiency (requiring an extra process to be started, and forcing awk to read from a pipe rather than direct from a file).
mapfile had nothing to read because the output of awk was redirected to test.txt. If you want it to go to both a file and stdout, then you need to use tee.
-a is not valid in [[ ]]; use && instead there. However, since you're doing only arithmetic, (( )) is more appropriate. Incidentally, -a is officially marked obsolescent even for [ ] and test; see the current POSIX standard.
${my_array[#]} iterates over values. If you want to iterate over indexes, you need ${!my_array[#]} instead.
Whitespace is mandatory in separating command names. [["$foo" is a different command from [[, unless $foo is empty or starts with a character in $IFS.
If you redirect the output to a file: > test.txt then there is no output in "standard output" because it is consumed by the file. So, first, you need to remove that redirection. You may use:
mapfile -t my_array < <(cat file1.txt | awk '{ print $4 }' )
But since awk could perfectly well read a file, this is better:
mapfile -t my_array < <(awk '{ print $4 }' file1.txt)
And since you are using awk, it could do the comparison to 0 and 500 and output the whole count.
counter=$(awk '{if($4>0 && $4<500){c++}}END{print c}' file1.txt)
echo "$counter"
Simpler, faster.
That will also avoid some simple mistakes in your script, like missing an space in the […] construct:
if [[ "${my … # NOT "if [["${my …"
And some missing quotes:
for i in "${my_array[#]}" # NOT for i in ${my_array[#]}
In general, it is a good idea to check your script with ShellCheck.net to remove some simple mistakes.

Prepending a variable to all items in a bash array

CURRENTFILENAMES=( "$(ls $LOC -AFl | sed "1 d" | grep "[^/]$" | awk '{ print $9 }')" )
I have written the above code, however it is not behaving as I expect it to in a for-loop, which I wrote as so
for a in "$CURRENTFILENAMES"; do
CURRENTFILEPATHS=( "${LOC}/${a}" )
done
Which I expected to prepend the value in the variable LOC to all of the items in the CURRENTFILENAMES array, however it has just prepended it to the beginning of the array, how can I remedy this?
You need to use += operator for appending into an array:
CURRENTFILEPATHS+=( "${LOC}/${a}" )
However parsing ls output is not advisable, use find instead.
EDIT: Proper way to run this loop:
CURRENTFILEPATHS=()
while IFS= read -d '' -r f; do
CURRENTFILEPATHS+=( "$f" )
done < <(find "$LOC" -maxdepth 1 -type f -print0)

working with arrays in bash and do stuff

I am writing a bash script and need help. This is what I tried:
With the help of #merlin2011
#!/bin/bash
if [ $# -ne 2 ]; then
echo "Usage: `basename $0` <absolute-path> <number>"
exit 1
fi
if [ "$(id -u)" != "0" ]; then
echo "This script must be run as root" 1>&2
exit 1
fi
#find . -name "$2" -exec mv {} /some/path/here \;
find $1 >> /tmp/test
for line in $(cat `/tmp/test`); do
echo $line | mv $2 awk -F"/" '{for (i = 1; i < NF; i++) if ($i == "$2") print $(i-1)}'
done
Now I want to check the result of find command from array and then if there were a directory named 2010 then get the absolute path of it. For ecxample:
arr[1]="/path/to/2010/file.db"
Then I want to rename 2010 to parent directory to. My pattern is:
arr[1]="/path/to/2010/file.db"
arr[2]="/path/test/2010/fileee.db"
arr[3]="/path/tt/2010/fileeeee.db"
.
.
.
arr[100]="/path/last/2010/fileeeeeee.db"
Result should be:
mv /path/to/2010/ to
mv /path/test/2010 test
mv /path/tt/2010/ tt
.
.
.
mv /path/last/2010 last
UPDATE:
Totally I want to know how to get a variable inversely in awk...
/path/to/dir1/2010/file.db
I want to search in absolute path then find 2010 and rename it in previous path with / pattern like : awk -F"/" {print [what?]}
tell awk my state is 2010 then print one variable before it by knowing splitter is /
UPDATE
The files dirs and subdirs pattern are:
/path/to/file/efsef/2010/1.db
/path/to/file/hfjh/sdfsf/2010/2.db
/path/to/file/dsf/sdhher/aqwe/sfrt/2010/3.db
.
.
.
/path/to/file/kldf/2010/100.db
I want to rename all 2010 dirs to their parent then tar all .db
This is what exactly I want :)
This answer addresses only the OP's update. My best interpretation is that you are trying to get awk to print the value dir1 inside the string /path/to/dir1/2010/file.db. The following line will achieve it.
awk -F"/" '{for (i = 1; i < NF; i++) if ($i == "2010") print $(i-1)}'
I tested using the following command, which will output dir1.
echo /path/to/dir1/2010/file.db | awk -F"/" '{for (i = 1; i < NF; i++) if ($i == "2010") print $(i-1)}'
Based on your update, you should surround the awk command with the backtic operator.
mv $2 `awk -F"/" '{for (i = 1; i < NF; i++) if ($i == "$2") print $(i-1)}'`
To implement, we have to do the recursive folder search.
It should be combination of two commands like find and mv
find . -name "2010" -exec mv {} /some/path/here \;
Other way shared by merlin2011
mv $2 awk -F"/" `'{for (i = 1; i < NF; i++) if ($i == "$2") print $(i-1)}'`
Here is awk command:
awk -F"/$2/" '{split($1, a, "/"); system("echo mv " $0 " " a[length(a)]);}' <<< "$1"
mv /path/to/dir1/2010/file.db dir1
Once you're satisfied remove echo in system command.

Is there a way to search an entire array inside of an argument?

Posted my code below, wondering if I can search one array for a match... or if theres a way I can search a unix file inside of an argument.
#!/bin/bash
# store words in file
cat $1 | ispell -l > file
# move words in file into array
array=($(< file))
# remove temp file
rm file
# move already checked words into array
checked=($(< .spelled))
# print out words & ask for corrections
for ((i=0; i<${#array[#]}; i++ ))
do
if [[ ! ${array[i]} = ${checked[#]} ]]; then
read -p "' ${array[i]} ' is mispelled. Press "Enter" to keep
this spelling, or type a correction here: " input
if [[ ! $input = "" ]]; then
correction[i]=$input
else
echo ${array[i]} >> .spelled
fi
fi
done
echo "MISPELLED: CORRECTIONS:"
for ((i=0; i<${#correction[#]}; i++ ))
do
echo ${array[i]} ${correction[i]}
done
otherwise, i would need to write a for loop to check each array indice, and then somehow make a decision statement whether to go through the loop and print/take input
The ususal shell incantation to do this is:
cat $1 | ispell -l |while read -r ln
do
read -p "$ln is misspelled. Enter correction" corrected
if [ ! x$corrected = x ] ; then
ln=$corrected
fi
echo $ln
done >correctedwords.txt
The while;do;done is kind of like a function and you can pipe data into and out of it.
P.S. I didn't test the above code so there may be syntax errors

Resources