Shell to resolve output issue - issue in concatenation and file reading - file

I have a quick question - I am trying to resolve an issue with a serie of files where the output has been changed.
The output should look like that:
>Tests HadI-sdds1:4134:AAABBBBB:1:1101:6635:2407_2:N:0:TTTTTT
AAAABBBBBEEEECCCCERTTSFASFASFDSGFSDGGSFGFSGDFGDFGDFGDFGDFGDFGDFGDCCVBWAAAABBBBBEEEECCCCERTTSFASFASFDSGFSDGGSFGFSG
But appears as:
>Tests HadI-sdds1:4134:AAABBBBB:1:1101:6635:2407_2:N:0:TTTTTT
AAAABBBBBEEEECCCCERTTSFASFASFDSGFSDGGSFGFSGDFGDFGDFGDFGDFGDFGDFGDCCVBW
AAAABBBBBEEEECCCCERTTSFASFASFDSGFSDGGSFGFSG
I have written the following code to try to fix it, but the line 16 appears to return an empty string, however when I do the echo without putting in a var, I get the complete line.
#!/bin/sh
FILENAME=$1
OUTPUT=$2
set LineToWrite=''
while read LINE
do
if [ `echo "$LINE" | awk '{print substr($0,1,1)}'` == ">" ]
then
echo "$LineToWrite" >> $OUTPUT
echo "$LINE" >> $OUTPUT
set LineToWrite=''
else
set currLine=`echo "$LINE" | awk '{print substr($0,1,70)}'`
set LineToWrite+=$currLine
fi
done <$FILENAME
Any Idea to solve my problem? (The files contains > 1million lines)
Thanks a lot in advance!!!!

Three things :
no space allowed between keys & values in shell
use more quotes on all variables
no need to cat FILE | while : use while <condition>; do ...; done < FILE
USE MORE QUOTES! They are vital. Also, learn the difference between ' and " and `. See http://mywiki.wooledge.org/Quotes and http://wiki.bash-hackers.org/syntax/words

Related

Read lines looping over array which contains filenames in bash

In bash, I would like to loop over a previously defined array, which contains filenames. In turn, each file of the array must be readed and processed dynamically (while read line...).
This is an example of what the files of the array contains:
_VALUE1_,_VALUE1_,1,Name 1
_VALUE2_,_VALUE2_,1,Name 2
_VALUE3_,_VALUE3_,1,Name 3
_VALUE4_,_VALUE4_,1,Name 4
_VALUE5_,_VALUE5_,1,Name 5
This is what I've tested with no luck.
#!/bin/bash
. functions.sh
GEN_ARQ_ARRAY=("./cfg_file.txt" "./euro_file.txt" "./zl_file.txt")
WB_ARQ_ARRAY=("./rn_cfg_wb_file.txt" "./rn_eur_wb_file.txt" "./rn_zl_wb_file.txt")
BN_ARQ_ARRAY=("./rn_cfg_bn_file.txt" "./rn_eur_bn_file.txt" "./rn_zl_bn_file.txt")
AM_ARQ_ARRAY=("./rn_cfg_am_file.txt" "./rn_eur_am_file.txt" "./rn_zl_am_file.txt")
STATUS_BOOL=true
for i in "${!GEN_ARQ_ARRAY[#]}"; do
while IFS=$'\r' read -r line || [[ -n "$line" ]];do
STACK_NAME=${line%%,*} # Gets the first substring of a string divided by ','
STACK_STATUS=$(curl -su "${USERNAME}":"${PASSWORD}" -X GET http://"${SERVER_NAME}":9100/api/stacks/"${STACK_NAME}"/state | ./jq-linux64 -cr '.result.value')
if [[ $(echo "$STACK_STATUS" | tr -d '\r') == "$STATUS_BOOL" ]]; then
echo "${line}" >> "${GEN_ARQ_ARRAY[i]}"
case ${line} in
*"ARQBS"*|*"ARCBS"*|*"ARQWB"*|*"ARCWB"*) echo "${line}" >> "${WB_ARQ_ARRAY[$i]}";;
*"ARQOF"*|*"ARCOF"*|*"ARQBN"*|*"ARCBN"*) echo "${line}" >> "${BN_ARQ_ARRAY[$i]}";;
*"ARQAM"*|*"ARCAM"*) echo "${line}" >> "${AM_ARQ_ARRAY[$i]}";;
*) echo "$(logWarn) No matches -- ${STACK_NAME}" | tee -a "$LOGFILE";;
esac
else
echo "$(logInfo) ${STACK_NAME} is not running" | tee -a "$LOGFILE"
fi
done < "${GEN_ARQ_ARRAY[i]}"
done
Problem here is that the for loop starts, detects array content, gets the first value of the array, enter into while, and it constantly loops in the first position of the array even with the end of the file is reached. I can't find the way to exit the while loop and continue with the next array position.
I'm pretty sure there is a better way to implement this.
Hearing your ideas!
Edit:
Solved by replacing the line echo "${line}" >> "${GEN_ARQ_ARRAY[i]}", which was in-loop filling up the file.
Solved by replacing the line echo "${line}" >> "${GEN_ARQ_ARRAY[i]}", which was in-loop filling up the file. Once I did, the code worked flawlessly.

Bash Add elements to an array does not work [duplicate]

Why isn't this bash array populating? I believe I've done them like this in the past. Echoing ${#XECOMMAND[#]} shows no data..
DIR=$1
TEMPFILE=/tmp/dir.tmp
ls -l $DIR | tail -n +2 | sed 's/\s\+/ /g' | cut -d" " -f5,9 > $TEMPFILE
i=0
cat $TEMPFILE | while read line ;do
if [[ $(echo $line | cut -d" " -f1) == 0 ]]; then
XECOMMAND[$i]="$(echo "$line" | cut -d" " -f2)"
(( i++ ))
fi
done
When you run the while loop like
somecommand | while read ...
then the while loop is executed in sub-shell, i.e. a different process than the main script. Thus, all variable assignments that happen in the loop, will not be reflected in the main process. The workaround is to use input redirection and/or command substitution, so that the loop executes in the current process. For example if you want to read from a file you do
while read ....
do
# do stuff
done < "$filename"
or if you wan't the output of a process you can do
while read ....
do
# do stuff
done < <(some command)
Finally, in bash 4.2 and above, you can set shopt -s lastpipe, which causes the last command in the pipeline to be executed in the current process.
I think you're trying to construct an array consisting of the names of all zero-length files and directories in $DIR. If so, you can do it like this:
mapfile -t ZERO_LENGTH < <(find "$DIR" -maxdepth 1 -size 0)
(Add -type f to the find command if you're only interested in regular files.)
This sort of solution is almost always better than trying to parse ls output.
The use of process substitution (< <(...)) rather than piping (... |) is important, because it means that the shell variable will be set in the current shell, not in an ephimeral subshell.

echo a variable and then grep to see if value exist in a file is not returning anything. Unix Shell Scripting

I'm trying to figure out how to determine if a variable contains a value from a file using grep, this is not returning anything, so I'm going to explain it.
I have my code that is this:
MyFiles="MyFile-I-20160606_141_Employees.txt"
DirFiles="/dev/fs/C/Users/salasfri/Desktop/MyFiles.txt"
for OutFile in $(cat $DirFiles); do
if [[ $( echo $MyFiles | grep -c $OutFile ) -gt 0 ]]; then
print "The file $OutFile exist!!"
fi
done
and the file in /dev/fs/C/Users/salasfri/Desktop/MyFiles.txt contains the following values:
MyFile-I-*_141_Employees.txt
MyFile-I-*_141_Products.txt
MyFile-I-*_141_Deparments.txt
the idea is verify if the variable "MyFiles" is found in the MyFiles.txt file, as you can see is using the pattern "*" due that is a date, it will change.
that solutions is not returning any count of files, there's something that I'm doing wrong?
You can try to change the searchstring before searching.
An example with three teststrings:
for teststring in MyFile-I-20160606_141_Employees.txt MyFile-I-20160606_142_Employees.txt MyFile-I-20160606_141_Others.txt
do
grepstr=$(sed 's/[0-9]\{8\}_/*_/' <<< "${teststring}")
fgrep "${grepstr}" "${DirFiles}"
found=$(fgrep "${grepstr}" "${DirFiles}")
if [ $? -eq 0 ]; then
echo "${found} matches ${teststring}."
fi
done
In your case you can make the code shorter with
fgrep -q "$(sed 's/[0-9]\{8\}_/*_/' <<< "${MyFiles}")" $DirFiles &&
echo "The file $(sed 's/[0-9]\{8\}_/*_/' <<< "${MyFiles}") exist!!"
Your patterns are glob-style patterns, not regular expressions. The pattern abc-*_X.txt will not match the string abc-1234_X.txt.
You want to use a shell construct that does glob matching.
MyFiles="MyFile-I-20160606_141_Employees.txt"
sed 's/\r$//' "/dev/fs/C/Users/salasfri/Desktop/MyFiles.txt" \
| while IFS= read -r Pattern; do
if [[ $MyFiles == $Pattern ]]; then
print "$MyFiles matches pattern $Pattern"
break
fi
done

Using array inside awk in shell script

I am very new to Unix shell script and trying to get some knowledge in shell scripting. Please check my requirement and my approach.
I have a input file having data
ABC = A:3 E:3 PS:6
PQR = B:5 S:5 AS:2 N:2
I am trying to parse the data and get the result as
ABC
A=3
E=3
PS=6
PQR
B=5
S=5
AS=2
N=2
The values can be added horizontally and vertically so I am trying to use an array. I am trying something like this:
myarr=(main.conf | awk -F"=" 'NR!=1 {print $1}'))
echo ${myarr[1]}
# Or loop through every element in the array
for i in "${myarr[#]}"
do
:
echo $i
done
or
awk -F"=" 'NR!=1 {
print $1"\n"
STR=$2
IFS=':' read -r -a array <<< "$STR"
for i in "${!array[#]}"
do
echo "$i=>${array[i]}"
done
}' main.conf
But when I add this code to a .sh file and try to run it, I get syntax errors as
$ awk -F"=" 'NR!=1 {
> print $1"\n"
> STR=$2
> FS= read -r -a array <<< "$STR"
> for i in "${!array[#]}"
> do
> echo "$i=>${array[i]}"
> done
>
> }' main.conf
awk: cmd. line:4: FS= read -r -a array <<< "$STR"
awk: cmd. line:4: ^ syntax error
awk: cmd. line:5: for i in "${!array[#]}"
awk: cmd. line:5: ^ syntax error
awk: cmd. line:8: done
awk: cmd. line:8: ^ syntax error
How can I complete the above expectations?
This is the awk code to do what you want:
$ cat tst.awk
BEGIN { FS="[ =:]+"; OFS="=" }
{
print $1
for (i=2;i<NF;i+=2) {
print $i, $(i+1)
}
print ""
}
and this is the shell script (yes, all a shell script does to manipulate text is call awk):
$ awk -f tst.awk file
ABC
A=3
E=3
PS=6
PQR
B=5
S=5
AS=2
N=2
A UNIX shell is an environment from which to call UNIX tools (find, sort, sed, grep, awk, tr, cut, etc.). It has its own language for manipulating (e.g. creating/destroying) files and processes and sequencing calls to tools but it is NOT intended to be used to manipulate text. The guys who invented shell also invented awk for shell to call to manipulate text.
Read https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice and the book Effective Awk Programming, 4th Edition, by Arnold Robbins.
First off, a command that does what you want:
$ sed 's/ = /\n/;y/: /=\n/' main.conf
ABC
A=3
E=3
PS=6
PQR
B=5
S=5
AS=2
N=2
This replaces, on each line, the first (and only) occurrence of = with a newline (the s command), then turns all : into = and all spaces into newlines (the y command). Notice that
this works only because there is a space at the end of the first line (otherwise it would be a bit more involved to get the empty line between the blocks) and
this works only with GNU sed because it substitutes newlines; see this fantastic answer for all the details and how to get it to work with BSD sed.
As for what you tried, there is almost too much wrong with it to try and fix it piece by piece: from the wild mixing of awk and Bash to syntax errors all over the place. I recommend you read good tutorials for both, for example:
The BashGuide
Effective AWK Programming
A Bash solution
Here is a way to solve the same in Bash; I didn't use any arrays.
#!/bin/bash
# Read line by line into the 'line' variable. Setting 'IFS' to the empty string
# preserves leading and trailing whitespace; '-r' prevents interpretation of
# backslash escapes
while IFS= read -r line; do
# Three parameter expansions:
# Replace ' = ' by newline (escape backslash)
line="${line/ = /\\n}"
# Replace ':' by '='
line="${line//:/=}"
# Replace spaces by newlines (escape backslash)
line="${line// /\\n}"
# Print the modified input line; '%b' expands backslash escapes
printf "%b" "$line"
done < "$1"
Output:
$ ./SO.sh main.conf
ABC
A=3
E=3
PS=6
PQR
B=5
S=5
AS=2
N=2

Store the output of find command in an array [duplicate]

This question already has answers here:
How can I store the "find" command results as an array in Bash
(8 answers)
Closed 4 years ago.
How do I put the result of find $1 into an array?
In for loop:
for /f "delims=/" %%G in ('find $1') do %%G | cut -d\/ -f6-
I want to cry.
In bash:
file_list=()
while IFS= read -d $'\0' -r file ; do
file_list=("${file_list[#]}" "$file")
done < <(find "$1" -print0)
echo "${file_list[#]}"
file_list is now an array containing the results of find "$1
What's special about "field 6"? It's not clear what you were attempting to do with your cut command.
Do you want to cut each file after the 6th directory?
for file in "${file_list[#]}" ; do
echo "$file" | cut -d/ -f6-
done
But why "field 6"? Can I presume that you actually want to return just the last element of the path?
for file in "${file_list[#]}" ; do
echo "${file##*/}"
done
Or even
echo "${file_list[#]##*/}"
Which will give you the last path element for each path in the array. You could even do something with the result
for file in "${file_list[#]##*/}" ; do
echo "$file"
done
Explanation of the bash program elements:
(One should probably use the builtin readarray instead)
find "$1" -print0
Find stuff and 'print the full file name on the standard output, followed by a null character'. This is important as we will split that output by the null character later.
<(find "$1" -print0)
"Process Substitution" : The output of the find subprocess is read in via a FIFO (i.e. the output of the find subprocess behaves like a file here)
while ...
done < <(find "$1" -print0)
The output of the find subprocess is read by the while command via <
IFS= read -d $'\0' -r file
This is the while condition:
read
Read one line of input (from the find command). Returnvalue of read is 0 unless EOF is encountered, at which point while exits.
-d $'\0'
...taking as delimiter the null character (see QUOTING in bash manpage). Which is done because we used the null character using -print0 earlier.
-r
backslash is not considered an escape character as it may be part of the filename
file
Result (first word actually, which is unique here) is put into variable file
IFS=
The command is run with IFS, the special variable which contains the characters on which read splits input into words unset. Because we don't want to split.
And inside the loop:
file_list=("${file_list[#]}" "$file")
Inside the loop, the file_list array is just grown by $file, suitably quoted.
arrayname=( $(find $1) )
I don't understand your loop question? If you look how to work with that array then in bash you can loop through all array elements like this:
for element in $(seq 0 $((${#arrayname[#]} - 1)))
do
echo "${arrayname[$element]}"
done
This is probably not 100% foolproof, but it will probably work 99% of the time (I used the GNU utilities; the BSD utilities won't work without modifications; also, this was done using an ext4 filesystem):
declare -a BASH_ARRAY_VARIABLE=$(find <path> <other options> -print0 | sed -e 's/\x0$//' | awk -F'\0' 'BEGIN { printf "("; } { for (i = 1; i <= NF; i++) { printf "%c"gensub(/"/, "\\\\\"", "g", $i)"%c ", 34, 34; } } END { printf ")"; }')
Then you would iterate over it like so:
for FIND_PATH in "${BASH_ARRAY_VARIABLE[#]}"; do echo "$FIND_PATH"; done
Make sure to enclose $FIND_PATH inside double-quotes when working with the path.
Here's a simpler pipeless version, based on the version of user2618594
declare -a names=$(echo "("; find <path> <other options> -printf '"%p" '; echo ")")
for nm in "${names[#]}"
do
echo "$nm"
done
To loop through a find, you can simply use find:
for file in "`find "$1"`"; do
echo "$file" | cut -d/ -f6-
done
It was what I got from your question.

Resources