Bash script to loop through file name to delete at specific index

Bash script to loop through file name to delete at specific index - file

I have a ton of files that are named like this:
nn - xxxxxxxxxxxxxx-OOO.ext
Where nn is always a two digit number and xxxxx is a variable length of text. (The suffix of -OOO is static throughout all of the files). What should be in the loop to rename the files to:
xxxxxxxxxxxxxx.ext
Thus removing the nn -(always the first 5 characters) and the -OOO.

You can do that with two substring operations:
$ name="nn - xxxx x xx xx xxxxx-OOO.ext"
$ part1=${name:5} # substring starting at position 5
$ part2=${part1%-OOO.ext} # remove `-OOO.ext` at the end of $part1
$ final="$part2".ext
$ echo $final
xxxx x xx xx xxxxx.ext
$ mv "$name" "$final"

echo $file_name | sed "s/.*-\s*\(.*\)-.*/\1.ext/" will give you the "xxxxxxx.ext" as you asked for in the OP.

Related

find first numeric value of first row and insert it at the end of second line, do it for multiples files in the same directory

I have multiple files to process within a unique directory.
They share the same extension (.dat) but their name could be anything.
Each file has a 1st line made of a random text in which the first encountered numeric value has to be caught and then put at the end of the 2nd line.
Then many other lines after.
1st and 2nd line number of fields is unknown, as the position of the numeric value in the 1st row. 1st row can also include several numeric values.
This currently looks like as the example, with '850' in 'xxx.dat', file below:
typical input:
field11 field21 ... 850 ... 520 ... blabla ... 1100 ... fieldi1
field12 field22 ... fieldj2
field13 field23 ... fieldk3
...
field1n field2n ... fieldzn
desired output:
field11 field21 ... 850 ... 520 ... blabla ... 1100 ... fieldi1
field12 field22 ... fieldj2 850
field13 field23 ... fieldk3
...
field1n field2n ... fieldzn
Ideally a unique command or loop would process all the .dat files.
I am a beginner with sed and awk and unfortunately far from being able to solve this.
Please could I have any advice or solutions to do it ?
Thanks.

You can use this shell script:
#!/bin/sh
# make a temp file
trap "rm -f '$tmp'; exit" INT TERM
if command -v mktemp 2>&1 >/dev/null; then
tmp=$(mktemp)
else
tmp=edit.tmp
fi
[ -e "$tmp" ] && exit 1
# edit .dat files
for i in *.dat; do
awk '
NR==1 {while ($(++i) ~ /[^0-9]/); num=$i}
NR==2 {print $0,num}
NR!=2' "$i" > "$tmp" &&
mv "$tmp" "$i"
done
rm -f "$tmp"
It grabs the first digit only field in line 1, and appends it to line 2.
Run in a directory containing only .dats you wish to edit.
It helps a lot to say which OS platform you're targeting.

How to store variables from loop to a file

I am trying to store the variables $d, $tf_name, $db_orig created in the following loop to a file.I want to end up with a tab separated MY_FILE.txt containing the following fields $d, $tf_name, $db_orig and each iteration of this set of variables to be stored in a new line in the file MY_FILE.txt.
MY_ARRAY=()
for d in */
do
IN=$d
folderIN=(${IN//_/ })
tf_name=${folderIN[-1]%/*}
db_orig=${folderIN[-2]%/*};
ENTRY="$d\t$tf\t$id\t$db_orig\n"
MY_ARRAY+=$ENTRY
done
$MY_ARRAY > MY_FILE.txt
It doesn't recognise \t and \n as TAB and NEWLINE respectively. It stores all the values next to each other in the same line without TAB, in the array MY_ARRAY.
Any help?

Yes, this happens because $MY_ARRAY > MY_FILE.txt is not a valid command.
You need to print your array to the file.
And in order to print it correctly you need either to use
echo -e "${MY_ARRAY[#]}" >file or printf
By man echo
echo -e : enable interpretation of backslash escapes
Moreover, if you need to store the $ENTRY to your array you need to do it like this:
MY_ARRAY+=("$ENTRY")
In any case, you can do it without the need of an array. You can just apply += in the ENTRY : ENTRY+="$d\t$tf\t$id\t$db_orig\n"
Test:
$ e+="a\tb\tc\td\n"
$ e+="aa\tbb\tcc\tdd\n"
$ e+="aaa\tbbb\tccc\tddd\n"
$ echo -e "$e"
a b c d
aa bb cc dd
aaa bbb ccc ddd
# Test with array
$ e="a\tb\tc\td\n" && myar+=("$e")
$ e="aa\tbb\tcc\tdd\n" && myar+=("$e")
$ e="aaa\tbbb\tccc\tddd\n" && myar+=("$e")
$ echo -e "${myar[#]}"
a b c d
aa bb cc dd
aaa bbb ccc ddd
#Alternative array printing
$ for i in "${myar[#]}";do echo -en "$i";done
a b c d
aa bb cc dd

Append elements of an array to the end of a line

First let me say I followed questions on stackoverflow.com that relate to my question and it seems the rules are not applying. Let me show you.
The following script:
#!/bin/bash
OUTPUT_DIR=/share/es-ops/Build_Farm_Reports/WorkSpace_Reports
TODAY=`date +"%m-%d-%y"`
HOSTNAME=`hostname`
WORKSPACES=( "bob" "mel" "sideshow-ws2" )
if ! [ -f $OUTPUT_DIR/$HOSTNAME.csv ] && [ $HOSTNAME == "sideshow" ]; then
echo "$TODAY","$HOSTNAME" > $OUTPUT_DIR/$HOSTNAME.csv
echo "${WORKSPACES[0]}," >> $OUTPUT_DIR/$HOSTNAME.csv
sed -i "/^'"${WORKSPACES[0]}"'/$/'"${WORKSPACES[1]}"'/" $OUTPUT_DIR/$HOSTNAME.csv
sed -i "/^'"${WORKSPACES[1]}"'/$/${WORKSPACES[2]}"'/" $OUTPUT_DIR/$HOSTNAME.csv
fi
I want the output to look like:
09-20-14,sideshow
bob,mel,sideshow-ws2
the sed statements are supposed to append successive array elements to preceding ones on the same line. Now I know there's a simpler way to do this like:
echo "${WORKSPACES[0]},${WORKSPACES[1]},${WORKSPACES[2]}" >> $OUTPUT_DIR/$HOSTNAME.csv
But let's say I had 30 elements in the array and I wanted to appended them one after the other on the same line? Can you show me how to loop through the elements in an array and append them one after the other on the same line?
Also let's say I had the output of a command like:
df -m /export/ws/$ws | awk '{if (NR!=1) {print $3}}'
and I wanted to append that to the end of the same line.
But when I run it I get:
+ OUTPUT_DIR=/share/es-ops/Build_Farm_Reports/WorkSpace_Reports
++ date +%m-%d-%y
+ TODAY=09-20-14
++ hostname
+ HOSTNAME=sideshow
+ WORKSPACES=("bob" "mel" "sideshow-ws2")
+ '[' -f /share/es-ops/Build_Farm_Reports/WorkSpace_Reports/sideshow.csv ']'
And the file right now looks like:
09-20-14,sideshow
bob,
I am happy to report that user syme solved this (see below) but then I realized I need the date in the first column:
09-7-14,bob,mel,sideshow-ws2
Can I do this using syme's for loop?
Okay user syme solved this too he said "Just add $TODAY to the for loop" like this:
for v in "$TODAY" "${WORKSPACES[#]}"
Okay now the output looks like this I changed the elements in the array btw:
sideshow
09-20-14,bob_avail,bob_used,mel_avail,mel_used,sideshow-ws2_avail,sideshow-ws2_used
Now below that the next line will be populated by a , in the first column skipping the date and then:
df -m /export/ws/$v | awk '{if (NR!=1) {print $3}}
which equals the value of available space on bob in the first iteration
and then:
df -m /export/ws/$v | awk '{if (NR!=1) {print $2}}
which equals the value of used space on bob in the 2nd iteration
and then we just move on to the next value in ${WORKSPACE[#]}
which will be mel and do the available and used as we did with bob or $v above.
I know you geniuses on here will make child's play out of this.
I solved my own last question on this thread:
WORKSPACES2=( "bob" "mel" "sideshow-ws2" )
separator="," # defined empty for the first value
for v in "${WORKSPACES2[#]}"
do
available=`df -m /export/ws/$v | awk '{if (NR!=1) {print $3}}'`
used=`df -m /export/ws/$v | awk '{if (NR!=1) {print $2}}'`
echo -n "$separator$available$separator$used" >> $OUTPUT_DIR/$HOSTNAME.csv # append, concatenated, the separator and the value to the file
done
produces:
sideshow
09-20-14,bob_avail,bob_used,mel_avail,mel_used,sideshow-ws2_avail,sideshow-ws2_used
,470400,1032124,661826,1032124,43443,1032108

echo -n permits to print text without the linebreak.
To loop over the values of the array, you can use a for-loop:
echo "$TODAY,$HOSTNAME" > $OUTPUT_DIR/$HOSTNAME.csv # with a linebreak
separator="" # defined empty for the first value
for v in "${WORKSPACES[#]}"
do
echo -n "$separator$v" >> $OUTPUT_DIR/$HOSTNAME.csv # append, concatenated, the separator and the value to the file
separator="," # comma for the next values
done
echo >> $OUTPUT_DIR/$HOSTNAME.csv # add a linebreak (if you want it)

Store grep output in an array

I need to search a pattern in a directory and save the names of the files which contain it in an array.
Searching for pattern:
grep -HR "pattern" . | cut -d: -f1
This prints me all filenames that contain "pattern".
If I try:
targets=$(grep -HR "pattern" . | cut -d: -f1)
length=${#targets[#]}
for ((i = 0; i != length; i++)); do
echo "target $i: '${targets[i]}'"
done
This prints only one element that contains a string with all filnames.
output: target 0: 'file0 file1 .. fileN'
But I need:
output: target 0: 'file0'
output: target 1: 'file1'
.....
output: target N: 'fileN'
How can I achieve the result without doing a boring split operation on targets?

You can use:
targets=($(grep -HRl "pattern" .))
Note use of (...) for array creation in BASH.
Also you can use grep -l to get only file names in grep's output (as shown in my command).
Above answer (written 7 years ago) made an assumption that output filenames won't contain special characters like whitespaces or globs. Here is a safe way to read those special filenames into an array: (will work with older bash versions)
while IFS= read -rd ''; do
targets+=("$REPLY")
done < <(grep --null -HRl "pattern" .)
# check content of array
declare -p targets
On BASH 4+ you can use readarray instead of a loop:
readarray -d '' -t targets < <(grep --null -HRl "pattern" .)

What is the shell script instruction to divide a file with sorted lines to small files?

I have a large text file with the next format:
1 2327544589
1 3554547564
1 2323444333
2 3235434544
2 3534532222
2 4645644333
3 3424324322
3 5323243333
...
And the output should be text files with a suffix in the name with the number of the first column of the original file keeping the number of the second column in the corresponding output file as following:
file1.txt:
2327544589
3554547564
2323444333
file2.txt:
3235434544
3534532222
4645644333
file3.txt:
3424324322
5323243333
...
The script should run on Solaris but I'm also having trouble with the instruction awk and options of another instruccions like -c with cut; its very limited so I am searching for common commands on Solaris. I am not allowed to change or install anything on the system. Using a loop is not very efficient because the script takes too long with large files. So aside from using the awk instruction and loops, any suggestions?

Something like this perhaps:
$ awk 'NF>1{print $2 > "file"$1".txt"}' input
$ cat file1.txt
2327544589
3554547564
2323444333
or if you have bash available, try this:
#!/bin/bash
while read a b
do
[ -z $a ] && continue
echo $b >> "file"$a".txt"
done < input
output:
$ paste file{1..3}.txt
2327544589 3235434544 3424324322
3554547564 3534532222 5323243333
2323444333 4645644333

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Bash script to loop through file name to delete at specific index - file

You can do that with two substring operations: $ name="nn - xxxx x xx xx xxxxx-OOO.ext" $ part1=${name:5} # substring starting at position 5 $ part2=${part1%-OOO.ext} # remove `-OOO.ext` at the end of $part1 $ final="$part2".ext $ echo $final xxxx x xx xx xxxxx.ext $ mv "$name" "$final"

echo $file_name | sed "s/.-\s\(.\)-./\1.ext/" will give you the "xxxxxxx.ext" as you asked for in the OP.

Related

find first numeric value of first row and insert it at the end of second line, do it for multiples files in the same directory

How to store variables from loop to a file

Append elements of an array to the end of a line

Store grep output in an array

What is the shell script instruction to divide a file with sorted lines to small files?

Categories

Resources