Bash shell script to extract archive - arrays

I am trying to convert the following script which I use to create archives into one which extracts them.
[[ $# -lt 2 ]] && exit 1
name=$1; shift
files=("$#")
#exclude all files/directories that are not readable
for index in "${!files[#]}"; do
[[ -r ${files[index]} ]] || unset "files[index]"
done
[[ ${#files[#]} -eq 0 ]] && exit 1
if tar -czvf "${name:-def_$$}.tar.gz" "${files[#]}"; then
echo "Ok"
else
echo "Error"
exit 1
fi
So far I have this:
[[ $# -lt 1 ]] && exit 1
files=("$#")
#remove files and directories which are not readable
for index in "${!files[#]}"; do
[[ -r ${files[index]} ]] || unset "files[index]"
done
[[ ${#files[#]} -eq 0 ]] && exit 1
if tar -xzvf "${files[#]}".tar.gz; then
echo "OK"
else
echo "Error"
exit 1
fi
I dont know whether I needed to keep the shift as for this script I do not need to discard any arguments. I want to be able to take them all and unzip each one. Also I see there is a -C switch which allows the user to choose where the unzipped files go. How would I go about also adding this as an option for the user because they may or may not want to change the directory where the files get unzipped to.

You unfortunately can't just do tar -xzvf one.tar.gz two.tar.gz. Straightforward approach is to use a good old for loop:
for file in "${files[#]}"; do
tar -xzvf "$file"
done
Or you can use this:
cat "${files[#]}" | tar -xzvf - -i
You can have the first argument to be the specified directory for the -C option:
[[ $# -lt 2 ]] && exit 1
target=$1; shift
files=("$#")
#remove files and directories which are not readable
for index in "${!files[#]}"; do
[[ -r ${files[index]} ]] || unset "files[index]"
done
[[ ${#files[#]} -eq 0 ]] && exit 1
mkdir -p -- "$target" || exit 1
for file in "${files[#]}"; do
tar -xzvf "$file" -C "$target"
done
./script /some/path one.tar.gz two.tar.gz
List of files for tar can be also constructed like this:
target=$1; shift
for file; do
[[ -r $file ]] && files+=("$file")
done

Related

Change a variable's output if a match is successful with each loop of for command

Startup action:
I am trying to make a function that I will call with .bash_functions which is sourced in my .bashrc file.
Goal:
I am trying to reduce the amount of code required. If I make a function with repeated commands that have (mostly minor) differences for each command line I end up with a HUGE function...
Purpose of function:
The function will search the current working directory for files that match an array of predefined extensions. If a file with a matching extension is found it will execute a certain customized command line. If another match is found with a different extension then another type of command must be used instead of the first command that was mentioned and so forth.
If the following files are in the current working directory:
file1.7z
file2.bz2
file3.gz
file4.tgz
file5.xz
Then running the function in a terminal will output the following lines:
7z x -o/root/file1 /root/file1.7z
tar -xvfj /root/file2.bz2 -C /root/file2
tar -xvf /root/file3.gz -C /root/file3
tar -xvf /root/file4.tgz -C /root/file4
tar -xvf /root/file5.xz -C /root/file5
Where I am at so far:
I don't yet have the array set up because I am stuck figuring out the flow to loop the commands correctly. I have the below script which does what I want however like I mentioned I want to slim it down if possible (learn some new tricks from you guys)!
untar()
{
clear
for i in *.*
do
local EXT="$(echo "${i}" | sed 's/.*\.//')"
if [ -n "${EXT}" ]; then
if [[ "${EXT}" == '7z' ]]; then
if [ ! -d "${PWD}"/"${i%%.*}" ]; then mkdir -p "${PWD}"/"${i%%.*}"; fi
echo 7z x -o"${PWD}"/"${i%%.*}" "${PWD}"/"${i}"
elif [[ "${EXT}" == 'bz2' ]]; then
if [ ! -d "${PWD}"/"${i%%.*}" ]; then mkdir -p "${PWD}"/"${i%%.*}"; fi
echo tar -xvfj "${PWD}"/"${i}" -C "${PWD}"/"${i%%.*}"
elif [[ "${EXT}" == 'gz' ]]; then
if [ ! -d "${PWD}"/"${i%%.*}" ]; then mkdir -p "${PWD}"/"${i%%.*}"; fi
echo tar -xvf "${PWD}"/"${i}" -C "${PWD}"/"${i%%.*}"
elif [[ "${EXT}" == 'tgz' ]]; then
if [ ! -d "${PWD}"/"${i%%.*}" ]; then mkdir -p "${PWD}"/"${i%%.*}"; fi
echo tar -xvf "${PWD}"/"${i}" -C "${PWD}"/"${i%%.*}"
elif [[ "${EXT}" == 'xz' ]]; then
if [ ! -d "${PWD}"/"${i%%.*}" ]; then mkdir -p "${PWD}"/"${i%%.*}"; fi
echo tar -xvf "${PWD}"/"${i}" -C "${PWD}"/"${i%%.*}"
fi
fi
done;
}
Required to function:
Essentially, two different tar commands and a single 7z command are needed (as far as i can come up with anyways)
Command 1: tar -xvf <target.{gz,tgz,xz}> -C <output folder>
Command 2: tar -xvfj <target.bz2> -C <output folder>
Command 3: 7z x -o<output folder> <target.7z>
Any ideas you can throw my way?
Since you're familiar with parameter substitution I'd start with eliminating the overhead of the dual subprocess invocations for finding EXT; shorter code but also a performance improvement as this doesn't require spawning/cleaning-up any subprocesses (for each pass through the loop):
local EXT="${i##*.}"
NOTE: this assumes all files have a single extension (eg, .tgz instead of tar.gz) otherwise OP may need to add more logic to determine how to process files with multiple extensions (eg, abc.tar.gz vs a.b.c.tar.gz)
Next idea would be to pull the mkdir logic out by itself; this eliminates 4 mkdir lines:
if [ ! -d "${PWD}"/"${i%%.*}" ]; then mkdir -p "${PWD}"/"${i%%.*}"; fi
# or
[[ ! -d "${PWD}"/"${i%%.*}" ]] && mkdir -p "${PWD}"/"${i%%.*}"
Small detour ...
OP has mentioned wanting to use an array to manage a list of extensions and while such an approach is doable it would also require some thought on how to store and reference the associated commands.
Assuming this will be the only piece of code that needs to process a list of extensions I'd probably opt for 'hardcoding' the logic for each extension (as opposed to storing in a resource/config file and then loading into an array). Net result, stick with current approach but with a few improvements.
Back to code (re)design ...
Next idea would be to collapse the tar calls into a single call with a test for bz2; this eliminates 3 tests and 3 tar lines:
if [[ "${EXT}" == '7z' ]]; then
echo 7z x -o"${PWD}"/"${i%%.*}" "${PWD}"/"${i}"
else
jflag=""
[[ "${EXT}" == 'bz2' ]] && jflag="j"
echo tar -xvf${jflag} "${PWD}"/"${i}" -C "${PWD}"/"${i%%.*}"
fi
Personally, I'd probably opt for a case statement:
case "${EXT}" in
7z)
echo 7z x -o"${PWD}"/"${i%%.*}" "${PWD}"/"${i}"
;;
*)
jflag=""
[[ "${EXT}" == 'bz2' ]] && jflag="j"
echo tar -xvf${jflag} "${PWD}"/"${i}" -C "${PWD}"/"${i%%.*}"
;;
esac
Pulling this all together:
untar()
{
clear
local EXT
for i in *.*
do
EXT="${i##*.}"
[[ ! -d "${PWD}"/"${i%%.*}" ]] && mkdir -p "${PWD}"/"${i%%.*}"
case "${EXT}" in
7z)
echo 7z x -o"${PWD}"/"${i%%.*}" "${PWD}"/"${i}"
;;
*)
jflag=""
[[ "${EXT}" == 'bz2' ]] && jflag="j"
echo tar -xvf${jflag} "${PWD}"/"${i}" -C "${PWD}"/"${i%%.*}"
;;
esac
done
}
Posting this as another example for others to see that I came up with after seeing the great example by markp-fuso.
I added zip files to the list this time.
untar()
{
clear
local EXT
for i in *.*
do
EXT="${i##*.}"
[[ ! -d "${PWD}"/"${i%%.*}" ]] && mkdir -p "${PWD}"/"${i%%.*}"
case "${EXT}" in
7z|zip)
7z x -o"${PWD}"/"${i%%.*}" "${PWD}"/"${i}"
;;
bz2|gz|xz)
jflag=""
[[ "${EXT}" == 'bz2' ]] && jflag="j"
tar -xvf${jflag} "${PWD}"/"${i}" -C "${PWD}"/"${i%%.*}"
;;
esac
done
}

Check if associative array element exists in bash

In a bash script, I have a locale in a variable like so
locale=fr_ma
I also have an associative array like this
declare -A new_loc_map
new_loc[fr_ma]=en_ma
new_loc[el_gr]=en_gr
new_loc[sl_si]=en_si
I want to check if new_loc element with the key ${locale} exists
I thought this should work but it doesn't:
if [[ -v "${new_loc[${locale}]}" ]]
then
echo -e "${locale} has a new_loc"
fi
fi
any ideas on how otherwise I can test for this?
For older verions of bash (looks like [[ -v array[index] ]] was introduced in version 4.3), you can use the ${var-word} form to test is a variable has been set:
$ zz="$RANDOM$RANDOM$RANDOM"
$ echo $zz
270502100415054
$ declare -a name
$ locale=foo
$ [[ ${name[$locale]-$zz} = "$zz" ]] && echo var is unset || echo var has a value
var is unset
$ name[$locale]=""
$ [[ ${name[$locale]-$zz} = "$zz" ]] && echo var is unset || echo var has a value
var has a value
$ [[ ${name[$locale]:-$zz} = "$zz" ]] && echo var is unset or empty || echo var has a value
var is unset or empty
The tricky part is devising a $zz string that won't appear as actual data in your array.
Much better suggestion from #chepner:
if [[ -z "${name[$locale]+unset}" ]]; then
echo "no name for $locale"
else
echo "name for $locale is ${name[$locale]}"
fi
-v takes an (indexed) name as its argument, since you are trying to determine if the expansion makes sense in the first place.
if [[ -v new_loc[$locale] ]]; then
echo "Locale ${locale} now maps to ${new_loc[$locale]}"
fi
Word of warning
While the BASH manual page describes -v for [[ and test, reliable results are returned from [[ only.
Consider this (Bash 4.4):
> [ -v "$a[1]" ] && echo true
> a[1]=''
> [ -v "$a[1]" ] && echo true
> declare -p a
declare -a a=([1]="")
> [ -v $a[1] ] && echo true
> [[ -v $a[1] ]] && echo true
> [[ -v a[1] ]] && echo true
true
> [[ -v a[0] ]] && echo true
>
I managed to solve the problem by checking if the variable is not an empty string instead.
Example:
locale=fr_ma
declare -A new_loc
new_loc[fr_ma]=en_ma
new_loc[el_gr]=en_gr
if [[ ! -z ${new_loc[$locale]} ]]; then
echo "Locale ${locale} now maps to ${new_loc[$locale]}"
fi
Output:
Locale fr_ma now maps to en_ma

Using mapfile to save output to associative arrays

In practicing bash, I tried writing a script that searches the home directory for duplicate files in the home directory and deletes them. Here's what my script looks like now.
#!/bin/bash
# create-list: create a list of regular files in a directory
declare -A arr1 sumray origray
if [[ -d "$HOME/$1" && -n "$1" ]]; then
echo "$1 is a directory"
else
echo "Usage: create-list Directory | options" >&2
exit 1
fi
for i in $HOME/$1/*; do
[[ -f $i ]] || continue
arr1[$i]="$i"
done
for i in "${arr1[#]}"; do
Name=$(sed 's/[][?*]/\\&/g' <<< "$i")
dupe=$(find ~ -name "${Name##*/}" ! -wholename "$Name")
if [[ $(find ~ -name "${Name##*/}" ! -wholename "$Name") ]]; then
mapfile -t sumray["$i"] < <(find ~ -name "${Name##*/}" ! -wholename "$Name")
origray[$i]=$(md5sum "$i" | cut -c 1-32)
fi
done
for i in "${!sumray[#]}"; do
poten=$(md5sum "$i" | cut -c 1-32)
for i in "${!origray[#]}"; do
if [[ "$poten" = "${origray[$i]}" ]]; then
echo "${sumray[$i]} is a duplicate of $i"
fi
done
done
Originally, where mapfile -t sumray["$i"] < <(find ~ -name "${Name##*/}" ! -wholename "$Name") is now, my line was the following:
sumray["$i"]=$(find ~ -name "${Name##*/}" ! -wholename "$Name")
This saved the output of find to the array. But I had an issue. If a single file had multiple duplicates, then all locations found by find would be saved to a single value. I figured I could use the mapfile command to fix this, but now it's not saving anything to my array at all. Does it have to do with the fact that I'm using an associative array? Or did I just mess up elsewhere?
I'm not sure if I'm allowed to answer my own question, but I figured that I should post how I solved my problem.
As it turns out, the mapfile command does not work on associative arrays at all. So my fix was to save the output of find to a text file and then store that information in an indexed array. I tested this a few times and I haven't seemed to encounter any errors yet.
Here's my finished script.
#!/bin/bash
# create-list: create a list of regular files in a directory
declare -A arr1 origray
declare indexray
#Verify that Parameter is a directory.
if [[ -d "$HOME/$1/" && -n "$1" ]]; then
echo "Searching for duplicates of files in $1"
else
echo "Usage: create-list Directory | options" >&2
exit 1
fi
#create list of files in specified directory
for i in $HOME/${1%/}/*; do
[[ -f $i ]] || continue
arr1[$i]="$i"
done
#search for all duplicate files in the home directory
#by name
#find checksum of files in specified directory
for i in "${arr1[#]}"; do
Name=$(sed 's/[][?*]/\\&/g' <<< "$i")
if [[ $(find ~ -name "${Name##*/}" ! -wholename "$Name") ]]; then
find ~ -name "${Name##*/}" ! -wholename "$Name" >> temp.txt
origray[$i]=$(md5sum "$i" | cut -c 1-32)
fi
done
#create list of duplicate file locations.
if [[ -f temp.txt ]]; then
mapfile -t indexray < temp.txt
else
echo "No duplicates were found."
exit 0
fi
#compare similarly named files by checksum and delete duplicates
count=0
for i in "${!indexray[#]}"; do
poten=$(md5sum "${indexray[$i]}" | cut -c 1-32)
for i in "${!origray[#]}"; do
if [[ "$poten" = "${origray[$i]}" ]]; then
echo "${indexray[$count]} is a duplicate of a file in $1."
fi
done
count=$((count+1))
done
rm temp.txt
This is kind of sloppy but it does what it's supposed to do. md5sum may not be the optimal way to check for file duplicates but it works. All I have to do is replace echo "${indexray[$count]} is a duplicate of a file in $1." with rm -i ${indexray[$count]} and it's good to go.
So my next question would have to be...why doesn't mapfile work with associative arrays?

rename specific pattern of files in bash

I have the following files and directories:
/tmp/jj/
/tmp/jj/ese
/tmp/jj/ese/2010
/tmp/jj/ese/2010/test.db
/tmp/jj/dfhdh
/tmp/jj/dfhdh/2010
/tmp/jj/dfhdh/2010/rfdf.db
/tmp/jj/ddfxcg
/tmp/jj/ddfxcg/2010
/tmp/jj/ddfxcg/2010/df.db
/tmp/jj/ddfnghmnhm
/tmp/jj/ddfnghmnhm/2010
/tmp/jj/ddfnghmnhm/2010/sdfs.db
I want to rename all 2010 directories to their parent directories then tar all .db files...
What I tried is:
#!/bin/bash
if [ $# -ne 1 ]; then
echo "Usage: `basename $0` <absolute-path>"
exit 1
fi
if [ "$(id -u)" != "0" ]; then
echo "This script must be run as root" 1>&2
exit 1
fi
rm /tmp/test
find $1 >> /tmp/test
for line in $(cat /tmp/test)
do
arr=$( (echo $line | awk -F"/" '{for (i = 1; i < NF; i++) if ($i == "2010") print $(i-1)}') )
for index in "${arr[#]}"
do
echo $index #HOW TO WRITE MV COMMAND RATHER THAN ECHO COMMAND?
done
done
1) The result is:
ese
dfhdh
ddfxcg
ddfnghmnhm
But it should be:
ese
dfhdh
ddfxcg
ddfnghmnhm
2) How can I rename all 2010 directories to their parent directory?
I mean how to do (I want to do it in loop because of larg numbers of dirs):
mv /tmp/jj/ese/2010 /tmp/jj/ese/ese
mv /tmp/jj/dfhdh/2010 /tmp/jj/dfhdh/dfhdh
mv /tmp/jj/ddfxcg/2010 /tmp/jj/ddfxcg/ddfxcg
mv /tmp/jj/ddfnghmnhm/2010 /tmp/jj/ddfnghmnhm/ddfnghmnhm
You could instead use find in order to determine if a directory contains a subdirectory named 2010 and perform the mv:
find /tmp -type d -exec sh -c '[ -d "{}"/2010 ] && mv "{}"/2010 "{}"/$(basename "{}")' -- {} \;
I'm not sure if you have any other question here but this would do what you've listed at the end of the question, i.e. it would:
mv /tmp/jj/ese/2010 /tmp/jj/ese/ese
and so on...
Can be done using grep -P:
grep -oP '[^/]+(?=/2010)' file
ese
ese
dfhdh
dfhdh
ddfxcg
ddfxcg
ddfnghmnhm
ddfnghmnhm
This should be close:
find "$1" -type d -name 2010 -print |
while IFS= read -r dir
do
parentPath=$(dirname "$dir")
parentDir=$(basename "$parentPath")
echo mv "$dir" "$parentPath/$parentDir"
done
Remove the echo after testing. If your dir names can contain newlines then look into the -print0 option for find, and the -0 option for xargs.
First, only iterate through the dirs you're interested in, and avoid temporary files:
for d in $(find $1 -type d -name '2010') ; do
Then you can use basename and dirname to extract parts of that directory name and reconstruct the desired one. Something like:
b="$(dirname $d)"
p="$(basename $b)"
echo mv "$d" "$b/$p"
You could use shell string replace operations instead of basename/dirname.

Bash: set array within braces in a while loop? (sub-shell problem)

I'm having problems getting a variable "${Error[*]}", which is a regular indexed array, to stay set from the time it's declared until it's checked. It seems to me that a sub-shell must be launched so the declaration doesn't stick. I didn't think sub-shells were opened when using braces { stuff...; }. I want to know how to get my variable, Error to stick in the case I'm trying to write up. Here's a sample of my script:
TestFunction () {
unset Error
local archive="$1" extlist="$2" && local ext="${archive##*.}"
shopt -s nocasematch
local -i run=0
while [[ "$run" == 0 || -n "${Error[run]}" ]]; do
(( run++ ))
local IFS=$'\n\r\t '
if [[ ! "${Error[*]}" =~ 'cpio' && "$ext" =~ ^(pax|cpio|cpgz|igz|ipa|cab)$ && -n "$(which 'cpio')" ]]; then
## Try to cpio the archive. Since cpio cannot handle '.cab' archive, I want to declare an Error ##
{ cpio -ti --quiet <"$archive" 2>'/dev/null' || local -a Error[run]='cpio'; } | grep -Ei '$extlist'
elif [[ ! "${Error[*]}" =~ 'zipinfo' && "$ext" =~ ^(zip|[jw]ar|ipa|cab)$ && -n "$(which 'unzip')" ]]; then
## If cpio fails, then try zipinfo, or unzip on the next run through the loop... ##
if which 'zipinfo' &>'/dev/null'; then
{ zipinfo -1 "$archive" 2>'/dev/null' || local -a Error[run]='zipinfo'; } | grep -Ei "$scanlist"
elif which 'unzip' &>'/dev/null'; then
{ unzip -lqq "$archive" 2>'/dev/null' || local -a Error[run]='unzip'; } | gsed -re '/^ +[0-9]+/!d;s|^ +[0-9]+ +[0-9-]+ [0-9:]+ +||' | grep -Ei "$exlist"
fi
## many more elifs... ##
fi
done
shopt -u nocasematch
return 0
}
Archives='\.(gnutar|7-zip|lharc|toast|7zip|boz|bzi?p2?|cpgz|cpio|gtar|g?z(ip)?|lzma(86)?|t[bg]z2?|ar[cgjk]|bz[2a]?|cb[7rz]|cdr|deb|[dt]lz|dmg|exe|fbz|fgz|gz[aip]|igz|img|iso|lh[az]|lz[hmswx]?|mgz|mpv|mpz|pax|piz|pka|[jrtwx]ar|rpm|s?7-?z|sitx?|m?pkg|sfx|nz|xz)$'
IFS=$'\n'
declare -a List=($(TestFunction '/Users/aesthir/Programming/│My Projects│/Swipe Master/Test Folder/SDKSetup.cab' "$Archives"))
IFS=$' \t\n'
xtrace output:
  〔xtrace〕 unset Error
  〔xtrace〕 local 'archive=/Users/aesthir/Programming/│My Projects│/Swipe Master/Test Folder/SDKSetup.cab' 'extlist=\.(gnutar|7-zip|lharc|toast|7zip|boz|bzi?p2?|cpgz|cpio|gtar|g?z(ip)?|lzma(86)?|t[bg]z2?|ar[cgjk]|bz[2a]?|cb[7rz]|cdr|deb|[dt]lz|dmg|exe|fbz|fgz|gz[aip]|igz|img|iso|lh[az]|lz[hmswx]?|mgz|mpv|mpz|pax|piz|pka|[jrtwx]ar|rpm|s?7-?z|sitx?|m?pkg|sfx|nz|xz)$'
  〔xtrace〕 local ext=cab
  〔xtrace〕 shopt -s nocasematch
  〔xtrace〕 local -i run=0
  〔xtrace〕 [[ 0 == 0 ]]
  〔xtrace〕 (( run++ ))
  〔xtrace〕 local 'IFS=
'
  〔xtrace〕 [[ ! '' =~ cpio ]]
  〔xtrace〕 [[ cab =~ ^(pax|cpio|cpgz|igz|ipa|cab)$ ]]
  〔xtrace〕 which cpio
  〔xtrace〕 [[ -n /usr/bin/cpio ]]
  〔xtrace〕 grep -Ei '$extlist'
  〔xtrace〕 cpio -ti --quiet
  〔xtrace〕 local -a 'Error[run]=cpio'
  〔xtrace〕 [[ 1 == 0 ]]
  〔xtrace〕 [[ -n '' ]] ## <—— Problem is here... when checking "${Error[run]}", it's unset ##
  〔xtrace〕 shopt -u nocasematch
  〔xtrace〕 return 0
Now obviously I know cpio, zipinfo, and unzip cannot handle cab files... I put 'cab' in the extension list on purpose to cause an error.
I want to stay in TestFunction and keep looping with different archivers until a success (file list is dumped, which cabextract would gladly do in this case) without repeating an already failed archiver.
Finally, since this works fine...
TestFunction () {
unset Error
local archive="$1" extlist="$2" && local ext="${archive##*.}"
local -i run=0
while [[ "$run" == 0 || -n "${Error[run]}" ]]; do
(( run++ ))
local IFS=$'\n\r\t '
if [[ ! "${Error[*]}" =~ 'cpio' && "$ext" =~ ^(pax|cpio|cpgz|igz|ipa|cab)$ && -n "$(which 'cpio')" ]]; then
cpio -ti --quiet <"$archive" 2>'/dev/null' || local -a Error[run]='cpio'
fi
done
shopt -u nocasematch
return 0
}
... I have to assume the problem is the braces because I want the results grep'd right away. However, I need those braces there because I don't want Error[run] to be set if grep turns up no results, only if cpio fails. I dont want to grep outside TestFunction for other reasons (I would have to do a complete re-write).
Any quick solution to this without massive rewriting? Maybe echo 'cpio' to some fd and read -u6ing it somehow?
I'd much prefer not to have to set an array to the file list and then for loop | grep through every file as it would really slow things down.
The problem is not the braces, but the pipe. Because you're using a pipe, the assignment to Error[run] is happening in a subshell, so that assignment disappears when the subshell exits.
Change:
{ cpio -ti --quiet <"$archive" 2>'/dev/null' || local -a Error[run]='cpio'; } | grep -Ei '$extlist'
to:
cpio -ti --quiet <"$archive" 2>'/dev/null' | grep -Ei "$extlist"
[[ ${PIPESTATUS[0]} -ne 0 ]] && Error[run]='cpio'
(btw, need double quotes in the grep part)

Resources