"basename" command won't include multiple files - loops

I have a problem with “basename” command as follow:
In my host directory I have two samples’ fastq.gz files, named as:
A29_WES_S3_R1_001.fastq.gz
A29_WES_S3_R2_001.fastq.gz
A30_WES_S1_R1_001.fastq.gz
A30_WES_S1_R2_001.fastq.gz
Now I need to have their basename without suffix like:
A29_WES_S3_R1_001
A29_WES_S3_R2_001
A30_WES_S1_R1_001
A30_WES_S1_R2_001
I used the bash pipeline as follow:
#!/bin/bash
FILES1=(*R1_001.fastq.gz)
FILES2=(*R2_001.fastq.gz)
read1="${FILES1[#]}"
read2="${FILES2[#]}"
Ffile=$read1
Ffileprevix=$(basename "$Ffile" .fastq.gz)
Mfile=$read2
Mfileprevix=$(basename "$Mfile" .fastq.gz)
echo $Ffileprevix
echo $Mfileprevix
exit;
But every time I just get this output:
A29_WES_S3_R1_001.fastq.gz A30_WES_S1_R1_001
A29_WES_S3_R2_001.fastq.gz A30_WES_S1_R2_001
Only the last file (A30) would be included in the command!
I checked my pipeline in this way:
echo $read1
echo $read2
The result:
A29_WES_S3_R1_001.fastq.gz A30_WES_S1_R1_001.fastq.gz
A29_WES_S3_R2_001.fastq.gz A30_WES_S1_R2_001.fastq.gz
Then I did:
echo $Ffile
echo $Mfile
The result:
A29_WES_S3_R1_001.fastq.gz A30_WES_S1_R1_001.fastq.gz
A29_WES_S3_R2_001.fastq.gz A30_WES_S1_R2_001.fastq.gz
So $read1, $read2, $Ffile, and $Mfile work well.
Then I put “-a” in my basename command as it will take multiple files:
Ffileprevix=$(basename -a "$Ffile" .fastq.gz)
Mfileprevix=$(basename -a "$Mfile" .fastq.gz)
But it got worse! The result was like:
A29_WES_S3_R1_001.fastq.gz A30_WES_S1_R1_001.fastq.gz .fastq.gz
A29_WES_S3_R2_001.fastq.gz A30_WES_S1_R2_001.fastq.gz .fastq.gz
Finally, I tried “for ..... do ....” command to make a loop for basename command. Again, nothing changed!!
Is there anybody can help me to obtain what I want:
A29_WES_S3_R1_001
A29_WES_S3_R2_001
A30_WES_S1_R1_001
A30_WES_S1_R2_001

I'd leave basename out of this entirely, but that's entirely personal preference. You could do something more like:
FILES_PATTERN_1=".*R1_001.fastq.gz"
FILES_PATTERN_2=".*R2_001.fastq.gz"
# Get FILE PATTERN 1
echo "Pattern 1:"
for FILE in $(find . | grep "${FILES_PATTERN_1}" | cut -d. -f2 | tr -d /); do
echo $FILE
done
# Get FILE PATTERN 2
echo "Pattern 2:"
for FILE in $(find . | grep "${FILES_PATTERN_2}" | cut -d. -f2 | tr -d /); do
echo $FILE
done
Output should be:
Pattern 1:
A30_WES_S1_R1_001
A29_WES_S3_R1_001
Pattern 2:
A29_WES_S3_R2_001
A30_WES_S1_R2_001
You could also play with awk to parse things instead:
# Get FILE PATTERN 1
echo "Pattern 1:"
for FILE in $(find . | grep "${FILES_PATTERN_1}" | awk -F '[/.]' '{print $3}'); do
echo $FILE
done
There are a number of ways to approach this. If you had a lot more patterns to test you could make more use of functions here to reduce code duplication.
Also note, I'm doing this from a shell on Mac OSX, so if you're doing this from a Linux box some of these commands may need to be tweaked due to differences in output for some commands, like find. (ex: print $1 instead of print $3)

Related

How to properly pass a $string with spaces into grep

i tried to make bash script that can find "keyword" inside *.desktop file. my approach is to set some keyword as array, then pass it to grep, it work flawlessly until the keyword has at least two word separated by space.
what it should be
cat /usr/share/applications/*.desktop | grep -i "Mail Reader"
what i have tried
search=$(printf 'Name=%s' "${appsx[$index]}")
echo \""$search\"" #debug
cat /usr/share/applications/*.desktop | grep -i $search
search=$(printf 'Name=%s' "${appsx[$index]}")
echo \""$search\"" #debug
cat /usr/share/applications/*.desktop | grep -i \""$search\""
search=$(printf '"Name=%s"' "${appsx[$index]}")
echo $search #debug
cat /usr/share/applications/*.desktop | grep -i $search
any suggestions is highly appreciated
If you simply assign Mail Reader to the variable search like below
search=Mail Reader
bash would complain that Reader command is not found as it takes anything after that first blank character to be a subsequent command. What you need is
search="Mail Reader" # 'Mail Reader' would also do.
In the case of your command substitution, things are not different, you need double quote wrappers though, as the substitution itself would not happen inside the single
quotes
search="$(command)"
In your case, you did an overkill using a command substitution though. It could be well simplified to:
search="Name=${appsx[$index]}"
# Then do the grep.
# Note that cat-grep combo could be simplified to
# -h suppresses printing filenames to get same result as cat .. | grep
grep -ih "$search" /usr/share/applications/*.desktop

importing data from a CSV in Bash

I have a CSV file that I need to use in a bash script. The CSV is formatted like so.
server1,file.name
server1,otherfile.name
server2,file.name
server3,file.name
I need to be able to pull this information into either an array or in some other way so that I can then filter the information and only pull out data for a single server that i can then pass to another command within the script.
I need it to go something like this.
Import workfile.csv
check hostname | return only lines from workfile.csv that have the hostname as column one and store column 2 as a variable.
find / -xdev -type f -perm -002 | compare to stored info | chmod o-w all files not in listing
I'm stuck using bash because of the environment that I'm working in.
The csv can be to big for adding all filenames in the find parameter list.
You also do not want to call find in a loop for every line in the csv.
Solution:
First make a complete list of files in a tmp file.
Second parse the csv and filter the files.
Third is chmod -w.
The next solution stores the files in a tmp
Make a script that gets the servername as a parameter.
See comment in the code:
# Before EDIT:
# Hostname by parameter 1
# Check that you have a hostname
if [ $# -ne 1 ]; then
echo "Usage: $0 hostname"
# Exit script, failure
exit 1
fi
hostname=$1
# Edit, get hostname by system call
hostname=$(hostname)
# Or: hostname=$(hostname -s)
# Additional check
if [ ! -f workfile.csv ]; then
echo "inputfile missing"
exit 1
fi
# After edits, ${hostname} is now filled.
find / -xdev -type f -perm -002 -name "${file}" > /tmp/allfiles.tmp
# Do not use cat workfile.csv | grep ..., you do not need to call cat
# grep with ^ for beginning of line, add a , for a complete first field
# grep "^${hostname}," workfile.csv
# cut for selecting second field with delimiter ','
# cut -d"," -f2
# while read file => can be improved with xargs but lets start with this.
grep "^${hostname}," workfile.csv | cut -d"," -f2 | while read file; do
# Using sed with #, not /, since you need / in the search string
# Variable in sed mist be outside the single quotes and in double quotes
# Add $ after the file for end-of-line
# delete the line with the file (#searchstring#d)
sed -i '#/'"${file}"'$#d' /tmp/allfiles.tmp
done
echo "Review /tmp/allfiles.tmp before chmodding all these files"
echo "Delete the echo and exit when you are happy"
# Just an exit for testing
exit
# Using < is for avoiding a call to cat
</tmp/allfiles.tmp xargs chmod -w
It might be easier when you can chmod -w all the files and chmod +w the files in the csv. This is a little different than you asked, since all files from the csv are writable after this process, maybe you do not want that.

Using a variable to pass grep pattern in bash

I am struggling with passing several grep patterns that are contained within a variable. This is the code I have:
#!/bin/bash
GREP="$(which grep)"
GREP_MY_OPTIONS="-c"
for i in {-2..2}
do
GREP_MY_OPTIONS+=" -e "$(date --date="$i day" +'%Y-%m-%d')
done
echo $GREP_MY_OPTIONS
IFS=$'\n'
MYARRAY=( $(${GREP} ${GREP_MY_OPTIONS} "/home/user/this path has spaces in it/"*"/abc.xyz" | ${GREP} -v :0$ ) )
This is what I wanted it to do:
determine/define where grep is
assign a variable (GREP_MY_OPTIONS) holding parameters I will pass to grep
assign several patterns to GREP_MY_OPTIONS
using grep and the patterns I have stored in $GREP_MY_OPTIONS search several files within a path that contains spaces and hold them in an array
When I use "echo $GREP_MY_OPTIONS" it is generating what I expected but when I run the script it fails with an error of:
/bin/grep: invalid option -- ' '
What am I doing wrong? If the path does not have spaces in it everything seems to work fine so I think it is something to do with the IFS but I'm not sure.
If you want to grep some content in a set of paths, you can do the following:
find <directory> -type f -print0 |
grep "/home/user/this path has spaces in it/\"*\"/abc.xyz" |
xargs -I {} grep <your_options> -f <patterns> {}
So that <patterns> is a file containing the patterns you want to search for in each file from directory.
Considering your answer, this shall do what you want:
find "/path\ with\ spaces/" -type f | xargs -I {} grep -H -c -e 2013-01-17 {}
From man grep:
-H, --with-filename
Print the file name for each match. This is the default when
there is more than one file to search.
Since you want to insert the elements into an array, you can do the following:
IFS=$'\n'; array=( $(find "/path\ with\ spaces/" -type f -print0 |
xargs -I {} grep -H -c -e 2013-01-17 "{}") )
And then use the values as:
echo ${array[0]}
echo ${array[1]}
echo ${array[...]}
When using variables to pass the parameters, use eval to evaluate the entire line. Do the following:
parameters="-H -c"
eval "grep ${parameters} file"
If you build the GREP_MY_OPTIONS as an array instead of as a simple string, you can get the original outline script to work sensibly:
#!/bin/bash
path="/home/user/this path has spaces in it"
GREP="$(which grep)"
GREP_MY_OPTIONS=("-c")
j=1
for i in {-2..2}
do
GREP_MY_OPTIONS[$((j++))]="-e"
GREP_MY_OPTIONS[$((j++))]=$(date --date="$i day" +'%Y-%m-%d')
done
IFS=$'\n'
MYARRAY=( $(${GREP} "${GREP_MY_OPTIONS[#]}" "$path/"*"/abc.xyz" | ${GREP} -v :0$ ) )
I'm not clear why you use GREP="$(which grep)" since you will execute the same grep as if you wrote grep directly — unless, I suppose, you have some alias for grep (which is then the problem; don't alias grep).
You can do one thing without making things complex:
First do a change directory in your script like following:
cd /home/user/this\ path\ has\ spaces\ in\ it/
$ pwd
/home/user/this path has spaces in it
or
$ cd "/home/user/this path has spaces in it/"
$ pwd
/home/user/this path has spaces in it
Then do what ever your want in your script.
$(${GREP} ${GREP_MY_OPTIONS} */abc.xyz)
EDIT :
[sgeorge#sgeorge-ld stack1]$ ls -l
total 4
drwxr-xr-x 2 sgeorge eng 4096 Jan 19 06:05 test tesd
[sgeorge#sgeorge-ld stack1]$ cat test\ tesd/file
SUKU
[sgeorge#sgeorge-ld stack1]$ grep SUKU */file
SUKU
EDIT :
[sgeorge#sgeorge-ld stack1]$ find */* -print | xargs -I {} grep SUKU {}
SUKU

Need bash to separate cat'ed string to separate variables and do a for loop

I need to get a list of files added to a master folder and copy only the new files to the respective backup folders; The paths to each folder have multiple folders, all named by numbers and only 1 level deep.
ie /tester/a/100
/tester/a/101 ...
diff -r returns typically "Only in /testing/a/101: 2093_thumb.png" per line in the diff.txt file generated.
NOTE: there is a space after the colon
I need to get the 101 from the path and filename into separate variables and copy them to the backup folders.
I need to get the lesserfolder var to get 101 without the colon
and mainfile var to get 2093_thumb.png from each line of the diff.txt and do the for loop but I can't seem to get the $file to behave. Each time I try testing to echo the variables I get all the wrong results.
#!/bin/bash
diff_file=/tester/diff.txt
mainfolder=/testing/a
bacfolder= /testing/b
diff -r $mainfolder $bacfolder > $diff_file
LIST=`cat $diff_file`
for file in $LIST
do
maindir=$file[3]
lesserfolder=
mainfile=$file[4]
# cp $mainfolder/$lesserFolder/$mainfile $bacfolder/$lesserFolder/$mainfile
echo $maindir $mainfile $lesserfolder
done
If I could just get the echo statement working the cp would work then too.
I believe this is what you want:
#!/bin/bash
diff_file=/tester/diff.txt
mainfolder=/testing/a
bacfolder= /testing/b
diff -r -q $mainfolder $bacfolder | egrep "^Only in ${mainfolder}" | awk '{print $3,$4}' > $diff_file
cat ${diff_file} | while read foldercolon mainfile ; do
folderpath=${foldercolon%:}
lesserFolder=${folderpath#${mainfolder}/}
cp $mainfolder/$lesserFolder/$mainfile $bacfolder/$lesserFolder/$mainfile
done
But it is much more reliable (and much easier!) to use rsync for this kind of backup. For example:
rsync -a /testing/a/* /testing/b/
You could try a while read loop
diff -r $mainfolder $bacfolder | while read dummy dummy dir file; do
echo $dir $file
done

Append some text to the end of multiple files in Linux

How can I append the following code to the end of numerous php files in a directory and its sub directory:
</div>
<div id="preloader" style="display:none;position: absolute;top: 90px;margin-left: 265px;">
<img src="ajax-loader.gif"/>
</div>
I have tried with:
echo "my text" >> *.php
But the terminal displays the error:
bash : *.php: ambiguous redirect
I usually use tee because I think it looks a little cleaner and it generally fits on one line.
echo "my text" | tee -a *.php
You don't specify the shell, you could try the foreach command. Under tcsh (and I'm sure a very similar version is available for bash) you can say something like interactively:
foreach i (*.php)
foreach> echo "my text" >> $i
foreach> end
$i will take on the name of each file each time through the loop.
As always, when doing operations on a large number of files, it's probably a good idea to test them in a small directory with sample files to make sure it works as expected.
Oops .. bash in error message (I'll tag your question with it). The equivalent loop would be
for i in *.php
do
echo "my text" >> $i
done
If you want to cover multiple directories below the one where you are you can specify
*/*.php
rather than *.php
BashFAQ/056 does a decent job of explaining why what you tried doesn't work. Have a look.
Since you're using bash (according to your error), the for command is your friend.
for filename in *.php; do
echo "text" >> "$filename"
done
If you'd like to pull "text" from a file, you could instead do this:
for filename in *.php; do
cat /path/to/sourcefile >> "$filename"
done
Now ... you might have files in subdirectories. If so, you could use the find command to find and process them:
find . -name "*.php" -type f -exec sh -c "cat /path/to/sourcefile >> {}" \;
The find command identifies what files using conditions like -name and -type, then the -exec command runs basically the same thing I showed you in the previous "for" loop. The final \; indicates to find that this is the end of arguments to the -exec option.
You can man find for lots more details about this.
The find command is portable and is generally recommended for this kind of activity especially if you want your solution to be portable to other systems. But since you're currently using bash, you may also be able to handle subdirectories using bash's globstar option:
shopt -s globstar
for filename in **/*.php; do
cat /path/to/sourcefile >> "$filename"
done
You can man bash and search for "globstar" for more details about this. This option requires bash version 4 or higher.
NOTE: You may have other problems with what you're doing. PHP scripts don't need to end with a ?>, so you might be adding HTML that the script will try to interpret as PHP code.
You can use sed combined with find. Assume your project tree is
/MyProject/
/MyProject/Page1/file.php
/MyProject/Page2/file.php
etc.
Save the code you want to append on /MyProject/. Call it append.txt
From /MyProject/ run:
find . -name "*.php" -print | xargs sed -i '$r append.txt'
Explain:
find does as it is, it looks for all .php, including subdirectories
xargs will pass (i.e. run) sed for all .php that have just been found
sed will do the appending. '$r append.txt' means go to the end of the file ($) and write (paste) whatever is in append.txt there. Don't forget -i otherwise it will just print out the appended file and not save it.
Source: http://www.grymoire.com/unix/Sed.html#uh-37
You can do (Work even if there's space in your file path) :
#!/bin/bash
# Create a tempory file named /tmp/end_of_my_php.txt
cat << EOF > /tmp/end_of_my_php.txt
</div>
<div id="preloader" style="display:none;position: absolute;top: 90px;margin-left: 265px;">
<img src="ajax-loader.gif"/>
</div>
EOF
find . -type f -name "*.php" | while read the_file
do
echo "Processing $the_file"
#cp "$the_file" "${the_file}.bak" # Uncomment if you want to save a backup of your file
cat /tmp/end_of_my_php.txt >> "$the_file"
done
echo
echo done
PS: You must run the script from the directory you want to browse
Inspired from #Dantastic answer :
echo "my text" | tee -a file1.txt | tee -a file2.txt

Resources