How can I edit my code so that I can account for output that goes against my sed command - database

I am writing a code to put the species named matched from a remote NCBI BLAST database, and the file the matched name came from. I want to make my code more robust so that it can deal with files that do not get a match and that go against my current sed command
#!/bin/bash
for i in ./split.contigs.Parsed/*.csv ; do
sciname=$(head -1 $i | sed -E "s/([A-Z][a-z]+ [a-z]+) .+/\1/")
contigname=$(echo $i | sed -E "s/.fa.csv//" | sed -E
"s/\.\/split.contigs.Parsed\///")
echo "$sciname,$contigname"
done
Expected
Drosophila melanogaster,contig_66:1.0-213512.0_pilon
Drosophila melanogaster,contig_67:1.0-138917.0_pilon
Drosophila sechellia,contig_67:139347.0-186625.0_pilon
Drosophila melanogaster,contig_68:3768.0-4712.0_pilon
Actual
Drosophila ananassae,contig_393:1.0-13214.0_pilon
,contig_393:13217.0-13563.0_pilon
Drosophila sp. pallidosa-like-Wau w,contig_393:14835.0-18553.0_pilon
Apteryx australis,contig_393:19541.0-21771.0_pilon
,contig_393:21780.0-22772.0_pilon
Drosophila sp. pallidosa-like-Wau w,contig_393:22776.0-31442.0_pilon
Drosophila melanogaster,contig_394:1.0-89663.0_pilon

Simply skip the loop if $sciname is null. Put this one line after defining $sciname:
[[ -z $sciname ]] && continue

Related

Using sed to remove line breaks after execute command and save it on array

I'm working on some inventory stuff and i'm trying to save all the AWS regions on one array, then, showed elements one under another to use it as an input menu.
This next command is giving me the right output but when i walk into the array with FOR, the array length is just 1 cause the result is:
aws ec2 describe-regions --output text|awk -F\t '{print $3}'| sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/ /g'
eu-north-1 ap-south-1 eu-west-3 eu-west-2 eu-west-1 ap-northeast-2
ap-northeast-1 sa-east-1 ca-central-1 ap-southeast-1 ap-southeast-2
eu-central-1 us-east-1 us-east-2 us-west-1 us-west-2
This is how i'm filing the arrays:
# Get regions
declare -a regions=$(aws ec2 describe-regions --output text | awk -F\t '{print $3}' | sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/ /')
echo -e "\nPlease, select the region you would like to query: "
# Print Regions
len=${#regions[#]}
last=$((len+1))
for (( i=0; i<$len; i++ )); do
echo -e "$i.${regions[$i]}\n" ;
done
echo -e "$last All of them (this could take a while...O_o)\n"
read region_opt
if [${region_opt}!=${last}] then
region=(${regions[$region_opt]})
What i want to have in the output is something like
eu-north-1
ap-south-1
eu-west-3 ....
You're missing parentheses around your array values, e.g.,
declare -a ARRAY=(value1 value2 ... valueN)
(refs: https://www.tldp.org/LDP/Bash-Beginners-Guide/html/sect_10_02.html, https://www.gnu.org/software/bash/manual/bash.html)
The following forms also work, and the first (without declare -a) is given as an example in the GNU's Bash reference manual, the Bash guide for beginners, and the Advanced bash-scripting guide:
ARRAY=(value1 value2 ... valueN)
declare ARRAY=(value1 value2 ... valueN)
$() is command substitution, is just convert any stdout to string and assign it to a variable
if you said true that the result is;
eu-north-1 ap-south-1 eu-west-3...
then to get array out of it, make it syntactically appear so, then tell Bash to evaluate it as such,
regions=($regions)
after expansion it'd be the valid array syntax
regions=(eu-north-1 ap-south-1 eu-west-3)
then would be evaluated as valid array after it's enclosed by "" and as Bash eval argument
$ eval "regions=($regions)"
$ echo ${regions[0]}
eu-north-1
So that I am sure you will be able to accomplish and solve it on your own...

How to properly pass a $string with spaces into grep

i tried to make bash script that can find "keyword" inside *.desktop file. my approach is to set some keyword as array, then pass it to grep, it work flawlessly until the keyword has at least two word separated by space.
what it should be
cat /usr/share/applications/*.desktop | grep -i "Mail Reader"
what i have tried
search=$(printf 'Name=%s' "${appsx[$index]}")
echo \""$search\"" #debug
cat /usr/share/applications/*.desktop | grep -i $search
search=$(printf 'Name=%s' "${appsx[$index]}")
echo \""$search\"" #debug
cat /usr/share/applications/*.desktop | grep -i \""$search\""
search=$(printf '"Name=%s"' "${appsx[$index]}")
echo $search #debug
cat /usr/share/applications/*.desktop | grep -i $search
any suggestions is highly appreciated
If you simply assign Mail Reader to the variable search like below
search=Mail Reader
bash would complain that Reader command is not found as it takes anything after that first blank character to be a subsequent command. What you need is
search="Mail Reader" # 'Mail Reader' would also do.
In the case of your command substitution, things are not different, you need double quote wrappers though, as the substitution itself would not happen inside the single
quotes
search="$(command)"
In your case, you did an overkill using a command substitution though. It could be well simplified to:
search="Name=${appsx[$index]}"
# Then do the grep.
# Note that cat-grep combo could be simplified to
# -h suppresses printing filenames to get same result as cat .. | grep
grep -ih "$search" /usr/share/applications/*.desktop

I want to edit a specific lines (multiple) with sed command

I have a test file having around 20K lines in that file I want to change some specific string in specific lines I am getting the line number and strings to change.here I have a scenario where I want to change the one string to another in multiple lines. I used earlier like
sed -i '12s/stringone/stringtwo/g' filename
but in this case I have to run the multiple commands for same test like
sed -i '15s/stringone/stringtwo/g' filename
sed -i '102s/stringone/stringtwo/g' filename
sed -i '11232s/stringone/stringtwo/g' filename
Than I tried below
sed -i '12,15,102,11232/stringone/stringtwo/g' filename
but I am getting the error
sed: -e expression #1, char 5: unknown command: `,'
Please some one help me to achieve this.
To get the functionality you're trying to get with GNU sed would be this in GNU awk:
awk -i inplace '
BEGIN {
split("12 15 102 11232",tmp)
for (i in tmp) lines[tmp[i]]
}
NR in lines { gsub(/stringone/,"stringtwo") }
' filename
Just like with a sed script, the above will fail when the strings contain regexp or backreference metacharacters. If that's an issue then with awk you can replace gsub() with index() and substr() for string literal operations (which are not supported by sed).
You get the error because the N,M in sed is a range (from N to M) and doesn't apply to a list of single line number.
An alternative is to use printf and sed:
sed -i "$(printf '%ds/stringone/stringtwo/g;' 12 15 102 11232)" filename
The printf statement is repeating the pattern Ns/stringone/stringtwo/g; for all numbers N in argument.
This might work for you (GNU sed):
sed '12ba;15ba;102ba;11232ba;b;:a;s/pattern/replacement/' file
For each address, branch to a common place holder (in this case :a) and do a substitution, otherwise break out of the sed cycle.
If the addresses were in a file:
sed 's/.*/&ba/' fileOfAddresses | sed -f - -e 'b;:a;s/pattern/replacement/' file

Unable to add element to array in bash

I have the following problem. Let´s assume that $# contains only valid files. Variable file contains the name of the current file (the file I'm currently "on"). Then variable element contains data in the format file:function.
Now, when variable element is not empty, it should be put into the array. And that's the problem. If I echo element, it contains exactly what I want, although it is not stored in array, so for cycle doesn't print out anything.
I have written two ways I try to insert element into array, but neither works. Can you tell me, What am I doing wrong, please?
I'm using Linux Mint 16.
#!/bin/bash
nm $# | while read line
do
pattern="`echo \"$line\" | sed -n \"s/^\(.*\):$/\1/p\"`"
if [ -n "$pattern" ]; then
file="$pattern"
fi
element="`echo \"$line\" | sed -n \"s/^U \([0-9a-zA-Z_]*\).*/$file:\1/p\"`"
if [ -n "$element" ]; then
array+=("$element")
#array[$[${#array[#]}+1]]="$element"
echo element - "$element"
fi
done
for j in "${array[#]}"
do
echo "$j"
done
Your problem is that the while loop runs in a subshell because it is the second command in a pipeline, so any changes made in that loop are not available after the loop exits.
You have a few options. I often use { and } for command grouping:
nm "$#" |
{
while read line
do
…
done
for j in "${array[#]}"
do
echo "$j"
done
}
In bash, you can also use process substitution:
while read line
do
…
done < <(nm "$#")
Also, it is better to use $(…) in place of back-quotes `…` (and not just because it is hard work getting back quotes into markdown text!).
Your line:
element="`echo \"$line\" | sed -n \"s/^U \([0-9a-zA-Z_]*\).*/$file:\1/p\"`"
could be written:
element="$(echo "$line" | sed -n "s/^U \([0-9a-zA-Z_]*\).*/$file:\1/p")"
or even:
element=$(echo "$line" | sed -n "s/^U \([0-9a-zA-Z_]*\).*/$file:\1/p")
It really helps when you need them nested. For example, to list the lib directory adjacent to where gcc is found:
ls -l $(dirname $(dirname $(which gcc)))/lib
vs
ls -l `dirname \`dirname \\\`which gcc\\\`\``/lib
I know which I find easier!

How to find and remove line from file in Unix?

I have one file (for example: test.txt), this file contains some lines and for example one line is: abcd=11
But it can be for example: abcd=12
Number is different but abcd= is the same in all case, so could anybody give me command for finding this line and remove it?
I have tried: sed -e \"/$abcd=/d\" /test.txt >/test.txt but it removes all lines from my file and I also have tried: sed -e \"/$abcd=/d\" /test.txt >/testNew.txt but it doesn't delete line from test.txt, it only creates new file (testNew.txt) and in this file it removes my line. But it is not what I want.
Based on your description in your text, here is a cleaned-up version of your sed script that should work.
Assuming a linux GNU sed
sed -i '/abcd=/d' /test.txt
If you're using OS-X, then you need
sed -i "" '/abcd=/d' /test.txt
If these don't work, then use old-school sed with a conditional mv to manage your tmpfiles.
sed '/abcd=/d' /test.txt > test.txt.$$ && /bin/mv test.txt.$$ test.txt
Notes:
Not sure why you're doing \"/$abcd=/d\", you don't need to escape " chars unless you're doing more with this code than you indicate (like using eval). Just write it as "/$abcd=/d".
Normally you don't need '-e'
If you really want to use '$abcd, then you need to give it a value AND as you're matching the string 'abcd=', then you can do
abcd='abcd='
sed -i "/${abcd}/d" /test.txt
I hope this helps.
Here's a solution using grep:
$ grep -v '^\$abcd=' test.txt
Proof of concept:
$ cat test.txt
a
b
ab
ac
$abcd=1
$abcd=2
$abcd
ab
a
$abcd=3
x
$ grep -v '^\$abcd=' test.txt
a
b
ab
ac
$abcd
ab
a
x
As far as I know, this command can be used to create some other file with the deleted lines. Now that we have another file we can rename that file and delete the original file if we want.
You will just have to do this
grep -v '^\$abcd=' test.txt > tmp.txt
now tmp.txt will have contents
a
b
ab
ac
$abcd
ab
a
x
If you want you may rename this to test.txt after deleting test.txt

Resources