Bash running explicit path works, but variable path doesn't [duplicate] - arrays

This question already has answers here:
How can I store a command in a variable in a shell script?
(12 answers)
Closed 4 years ago.
These work as advertised:
grep -ir 'hello world' .
grep -ir hello\ world .
These don't:
argumentString1="-ir 'hello world'"
argumentString2="-ir hello\\ world"
grep $argumentString1 .
grep $argumentString2 .
Despite 'hello world' being enclosed by quotes in the second example, grep interprets 'hello (and hello\) as one argument and world' (and world) as another, which means that, in this case, 'hello will be the search pattern and world' will be the search path.
Again, this only happens when the arguments are expanded from the argumentString variables. grep properly interprets 'hello world' (and hello\ world) as a single argument in the first example.
Can anyone explain why this is? Is there a proper way to expand a string variable that will preserve the syntax of each character such that it is correctly interpreted by shell commands?

Why
When the string is expanded, it is split into words, but it is not re-evaluated to find special characters such as quotes or dollar signs or ... This is the way the shell has 'always' behaved, since the Bourne shell back in 1978 or thereabouts.
Fix
In bash, use an array to hold the arguments:
argumentArray=(-ir 'hello world')
grep "${argumentArray[#]}" .
Or, if brave/foolhardy, use eval:
argumentString="-ir 'hello world'"
eval "grep $argumentString ."
On the other hand, discretion is often the better part of valour, and working with eval is a place where discretion is better than bravery. If you are not completely in control of the string that is eval'd (if there's any user input in the command string that has not been rigorously validated), then you are opening yourself to potentially serious problems.
Note that the sequence of expansions for Bash is described in Shell Expansions in the GNU Bash manual. Note in particular sections 3.5.3 Shell Parameter Expansion, 3.5.7 Word Splitting, and 3.5.9 Quote Removal.

When you put quote characters into variables, they just become plain literals (see http://mywiki.wooledge.org/BashFAQ/050; thanks #tripleee for pointing out this link)
Instead, try using an array to pass your arguments:
argumentString=(-ir 'hello world')
grep "${argumentString[#]}" .

In looking at this and related questions, I'm surprised that no one brought up using an explicit subshell. For bash, and other modern shells, you can execute a command line explicitly. In bash, it requires the -c option.
argumentString="-ir 'hello world'"
bash -c "grep $argumentString ."
Works exactly as original questioner desired. There are two restrictions to this technique:
You can only use single quotes within the command or argument strings.
Only exported environment variables will be available to the command
Also, this technique handles redirection and piping, and other shellisms work as well. You also can use bash internal commands as well as any other command that works at the command line, because you are essentially asking a subshell bash to interpret it directly as a command line. Here's a more complex example, a somewhat gratuitously complex ls -l variant.
cmd="prefix=`pwd` && ls | xargs -n 1 echo \'In $prefix:\'"
bash -c "$cmd"
I have built command processors both this way and with parameter arrays. Generally, this way is much easier to write and debug, and it's trivial to echo the command you are executing. OTOH, param arrays work nicely when you really do have abstract arrays of parameters, as opposed to just wanting a simple command variant.

Related

shell send args to a C program with spaces [duplicate]

This question already has answers here:
How can I store a command in a variable in a shell script?
(12 answers)
Closed 4 years ago.
These work as advertised:
grep -ir 'hello world' .
grep -ir hello\ world .
These don't:
argumentString1="-ir 'hello world'"
argumentString2="-ir hello\\ world"
grep $argumentString1 .
grep $argumentString2 .
Despite 'hello world' being enclosed by quotes in the second example, grep interprets 'hello (and hello\) as one argument and world' (and world) as another, which means that, in this case, 'hello will be the search pattern and world' will be the search path.
Again, this only happens when the arguments are expanded from the argumentString variables. grep properly interprets 'hello world' (and hello\ world) as a single argument in the first example.
Can anyone explain why this is? Is there a proper way to expand a string variable that will preserve the syntax of each character such that it is correctly interpreted by shell commands?
Why
When the string is expanded, it is split into words, but it is not re-evaluated to find special characters such as quotes or dollar signs or ... This is the way the shell has 'always' behaved, since the Bourne shell back in 1978 or thereabouts.
Fix
In bash, use an array to hold the arguments:
argumentArray=(-ir 'hello world')
grep "${argumentArray[#]}" .
Or, if brave/foolhardy, use eval:
argumentString="-ir 'hello world'"
eval "grep $argumentString ."
On the other hand, discretion is often the better part of valour, and working with eval is a place where discretion is better than bravery. If you are not completely in control of the string that is eval'd (if there's any user input in the command string that has not been rigorously validated), then you are opening yourself to potentially serious problems.
Note that the sequence of expansions for Bash is described in Shell Expansions in the GNU Bash manual. Note in particular sections 3.5.3 Shell Parameter Expansion, 3.5.7 Word Splitting, and 3.5.9 Quote Removal.
When you put quote characters into variables, they just become plain literals (see http://mywiki.wooledge.org/BashFAQ/050; thanks #tripleee for pointing out this link)
Instead, try using an array to pass your arguments:
argumentString=(-ir 'hello world')
grep "${argumentString[#]}" .
In looking at this and related questions, I'm surprised that no one brought up using an explicit subshell. For bash, and other modern shells, you can execute a command line explicitly. In bash, it requires the -c option.
argumentString="-ir 'hello world'"
bash -c "grep $argumentString ."
Works exactly as original questioner desired. There are two restrictions to this technique:
You can only use single quotes within the command or argument strings.
Only exported environment variables will be available to the command
Also, this technique handles redirection and piping, and other shellisms work as well. You also can use bash internal commands as well as any other command that works at the command line, because you are essentially asking a subshell bash to interpret it directly as a command line. Here's a more complex example, a somewhat gratuitously complex ls -l variant.
cmd="prefix=`pwd` && ls | xargs -n 1 echo \'In $prefix:\'"
bash -c "$cmd"
I have built command processors both this way and with parameter arrays. Generally, this way is much easier to write and debug, and it's trivial to echo the command you are executing. OTOH, param arrays work nicely when you really do have abstract arrays of parameters, as opposed to just wanting a simple command variant.

Create array in bash with variables as array name

I'm not sure if this has been answered, I've looked and haven't found anything that looks like what I'm trying to do. I also posted this to stackexchange (https://unix.stackexchange.com/questions/189293/create-array-in-bash-with-variables-as-array-name)
I have a number of shell scripts that are capable of running against a ksh or bash shell, and they make use of arrays. I created a function named "setArray" that interrogates the running shell and determines what builtin to use to create the array - for ksh, set -A, for bash, typeset -a. However, I'm having some issues with the bash portion.
The function takes two arguments, the name of the array and the value to add. This then becomes ${ARRAY_NAME} and ${VARIABLE_VALUE}. Doing the following:
set -A $(eval echo \${ARRAY_NAME}) $(eval echo \${${ARRAY_NAME}[*]}) "${VARIABLE_VALUE}"
works perfectly in ksh. However,
typeset -a $(eval echo \${ARRAY_NAME})=( $(eval echo \${${ARRAY_NAME}[*]}) "${VARIABLE_VALUE}" )
does not. This provides
bash: syntax error near unexpected token '('
I know I can just make it a list of strings (e.g. MYARRAY="one two three") and just loop through it using the IFS, but I don't want to lose the ability to use an array either.
Any thoughts ?
Given the assertion that the ksh portion of this function is working only the bash portion needs to be created. For which the following should work and, I believe, be safe and robust (though evidence to the contrary is welcome).
eval $ARRAY_NAME+=\(\"\$VARIABLE_VALUE\"\)
First expansion only expands $ARRAY_NAME to get
eval array+=("$VARIABLE_VALUE")
which eval then causes to be evaluated again normally.

Use of echo and system to run a software in C

I am trying to run a biological program called BLASTP which takes in two strings (fasta_GWIDD and fasta_UNIPROT in the code) and compares them. The problem that I am encountering is the use of echo/system in the code. Can anyone suggest what am I missing out??
for(i=0;i<index1;i++)
{
sprintf(fasta_GWIDD,">%s\\n%s\n",fasta_name1[i],fasta_seq1[i]);
setenv("GwiddVar", fasta_GWIDD, 1) ;
sprintf(fasta_UNIPROT,">%s\\n%s\n",fasta_name2[i],fasta_seq2[i]);
setenv("UniprotVar", fasta_UNIPROT, 1) ;
system("blastp -query <(echo -e $GwiddVar) -subject<(echo -e $UniprotVar)");
}
The error is:
sh: -c: line 0: syntax error near unexpected token `('
sh: -c: line 0: `blastp -query <(echo -e $GwiddVar) -subject<(echo -e $UniprotVar)'
It seems that the shell does not understand the
<(echo -e $GwiddVar)
syntax. Mind that the system command may use different shell than the one you are used to (like csh instead of bash, and so on). It's everything in somewhere in your OS config files and profile, but I can't guess what you have out there.
Btw. I think that you should be able to check which shell is being used by the system() command by either of these:
system("echo $SHELL") // should simply write the path to current shell
system("ps -aux") // look at it and find what is the parent of the PS
etc.
Considering that this was correct on some shell:
blastp -query <(echo -e $GwiddVar) -subject<(echo -e $UniprotVar)
The syntax cited above apparently is meant only to pass the variable as intput. I think you are overdoing it. You are using echo -e $GwiddVar to print and capture the data, which you already have in a vairable at hand. Have you tried something as simple as:
blastp -query $GwiddVar -subject $UniprotVar
I don't know which shell you are trying to use, but considering that echo got its data, then it should be exactly the same.
If you are worried about spaces, then various shells usually allow you to use quotation marks:
blastp -query "$GwiddVar" -subject "$UniprotVar"
Of course it depends on the shell. If your program uses a shell that does not like quotation marks, well, you have to adapt it. Not to your shell, but to the shell the system() has used.
Another thing is that using system is quite rough. When you have arguments that are difficult to escape correctly, you should be using other functions like execve that are able to take an array of real raw direct strings and pass them directly as ARGV to the process. Using these, you will not need (and you should not) add any quotes or escape any spaces in the strings to be passed.
sprintf(fasta_GWIDD,">%s\\n%s\n",fasta_name1[i],fasta_seq1[i]);
sprintf(fasta_UNIPROT,">%s\\n%s\n",fasta_name2[i],fasta_seq2[i]);
char** args = .....; // allocate an array of char*[5], malloc, or whatever
args[0] = "blastp";
args[1] = "-query";
args[2] = fasta_GWIDD;
args[3] = "-subject";
args[4] = fasta_UNIPROT;
int errcode = execve(4, args, null);
if( errcode ) ... // check the error (if any) and react
However! Note that the execve comes from the exec family, so it replaces your current process. This is why I write only a sketch and don't show the whole ready-to-run code. You will probably need to fork() before it and then wait for the children in the outer loop.
So, I'd first check the shell and syntax ;)
From man 3 system:
DESCRIPTION
system() executes a command specified in command by calling /bin/sh -c
command, and returns after the command has been completed.
On many systems, /bin/sh is not bash, and even when it is, it is a different configuration of bash (bash typically operates differently if it is invoked as /bin/sh). So, you are passing bash syntax to a shell that is either not bash, or doesn't allow the full set of bash-isms... Also, there's a space missing after -system that might be confusing things as well... And, I'm not entirely sure environment variables are expanded within system() strings...

How to 'cut' on null?

Unix 'file' command has a -0 option to output a null character after a filename. This is supposedly good for using with 'cut'.
From man file:
-0, --print0
Output a null character ‘\0’ after the end of the filename. Nice
to cut(1) the output. This does not affect the separator which is
still printed.
(Note, on my Linux, the '-F' separator is NOT printed - which makes more sense to me.)
How can you use 'cut' to extract a filename from output of 'file'?
This is what I want to do:
find . "*" -type f | file -n0iNf - | cut -d<null> -f1
where <null> is the NUL character.
Well, that is what I am trying to do, what I want to do is get all file names from a directory tree that have a particular MIME type. I use a grep (not shown).
I want to handle all legal file names and not get stuck on file names with colons, for example, in their name. Hence, NUL would be excellent.
I guess non-cut solutions are fine too, but I hate to give up on a simple idea.
Just specify an empty delimiter:
cut -d '' -f1
(N.B.: The space between the -d and the '' is important, so that the -d and the empty string get passed as separate arguments; if you write -d'', then that will get passed as just -d, and then cut will think you're trying to use -f1 as the delimiter, which it will complain about, with an error message that "the delimiter must be a single character".)
This works with gnu awk.
awk 'BEGIN{FS="\x00"}{print$1}'
ruakh's helpful answer works well on Linux.
On macOS, the cut utility doesn't accept '' as a delimiter argument (bad delimiter):
Here is a portable workaround that works on both platforms, via the tr utility; it only makes one assumption:
The input mustn't contain \1 control characters (START OF HEADING, U+0001) - which is unlikely in text.
You can substitute any character known not to occur in the input for \1; if it's a character that can be represented verbatim in a string, that simplifies the solution because you won't need the aux. command substitution ($(...)) with a printf call for the -d argument.
If your shell supports so-called ANSI C-quoted strings - which is true of bash, zsh and ksh - you can replace "$(printf '\1')" with $'\1'
(The following uses a simpler input command to demonstrate the technique).
# In zsh, bash, ksh you can simplify "$(printf '\1')" to $'\1'
$ printf '[first field 1]\0[rest 1]\n[first field 2]\0[rest 2]' |
tr '\0' '\1' | cut -d "$(printf '\1')" -f 1
[first field 1]
[first field 2]
Alternatives to using cut:
C. Paul Bond's helpful answer shows a portable awk solution.

Does it possible combine bash and awk script files?

I have some bash script where I get values of variable, that I would like use in awk.
Does it possible include whole awk (like it possible with bash script files) file in bash e.g.:
#!/bin/sh
var1=$1
source myawk.sh
and myawk.sh:
print $1;
Bash and awk are different languages, each with their own interpreter of the same name. The tiny sample you show is stripped down too far to make much sense:
You've marked both files as shell scripts; one using the shebang #!/bin/sh and the other using the extension .sh. Obviously the shell can read shell script, and the command to do so is called . in Bourne shell (or source in csh and bash).
The shell script assigns a variable, but you're not using it anywhere. Did you mean passing it on to the awk script?
Both the awk and shell script use $1, which has different meanings for them (in bash, it's from the command line or a set command; in awk, it's from a parsed input line).
The two tools are often used in tandem, as the shell is better at combining separate programs and awk is better at reformatting tabular or structured text. It was so common that a whole language evolved to combine the tasks; Perl's roots are as a combination of shell, awk and sed.
If you just wanted to pass a variable from the shell script into an awk script, use -v. The man page is your friend.
first of all, if you're writing bash don't use #!/bin/sh that will put you in compatibility mode which is only necessarly if you're writing for portability (and then you have to adhere to the POSIX normative).
now regarding your question you just have to run awk from inside your bash script, like this:
#!/bin/bash
var1=$1
awk -f myawk.sh
also you should use .awk as extension I guess.
Or, many ppl do sth like this:
#!/bin/env bash
#Bash things start
...
var1=$1
#Bash things stop
#Awk things start,
#may use pipes or variable to interact with bash
awk -v V1=var1 '
#AWK program, can even include awk scripts here.
'
#Bash things
I suggest this page here by Bruce Barnett:
http://www.grymoire.com/Unix/Awk.html#uh-3
You can also use double quote to make use of shell's extract feature but it is confusing.
Personally I just try to avoid those fancy gnu additions of bash or awk and make my scripts ksh+(n)awk compatible.
As an hardcore AWK user, I soon realized that doing the following was really a huge help :
Defining and exporting an AWK_REPO variable in my bashrc
#Content of bashrc
export AWK_REPO=~/bin/AWK
Storing there every AWK script I write using the .awk extension.
You can then call it from anywhere like this :
awk -f $AWK_REPO/myScript.awk $file
or even, using Shebangs and adding AWK_REPO to PATH (with export PATH=${AWK_REPO}:${PATH})
myScript.awk $file

Resources