Does it possible combine bash and awk script files? - file

I have some bash script where I get values of variable, that I would like use in awk.
Does it possible include whole awk (like it possible with bash script files) file in bash e.g.:
#!/bin/sh
var1=$1
source myawk.sh
and myawk.sh:
print $1;

Bash and awk are different languages, each with their own interpreter of the same name. The tiny sample you show is stripped down too far to make much sense:
You've marked both files as shell scripts; one using the shebang #!/bin/sh and the other using the extension .sh. Obviously the shell can read shell script, and the command to do so is called . in Bourne shell (or source in csh and bash).
The shell script assigns a variable, but you're not using it anywhere. Did you mean passing it on to the awk script?
Both the awk and shell script use $1, which has different meanings for them (in bash, it's from the command line or a set command; in awk, it's from a parsed input line).
The two tools are often used in tandem, as the shell is better at combining separate programs and awk is better at reformatting tabular or structured text. It was so common that a whole language evolved to combine the tasks; Perl's roots are as a combination of shell, awk and sed.
If you just wanted to pass a variable from the shell script into an awk script, use -v. The man page is your friend.

first of all, if you're writing bash don't use #!/bin/sh that will put you in compatibility mode which is only necessarly if you're writing for portability (and then you have to adhere to the POSIX normative).
now regarding your question you just have to run awk from inside your bash script, like this:
#!/bin/bash
var1=$1
awk -f myawk.sh
also you should use .awk as extension I guess.

Or, many ppl do sth like this:
#!/bin/env bash
#Bash things start
...
var1=$1
#Bash things stop
#Awk things start,
#may use pipes or variable to interact with bash
awk -v V1=var1 '
#AWK program, can even include awk scripts here.
'
#Bash things
I suggest this page here by Bruce Barnett:
http://www.grymoire.com/Unix/Awk.html#uh-3
You can also use double quote to make use of shell's extract feature but it is confusing.
Personally I just try to avoid those fancy gnu additions of bash or awk and make my scripts ksh+(n)awk compatible.

As an hardcore AWK user, I soon realized that doing the following was really a huge help :
Defining and exporting an AWK_REPO variable in my bashrc
#Content of bashrc
export AWK_REPO=~/bin/AWK
Storing there every AWK script I write using the .awk extension.
You can then call it from anywhere like this :
awk -f $AWK_REPO/myScript.awk $file
or even, using Shebangs and adding AWK_REPO to PATH (with export PATH=${AWK_REPO}:${PATH})
myScript.awk $file

Related

shell send args to a C program with spaces [duplicate]

This question already has answers here:
How can I store a command in a variable in a shell script?
(12 answers)
Closed 4 years ago.
These work as advertised:
grep -ir 'hello world' .
grep -ir hello\ world .
These don't:
argumentString1="-ir 'hello world'"
argumentString2="-ir hello\\ world"
grep $argumentString1 .
grep $argumentString2 .
Despite 'hello world' being enclosed by quotes in the second example, grep interprets 'hello (and hello\) as one argument and world' (and world) as another, which means that, in this case, 'hello will be the search pattern and world' will be the search path.
Again, this only happens when the arguments are expanded from the argumentString variables. grep properly interprets 'hello world' (and hello\ world) as a single argument in the first example.
Can anyone explain why this is? Is there a proper way to expand a string variable that will preserve the syntax of each character such that it is correctly interpreted by shell commands?
Why
When the string is expanded, it is split into words, but it is not re-evaluated to find special characters such as quotes or dollar signs or ... This is the way the shell has 'always' behaved, since the Bourne shell back in 1978 or thereabouts.
Fix
In bash, use an array to hold the arguments:
argumentArray=(-ir 'hello world')
grep "${argumentArray[#]}" .
Or, if brave/foolhardy, use eval:
argumentString="-ir 'hello world'"
eval "grep $argumentString ."
On the other hand, discretion is often the better part of valour, and working with eval is a place where discretion is better than bravery. If you are not completely in control of the string that is eval'd (if there's any user input in the command string that has not been rigorously validated), then you are opening yourself to potentially serious problems.
Note that the sequence of expansions for Bash is described in Shell Expansions in the GNU Bash manual. Note in particular sections 3.5.3 Shell Parameter Expansion, 3.5.7 Word Splitting, and 3.5.9 Quote Removal.
When you put quote characters into variables, they just become plain literals (see http://mywiki.wooledge.org/BashFAQ/050; thanks #tripleee for pointing out this link)
Instead, try using an array to pass your arguments:
argumentString=(-ir 'hello world')
grep "${argumentString[#]}" .
In looking at this and related questions, I'm surprised that no one brought up using an explicit subshell. For bash, and other modern shells, you can execute a command line explicitly. In bash, it requires the -c option.
argumentString="-ir 'hello world'"
bash -c "grep $argumentString ."
Works exactly as original questioner desired. There are two restrictions to this technique:
You can only use single quotes within the command or argument strings.
Only exported environment variables will be available to the command
Also, this technique handles redirection and piping, and other shellisms work as well. You also can use bash internal commands as well as any other command that works at the command line, because you are essentially asking a subshell bash to interpret it directly as a command line. Here's a more complex example, a somewhat gratuitously complex ls -l variant.
cmd="prefix=`pwd` && ls | xargs -n 1 echo \'In $prefix:\'"
bash -c "$cmd"
I have built command processors both this way and with parameter arrays. Generally, this way is much easier to write and debug, and it's trivial to echo the command you are executing. OTOH, param arrays work nicely when you really do have abstract arrays of parameters, as opposed to just wanting a simple command variant.

Bash running explicit path works, but variable path doesn't [duplicate]

This question already has answers here:
How can I store a command in a variable in a shell script?
(12 answers)
Closed 4 years ago.
These work as advertised:
grep -ir 'hello world' .
grep -ir hello\ world .
These don't:
argumentString1="-ir 'hello world'"
argumentString2="-ir hello\\ world"
grep $argumentString1 .
grep $argumentString2 .
Despite 'hello world' being enclosed by quotes in the second example, grep interprets 'hello (and hello\) as one argument and world' (and world) as another, which means that, in this case, 'hello will be the search pattern and world' will be the search path.
Again, this only happens when the arguments are expanded from the argumentString variables. grep properly interprets 'hello world' (and hello\ world) as a single argument in the first example.
Can anyone explain why this is? Is there a proper way to expand a string variable that will preserve the syntax of each character such that it is correctly interpreted by shell commands?
Why
When the string is expanded, it is split into words, but it is not re-evaluated to find special characters such as quotes or dollar signs or ... This is the way the shell has 'always' behaved, since the Bourne shell back in 1978 or thereabouts.
Fix
In bash, use an array to hold the arguments:
argumentArray=(-ir 'hello world')
grep "${argumentArray[#]}" .
Or, if brave/foolhardy, use eval:
argumentString="-ir 'hello world'"
eval "grep $argumentString ."
On the other hand, discretion is often the better part of valour, and working with eval is a place where discretion is better than bravery. If you are not completely in control of the string that is eval'd (if there's any user input in the command string that has not been rigorously validated), then you are opening yourself to potentially serious problems.
Note that the sequence of expansions for Bash is described in Shell Expansions in the GNU Bash manual. Note in particular sections 3.5.3 Shell Parameter Expansion, 3.5.7 Word Splitting, and 3.5.9 Quote Removal.
When you put quote characters into variables, they just become plain literals (see http://mywiki.wooledge.org/BashFAQ/050; thanks #tripleee for pointing out this link)
Instead, try using an array to pass your arguments:
argumentString=(-ir 'hello world')
grep "${argumentString[#]}" .
In looking at this and related questions, I'm surprised that no one brought up using an explicit subshell. For bash, and other modern shells, you can execute a command line explicitly. In bash, it requires the -c option.
argumentString="-ir 'hello world'"
bash -c "grep $argumentString ."
Works exactly as original questioner desired. There are two restrictions to this technique:
You can only use single quotes within the command or argument strings.
Only exported environment variables will be available to the command
Also, this technique handles redirection and piping, and other shellisms work as well. You also can use bash internal commands as well as any other command that works at the command line, because you are essentially asking a subshell bash to interpret it directly as a command line. Here's a more complex example, a somewhat gratuitously complex ls -l variant.
cmd="prefix=`pwd` && ls | xargs -n 1 echo \'In $prefix:\'"
bash -c "$cmd"
I have built command processors both this way and with parameter arrays. Generally, this way is much easier to write and debug, and it's trivial to echo the command you are executing. OTOH, param arrays work nicely when you really do have abstract arrays of parameters, as opposed to just wanting a simple command variant.

Run C program from shell script [duplicate]

I have a script in unix that looks like this:
#!/bin/bash
gcc -osign sign.c
./sign < /usr/share/dict/words | sort | squash > out
Whenever I try to run this script it gives me an error saying that squash is not a valid command. squash is a shell script stored in the same directory as this script and looks like this:
#!/bin/bash
awk -f squash.awk
I have execute permissions set correctly but for some reason it doesn't run. Is there something else I have to do to make it able to run like shown? I am rather new to scripting so any help would be greatly appreciated!
As mentioned in #Biffen's comment, unless . is in your $PATH variable, you need to specify ./squash for the same reason you need to specify ./sign.
When parsing a bare word on the command line, bash checks all the directories listed in $PATH to see if said word is an executable file living inside any of them. Unless . is in $PATH, bash won't find squash.
To avoid this problem, you can tell bash not to go looking for squash by giving bash the complete path to it, namely ./squash.

In C, what's the typical way to handle multiple arguments that are "list"-like?

Suppose I have some program called "combine" that takes input of "red", "green" and "blue"-type files to produce an output file (let's say "color.jpg")... BUT the number of each type is arbitrary. Let's also suppose that there's no way to determine what type the file is except through how the user classifies them. What do people usually do in this case?
For instance, on the command line, some of the approaches might be:
command red1,red2,red3 green1,green2 blue1 color.jpg
This comma-approach breaks down if commas can appear in the filenames. It's the approach I like the most though. Another idea would be
command "red1 red2 red3" "green1 green2" "blue1" color.jpg
but this approach also has trouble with spaces in names.
I could also require ASCII files containing lists giving the files of each type:
command redlist greenlist bluelist color.jpg
but this requires lugging around extra files.
Further ideas? Is there a standard LINUX way of doing this?
The standard way would be this:
command --red red1.jpg --red red2.jpg --blue blue1.jpg
With short options:
command -r red1.jpg -r red2.jpg -b blue1.jpg
With bash shorthand:
command -r={red1,red2}.jpg -b blue1.jpg
(The above gets expanded by the shell so it looks like the previous invocation.)
Doing things this way avoids arbitrary limitations like "no commas in filenames" and also makes your program more interoperable with standard *nix utilities like xargs and so on.
Another way is accepting:
command -r redfile1 redfile2 -b bluefile1 blue2 blue2 -g green1
so that:
command -r red* -b blue* -g green*
is possible.

Moving things in terminal based on their name

Edit: I think this has been answered successfully, but I can't check 'til later. I've reformatted it as suggested though.
The question: I have a series of files, each with a name of the form XXXXNAME, where XXXX is some number. I want to move them all to separate folders called XXXX and have them called NAME. I can do this manually, but I was hoping that by naming them XXXXNAME there'd be some way I could tell Terminal (I think that's the right name, but not really sure) to move them there. Something like
mv *NAME */NAME
but where it takes whatever * was in the first case and regurgitates it to the path.
This is on some form of Linux, with a bash shell.
In the real life case, the files are 0000GNUmakefile, with sequential numbering. I'm having to make lots of similar-but-slightly-altered versions of a program to compile and run on a cluster as part of my research. It would probably have been quicker to write a program to edit all the files and put in the right place in the first place, but I didn't.
This is probably extremely simple, and I should be able to find an answer myself, if I knew the right words. Thing is, I have no formal training in programming, so I don't know what to call things to search for them. So hopefully this will result in me getting an answer, and maybe knowing how to find out the answer for similar things myself next time. With the basic programming I've picked up, I'm sure I could write a program to do this for me, but I'm hoping there's a simple way to do it just using functionality already in Terminal. I probably shouldn't be allowed to play with these things.
Thanks for any help! I can actually program in C and Python a fair amount, but that's through trial and error largely, and I still don't know what I can do and can't do in Terminal.
SO many ways to achieve this.
I find that the old standbys sed and awk are often the most powerful.
ls | sed -rne 's:^([0-9]{4})(NAME)$:mv -iv & \1/\2:p'
If you're satisfied that the commands look right, pipe the command line through a shell:
ls | sed -rne 's:^([0-9]{4})(NAME)$:mv -iv & \1/\2:p' | sh
I put NAME in brackets and used \2 so that if it varies more than your example indicates, you can come up with a regular expression to handle your filenames better.
To do the same thing in gawk (GNU awk, the variant found in most GNU/Linux distros):
ls | gawk '/^[0-9]{4}NAME$/ {printf("mv -iv %s %s/%s\n", $1, substr($0,0,4), substr($0,5))}'
As with the first sample, this produces commands which, if they make sense to you, can be piped through a shell by appending | sh to the end of the line.
Note that with all these mv commands, I've added the -i and -v options. This is for your protection. Read the man page for mv (by typing man mv in your Linux terminal) to see if you should be comfortable leaving them out.
Also, I'm assuming with these lines that all your directories already exist. You didn't mention if they do. If they don't, here's a one-liner to create the directories.
ls | sed -rne 's:^([0-9]{4})(NAME)$:mkdir -p \1:p' | sort -u
As with the others, append | sh to run the commands.
I should mention that it is generally recommended to use constructs like for (in Tim's answer) or find instead of parsing the output of ls. That said, when your filename format is as simple as /[0-9]{4}word/, I find the quick sed one-liner to be the way to go.
Lastly, if by NAME you actually mean "any string of characters" rather than the literal string "NAME", then in all my examples above, replace NAME with .*.
The following script will do this for you. Copy the script into a file on the remote machine (we'll call it sortfiles.sh).
#!/bin/bash
# Get all files in current directory having names XXXXsomename, where X is an integer
files=$(find . -name '[0-9][0-9][0-9][0-9]*')
# Build a list of the XXXX patterns found in the list of files
dirs=
for name in ${files}; do
dirs="${dirs} $(echo ${name} | cut -c 3-6)"
done
# Remove redundant entries from the list of XXXX patterns
dirs=$(echo ${dirs} | uniq)
# Create any XXXX directories that are not already present
for name in ${dirs}; do
if [[ ! -d ${name} ]]; then
mkdir ${name}
fi
done
# Move each of the XXXXsomename files to the appropriate directory
for name in ${files}; do
mv ${name} $(echo ${name} | cut -c 3-6)
done
# Return from script with normal status
exit 0
From the command line, do chmod +x sortfiles.sh
Execute the script with ./sortfiles.sh
Just open the Terminal application, cd into the directory that contains the files you want moved/renamed, and copy and paste these commands into the command line.
for file in [0-9][0-9][0-9][0-9]*; do
dirName="${file%%*([^0-9])}"
mkdir -p "$dirName"
mv "$file" "$dirName/${file##*([0-9])}"
done
This assumes all the files that you want to rename and move are in the same directory. The file globbing also assumes that there are at least four digits at the start of the filename. If there are more than four numbers, it will still be caught, but not if there are less than four. If there are less than four, take off the appropriate number of [0-9]s from the first line.
It does not handle the case where "NAME" (i.e. the name of the new file you want) starts with a number.
See this site for more information about string manipulation in bash.

Resources