How can a post-receive hook written in perl get the branch name? - githooks

I have a post-receive hook that is written in perl. I need to be able to figure out which branch is being pushed to. How can I do this? I tried looking at #ARGV and $ARGV[2] without success.

The key from the git documentation is that the post-receive hook receives no arguments:
This hook executes once for the receive operation. It takes no arguments, but gets the same information as the <> hook does on its standard input.
Here is some perl code that I have used to parse ref:
while (<>) {
chomp;
next unless my($old,$new,$ref) =
m/ ^ ([0-9a-f]+) \s+ # old SHA-1
([0-9a-f]+) \s+ # new SHA-1
refs\/heads\/(.*?) # ref
\s* $ /x;
#...
}

Related

How can I overwrite file after replace the word?

I want to replace the text in the file and overwrite file.
use strict;
use warnings;
my ($str1, $str2, $i, $all_line);
$str1 = "AAA";
$str2 = "bbb";
open(RP, "+>", "testfile") ;
$all_line = $_;
$i = 0;
while(<RP>) {
while(/$str1/) {
$i++;
}
s/$str1/$str2/g;
print RP $_;
}
close(RP);
A normal process is to read the file line by line and write out each line, changed as/if needed, to a new file. Once that's all done rename that new file, atomically as much as possible, so to overwrite the orgiinal.
Here is an example of a library that does all that for us, Path::Tiny
use warnings;
use strict;
use feature 'say';
use Path::Tiny;
my $file = shift || die "Usage: $0 file\n";
say "File to process:";
say path($file)->slurp;
# NOTE: This CHANGES THE INPUT FILE
#
# Process each line: upper-case a letter after .
path($file)->edit_lines( sub { s/\.\s+\K([a-z])/\U$1/g } );
say "File now:";
say path($file)->slurp;
This upper-cases a letter following a dot (period), after some spaces, on each line where it is found and copies all other lines unchanged. (It's just an example, not a proper linguistic fix.)
Note: the input file is edited in-place so it will have been changed after this is done.
This capability was introduced in the module's version 0.077, of 2016-02-10. (For reference, Perl version 5.24 came in 2016-08. So with the system Perl 5.24 or newer, a Path::Tiny installed from an OS package or as a default CPAN version should have this method.)
Perl has a built-in in-place editing facility: The -i command line switch. You can access the functionality of this switch via $^I.
use strict;
use warnings;
my $str1 = "AAA";
my $str2 = "bbb";
local $^I = ''; # Same as `perl -i`. For `-i~`, assign `"~"`.
local #ARGV = "testfile";
while (<>) {
s/\Q$str1/$str2/g;
print;
}
I was not going to leave an answer, but I discovered when leaving a comment that I did have some feedback for you, so here goes.
open(RP, "+>", "testfile") ;
The mode "+>" will truncate your file, delete all it's content. This is described in the documentation for open:
You can put a + in front of the > or < to indicate that you want both read and write access to the file; thus +< is almost always preferred for read/write updates--the +> mode would clobber the file first. You can't usually use either read-write mode for updating textfiles, since they have variable-length records. See the -i switch in perlrun for a better approach.
So naturally, you can't read from the file if you first delete it's content. They mention here the -i switch, which is described like this in perl -h:
-i[extension] edit <> files in place (makes backup if extension supplied)
This is what ikegami's answer describes, only in his case it is done from within a program file rather than on the command line.
But, using the + mode for both reading and writing is not really a good way to do it. Basically it becomes difficult to print where you want to print. The recommended way is to instead read from one file, and then print to another. After the editing is done, you can rename and delete files as required. And this is exactly what the -i switch does for you. It is a predefined functionality of Perl. Read more about it in perldoc perlrun.
Next, you should use a lexical file handle. E.g. my $fh, instead of a global. And you should also check the return value from the open, to make sure there was not an error. Which gives us:
open my $fh, "<", "testfile" or die "Cannot open 'testfile': $!";
Usually if open fails, you want the program to die, and report the reason it failed. The error is in the $! variable.
Another thing to note is that you should not declare all your variables at the top of the file. It is good that you use use strict; use warnings, Perl code should never be written without them, but this is not how you handle it. You declare your variable as close to the place you use the variable as possible, and in the smallest scope possible. With a my declared variable, that is the nearest enclosing block { ... }. This will make it easy to trace your variable in bigger programs, and it will encapsulate your code and protect your variable.
In your case, you would simply put the my before all the variables, like so:
my $str1 = "AAA";
my $str2 = "bbb";
my $all_line = $_;
my $i = 0;
Note that $_ will be empty/undefined there, so that assignment is kind of pointless. If your intent was to use $all_line as the loop variable, you would do:
while (my $all_line = <$fh>) {
Note that this variable is declared in the smallest scope possible, and we are using a lexical file handle $fh.
Another important note is that your replacement strings can contain regex meta characters. Sometimes you want them to be included, like for example:
my $str1 = "a+"; # match one or more 'a'
Sometimes you don't want that:
my $str1 = "google.com"; # '.' is meant as a period, not the "match anything" character
I will assume that you most often do not want that, in which case you should use the escape sequence \Q ... \E which disables regex meta characters inside it.
So what do we get if we put all this together? Well, you might get something like this:
use strict;
use warnings;
my $str1 = "AAA";
my $str2 = "bbb";
my $filename = shift || "testfile"; # 'testfile', or whatever the program argument is
open my $fh_in, "<", $filename or die "Cannot open '$filename': $!";
open my $fh_out, ">", "$filename.out" or die "Cannot open '$filename.out': $!";
while (<$fh_in>) { # read from input file...
s/\Q$str1\E/$str2/g; # perform substitution...
print $fh_out $_; # print to output file
}
close $fh_in;
close $fh_out;
After this finishes, you may choose to rename the files and delete one or the other. This is basically the same procedure as using the -i switch, except here we do it explicitly.
rename $filename, "$filename.bak"; # save backup of input file in .bak extension
rename "$filename.out", $filename; # clobber the input file
Renaming files is sometimes also facilitated by the File::Copy module, which is a core module.
With all this said, you can replace all your code with this:
perl -i -pe's/AAA/bbb/g' testfile
And this is the power of the -i switch.

How to set arrays with variables with loop in bash [duplicate]

I am confused about a bash script.
I have the following code:
function grep_search() {
magic_way_to_define_magic_variable_$1=`ls | tail -1`
echo $magic_variable_$1
}
I want to be able to create a variable name containing the first argument of the command and bearing the value of e.g. the last line of ls.
So to illustrate what I want:
$ ls | tail -1
stack-overflow.txt
$ grep_search() open_box
stack-overflow.txt
So, how should I define/declare $magic_way_to_define_magic_variable_$1 and how should I call it within the script?
I have tried eval, ${...}, \$${...}, but I am still confused.
I've been looking for better way of doing it recently. Associative array sounded like overkill for me. Look what I found:
suffix=bzz
declare prefix_$suffix=mystr
...and then...
varname=prefix_$suffix
echo ${!varname}
From the docs:
The ‘$’ character introduces parameter expansion, command substitution, or arithmetic expansion. ...
The basic form of parameter expansion is ${parameter}. The value of parameter is substituted. ...
If the first character of parameter is an exclamation point (!), and parameter is not a nameref, it introduces a level of indirection. Bash uses the value formed by expanding the rest of parameter as the new parameter; this is then expanded and that value is used in the rest of the expansion, rather than the expansion of the original parameter. This is known as indirect expansion. The value is subject to tilde expansion, parameter expansion, command substitution, and arithmetic expansion. ...
Use an associative array, with command names as keys.
# Requires bash 4, though
declare -A magic_variable=()
function grep_search() {
magic_variable[$1]=$( ls | tail -1 )
echo ${magic_variable[$1]}
}
If you can't use associative arrays (e.g., you must support bash 3), you can use declare to create dynamic variable names:
declare "magic_variable_$1=$(ls | tail -1)"
and use indirect parameter expansion to access the value.
var="magic_variable_$1"
echo "${!var}"
See BashFAQ: Indirection - Evaluating indirect/reference variables.
Beyond associative arrays, there are several ways of achieving dynamic variables in Bash. Note that all these techniques present risks, which are discussed at the end of this answer.
In the following examples I will assume that i=37 and that you want to alias the variable named var_37 whose initial value is lolilol.
Method 1. Using a “pointer” variable
You can simply store the name of the variable in an indirection variable, not unlike a C pointer. Bash then has a syntax for reading the aliased variable: ${!name} expands to the value of the variable whose name is the value of the variable name. You can think of it as a two-stage expansion: ${!name} expands to $var_37, which expands to lolilol.
name="var_$i"
echo "$name" # outputs “var_37”
echo "${!name}" # outputs “lolilol”
echo "${!name%lol}" # outputs “loli”
# etc.
Unfortunately, there is no counterpart syntax for modifying the aliased variable. Instead, you can achieve assignment with one of the following tricks.
1a. Assigning with eval
eval is evil, but is also the simplest and most portable way of achieving our goal. You have to carefully escape the right-hand side of the assignment, as it will be evaluated twice. An easy and systematic way of doing this is to evaluate the right-hand side beforehand (or to use printf %q).
And you should check manually that the left-hand side is a valid variable name, or a name with index (what if it was evil_code # ?). By contrast, all other methods below enforce it automatically.
# check that name is a valid variable name:
# note: this code does not support variable_name[index]
shopt -s globasciiranges
[[ "$name" == [a-zA-Z_]*([a-zA-Z_0-9]) ]] || exit
value='babibab'
eval "$name"='$value' # carefully escape the right-hand side!
echo "$var_37" # outputs “babibab”
Downsides:
does not check the validity of the variable name.
eval is evil.
eval is evil.
eval is evil.
1b. Assigning with read
The read builtin lets you assign values to a variable of which you give the name, a fact which can be exploited in conjunction with here-strings:
IFS= read -r -d '' "$name" <<< 'babibab'
echo "$var_37" # outputs “babibab\n”
The IFS part and the option -r make sure that the value is assigned as-is, while the option -d '' allows to assign multi-line values. Because of this last option, the command returns with an non-zero exit code.
Note that, since we are using a here-string, a newline character is appended to the value.
Downsides:
somewhat obscure;
returns with a non-zero exit code;
appends a newline to the value.
1c. Assigning with printf
Since Bash 3.1 (released 2005), the printf builtin can also assign its result to a variable whose name is given. By contrast with the previous solutions, it just works, no extra effort is needed to escape things, to prevent splitting and so on.
printf -v "$name" '%s' 'babibab'
echo "$var_37" # outputs “babibab”
Downsides:
Less portable (but, well).
Method 2. Using a “reference” variable
Since Bash 4.3 (released 2014), the declare builtin has an option -n for creating a variable which is a “name reference” to another variable, much like C++ references. Just as in Method 1, the reference stores the name of the aliased variable, but each time the reference is accessed (either for reading or assigning), Bash automatically resolves the indirection.
In addition, Bash has a special and very confusing syntax for getting the value of the reference itself, judge by yourself: ${!ref}.
declare -n ref="var_$i"
echo "${!ref}" # outputs “var_37”
echo "$ref" # outputs “lolilol”
ref='babibab'
echo "$var_37" # outputs “babibab”
This does not avoid the pitfalls explained below, but at least it makes the syntax straightforward.
Downsides:
Not portable.
Risks
All these aliasing techniques present several risks. The first one is executing arbitrary code each time you resolve the indirection (either for reading or for assigning). Indeed, instead of a scalar variable name, like var_37, you may as well alias an array subscript, like arr[42]. But Bash evaluates the contents of the square brackets each time it is needed, so aliasing arr[$(do_evil)] will have unexpected effects… As a consequence, only use these techniques when you control the provenance of the alias.
function guillemots {
declare -n var="$1"
var="«${var}»"
}
arr=( aaa bbb ccc )
guillemots 'arr[1]' # modifies the second cell of the array, as expected
guillemots 'arr[$(date>>date.out)1]' # writes twice into date.out
# (once when expanding var, once when assigning to it)
The second risk is creating a cyclic alias. As Bash variables are identified by their name and not by their scope, you may inadvertently create an alias to itself (while thinking it would alias a variable from an enclosing scope). This may happen in particular when using common variable names (like var). As a consequence, only use these techniques when you control the name of the aliased variable.
function guillemots {
# var is intended to be local to the function,
# aliasing a variable which comes from outside
declare -n var="$1"
var="«${var}»"
}
var='lolilol'
guillemots var # Bash warnings: “var: circular name reference”
echo "$var" # outputs anything!
Source:
BashFaq/006: How can I use variable variables (indirect variables, pointers, references) or associative arrays?
BashFAQ/048: eval command and security issues
Example below returns value of $name_of_var
var=name_of_var
echo $(eval echo "\$$var")
Use declare
There is no need on using prefixes like on other answers, neither arrays. Use just declare, double quotes, and parameter expansion.
I often use the following trick to parse argument lists contanining one to n arguments formatted as key=value otherkey=othervalue etc=etc, Like:
# brace expansion just to exemplify
for variable in {one=foo,two=bar,ninja=tip}
do
declare "${variable%=*}=${variable#*=}"
done
echo $one $two $ninja
# foo bar tip
But expanding the argv list like
for v in "$#"; do declare "${v%=*}=${v#*=}"; done
Extra tips
# parse argv's leading key=value parameters
for v in "$#"; do
case "$v" in ?*=?*) declare "${v%=*}=${v#*=}";; *) break;; esac
done
# consume argv's leading key=value parameters
while test $# -gt 0; do
case "$1" in ?*=?*) declare "${1%=*}=${1#*=}";; *) break;; esac
shift
done
Combining two highly rated answers here into a complete example that is hopefully useful and self-explanatory:
#!/bin/bash
intro="You know what,"
pet1="cat"
pet2="chicken"
pet3="cow"
pet4="dog"
pet5="pig"
# Setting and reading dynamic variables
for i in {1..5}; do
pet="pet$i"
declare "sentence$i=$intro I have a pet ${!pet} at home"
done
# Just reading dynamic variables
for i in {1..5}; do
sentence="sentence$i"
echo "${!sentence}"
done
echo
echo "Again, but reading regular variables:"
echo $sentence1
echo $sentence2
echo $sentence3
echo $sentence4
echo $sentence5
Output:
You know what, I have a pet cat at home
You know what, I have a pet chicken at home
You know what, I have a pet cow at home
You know what, I have a pet dog at home
You know what, I have a pet pig at home
Again, but reading regular variables:
You know what, I have a pet cat at home
You know what, I have a pet chicken at home
You know what, I have a pet cow at home
You know what, I have a pet dog at home
You know what, I have a pet pig at home
This will work too
my_country_code="green"
x="country"
eval z='$'my_"$x"_code
echo $z ## o/p: green
In your case
eval final_val='$'magic_way_to_define_magic_variable_"$1"
echo $final_val
This should work:
function grep_search() {
declare magic_variable_$1="$(ls | tail -1)"
echo "$(tmpvar=magic_variable_$1 && echo ${!tmpvar})"
}
grep_search var # calling grep_search with argument "var"
An extra method that doesn't rely on which shell/bash version you have is by using envsubst. For example:
newvar=$(echo '$magic_variable_'"${dynamic_part}" | envsubst)
For zsh (newers mac os versions), you should use
real_var="holaaaa"
aux_var="real_var"
echo ${(P)aux_var}
holaaaa
Instead of "!"
As per BashFAQ/006, you can use read with here string syntax for assigning indirect variables:
function grep_search() {
read "$1" <<<$(ls | tail -1);
}
Usage:
$ grep_search open_box
$ echo $open_box
stack-overflow.txt
Even though it's an old question, I still had some hard time with fetching dynamic variables names, while avoiding the eval (evil) command.
Solved it with declare -n which creates a reference to a dynamic value, this is especially useful in CI/CD processes, where the required secret names of the CI/CD service are not known until runtime. Here's how:
# Bash v4.3+
# -----------------------------------------------------------
# Secerts in CI/CD service, injected as environment variables
# AWS_ACCESS_KEY_ID_DEV, AWS_SECRET_ACCESS_KEY_DEV
# AWS_ACCESS_KEY_ID_STG, AWS_SECRET_ACCESS_KEY_STG
# -----------------------------------------------------------
# Environment variables injected by CI/CD service
# BRANCH_NAME="DEV"
# -----------------------------------------------------------
declare -n _AWS_ACCESS_KEY_ID_REF=AWS_ACCESS_KEY_ID_${BRANCH_NAME}
declare -n _AWS_SECRET_ACCESS_KEY_REF=AWS_SECRET_ACCESS_KEY_${BRANCH_NAME}
export AWS_ACCESS_KEY_ID=${_AWS_ACCESS_KEY_ID_REF}
export AWS_SECRET_ACCESS_KEY=${_AWS_SECRET_ACCESS_KEY_REF}
echo $AWS_ACCESS_KEY_ID $AWS_SECRET_ACCESS_KEY
aws s3 ls
Wow, most of the syntax is horrible! Here is one solution with some simpler syntax if you need to indirectly reference arrays:
#!/bin/bash
foo_1=(fff ddd) ;
foo_2=(ggg ccc) ;
for i in 1 2 ;
do
eval mine=( \${foo_$i[#]} ) ;
echo ${mine[#]}" " ;
done ;
For simpler use cases I recommend the syntax described in the Advanced Bash-Scripting Guide.
KISS approach:
a=1
c="bam"
let "$c$a"=4
echo $bam1
results in 4
I want to be able to create a variable name containing the first argument of the command
script.sh file:
#!/usr/bin/env bash
function grep_search() {
eval $1=$(ls | tail -1)
}
Test:
$ source script.sh
$ grep_search open_box
$ echo $open_box
script.sh
As per help eval:
Execute arguments as a shell command.
You may also use Bash ${!var} indirect expansion, as already mentioned, however it doesn't support retrieving of array indices.
For further read or examples, check BashFAQ/006 about Indirection.
We are not aware of any trick that can duplicate that functionality in POSIX or Bourne shells without eval, which can be difficult to do securely. So, consider this a use at your own risk hack.
However, you should re-consider using indirection as per the following notes.
Normally, in bash scripting, you won't need indirect references at all. Generally, people look at this for a solution when they don't understand or know about Bash Arrays or haven't fully considered other Bash features such as functions.
Putting variable names or any other bash syntax inside parameters is frequently done incorrectly and in inappropriate situations to solve problems that have better solutions. It violates the separation between code and data, and as such puts you on a slippery slope toward bugs and security issues. Indirection can make your code less transparent and harder to follow.
For indexed arrays, you can reference them like so:
foo=(a b c)
bar=(d e f)
for arr_var in 'foo' 'bar'; do
declare -a 'arr=("${'"$arr_var"'[#]}")'
# do something with $arr
echo "\$$arr_var contains:"
for char in "${arr[#]}"; do
echo "$char"
done
done
Associative arrays can be referenced similarly but need the -A switch on declare instead of -a.
POSIX compliant answer
For this solution you'll need to have r/w permissions to the /tmp folder.
We create a temporary file holding our variables and leverage the -a flag of the set built-in:
$ man set
...
-a Each variable or function that is created or modified is given the export attribute and marked for export to the environment of subsequent commands.
Therefore, if we create a file holding our dynamic variables, we can use set to bring them to life inside our script.
The implementation
#!/bin/sh
# Give the temp file a unique name so you don't mess with any other files in there
ENV_FILE="/tmp/$(date +%s)"
MY_KEY=foo
MY_VALUE=bar
echo "$MY_KEY=$MY_VALUE" >> "$ENV_FILE"
# Now that our env file is created and populated, we can use "set"
set -a; . "$ENV_FILE"; set +a
rm "$ENV_FILE"
echo "$foo"
# Output is "bar" (without quotes)
Explaining the steps above:
# Enables the -a behavior
set -a
# Sources the env file
. "$ENV_FILE"
# Disables the -a behavior
set +a
While I think declare -n is still the best way to do it there is another way nobody mentioned it, very useful in CI/CD
function dynamic(){
export a_$1="bla"
}
dynamic 2
echo $a_2
This function will not support spaces so dynamic "2 3" will return an error.
for varname=$prefix_suffix format, just use:
varname=${prefix}_suffix

Bash array indirection in a function [duplicate]

Bash script to create multiple arrays from csv with unknown columns.
I am trying to write a script to compare two csv files with similar columns. I need it to locate the matching column from the other csv and compare any differences. The kicker is I would like the script to be dynamic to allow any number of columns to be entered and it still be able to function. I thought I had a good plan to solve this but turns out I'm running into syntax errors. Here is a sample of a csv I need to compare.
IP address, Notes, Nmap-SSH, Nmap-SMTP, Nmap-HTTP, Nmap-HTTPS,
10.0.0.1, , open, closed, open, open,
10.0.0.2, , closed, open, closed, closed,
When I read the csv file I was planning to look for "IF column == open; then; populate this column's array with the IP address" This would have given me 4 lists in this scenario with the IPs that were listening on said port. I could then compare that to my security device configuration to make sure it was configured properly. Finally to the meat, here is what I thought would accomplish creating the arrays for me to search later. However I ran into a snag when I tried to use a variable inside an array name. Can my syntax be corrected or is there just a better way to do this sort of thing?
#!/bin/bash
#
#
# This script compares config_cleaned_<ip>.txt output against ext_web_env.csv and outputs the differences
#
#
# Read from ext_web_env.csv file and create Array
#
FILENAME=./tmp/ext_web_env.csv
#
index=0
#
while read line
do
# How many columns are in the .csv?
varEnvCol=$(echo $line | awk -F, '{print NF}')
echo "columns = $varEnvCol"
# While loop to create array for each column
while [ $varEnvCol != 2 ]
do
# Checks to see if port is open; if so then add IP address to array
varPortCon=$(echo $line | awk -F, -v i=$varEnvCol '{print $i}')
if [ $varPortCon = "open" ]
then
arr$varEnvCol[$index]="$(echo $line | awk -F, '{print $1}')"
# I get this error message "line29 : arr8[194]=10.0.0.194: command not found"
fi
echo "arrEnv$varEnvCol is: ${arr$varEnvCol[#]}"
# Another error but not as important since I am using this to debug "line31: arr$varEnvCol is: ${arr$varEnvCol[#]}: bad substitution"
varEnvCol=$(($varEnvCol - 1))
done
index=$(($index + 1 ))
done < $FILENAME
UPDATE
I also tried using the eval command since all the data will be populated by other scripts.
but am getting this error message:
./compare.sh: line 41: arr8[83]=10.0.0.83: command not found
Here is my new code for this example:
if [[ $varPortCon = *'open'* ]]
then
eval arr\$varEnvCol[$index]=$(echo $line | awk -F, '{print $1}')
fi
arr$varEnvCol[$index]="$(...)"
doesn't work the way you expect it to - you cannot assign to shell variables indirectly - via an expression that expands to the variable name - this way.
Your attempted workaround with eval is also flawed - see below.
tl;dr
If you use bash 4.3 or above:
declare -n targetArray="arr$varEnvCol"
targetArray[index]=$(echo $line | awk -F, '{print $1}')
bash 4.2 or earlier:
declare "arr$varEnvCol"[index]="$(echo $line | awk -F, '{print $1}')"
Caveat: This will work in your particular situation, but may fail subtly in others; read on for details, including a more robust, but cumbersome alternative based on read.
The eval-based solution mentioned by #shellter in a since-deleted comment is problematic not only for security reasons (as they mentioned), but also because it can get quite tricky with respect to quoting; for completeness, here's the eval-based solution:
eval "arr$varEnvCol[index]"='$(echo $line | awk -F, '\''{print $1}'\'')'
See below for an explanation.
Assign to a bash array variable indirectly:
bash 4.3+: use declare -n to effectively create an alias ('nameref') of another variable
This is by far the best option, if available:
declare -n targetArray="arr$varEnvCol"
targetArray[index]=$(echo $line | awk -F, '{print $1}')
declare -n effectively allows you to refer to a variable by another name (whether that variable is an array or not), and the name to create an alias for can be the result of an expression (an expanded string), as demonstrated.
bash 4.2-: there are several options, each with tradeoffs
NOTE: With non-array variables, the best approach is to use printf -v. Since this question is about array variables, this approach is not discussed further.
[most robust, but cumbersome]: use read:
IFS=$'\n' read -r -d '' "arr$varEnvCol"[index] <<<"$(echo $line | awk -F, '{print $1}')"
IFS=$'\n' ensures that that leading and trailing whitespace in each input line is left intact.
-r prevents interpretation of \ chars. in the input.
-d '' ensures that ALL input is captured, even multi-line.
Note, however, that any trailing \n chars. are stripped.
If you're only interested in the first line of input, omit -d ''
"arr$varEnvCol"[index] expands to the variable - array element, in this case - to assign to; note that referring to variable index inside an array subscript does NOT need the $ prefix, because subscripts are evaluated in arithmetic context, where the prefix is optional.
<<< - a so-called here-string - sends its argument to stdin, where read takes its input from.
[simplest, but may break]: use declare:
declare "arr$varEnvCol"[index]="$(echo $line | awk -F, '{print $1}')"
(This is slightly counter-intuitive, in that declare is meant to declare, not modify a variable, but it works in bash 3.x and 4.x, with the constraints noted below.)
Works fine OUTSIDE a FUNCTION - whether the array was explicitly declared with declare or not.
Caveat: INSIDE a function, only works with LOCAL variables - you cannot reference shell-global variables (variables declared outside the function) from inside a function that way. Attempting to do so invariably creates a LOCAL variable ECLIPSING the shell-global variable.
[insecure and tricky]: use eval:
eval "arr$varEnvCol[index]"='$(echo $line | awk -F, '\''{print $1}'\'')'
CAVEAT: Only use eval if you fully control the contents of the string being evaluated; eval will execute any command contained in a string, with potentially unwanted results.
Understanding what variable references/command substitutions get expanded when is nontrivial - the safest approach is to delay expansion so that they happen when eval executes rather than immediate expansion that happens when arguments are passed to eval.
For a variable assignment statement to succeed, the RHS (right-hand side) must eventually evaluate to a single token - either unquoted without whitespace or quoted (optionally with whitespace).
The above example uses single quotes to delay expansion; thus, the string passed mustn't contain single quotes directly and thus is broken into multiple parts with literal ' chars. spliced in as \'.
Also note that the LHS (left-hand side) of the assignment statement passed to eval must be a double-quoted string - using an unquoted string with selective quoting of $ won't work, curiously:
OK: eval "arr$varEnvCol[index]"=...
FAILS: eval arr\$varEnvCol[index]=...

How do I store output of a git command in a Perl array or scalar?

In my code I have
git diff --numstat
I know I could create a file with
git diff --numstat > log.log
But is it even possible to pass this into an array or scalar of some sort? I was thinking something like this but I'm not sure why it doesn’t compile.
my #array;
push (#array, git diff --numstat);
Use backticks, also known more generally as qx//:
qx/STRING/
`STRING`
A string which is (possibly) interpolated and then executed as a system command with /bin/sh or its equivalent. Shell wildcards, pipes, and redirections will be honored. The collected standard output of the command is returned; standard error is unaffected. In scalar context, it comes back as a single (potentially multi-line) string, or undef if the command failed. In list context, returns a list of lines (however you've defined lines with $/ or $INPUT_RECORD_SEPARATOR), or an empty list if the command failed.
You have options, and which is better depends on what you want to do with the output.
To read all of the standard output into a scalar, use the operator in scalar context as in
$output = `git diff --numstat`;
In list context with the default value of $/, perl splits the output into separate lines. If you want to append the git output to the end of an existing array, use push, as in
push #array, `git diff --numstat`;
Although you mentioned push specifically in your question, I’m having a hard time imagining why you’d mix the output of git with something else. Storing the output in an array directly is simpler:
#array = `git diff --numstat`;
Note that the list of lines returned retain their end-of-line characters. To create a new array and remove all of the newlines in one line, write
chomp(#array = `git diff --numstat`);
or even
chomp(my #array = `git diff --numstat`);
if you’re running under use strict.
Error Handling
For code that you plan to use more than once or twice, you should check that `git diff --numstat`, or any other command whose output you want to read, actually succeeded. Otherwise, with the warnings pragma enabled, you’ll see lots of diagnostic messages about undefined variables or missing output.
In scalar context, failure will return the undefined value. Check it as in
my $output = `git diff --numstat`;
die "$0: git may not be installed" unless defined $output;
Failure in list context produces an empty list.
my #output = `git diff --numstat`;
die "$0: git may not be installed" unless #output;

Exporting an array in bash script

I can not export an array from a bash script to another bash script like this:
export myArray[0]="Hello"
export myArray[1]="World"
When I write like this there are no problem:
export myArray=("Hello" "World")
For several reasons I need to initialize my array into multiple lines. Do you have any solution?
Array variables may not (yet) be exported.
From the manpage of bash version 4.1.5 under ubuntu 10.04.
The following statement from Chet Ramey (current bash maintainer as of 2011) is probably the most official documentation about this "bug":
There isn't really a good way to encode an array variable into the environment.
http://www.mail-archive.com/bug-bash#gnu.org/msg01774.html
TL;DR: exportable arrays are not directly supported up to and including bash-5.1, but you can (effectively) export arrays in one of two ways:
a simple modification to the way the child scripts are invoked
use an exported function to store the array initialisation, with a simple modification to the child scripts
Or, you can wait until bash-4.3 is released (in development/RC state as of February 2014, see ARRAY_EXPORT in the Changelog). Update: This feature is not enabled in 4.3. If you define ARRAY_EXPORT when building, the build will fail. The author has stated it is not planned to complete this feature.
The first thing to understand is that the bash environment (more properly command execution environment) is different to the POSIX concept of an environment. The POSIX environment is a collection of un-typed name=value pairs, and can be passed from a process to its children in various ways (effectively a limited form of IPC).
The bash execution environment is effectively a superset of this, with typed variables, read-only and exportable flags, arrays, functions and more. This partly explains why the output of set (bash builtin) and env or printenv differ.
When you invoke another bash shell you're starting a new process, you loose some bash state. However, if you dot-source a script, the script is run in the same environment; or if you run a subshell via ( ) the environment is also preserved (because bash forks, preserving its complete state, rather than reinitialising using the process environment).
The limitation referenced in #lesmana's answer arises because the POSIX environment is simply name=value pairs with no extra meaning, so there's no agreed way to encode or format typed variables, see below for an interesting bash quirk regarding functions , and an upcoming change in bash-4.3(proposed array feature abandoned).
There are a couple of simple ways to do this using declare -p (built-in) to output some of the bash environment as a set of one or more declare statements which can be used reconstruct the type and value of a "name". This is basic serialisation, but with rather less of the complexity some of the other answers imply. declare -p preserves array indexes, sparse arrays and quoting of troublesome values. For simple serialisation of an array you could just dump the values line by line, and use read -a myarray to restore it (works with contiguous 0-indexed arrays, since read -a automatically assigns indexes).
These methods do not require any modification of the script(s) you are passing the arrays to.
declare -p array1 array2 > .bash_arrays # serialise to an intermediate file
bash -c ". .bash_arrays; . otherscript.sh" # source both in the same environment
Variations on the above bash -c "..." form are sometimes (mis-)used in crontabs to set variables.
Alternatives include:
declare -p array1 array2 > .bash_arrays # serialise to an intermediate file
BASH_ENV=.bash_arrays otherscript.sh # non-interactive startup script
Or, as a one-liner:
BASH_ENV=<(declare -p array1 array2) otherscript.sh
The last one uses process substitution to pass the output of the declare command as an rc script. (This method only works in bash-4.0 or later: earlier versions unconditionally fstat() rc files and use the size returned to read() the file in one go; a FIFO returns a size of 0, and so won't work as hoped.)
In a non-interactive shell (i.e. shell script) the file pointed to by the BASH_ENV variable is automatically sourced. You must make sure bash is correctly invoked, possibly using a shebang to invoke "bash" explicitly, and not #!/bin/sh as bash will not honour BASH_ENV when in historical/POSIX mode.
If all your array names happen to have a common prefix you can use declare -p ${!myprefix*} to expand a list of them, instead of enumerating them.
You probably should not attempt to export and re-import the entire bash environment using this method, some special bash variables and arrays are read-only, and there can be other side-effects when modifying special variables.
(You could also do something slightly disagreeable by serialising the array definition to an exportable variable, and using eval, but let's not encourage the use of eval ...
$ array=([1]=a [10]="b c")
$ export scalar_array=$(declare -p array)
$ bash # start a new shell
$ eval $scalar_array
$ declare -p array
declare -a array='([1]="a" [10]="b c")'
)
As referenced above, there's an interesting quirk: special support for exporting functions through the environment:
function myfoo() {
echo foo
}
with export -f or set +a to enable this behaviour, will result in this in the (process) environment, visible with printenv:
myfoo=() { echo foo
}
The variable is functionname (or functioname() for backward compatibility) and its value is () { functionbody }.
When a subsequent bash process starts it will recreate a function from each such environment variable. If you peek into the bash-4.2 source file variables.c you'll see variables starting with () { are handled specially. (Though creating a function using this syntax with declare -f is forbidden.) Update: The "shellshock" security issue is related to this feature, contemporary systems may disable automatic function import from the environment as a mitigation.
If you keep reading though, you'll see an #if 0 (or #if ARRAY_EXPORT) guarding code that checks variables starting with ([ and ending with ), and a comment stating "Array variables may not yet be exported". The good news is that in the current development version bash-4.3rc2 the ability to export indexed arrays (not associative) is enabled. This feature is not likely to be enabled, as noted above.
We can use this to create a function which restores any array data required:
% function sharearray() {
array1=(a b c d)
}
% export -f sharearray
% bash -c 'sharearray; echo ${array1[*]}'
So, similar to the previous approach, invoke the child script with:
bash -c "sharearray; . otherscript.sh"
Or, you can conditionally invoke the sharearray function in the child script by adding at some appropriate point:
declare -F sharearray >/dev/null && sharearray
Note there is no declare -a in the sharearray function, if you do that the array is implicitly local to the function, not what is wanted. bash-4.2 supports declare -g that makes a variable declared in a function into a global, so declare -ga can then be used. (Since associative arrays require a declare -A you won't be able to use this method for global associative arrays prior to bash-4.2, from v4.2 declare -Ag will work as hoped.) The GNU parallel documentation has useful variation on this method, see the discussion of --env in the man page.
Your question as phrased also indicates you may be having problems with export itself. You can export a name after you've created or modified it. "exportable" is a flag or property of a variable, for convenience you can also set and export in a single statement. Up to bash-4.2 export expects only a name, either a simple (scalar) variable or function name are supported.
Even if you could (in future) export arrays, exporting selected indexes (a slice) may not be supported (though since arrays are sparse there's no reason it could not be allowed). Though bash also supports the syntax declare -a name[0], the subscript is ignored, and "name" is simply a normal indexed array.
Jeez. I don't know why the other answers made this so complicated. Bash has nearly built-in support for this.
In the exporting script:
myArray=( ' foo"bar ' $'\n''\nbaz)' ) # an array with two nasty elements
myArray="${myArray[#]#Q}" ./importing_script.sh
(Note, the double quotes are necessary for correct handling of whitespace within array elements.)
Upon entry to importing_script.sh, the value of the myArray environment variable comprises these exact 26 bytes:
' foo"bar ' $'\n\\nbaz)'
Then the following will reconstitute the array:
eval "myArray=( ${myArray} )"
CAUTION! Do not eval like this if you cannot trust the source of the myArray environment variable. This trick exhibits the "Little Bobby Tables" vulnerability. Imagine if someone were to set the value of myArray to ) ; rm -rf / #.
The environment is just a collection of key-value pairs, both of which are character strings. A proper solution that works for any kind of array could either
Save each element in a different variable (e.g. MY_ARRAY_0=myArray[0]). Gets complicated because of the dynamic variable names.
Save the array in the file system (declare -p myArray >file).
Serialize all array elements into a single string.
These are covered in the other posts. If you know that your values never contain a certain character (for example |) and your keys are consecutive integers, you can simply save the array as a delimited list:
export MY_ARRAY=$(IFS='|'; echo "${myArray[*]}")
And restore it in the child process:
IFS='|'; myArray=($MY_ARRAY); unset IFS
Based on #mr.spuratic use of BASH_ENV, here I tunnel $# through script -f -c
script -c <command> <logfile> can be used to run a command inside another pty (and process group) but it cannot pass any structured arguments to <command>.
Instead <command> is a simple string to be an argument to the system library call.
I need to tunnel $# of the outer bash into $# of the bash invoked by script.
As declare -p cannot take #, here I use the magic bash variable _ (with a dummy first array value as that will get overwritten by bash). This saves me trampling on any important variables:
Proof of concept:
BASH_ENV=<( declare -a _=("" "$#") && declare -p _ ) bash -c 'set -- "${_[#]:1}" && echo "$#"'
"But," you say, "you are passing arguments to bash -- and indeed I am, but these are a simple string of known character. Here is use by script
SHELL=/bin/bash BASH_ENV=<( declare -a _=("" "$#") && declare -p _ && echo 'set -- "${_[#]:1}"') script -f -c 'echo "$#"' /tmp/logfile
which gives me this wrapper function in_pty:
in_pty() {
SHELL=/bin/bash BASH_ENV=<( declare -a _=("" "$#") && declare -p _ && echo 'set -- "${_[#]:1}"') script -f -c 'echo "$#"' /tmp/logfile
}
or this function-less wrapper as a composable string for Makefiles:
in_pty=bash -c 'SHELL=/bin/bash BASH_ENV=<( declare -a _=("" "$$#") && declare -p _ && echo '"'"'set -- "$${_[#]:1}"'"'"') script -qfc '"'"'"$$#"'"'"' /tmp/logfile' --
...
$(in_pty) test --verbose $# $^
I was editing a different post and made a mistake. Augh. Anyway, perhaps this might help?
https://stackoverflow.com/a/11944320/1594168
Note that because the shell's array format is undocumented on bash or any other shell's side,
it is very difficult to return a shell array in platform independent way.
You would have to check the version, and also craft a simple script that concatinates all
shell arrays into a file that other processes can resolve into.
However, if you know the name of the array you want to take back home then there is a way, while a bit dirty.
Lets say I have
MyAry[42]="whatever-stuff";
MyAry[55]="foo";
MyAry[99]="bar";
So I want to take it home
name_of_child=MyAry
take_me_home="`declare -p ${name_of_child}`";
export take_me_home="${take_me_home/#declare -a ${name_of_child}=/}"
We can see it being exported, by checking from a sub-process
echo ""|awk '{print "from awk =["ENVIRON["take_me_home"]"]"; }'
Result :
from awk =['([42]="whatever-stuff" [55]="foo" [99]="bar")']
If we absolutely must, use the env var to dump it.
env > some_tmp_file
Then
Before running the another script,
# This is the magic that does it all
source some_tmp_file
As lesmana reported, you cannot export arrays. So you have to serialize them before passing through the environment. This serialization useful other places too where only a string fits (su -c 'string', ssh host 'string'). The shortest code way to do this is to abuse 'getopt'
# preserve_array(arguments). return in _RET a string that can be expanded
# later to recreate positional arguments. They can be restored with:
# eval set -- "$_RET"
preserve_array() {
_RET=$(getopt --shell sh --options "" -- -- "$#") && _RET=${_RET# --}
}
# restore_array(name, payload)
restore_array() {
local name="$1" payload="$2"
eval set -- "$payload"
eval "unset $name && $name=("\$#")"
}
Use it like this:
foo=("1: &&& - *" "2: two" "3: %# abc" )
preserve_array "${foo[#]}"
foo_stuffed=${_RET}
restore_array newfoo "$foo_stuffed"
for elem in "${newfoo[#]}"; do echo "$elem"; done
## output:
# 1: &&& - *
# 2: two
# 3: %# abc
This does not address unset/sparse arrays.
You might be able to reduce the 2 'eval' calls in restore_array.
Although this question/answers are pretty old, this post seems to be the top hit when searching for "bash serialize array"
And, although the original question wasn't quite related to serializing/deserializing arrays, it does seem that the answers have devolved in that direction.
So with that ... I offer my solution:
Pros
All Core Bash Concepts
No Evals
No Sub-Commands
Cons
Functions take variable names as arguments (vs actual values)
Serializing requires having at least one character that is not present in the array
serialize_array.bash
# shellcheck shell=bash
##
# serialize_array
# Serializes a bash array to a string, with a configurable seperator.
#
# $1 = source varname ( contains array to be serialized )
# $2 = target varname ( will contian the serialized string )
# $3 = seperator ( optional, defaults to $'\x01' )
#
# example:
#
# my_arry=( one "two three" four )
# serialize_array my_array my_string '|'
# declare -p my_string
#
# result:
#
# declare -- my_string="one|two three|four"
#
function serialize_array() {
declare -n _array="${1}" _str="${2}" # _array, _str => local reference vars
local IFS="${3:-$'\x01'}"
# shellcheck disable=SC2034 # Reference vars assumed used by caller
_str="${_array[*]}" # * => join on IFS
}
##
# deserialize_array
# Deserializes a string into a bash array, with a configurable seperator.
#
# $1 = source varname ( contains string to be deserialized )
# $2 = target varname ( will contain the deserialized array )
# $3 = seperator ( optional, defaults to $'\x01' )
#
# example:
#
# my_string="one|two three|four"
# deserialize_array my_string my_array '|'
# declare -p my_array
#
# result:
#
# declare -a my_array=([0]="one" [1]="two three" [2]="four")
#
function deserialize_array() {
IFS="${3:-$'\x01'}" read -r -a "${2}" <<<"${!1}" # -a => split on IFS
}
NOTE: This is hosted as a gist here:
https://gist.github.com/TekWizely/c0259f25e18f2368c4a577495cd566cd
[edits]
Logic simplified after running through shellcheck + shfmt.
Added URL for hosted GIST
you (hi!) can use this, dont need writing a file, for ubuntu 12.04, bash 4.2.24
Also, your multiple lines array can be exported.
cat >>exportArray.sh
function FUNCarrayRestore() {
local l_arrayName=$1
local l_exportedArrayName=${l_arrayName}_exportedArray
# if set, recover its value to array
if eval '[[ -n ${'$l_exportedArrayName'+dummy} ]]'; then
eval $l_arrayName'='`eval 'echo $'$l_exportedArrayName` #do not put export here!
fi
}
export -f FUNCarrayRestore
function FUNCarrayFakeExport() {
local l_arrayName=$1
local l_exportedArrayName=${l_arrayName}_exportedArray
# prepare to be shown with export -p
eval 'export '$l_arrayName
# collect exportable array in string mode
local l_export=`export -p \
|grep "^declare -ax $l_arrayName=" \
|sed 's"^declare -ax '$l_arrayName'"export '$l_exportedArrayName'"'`
# creates exportable non array variable (at child shell)
eval "$l_export"
}
export -f FUNCarrayFakeExport
test this example on terminal bash (works with bash 4.2.24):
source exportArray.sh
list=(a b c)
FUNCarrayFakeExport list
bash
echo ${list[#]} #empty :(
FUNCarrayRestore list
echo ${list[#]} #profit! :D
I may improve it here
PS.: if someone clears/improve/makeItRunFaster I would like to know/see, thx! :D
For arrays with values without spaces, I've been using a simple set of functions to iterate through each array element and concatenate the array:
_arrayToStr(){
array=($#)
arrayString=""
for (( i=0; i<${#array[#]}; i++ )); do
if [[ $i == 0 ]]; then
arrayString="\"${array[i]}\""
else
arrayString="${arrayString} \"${array[i]}\""
fi
done
export arrayString="(${arrayString})"
}
_strToArray(){
str=$1
array=${str//\"/}
array=(${array//[()]/""})
export array=${array[#]}
}
The first function with turn the array into a string by adding the opening and closing parentheses and escaping all of the double quotation marks. The second function will strip the quotation marks and the parentheses and place them into a dummy array.
In order export the array, you would pass in all the elements of the original array:
array=(foo bar)
_arrayToStr ${array[#]}
At this point, the array has been exported into the value $arrayString. To import the array in the destination file, rename the array and do the opposite conversion:
_strToArray "$arrayName"
newArray=(${array[#]})
Much thanks to #stéphane-chazelas who pointed out all the problems with my previous attempts, this now seems to work to serialise an array to stdout or into a variable.
This technique does not shell-parse the input (unlike declare -a/declare -p) and so is safe against malicious insertion of metacharacters in the serialised text.
Note: newlines are not escaped, because read deletes the \<newlines> character pair, so -d ... must instead be passed to read, and then unescaped newlines are preserved.
All this is managed in the unserialise function.
Two magic characters are used, the field separator and the record separator (so that multiple arrays can be serialized to the same stream).
These characters can be defined as FS and RS but neither can be defined as newline character because an escaped newline is deleted by read.
The escape character must be \ the backslash, as that is what is used by read to avoid the character being recognized as an IFS character.
serialise will serialise "$#" to stdout, serialise_to will serialise to the varable named in $1
serialise() {
set -- "${#//\\/\\\\}" # \
set -- "${#//${FS:-;}/\\${FS:-;}}" # ; - our field separator
set -- "${#//${RS:-:}/\\${RS:-:}}" # ; - our record separator
local IFS="${FS:-;}"
printf ${SERIALIZE_TARGET:+-v"$SERIALIZE_TARGET"} "%s" "$*${RS:-:}"
}
serialise_to() {
SERIALIZE_TARGET="$1" serialise "${#:2}"
}
unserialise() {
local IFS="${FS:-;}"
if test -n "$2"
then read -d "${RS:-:}" -a "$1" <<<"${*:2}"
else read -d "${RS:-:}" -a "$1"
fi
}
and unserialise with:
unserialise data # read from stdin
or
unserialise data "$serialised_data" # from args
e.g.
$ serialise "Now is the time" "For all good men" "To drink \$drink" "At the \`party\`" $'Party\tParty\tParty'
Now is the time;For all good men;To drink $drink;At the `party`;Party Party Party:
(without a trailing newline)
read it back:
$ serialise_to s "Now is the time" "For all good men" "To drink \$drink" "At the \`party\`" $'Party\tParty\tParty'
$ unserialise array "$s"
$ echo "${array[#]/#/$'\n'}"
Now is the time
For all good men
To drink $drink
At the `party`
Party Party Party
or
unserialise array # read from stdin
Bash's read respects the escape character \ (unless you pass the -r flag) to remove special meaning of characters such as for input field separation or line delimiting.
If you want to serialise an array instead of a mere argument list then just pass your array as the argument list:
serialise_array "${my_array[#]}"
You can use unserialise in a loop like you would read because it is just a wrapped read - but remember that the stream is not newline separated:
while unserialise array
do ...
done
I've wrote my own functions for this and improved the method with the IFS:
Features:
Doesn't call to $(...) and so doesn't spawn another bash shell process
Serializes ? and | characters into ?00 and ?01 sequences and back, so can be used over array with these characters
Handles the line return characters between serialization/deserialization as other characters
Tested in cygwin bash 3.2.48 and Linux bash 4.3.48
function tkl_declare_global()
{
eval "$1=\"\$2\"" # right argument does NOT evaluate
}
function tkl_declare_global_array()
{
local IFS=$' \t\r\n' # just in case, workaround for the bug in the "[#]:i" expression under the bash version lower than 4.1
eval "$1=(\"\${#:2}\")"
}
function tkl_serialize_array()
{
local __array_var="$1"
local __out_var="$2"
[[ -z "$__array_var" ]] && return 1
[[ -z "$__out_var" ]] && return 2
local __array_var_size
eval declare "__array_var_size=\${#$__array_var[#]}"
(( ! __array_var_size )) && { tkl_declare_global $__out_var ''; return 0; }
local __escaped_array_str=''
local __index
local __value
for (( __index=0; __index < __array_var_size; __index++ )); do
eval declare "__value=\"\${$__array_var[__index]}\""
__value="${__value//\?/?00}"
__value="${__value//|/?01}"
__escaped_array_str="$__escaped_array_str${__escaped_array_str:+|}$__value"
done
tkl_declare_global $__out_var "$__escaped_array_str"
return 0
}
function tkl_deserialize_array()
{
local __serialized_array="$1"
local __out_var="$2"
[[ -z "$__out_var" ]] && return 1
(( ! ${#__serialized_array} )) && { tkl_declare_global $__out_var ''; return 0; }
local IFS='|'
local __deserialized_array=($__serialized_array)
tkl_declare_global_array $__out_var
local __index=0
local __value
for __value in "${__deserialized_array[#]}"; do
__value="${__value//\?01/|}"
__value="${__value//\?00/?}"
tkl_declare_global $__out_var[__index] "$__value"
(( __index++ ))
done
return 0
}
Example:
a=($'1 \n 2' "3\"4'" 5 '|' '?')
tkl_serialize_array a b
tkl_deserialize_array "$b" c
I think you can try it this way (by sourcing your script after export):
export myArray=(Hello World)
. yourScript.sh

Resources