I try to solve a problem in shell.
Im trying to find a way to delete all newlines from each element of an array. I tried to do this with a for loop.
The Strings look like this (always three numbers, separated with dots)
"14.1.3\n" and I need to get rid of the newline at the end.
This is what i tried to do:
As a single-liner
for i in ${backup_versions[*]}; do backup_versions[$i]=echo "$i" | tr '\n' ' ' ; done
Easier to read
for i in ${backup_versions[*]};
do
backup_versions[$i]=echo "$i" | tr '\n' ' '
done
I think I try to reassign the element with the wrong syntax, but I tried every kind of writing i which I found or knew myself.
The deletion of the newline works just fine and just the reassigning is my Problem.
If the strings are always of that form and don't contain any whitespace or wildcard characters, you can just use the shell's word-splitting to remove extraneous whitespace characters from the values.
backup_versions=(${backup_versions[*]})
If you used mapfile to create the array, you can use the -t option to prevent it from including the newline in the value in the first place.
Use Bash's string substitution expansion ${var//old/new} to delete all newlines, and dynamically create a declaration for a new array, with elements stripped of newlines:
#!/usr/bin/env bash
backup_versions=(
$'foo\nbar\n'
$'\nbaz\ncux\n\n'
$'I have spaces\n and newlines\n'
$'It\'s a \n\n\nsingle quote and spaces\n'
$'Quoted "foo bar"\n and newline'
)
# shellcheck disable=SC2155 # Dynamically generated declaration
declare -a no_newlines="($(
printf '%q ' "${backup_versions[#]//$'\n'/}"
))"
# Debug print original array declaration
declare -p backup_versions
# Debug print the declaration of no_newlines
declare -p no_newlines
declare -a no_newlines="($(: Creates a dynamically generated declaration for the no_newlines array.
printf '%q ': Print each argument with quotes if necessary and add a trailing space.
"${backup_versions[#]//$'\n'/}": Expand each element of the backup_versions array, // replacing all $'\n' newlines by nothing to delete them.
Finally the no_newlines array will contain all entries from backup_versions, with newlines stripped-out.
Debug output match expectations:
declare -a backup_versions=([0]=$'foo\nbar\n' [1]=$'\nbaz\ncux\n\n' [2]=$'I have spaces\n and newlines\n' [3]=$'It\'s a \n\n\nsingle quote and spaces\n' [4]=$'Quoted "foo bar"\n and newline')
declare -a no_newlines=([0]="foobar" [1]="bazcux" [2]="I have spaces and newlines" [3]="It's a single quote and spaces" [4]="Quoted \"foo bar\" and newline")
You can use a modifier when expanding the array, then save the modified contents. If the elements just have a single trailing newline, use substring removal to trim it:
backup_versions=("${backup_versions[#]%$'\n'}")
(Note: when expanding an array, you should almost always use [#] instead of [*], and put double-quotes around it to avoid weird parsing. Bash doesn't generally let you combine modifiers, but you can combo them with [#] to apply the modifier to each element as it's expanded.)
If you want to remove all newlines from the elements (in case there are multiple newlines in some elements), use a substitution (with an empty replacement string) instead:
backup_versions=("${backup_versions[#]//$'\n'/}")
(But as several comments have mentioned, it'd probably be better to look at how the array's being created, and see if it's possible to just avoid putting newlines in the array in the first place.)
Related
This question already has answers here:
Bash arbitrary glob pattern (with spaces) in for loop
(2 answers)
Closed 2 years ago.
I'm trying to use internal bash globs and braces expansion mechanism from a variable to an array.
path='./tmp2/tmp23/*'
expanded=($(eval echo "$(printf "%q" "${path}")"))
results:
declare -- path="./tmp2/tmp23/*"
declare -a expanded=([0]="./tmp2/tmp23/testfile" [1]="./tmp2/tmp23/testfile2" [2]="./tmp2/tmp23/testfile3" [3]="./tmp2/tmp23/testfile4" [4]="./tmp2/tmp23/tmp231")
This is working.
(I have 4 file testfileX and 1 folder in the ./tmp2/tmp23 folder)
Each file/folder inside an index of the array.
Now if my path contains spaces:
path='./tmp2/tmp2 3/*'
expanded=($(eval echo "$(printf "%q" "${path}")"))
Results
declare -- path="./tmp2/tmp2 3/*"
declare -a expanded=([0]="./tmp2/tmp2" [1]="3/")
Not working nothing is expanded and path is splitted due to IFS calvary.
Now with same path containing spaces but without glob:
path='./tmp2/tmp2 3/'
expanded=($(eval echo "$(printf "%q" "${path}"*)")) => added glob here outside ""
Results:
declare -a expanded=([0]="./tmp2/tmp2" [1]="3/testfile./tmp2/tmp2" [2]="3/testfile2./tmp2/tmp2" [3]="3/testfile3./tmp2/tmp2" [4]="3/testfile4./tmp2/tmp2" [5]="3/tmp231")
Path is expanded but results are false and splitted due to IFS.
Now with a quoted $(eval)
expanded=("$(eval echo "$(printf "%q" "${path}"*)")")
Results:
declare -a expanded=([0]="./tmp2/tmp2 3/testfile./tmp2/tmp2 3/testfile2./tmp2/tmp2 3/testfile3./tmp2/tmp2 3/testfile4./tmp2/tmp2 3/tmp231")
Now all expansion is done inside the same array index.
Why glob or braces expansion works inside a variable if there is no space ?
Why this is not working anymore when there is a space. Exactly the same code but just a space. Globs or braces expansion need to be outside double quotes. eval seems to have no effects.
Is there any other alternative to use (as read or mapfile or is it possible to escape space character) ?
I found this question how-to-assign-a-glob-expression-to-a-variable-in-a-bash-script but nothing about spaces.
Is there any way to expand a variable which contains globs or braces expansion parameters with spaces or without spaces to an array using the same method without word splitting when they contain spaces ?
Kind Regards
Don't use eval. Don't use a subshell. Just clear IFS.
path='./tmp2/tmp2 3/*'
oIFS=${IFS:-$' \t\n'} IFS='' # backup prior IFS value
expanded=( $path ) # perform expansion, unquoted
IFS=$oIFS # reset to original value, or an equivalent thereto
When you perform an unquoted expansion, two separate things happen in order:
All the characters found in the $IFS variable are used to split the string into words
Each word is then expanded as a separate glob.
The default value of IFS contains the space, the tab and the newline. If you don't want spaces, tabs and newlines to be treated as delimiters between words, then you need to modify that default.
Using Bash I am extracting multiple strings from a binary file. Those strings are filenames, so only NUL and slash can not appear. I use a function that outputs those filenames to an array. I know, I can use IFS separator newline to get filenames with spaces. I hope it is possible to separate functions multiline strings with NUL to save in array, so any *nix legal filename can be worked with. If I set IFS to '' or '\0' I get some numbers instead of names. Not sure why, and maybe I have overseen something pretty basic :)
How do I achieve getting all possible filename strings including not just spaces, but newlines and other characters/byte values as well?
Here is my simplified example.
#! /bin/bash
binaryFile=$1
getBinaryList () {
fileNameAddresses=( 123 456 789 ) #Just a mock example for simplicity
for currAddr in "${fileNameAddresses[#]}"
do
fileNameStart=$((currAddr)) #Just a mock example for simplicity
fileNameLength=48 #Just a mock example for simplicity
currFileName=$( dd status=none bs=1 skip=$fileNameStart count=$fileNameLength if=$binaryFile )
printf "%s\n" "$currFileName"
done
}
IFS=$'\n'
allFileNames=($(getBinaryList $binaryFile))
echo ${#allFileNames[#]}
printf "%s\n" "${allFileNames[#]}"
Your idea is right, but with a couple of slight modifications you can achieve what you are looking for. In the getBinaryList() function instead of using printf() emitting output with newline formatting, use a NULL byte separator, i.e.
printf "%s\0" "$currFileName"
and now instead of modifying IFS to newline and slurping the result into an array. Use a command like mapfile which puts the results directly into array. The command provides an option to delimit results on the NULL byte with -d '' and to store in array specified by -t. So your result can look like
mapfile -t -d '' allFileNames < <(getBinaryList "$binaryFile")
I'm trying to find if the output of the following command, stores just one file in the array array_a
array_a = $(find /path/dir1 -maxdepth 1 -name file_orders?.csv)
echo $array_a
/path/dir1/file_orders1.csv /path/dir1/file_orders2.csv
echo ${#array_a[#]}
1
So it tell's me there's just one element, but obviously there are 2.
If I type echo ${array_a[0]} it doesn't return me anything. It's like, the variable array_a isn't an array at all. How can i force it to store the elements in array?
You are lacking the parentheses which define an array. But the fundamental problem is that running find inside backticks will split on whitespace, so if any matching file could contain a space, it will produce more than one element in the resulting array.
With -maxdepth 1 anyway, just use the shell's globbing facilities instead; you don't need find at all.
array_a=(/path/dir1/file_orders?.csv)
Also pay attention to quotes when using the array.
echo "${array_a[#]}"
Without the quotes, the whitespace splitting will happen again.
I thought setting IFS to $'\n' would help me in reading an entire file into an array, as in:
IFS=$'\n' read -r -a array < file
However, the above command only reads the first line of the file into the first element of the array, and nothing else.
Even this reads only the first line into the array:
string=$'one\ntwo\nthree'
IFS=$'\n' read -r -a array <<< "$string"
I came across other posts on this site that talk about either using mapfile -t or a read loop to read a file into an array.
Now my question is: when do I use IFS=$'\n' at all?
You are a bit confused as to what IFS is. IFS is the Internal Field Separator used by bash to perform word-splitting to split lines into words after expansion. The default value is [ \t\n] (space, tab, newline).
By reassigning IFS=$'\n', you are removing the ' \t' and telling bash to only split words on newline characters (your thinking is correct). That has the effect of allowing some line with spaces to be read into a single array element without quoting.
Where your implementation fails is in your read -r -a array < file. The -a causes words in the line to be assigned to sequential array indexes. However, you have told bash to only break on a newline (which is the whole line). Since you only call read once, only one array index is filled.
You can either do:
while IFS=$'\n' read -r line; do
array+=( $line )
done < "$filename"
(which you could do without changing IFS if you simply quoted "$line")
Or using IFS=$'\n', you could do
IFS=$'\n'
array=( $(<filename) )
or finally, you could use IFS and readarray:
readarray array <filename
Try them and let me know if you have questions.
Your second try almost works, but you have to tell read that it should not just read until newline (the default behaviour), but for example until the null string:
$ IFS=$'\n' read -a arr -d '' <<< $'a b c\nd e f\ng h i'
$ declare -p arr
declare -a arr='([0]="a b c" [1]="d e f" [2]="g h i")'
But as you pointed out, mapfile/readarray is the way to go if you have it (requires Bash 4.0 or newer):
$ mapfile -t arr <<< $'a b c\nd e f\ng h i'
$ declare -p arr
declare -a arr='([0]="a b c" [1]="d e f" [2]="g h i")'
The -t option removes the newlines from each element.
As for when you'd want to use IFS=$'\n':
As just shown, if you want to read a files into an array, one line per element, if your Bash is older than 4.0, and you don't want to use a loop
Some people promote using an IFS without a space to avoid unexpected side effects from word splitting; the proper approach in my opinion, though, is to understand word splitting and make sure to avoid it with proper quoting as desired.
I've seen IFS=$'\n' used in tab completion scripts, for example the one for cd in bash-completion: this script fiddles with paths and replaces colons with newlines, to then split them up using that IFS.
I have a file called failedfiles.txt with the following content:
failed1
failed2
failed3
I need to use grep to return the content on each line in that file, and save the output in a list to be accessed. So I want something like this:
temp_list=$(grep "[a-z]" failedfiles.txt)
However, the problem with this is that when I type
echo ${temp_list[0]}
I get the following output:
failed1 failed2 failed3
But what I want is when I do:
echo ${temp_list[0]}
to print
failed1
and when I do:
echo ${temp_list[1]}
to print
failed2
Thanks.
#devnull's helpful answer explains why your code didn't work as expected: command substitution always returns a single string (possibly composed of multiple lines).
However, simply putting (...) around a command substitution to create an array of lines will only work as expected if the lines output by the command do not have embedded spaces - otherwise, each individual (whitespace-separated) word will become its own array element.
Capturing command output lines at once, in an array:
To capture the lines output by an arbitrary command in an array, use the following:
bash < 4 (e.g., on OSX as of OS X 10.9.2): use read -a
IFS=$'\n' read -rd '' -a linesArray <<<"$(grep "[a-z]" failedfiles.txt)"
bash >= 4: use readarray:
readarray -t linesArray <<<"$(grep "[a-z]" failedfiles.txt)"
Note:
<<< initiates a so-called here-string, which pipes the string to its right (which happens to be the result of a command substitution here) into the command on the left via stdin.
While command <<< string is functionally equivalent to echo string | command in principle, the crucial difference is that the latter creates subshells, which make variable assignments in command pointless - they are localized to each subshell.
An alternative to combining here-strings with command substitution is [input] process substitution - <(...) - which, simply put, allows using a command's output as if it were an input file; the equivalent of <<<"$(command)" is < <(command).
read: -a reads into an array, and IFS=$'\n' ensures that every line is considered a separate field and thus read into its own array element; -d '' ensures that ALL lines are read at once (before breaking them into fields); -r turns interpretation of escape sequence in the input off.
readarray (also callable as mapfile) directly breaks input lines into an array of lines; -t ensures that the terminating \n is NOT included in the array elements.
Looping over command output lines:
If there is no need to capture all lines in an array at once and looping over a command's output line by line is sufficient, use the following:
while IFS= read -r line; do
# ...
done < <(grep "[a-z]" failedfiles.txt)
IFS= ensures that each line is read unmodified in terms of whitespace; remove it to have leading and trailing whitespace trimmed.
-r ensures that the lines are read 'raw' in that substrings in the input that look like escape sequences - e.g., \t - are NOT interpreted as such.
Note the use of [input] process substitution (explained above) to provide the command output as input to the read loop.
You did not create an array. What you did was Command Substitution which would simply put the output of a command into a variable.
In order to create an array, say:
temp_list=( $(grep "[a-z]" failedfiles.txt) )
You might also want to refer to Guide on Arrays.
The proper and portable way to loop over lines in a file is simply
while read -r line; do
... something with "$line"
done <failedfiles.txt