Split long string into array and keep delimiter - arrays

I've got a strange edge-case.
I have a long string which contains \n (newline characters).
So the string looks something like:
text="loremipsum\nDollor sit atmet \n aliquyam erat,
sed diam\naliquyam erat \n sed diam"
I need to split the string into an array, but keep the newline characters uniterpreted,
so the array/output looks like:
"loremipsum\n"
"Dollor sit atmet \n"
"aliquyam erat, sed diam\n"
"aliquyam erat \n"
"sed diam"
I couldn't find a way to split the string and preserve the \n characters.
If I use IFS=$"\n" the \ncharacters are deleted,
but if I use IFS="\n" it gets split and delets all occurrence of n.
I tried it like:
IFS=$"\n" read -d '' -a arr <<<"$text"
How can I solve this?
Clarification/Update
The text is dynamic and can be very long 3000+ chars,
so creating the array like: declare -a arr=([0]=$'loremipsum\n'... is not an option.
The \n characters (0x5c + 0x6e in ascii code) should all be treated the same,
the should not be replaced with an actual newline.
The \n characters must be preserved,
because the progrann which gets the output looks for these in plaintext.
The \n characters can be àt every position in a sentence,
also in a word like:
lor\nem or with spaces: Lorem \n ipsum
So the \n characters must be at the end of the elements inside the array, like shown above.
The text must only be splitted at \n not a spaces etc..

My understanding from the sample (input/output) data given:
there is one actual newline character in text (between erat, and sed diem); this is to be removed and assuming there is no (space) after erat, we need to add a (space), ie, replace the actual newline character with a (space)
there are 4 literal strings of \ + n; we are to break the array after these literals; the literal \ + n are to remain in the text that is stored in the array
the output should have a leading space removed from array values
I'm assuming the final results should not include the double quotes (ie, OP included the double quotes in the desired output as a means of delimiting the array values for display purposes)
One idea:
text="loremipsum\nDollor sit atmet \n aliquyam erat,
sed diam\naliquyam erat \n sed diam"
# convert actual newline character to a (space)
text=${text//$'\n'/ }
# add an actual newline character after the literal `\` + `n`
text=${text//\n/\n$'\n'}
# print our value, remove leading (space), and load into array
IFS=$'\n' arr=( $(printf "%s\n" "${text}." | sed 's/^ //g') )
# display array
typeset -p arr
declare -a arr=([0]="loremipsum\\n" [1]="Dollor sit atmet \\n" [2]="aliquyam erat, sed diam\\n" [3]="aliquyam erat \\n" [4]="sed diam.")
# loop through array and display individual strings; add double quotes as delimiters for display purposes
for i in "${!arr[#]}"
do
echo "\"${arr[${i}]}\""
done
"loremipsum\n"
"Dollor sit atmet \n"
"aliquyam erat, sed diam\n"
"aliquyam erat \n"
"sed diam."

You can use process substitution and echo, e.g.
text="loremipsum\nDollor sit atmet \n aliquyam erat, sed diam\naliquyam erat \n sed diam"
readarray arr < <(echo -e "$text")
You can also use printf in the process substitution as well, e.g.
< <(printf "$text")
Since the -t option is not give to readarray, the '\n' is included as part of the array element.
Example Use/Output
Adding a declare -p arr to output the array, you would have:
text="loremipsum\nDollor sit atmet \n aliquyam erat, sed diam\naliquyam erat \n sed diam"
readarray arr < <(echo -e "$text")
declare -p arr
declare -a arr=([0]=$'loremipsum\n' [1]=$'Dollor sit atmet \n' [2]=$' aliquyam erat, sed diam\n' [3]=$'aliquyam erat \n' [4]=$' sed diam\n')
If you want to trim leading whitespace, you can use the brace-expansion ${element#*[[:space:]]}. Up to you.

Related

sed: sed -i "s/secret/$token/g" returning error when value of token have "/" at the end [duplicate]

In my bash script I have an external (received from user) string, which I should use in sed pattern.
REPLACE="<funny characters here>"
sed "s/KEYWORD/$REPLACE/g"
How can I escape the $REPLACE string so it would be safely accepted by sed as a literal replacement?
NOTE: The KEYWORD is a dumb substring with no matches etc. It is not supplied by user.
Warning: This does not consider newlines. For a more in-depth answer, see this SO-question instead. (Thanks, Ed Morton & Niklas Peter)
Note that escaping everything is a bad idea. Sed needs many characters to be escaped to get their special meaning. For example, if you escape a digit in the replacement string, it will turn in to a backreference.
As Ben Blank said, there are only three characters that need to be escaped in the replacement string (escapes themselves, forward slash for end of statement and & for replace all):
ESCAPED_REPLACE=$(printf '%s\n' "$REPLACE" | sed -e 's/[\/&]/\\&/g')
# Now you can use ESCAPED_REPLACE in the original sed statement
sed "s/KEYWORD/$ESCAPED_REPLACE/g"
If you ever need to escape the KEYWORD string, the following is the one you need:
sed -e 's/[]\/$*.^[]/\\&/g'
And can be used by:
KEYWORD="The Keyword You Need";
ESCAPED_KEYWORD=$(printf '%s\n' "$KEYWORD" | sed -e 's/[]\/$*.^[]/\\&/g');
# Now you can use it inside the original sed statement to replace text
sed "s/$ESCAPED_KEYWORD/$ESCAPED_REPLACE/g"
Remember, if you use a character other than / as delimiter, you need replace the slash in the expressions above wih the character you are using. See PeterJCLaw's comment for explanation.
Edited: Due to some corner cases previously not accounted for, the commands above have changed several times. Check the edit history for details.
The sed command allows you to use other characters instead of / as separator:
sed 's#"http://www\.fubar\.com"#URL_FUBAR#g'
The double quotes are not a problem.
The only three literal characters which are treated specially in the replace clause are / (to close the clause), \ (to escape characters, backreference, &c.), and & (to include the match in the replacement). Therefore, all you need to do is escape those three characters:
sed "s/KEYWORD/$(echo $REPLACE | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')/g"
Example:
$ export REPLACE="'\"|\\/><&!"
$ echo fooKEYWORDbar | sed "s/KEYWORD/$(echo $REPLACE | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')/g"
foo'"|\/><&!bar
Based on Pianosaurus's regular expressions, I made a bash function that escapes both keyword and replacement.
function sedeasy {
sed -i "s/$(echo $1 | sed -e 's/\([[\/.*]\|\]\)/\\&/g')/$(echo $2 | sed -e 's/[\/&]/\\&/g')/g" $3
}
Here's how you use it:
sedeasy "include /etc/nginx/conf.d/*" "include /apps/*/conf/nginx.conf" /etc/nginx/nginx.conf
It's a bit late to respond... but there IS a much simpler way to do this. Just change the delimiter (i.e., the character that separates fields). So, instead of s/foo/bar/ you write s|bar|foo.
And, here's the easy way to do this:
sed 's|/\*!50017 DEFINER=`snafu`#`localhost`\*/||g'
The resulting output is devoid of that nasty DEFINER clause.
It turns out you're asking the wrong question. I also asked the wrong question. The reason it's wrong is the beginning of the first sentence: "In my bash script...".
I had the same question & made the same mistake. If you're using bash, you don't need to use sed to do string replacements (and it's much cleaner to use the replace feature built into bash).
Instead of something like, for example:
function escape-all-funny-characters() { UNKNOWN_CODE_THAT_ANSWERS_THE_QUESTION_YOU_ASKED; }
INPUT='some long string with KEYWORD that need replacing KEYWORD.'
A="$(escape-all-funny-characters 'KEYWORD')"
B="$(escape-all-funny-characters '<funny characters here>')"
OUTPUT="$(sed "s/$A/$B/g" <<<"$INPUT")"
you can use bash features exclusively:
INPUT='some long string with KEYWORD that need replacing KEYWORD.'
A='KEYWORD'
B='<funny characters here>'
OUTPUT="${INPUT//"$A"/"$B"}"
Use awk - it is cleaner:
$ awk -v R='//addr:\\file' '{ sub("THIS", R, $0); print $0 }' <<< "http://file:\_THIS_/path/to/a/file\\is\\\a\\ nightmare"
http://file:\_//addr:\file_/path/to/a/file\\is\\\a\\ nightmare
Here is an example of an AWK I used a while ago. It is an AWK that prints new AWKS. AWK and SED being similar it may be a good template.
ls | awk '{ print "awk " "'"'"'" " {print $1,$2,$3} " "'"'"'" " " $1 ".old_ext > " $1 ".new_ext" }' > for_the_birds
It looks excessive, but somehow that combination of quotes works to keep the ' printed as literals. Then if I remember correctly the vaiables are just surrounded with quotes like this: "$1". Try it, let me know how it works with SED.
These are the escape codes that I've found:
* = \x2a
( = \x28
) = \x29
" = \x22
/ = \x2f
\ = \x5c
' = \x27
? = \x3f
% = \x25
^ = \x5e
sed is typically a mess, especially the difference between gnu-sed and bsd-sed
might just be easier to place some sort of sentinel at the sed side, then a quick pipe over to awk, which is far more flexible in accepting any ERE regex, escaped hex, or escaped octals.
e.g. OFS in awk is the true replacement ::
date | sed -E 's/[0-9]+/\xC1\xC0/g' |
mawk NF=NF FS='\xC1\xC0' OFS='\360\237\244\241'
1 Tue Aug 🤡 🤡:🤡:🤡 EDT 🤡
(tested and confirmed working on both BSD-sed and GNU-sed - the emoji isn't a typo that's what those 4 bytes map to in UTF-8 )
There are dozens of answers out there... If you don't mind using a bash function schema, below is a good answer. The objective below was to allow using sed with practically any parameter as a KEYWORD (F_PS_TARGET) or as a REPLACE (F_PS_REPLACE). We tested it in many scenarios and it seems to be pretty safe. The implementation below supports tabs, line breaks and sigle quotes for both KEYWORD and replace REPLACE.
NOTES: The idea here is to use sed to escape entries for another sed command.
CODE
F_REVERSE_STRING_R=""
f_reverse_string() {
: 'Do a string reverse.
To undo just use a reversed string as STRING_INPUT.
Args:
STRING_INPUT (str): String input.
Returns:
F_REVERSE_STRING_R (str): The modified string.
'
local STRING_INPUT=$1
F_REVERSE_STRING_R=$(echo "x${STRING_INPUT}x" | tac | rev)
F_REVERSE_STRING_R=${F_REVERSE_STRING_R%?}
F_REVERSE_STRING_R=${F_REVERSE_STRING_R#?}
}
# [Ref(s).: https://stackoverflow.com/a/2705678/3223785 ]
F_POWER_SED_ECP_R=""
f_power_sed_ecp() {
: 'Escape strings for the "sed" command.
Escaped characters will be processed as is (e.g. /n, /t ...).
Args:
F_PSE_VAL_TO_ECP (str): Value to be escaped.
F_PSE_ECP_TYPE (int): 0 - For the TARGET value; 1 - For the REPLACE value.
Returns:
F_POWER_SED_ECP_R (str): Escaped value.
'
local F_PSE_VAL_TO_ECP=$1
local F_PSE_ECP_TYPE=$2
# NOTE: Operational characters of "sed" will be escaped, as well as single quotes.
# By Questor
if [ ${F_PSE_ECP_TYPE} -eq 0 ] ; then
# NOTE: For the TARGET value. By Questor
F_POWER_SED_ECP_R=$(echo "x${F_PSE_VAL_TO_ECP}x" | sed 's/[]\/$*.^[]/\\&/g' | sed "s/'/\\\x27/g" | sed ':a;N;$!ba;s/\n/\\n/g')
else
# NOTE: For the REPLACE value. By Questor
F_POWER_SED_ECP_R=$(echo "x${F_PSE_VAL_TO_ECP}x" | sed 's/[\/&]/\\&/g' | sed "s/'/\\\x27/g" | sed ':a;N;$!ba;s/\n/\\n/g')
fi
F_POWER_SED_ECP_R=${F_POWER_SED_ECP_R%?}
F_POWER_SED_ECP_R=${F_POWER_SED_ECP_R#?}
}
# [Ref(s).: https://stackoverflow.com/a/24134488/3223785 ,
# https://stackoverflow.com/a/21740695/3223785 ,
# https://unix.stackexchange.com/a/655558/61742 ,
# https://stackoverflow.com/a/11461628/3223785 ,
# https://stackoverflow.com/a/45151986/3223785 ,
# https://linuxaria.com/pills/tac-and-rev-to-see-files-in-reverse-order ,
# https://unix.stackexchange.com/a/631355/61742 ]
F_POWER_SED_R=""
f_power_sed() {
: 'Facilitate the use of the "sed" command. Replaces in files and strings.
Args:
F_PS_TARGET (str): Value to be replaced by the value of F_PS_REPLACE.
F_PS_REPLACE (str): Value that will replace F_PS_TARGET.
F_PS_FILE (Optional[str]): File in which the replacement will be made.
F_PS_SOURCE (Optional[str]): String to be manipulated in case "F_PS_FILE" was
not informed.
F_PS_NTH_OCCUR (Optional[int]): [1~n] - Replace the nth match; [n~-1] - Replace
the last nth match; 0 - Replace every match; Default 1.
Returns:
F_POWER_SED_R (str): Return the result if "F_PS_FILE" is not informed.
'
local F_PS_TARGET=$1
local F_PS_REPLACE=$2
local F_PS_FILE=$3
local F_PS_SOURCE=$4
local F_PS_NTH_OCCUR=$5
if [ -z "$F_PS_NTH_OCCUR" ] ; then
F_PS_NTH_OCCUR=1
fi
local F_PS_REVERSE_MODE=0
if [ ${F_PS_NTH_OCCUR} -lt -1 ] ; then
F_PS_REVERSE_MODE=1
f_reverse_string "$F_PS_TARGET"
F_PS_TARGET="$F_REVERSE_STRING_R"
f_reverse_string "$F_PS_REPLACE"
F_PS_REPLACE="$F_REVERSE_STRING_R"
f_reverse_string "$F_PS_SOURCE"
F_PS_SOURCE="$F_REVERSE_STRING_R"
F_PS_NTH_OCCUR=$((-F_PS_NTH_OCCUR))
fi
f_power_sed_ecp "$F_PS_TARGET" 0
F_PS_TARGET=$F_POWER_SED_ECP_R
f_power_sed_ecp "$F_PS_REPLACE" 1
F_PS_REPLACE=$F_POWER_SED_ECP_R
local F_PS_SED_RPL=""
if [ ${F_PS_NTH_OCCUR} -eq -1 ] ; then
# NOTE: We kept this option because it performs better when we only need to replace
# the last occurrence. By Questor
# [Ref(s).: https://linuxhint.com/use-sed-replace-last-occurrence/ ,
# https://unix.stackexchange.com/a/713866/61742 ]
F_PS_SED_RPL="'s/\(.*\)$F_PS_TARGET/\1$F_PS_REPLACE/'"
elif [ ${F_PS_NTH_OCCUR} -gt 0 ] ; then
# [Ref(s).: https://unix.stackexchange.com/a/587924/61742 ]
F_PS_SED_RPL="'s/$F_PS_TARGET/$F_PS_REPLACE/$F_PS_NTH_OCCUR'"
elif [ ${F_PS_NTH_OCCUR} -eq 0 ] ; then
F_PS_SED_RPL="'s/$F_PS_TARGET/$F_PS_REPLACE/g'"
fi
# NOTE: As the "sed" commands below always process literal values for the "F_PS_TARGET"
# so we use the "-z" flag in case it has multiple lines. By Quaestor
# [Ref(s).: https://unix.stackexchange.com/a/525524/61742 ]
if [ -z "$F_PS_FILE" ] ; then
F_POWER_SED_R=$(echo "x${F_PS_SOURCE}x" | eval "sed -z $F_PS_SED_RPL")
F_POWER_SED_R=${F_POWER_SED_R%?}
F_POWER_SED_R=${F_POWER_SED_R#?}
if [ ${F_PS_REVERSE_MODE} -eq 1 ] ; then
f_reverse_string "$F_POWER_SED_R"
F_POWER_SED_R="$F_REVERSE_STRING_R"
fi
else
if [ ${F_PS_REVERSE_MODE} -eq 0 ] ; then
eval "sed -i -z $F_PS_SED_RPL \"$F_PS_FILE\""
else
tac "$F_PS_FILE" | rev | eval "sed -z $F_PS_SED_RPL" | tac | rev > "$F_PS_FILE"
fi
fi
}
MODEL
f_power_sed "F_PS_TARGET" "F_PS_REPLACE" "" "F_PS_SOURCE"
echo "$F_POWER_SED_R"
EXAMPLE
f_power_sed "{ gsub(/,[ ]+|$/,\"\0\"); print }' ./ and eliminate" "[ ]+|$/,\"\0\"" "" "Great answer (+1). If you change your awk to awk '{ gsub(/,[ ]+|$/,\"\0\"); print }' ./ and eliminate that concatenation of the final \", \" then you don't have to go through the gymnastics on eliminating the final record. So: readarray -td '' a < <(awk '{ gsub(/,[ ]+/,\"\0\"); print; }' <<<\"$string\") on Bash that supports readarray. Note your method is Bash 4.4+ I think because of the -d in readar"
echo "$F_POWER_SED_R"
IF YOU JUST WANT TO ESCAPE THE PARAMETERS TO THE SED COMMAND
MODEL
# "TARGET" value.
f_power_sed_ecp "F_PSE_VAL_TO_ECP" 0
echo "$F_POWER_SED_ECP_R"
# "REPLACE" value.
f_power_sed_ecp "F_PSE_VAL_TO_ECP" 1
echo "$F_POWER_SED_ECP_R"
IMPORTANT: If the strings for KEYWORD and/or replace REPLACE contain tabs or line breaks you will need to use the "-z" flag in your "sed" command. More details here.
EXAMPLE
f_power_sed_ecp "{ gsub(/,[ ]+|$/,\"\0\"); print }' ./ and eliminate" 0
echo "$F_POWER_SED_ECP_R"
f_power_sed_ecp "[ ]+|$/,\"\0\"" 1
echo "$F_POWER_SED_ECP_R"
NOTE: The f_power_sed_ecp and f_power_sed functions above was made available completely free as part of this project ez_i - Create shell script installers easily!.
Standard recommendation here: use perl :)
echo KEYWORD > /tmp/test
REPLACE="<funny characters here>"
perl -pi.bck -e "s/KEYWORD/${REPLACE}/g" /tmp/test
cat /tmp/test
don't forget all the pleasure that occur with the shell limitation around " and '
so (in ksh)
Var=">New version of \"content' here <"
printf "%s" "${Var}" | sed "s/[&\/\\\\*\\"']/\\&/g' | read -r EscVar
echo "Here is your \"text\" to change" | sed "s/text/${EscVar}/g"
If the case happens to be that you are generating a random password to pass to sed replace pattern, then you choose to be careful about which set of characters in the random string. If you choose a password made by encoding a value as base64, then there is is only character that is both possible in base64 and is also a special character in sed replace pattern. That character is "/", and is easily removed from the password you are generating:
# password 32 characters log, minus any copies of the "/" character.
pass=`openssl rand -base64 32 | sed -e 's/\///g'`;
If you are just looking to replace Variable value in sed command then just remove
Example:
sed -i 's/dev-/dev-$ENV/g' test to sed -i s/dev-/dev-$ENV/g test
I have an improvement over the sedeasy function, which WILL break with special characters like tab.
function sedeasy_improved {
sed -i "s/$(
echo "$1" | sed -e 's/\([[\/.*]\|\]\)/\\&/g'
| sed -e 's:\t:\\t:g'
)/$(
echo "$2" | sed -e 's/[\/&]/\\&/g'
| sed -e 's:\t:\\t:g'
)/g" "$3"
}
So, whats different? $1 and $2 wrapped in quotes to avoid shell expansions and preserve tabs or double spaces.
Additional piping | sed -e 's:\t:\\t:g' (I like : as token) which transforms a tab in \t.
An easier way to do this is simply building the string before hand and using it as a parameter for sed
rpstring="s/KEYWORD/$REPLACE/g"
sed -i $rpstring test.txt

remove enclosing brackets from a file

How can I efficiently remove enclosing brackets from a file with bash scripting (first occurrence of [ and last occurrence of ] in file)?
All brackets that are nested within the outer brackets and may extend over several lines should be retained.
Leading or trailing whitespaces may be present.
content of file1
[
Lorem ipsum
[dolor] sit [amet
conse] sadip elitr
]
cat file1 | magicCommand
desired output
Lorem ipsum
[dolor] sit [amet
conse] sadip elitr
content of file2
[Lorem ipsum [dolor] sit [amet conse] sadip elitr]
cat file2 | magicCommand
desired output
Lorem ipsum [dolor] sit [amet conse] sadip elitr
If you want to edit the file to remove the braces, use ed:
printf '%s\n' '1s/^\([[:space:]]*\)\[/\1/' '$s/\]\([[:space:]]*\)$/\1/' w | ed -s file1
If you want to pass on the modified contents of the file to something else as part of a pipeline, use sed:
sed -e '1s/^\([[:space:]]*\)\[/\1/' -e '$s/\]\([[:space:]]*\)$/\1/' file1
Both of these will, for the first line of the file, remove a [ at the start of the line (Skipping over any initial whitespace before the opening brace), and for the last line of the file (Which can be the same line as in your second example), remove a ] at the end of the line (Not counting any trailing whitespace after the close bracket). Any leading/trailing whitespace will be preserved in the result; use s/...// instead to remove them too.
perl -0777 -pe 's/^\s*\[\s*//; s/\s*\]\s*$//' file
That's aggressive about removing all whitespace around the outer brackets, which isn't exactly what you show in your desired output.
With GNU sed for -E and -z:
$ sed -Ez 's/\[(.*)]/\1/' file1
Lorem ipsum
[dolor] sit [amet
conse] sadip elitr
$ sed -Ez 's/\[(.*)]/\1/' file2
Lorem ipsum [dolor] sit [amet conse] sadip elitr
The above will read the whole file into memory.

how to add \n before 4th pipe and after last double Quotes in a file in unix

I have a line in a file. like:
1|4|ab|"abnchf "dnvjnkjf" fdvjnfkjnv" 2|12|df|"dskfnkfv "A"
I want to break the into two rows by adding \n at before 4th pipe and after last double quotes.
it should be like:
1|4|ab|"abnchf "dnvjnkjf" fdvjnfkjnv"
2|12|df|"dskfnkfv "A"
i have tried sed command but its not working
sed 's/\(|[^|]*\)(|[^|]*\)(|[^|]*\)|/\1\n|/g'
You may use
sed 's/\([^|]*|\)\{3\}[^|]* /&\n/' file > newfile
See the online demo
Details
\([^|]*|\)\{3\} - three consecutve occurrences of
[^|]* - 0+ chars other than |
| - a pipe symbol
[^|]* - 0+ chars other than |
- a space
The replacement pattern is &\n, the whole match (&) and a newline (\n).
The replacement is only done once per line since I removed the g option.
To avoid overescaping, you may use a POSIX ERE based sed:
sed -E 's/([^|]*\|){3}[^|]* /&\n/' file > newfile
where you do not need to escape capturing parentheses and range/interval quantifier braces (but you have to escape a literal | char).
This might work for you (GNU sed):
sed 's/[^ |]*|/\n&/4' file
Insert a newline before the fourth field delimited by |.

how to return an array from a script in Bash?

suppose I have a script called 'Hello'. something like:
array[0]="hello world"
array[1]="goodbye world"
echo ${array[*]}
and I want to do something like this in another script:
tmp=(`Hello`)
the result I need is:
echo ${tmp[0]} #prints "hello world"
echo ${tmp[1]} #prints "goodbye world"
instead I get
echo ${tmp[0]} #prints "hello"
echo ${tmp[1]} #prints "world"
or in other words, every word is put in a different spot in the tmp array.
how do I get the result I need?
Emit it as a NUL-delimited stream:
printf '%s\0' "${array[#]}"
...and, in the other side, read from that stream:
array=()
while IFS= read -r -d '' entry; do
array+=( "$entry" )
done
This often comes in handy in conjunction with process substitution; in the below example, the initial code is in a command (be it a function or an external process) invoked as generate_an_array:
array=()
while IFS= read -r -d '' entry; do
array+=( "$entry" )
done < <(generate_an_array)
You can also use declare -p to emit a string which can be evaled to get the content back:
array=( "hello world" "goodbye world" )
declare -p array
...and, on the other side...
eval "$(generate_an_array)"
However, this is less preferable -- it's not as portable to programming languages other than bash (whereas almost all languages can read a NUL-delimited stream), and it requires the receiving program to trust the sending program to return declare -p results and not malicious content.
Although there are workarounds, you can't really "return" an array from a bash function or script, since the normal way of "returning" a value is to send it as a string to stdout and let the caller capture it with command substitution. [Note 1] That's fine for simple strings or very simple arrays (such as arrays of numbers, where the elements cannot contain whitespace), but it's really not a good way to send structured data.
There are workarounds, such as printing a string with specific delimiters (in particular, with NUL bytes) which can be parsed by the caller, or in the form of an executable bash statement which can be evaluated by the caller with eval, but on the whole the simplest mechanism is to require that the caller provide the name of an array variable into which the value can be placed. This only works with bash functions, since scripts can't modify the environment of the caller, and it only works with functions called directly in the parent process, so it won't work with pipelines. Effectively, this is a mechanism similar to that used by the read built-in, and a few other bash built-ins.
Here's a simple example. The function split takes three arguments: an array name, a delimiter, and a string:
split () {
IFS=$2 read -a "$1" -r -d '' < <(printf %s "$3")
}
eg:
$ # Some text
$ lorem="Lorem ipsum dolor
sit amet, consectetur
adipisicing elit, sed do
eiusmod tempor incididunt"
# Split at the commas, putting the pieces in the array phrase
$ split phrase "," "$lorem"
# Print the pieces in a way that you can see the elements.
$ printf -- "--%s\n" "${phrase[#]}"
--Lorem ipsum dolor
sit amet
-- consectetur
adipisicing elit
-- sed do
eiusmod tempor incididunt
Notes:
Any function or script does have a status return, which is a small integer; this is what is actually returned by the return and exit special forms. However, the status return mostly works as a boolean value, and certainly cannot carry a structured value.
hello.sh
declare -a array # declares a global array variable
array=(
"hello world"
"goodbye world"
)
other.sh
. hello.sh
tmp=( "${array[#]}" ) # if you need to make a copy of the array
echo "${tmp[0]}"
echo "${tmp[1]}"
If you truly want a function to spit out values that your script will capture, do this:
hello.sh
#!/bin/bash
array=(
"hello world"
"goodbye world"
)
printf "%s\n" "${array[#]}"
other.sh
#!/bin/bash
./hello.sh | {
readarray -t tmp
echo "${tmp[0]}"
echo "${tmp[1]}"
}
# or
readarray -t tmp < <(./hello.sh)
echo "${tmp[0]}"
echo "${tmp[1]}"

How to prevent c from interpreting \s of sed

I'm using below command to remove commented, empty lines from a file and search for a particular pattern.
sed '/#/d' $file | sed '/^[ ]*$/d' | tr -d '\n' | sed -n 's/^.*tags\s*[{]\s*hosttags\s*=\s*\([0-1]\)\s*[}].*/tags {hosttags = \1}/p'
Though the above expression works out to me in shell, I have to use it in C. The problem is in this line.
sprintf(buf, "sed '/#/d' %s | sed '/^[ ]*$/d' | tr -d '\n' | sed -n 's/^.*tags\s*[{]\s*hosttags\s*=\s*\([0-1]\)\s*[}].*/tags {hosttags = \1}/p'",file);
C tries to interpret \s and compiling fails. Replacing \s with [[:space]] is not working.
Please let me know how I can get this working in C.
Double up the backslashes, changing \n to \\n, \s to \\s, and so on:
sprintf(buf, "sed '/#/d' %s | sed '/^[ ]*$/d' | tr -d '\\n' | sed -n 's/^.*tags\\s*[{]\\s*hosttags\\s*=\\s*\\([0-1]\\)\\s*[}].*/tags {hosttags = \\1}/p'",file);
When \\ appears in C string literal, a single backslash is embedded into the string in its place.
Replace \s with the POSIX character class [[:space:]].

Resources