SED Match/Replace URL and Update Serialized Array Count

SED Match/Replace URL and Update Serialized Array Count - arrays

Below is an example snippet from a sql dump file. This specific row contains a meta_value of a Wordpress PHP serialized array. During database restores in dev., test., and qc. environments I'm using sed to replace URLs with the respective environment sub-domain.
INSERT INTO `wp_postmeta`
(`meta_id`,
`post_id`,
`meta_key`,
`meta_value`)
VALUES
(527,
1951,
'ut_parallax_image',
'a:4:{
s:17:\"background-image\";
s:33:\"http://example.com/background.jpg\";
s:23:\"mobile-background-image\";
s:37:\"www.example.com/mobile-background.jpg\";
}')
;
However, I need to extend this to correct the string length in the serialized arrays after replace.
sed -r -e "s/:\/\/(www\.)?${domain}/:\/\/\1${1}\.${domain}/g" "/vagrant/repositories/apache/$domain/_sql/$(basename "$file")" > "/vagrant/repositories/apache/$domain/_sql/$1.$(basename "$file")"
The result should look like this for dev.:
INSERT INTO `wp_postmeta`
(`meta_id`,
`post_id`,
`meta_key`,
`meta_value`)
VALUES
(527,
1951,
'ut_parallax_image',
'a:4:{
s:17:\"background-image\";
s:37:\"http://dev.example.com/background.jpg\";
s:23:\"mobile-background-image\";
s:41:\"www.dev.example.com/mobile-background.jpg\";
}')
;
I'd prefer to not introduce any dependancies other than sed.

Thanks #John1024. #Fabio and #Seth, I not sure for perfomance, but these code work and without wp-cli:
localdomain=mylittlewordpress.local
maindomain=strongwordpress.site.ru
cat dump.sql | sed 's/;s:/;\ns:/g' | awk -F'"' '/s:.+'$maindomain'/ {sub("'$maindomain'", "'$localdomain'"); n=length($2)-1; sub(/:[[:digit:]]+:/, ":" n ":")} 1' | sed ':a;N;$!ba;s/;\ns:/;s:/g' | sed "s/$maindomain/$localdomain/g" | mysql -u$USER -p$PASS $DBNAME
PHP serialized string exploded by ';s:' to multiline string and awk processed all lines by #John1024 solution.
cat dump.sql | sed 's/;s:/;\ns:/g'
Redirect output to awk
awk -F'"' '/^s:.+'$maindomain'/ {sub("'$maindomain'", "'$localdomain'"); n=length($2)-1; sub(/:[[:digit:]]+:/, ":" n ":")} 1'
After all lines processed, multiline implode to one line (as then exists in original dump.sql). Thanks #Zsolt https://stackoverflow.com/a/1252191
sed ':a;N;$!ba;s/;\ns:/;s:/g'
Addition sed replacement need for any other strings in wordpress database.
sed "s/$maindomain/$localdomain/g"
And load into main server DB
... | mysql -u$USER -p$PASS $DBNAME

Your algorithm involves arithmetic. That makes sed a poor choice. Consider awk instead.
Consider this input file:
$ cat inputfile
something...
s:33:\"http://example.com/background.jpg\";
s:37:\"www.example.com/mobile-background.jpg\";
s:33:\"http://www.example.com/background.jpg\";
more lines...
I believe that this does what you want:
$ awk -F'"' '/:\/\/(www[.])?example.com/ {sub("example.com", "dev.example.com"); n=length($2)-1; sub(/:[[:digit:]]+:/, ":" n ":")} 1' inputfile
something...
s:37:\"http://dev.example.com/background.jpg\";
s:37:\"www.example.com/mobile-background.jpg\";
s:41:\"http://www.dev.example.com/background.jpg\";
more lines...

WP-CLI handles serialized PHP arrays during a search-replace http://wp-cli.org/commands/search-replace/. I wanted to try a native shell solution, but having WP-CLI was worth the extra overhead in the end.

Here is a sample text file you asked for (it's a database export).
Original (https://www.example.com) :
LOCK TABLES `wp_options` WRITE;
INSERT INTO `wp_options` VALUES (1,'siteurl','https://www.example.com','yes'),(18508,'optionsframework','a:48:{s:4:\"logo\";s:75:\"https://www.example.com/wp-content/uploads/2014/04/logo_imbrique_small3.png\";s:7:\"favicon\";s:62:\"https://www.example.com/wp-content/uploads/2017/04/favicon.ico\";}','yes')
/*!40000 ALTER TABLE `wp_options` ENABLE KEYS */;
UNLOCK TABLES;
Result needed (http://example.localhost) :
LOCK TABLES `wp_options` WRITE;
INSERT INTO `wp_options` VALUES (1,'siteurl','http://example.localhost','yes'),(18508,'optionsframework','a:48:{s:4:\"logo\";s:76:\"http://example.localhost/wp-content/uploads/2014/04/logo_imbrique_small3.png\";s:7:\"favicon\";s:64:\"https://example.localhost/wp-content/uploads/2017/04/favicon.ico\";}','yes');
/*!40000 ALTER TABLE `wp_options` ENABLE KEYS */;
UNLOCK TABLES;
As you can see :
there is multiple occurence on the same line
escape characters aren't counted in length number (eg: "/")
some occurence aren't preceded by "s:" length number (no need to replace, it can be done after awk with a simple sed)
Thanks in advance !

#Alexander Demidov's answer is great, here's our implementation for reference
public static function replaceInFile(string $replace, string $replacement, string $absoluteFilePath): void
{
ColorCode::colorCode("Attempting to replace ::\n($replace)\nwith replacement ::\n($replacement)\n in file ::\n(file://$absoluteFilePath)", iColorCode::BACKGROUND_MAGENTA);
$replaceDelimited = preg_quote($replace, '/');
$replacementDelimited = preg_quote($replacement, '/');
$replaceExecutable = CarbonPHP::CARBON_ROOT . 'extras/replaceInFileSerializeSafe.sh';
// #link https://stackoverflow.com/questions/29902647/sed-match-replace-url-and-update-serialized-array-count
$replaceBashCmd = "chmod +x $replaceExecutable && $replaceExecutable '$absoluteFilePath' '$replaceDelimited' '$replace' '$replacementDelimited' '$replacement'";
Background::executeAndCheckStatus($replaceBashCmd);
}
public static function executeAndCheckStatus(string $command, bool $exitOnFailure = true): int
{
$output = [];
$return_var = null;
ColorCode::colorCode('Running CMD >> ' . $command,
iColorCode::BACKGROUND_BLUE);
exec($command, $output, $return_var);
if ($return_var !== 0 && $return_var !== '0') {
ColorCode::colorCode("The command >> $command \n\t returned with a status code (" . $return_var . '). Expecting 0 for success.', iColorCode::RED);
$output = implode(PHP_EOL, $output);
ColorCode::colorCode("Command output::\t $output ", iColorCode::RED);
if ($exitOnFailure) {
exit($return_var);
}
}
return (int) $return_var;
}
#!/usr/bin/env bash
set -e
SQL_FILE="$1"
replaceDelimited="$2"
replace="$3"
replacementDelimited="$4"
replacement="$5"
if ! grep --quiet "$replace" "$SQL_FILE" ;
then
exit 0;
fi
cp "$SQL_FILE" "$SQL_FILE.old.sql"
# #link https://stackoverflow.com/questions/29902647/sed-match-replace-url-and-update-serialized-array-count
# #link https://serverfault.com/questions/1114188/php-serialize-awk-command-speed-up/1114191#1114191
sed 's/;s:/;\ns:/g' "$SQL_FILE" | \
awk -F'"' '/s:.+'$replaceDelimited'/ {sub("'$replace'", "'$replacement'"); n=length($2)-1; sub(/:[[:digit:]]+:/, ":" n ":")} 1' 2>/dev/null | \
sed -e ':a' -e 'N' -e '$!ba' -e 's/;\ns:/;s:/g' | \
sed "s/$replaceDelimited/$replacementDelimited/g" > "$SQL_FILE.replaced.sql"
cp "$SQL_FILE.replaced.sql" "$SQL_FILE"

Related

sed: sed -i "s/secret/$token/g" returning error when value of token have "/" at the end [duplicate]

In my bash script I have an external (received from user) string, which I should use in sed pattern.
REPLACE="<funny characters here>"
sed "s/KEYWORD/$REPLACE/g"
How can I escape the $REPLACE string so it would be safely accepted by sed as a literal replacement?
NOTE: The KEYWORD is a dumb substring with no matches etc. It is not supplied by user.

Warning: This does not consider newlines. For a more in-depth answer, see this SO-question instead. (Thanks, Ed Morton & Niklas Peter)
Note that escaping everything is a bad idea. Sed needs many characters to be escaped to get their special meaning. For example, if you escape a digit in the replacement string, it will turn in to a backreference.
As Ben Blank said, there are only three characters that need to be escaped in the replacement string (escapes themselves, forward slash for end of statement and & for replace all):
ESCAPED_REPLACE=$(printf '%s\n' "$REPLACE" | sed -e 's/[\/&]/\\&/g')
# Now you can use ESCAPED_REPLACE in the original sed statement
sed "s/KEYWORD/$ESCAPED_REPLACE/g"
If you ever need to escape the KEYWORD string, the following is the one you need:
sed -e 's/[]\/$*.^[]/\\&/g'
And can be used by:
KEYWORD="The Keyword You Need";
ESCAPED_KEYWORD=$(printf '%s\n' "$KEYWORD" | sed -e 's/[]\/$*.^[]/\\&/g');
# Now you can use it inside the original sed statement to replace text
sed "s/$ESCAPED_KEYWORD/$ESCAPED_REPLACE/g"
Remember, if you use a character other than / as delimiter, you need replace the slash in the expressions above wih the character you are using. See PeterJCLaw's comment for explanation.
Edited: Due to some corner cases previously not accounted for, the commands above have changed several times. Check the edit history for details.

The sed command allows you to use other characters instead of / as separator:
sed 's#"http://www\.fubar\.com"#URL_FUBAR#g'
The double quotes are not a problem.

The only three literal characters which are treated specially in the replace clause are / (to close the clause), \ (to escape characters, backreference, &c.), and & (to include the match in the replacement). Therefore, all you need to do is escape those three characters:
sed "s/KEYWORD/$(echo $REPLACE | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')/g"
Example:
$ export REPLACE="'\"|\\/><&!"
$ echo fooKEYWORDbar | sed "s/KEYWORD/$(echo $REPLACE | sed -e 's/\\/\\\\/g; s/\//\\\//g; s/&/\\\&/g')/g"
foo'"|\/><&!bar

Based on Pianosaurus's regular expressions, I made a bash function that escapes both keyword and replacement.
function sedeasy {
sed -i "s/$(echo $1 | sed -e 's/\([[\/.*]\|\]\)/\\&/g')/$(echo $2 | sed -e 's/[\/&]/\\&/g')/g" $3
}
Here's how you use it:
sedeasy "include /etc/nginx/conf.d/*" "include /apps/*/conf/nginx.conf" /etc/nginx/nginx.conf

It's a bit late to respond... but there IS a much simpler way to do this. Just change the delimiter (i.e., the character that separates fields). So, instead of s/foo/bar/ you write s|bar|foo.
And, here's the easy way to do this:
sed 's|/\*!50017 DEFINER=`snafu`#`localhost`\*/||g'
The resulting output is devoid of that nasty DEFINER clause.

It turns out you're asking the wrong question. I also asked the wrong question. The reason it's wrong is the beginning of the first sentence: "In my bash script...".
I had the same question & made the same mistake. If you're using bash, you don't need to use sed to do string replacements (and it's much cleaner to use the replace feature built into bash).
Instead of something like, for example:
function escape-all-funny-characters() { UNKNOWN_CODE_THAT_ANSWERS_THE_QUESTION_YOU_ASKED; }
INPUT='some long string with KEYWORD that need replacing KEYWORD.'
A="$(escape-all-funny-characters 'KEYWORD')"
B="$(escape-all-funny-characters '<funny characters here>')"
OUTPUT="$(sed "s/$A/$B/g" <<<"$INPUT")"
you can use bash features exclusively:
INPUT='some long string with KEYWORD that need replacing KEYWORD.'
A='KEYWORD'
B='<funny characters here>'
OUTPUT="${INPUT//"$A"/"$B"}"

Use awk - it is cleaner:
$ awk -v R='//addr:\\file' '{ sub("THIS", R, $0); print $0 }' <<< "http://file:\_THIS_/path/to/a/file\\is\\\a\\ nightmare"
http://file:\_//addr:\file_/path/to/a/file\\is\\\a\\ nightmare

Here is an example of an AWK I used a while ago. It is an AWK that prints new AWKS. AWK and SED being similar it may be a good template.
ls | awk '{ print "awk " "'"'"'" " {print $1,$2,$3} " "'"'"'" " " $1 ".old_ext > " $1 ".new_ext" }' > for_the_birds
It looks excessive, but somehow that combination of quotes works to keep the ' printed as literals. Then if I remember correctly the vaiables are just surrounded with quotes like this: "$1". Try it, let me know how it works with SED.

These are the escape codes that I've found:
* = \x2a
( = \x28
) = \x29
" = \x22
/ = \x2f
\ = \x5c
' = \x27
? = \x3f
% = \x25
^ = \x5e

sed is typically a mess, especially the difference between gnu-sed and bsd-sed
might just be easier to place some sort of sentinel at the sed side, then a quick pipe over to awk, which is far more flexible in accepting any ERE regex, escaped hex, or escaped octals.
e.g. OFS in awk is the true replacement ::
date | sed -E 's/[0-9]+/\xC1\xC0/g' |
mawk NF=NF FS='\xC1\xC0' OFS='\360\237\244\241'
1 Tue Aug 🤡 🤡:🤡:🤡 EDT 🤡
(tested and confirmed working on both BSD-sed and GNU-sed - the emoji isn't a typo that's what those 4 bytes map to in UTF-8 )

There are dozens of answers out there... If you don't mind using a bash function schema, below is a good answer. The objective below was to allow using sed with practically any parameter as a KEYWORD (F_PS_TARGET) or as a REPLACE (F_PS_REPLACE). We tested it in many scenarios and it seems to be pretty safe. The implementation below supports tabs, line breaks and sigle quotes for both KEYWORD and replace REPLACE.
NOTES: The idea here is to use sed to escape entries for another sed command.
CODE
F_REVERSE_STRING_R=""
f_reverse_string() {
: 'Do a string reverse.
To undo just use a reversed string as STRING_INPUT.
Args:
STRING_INPUT (str): String input.
Returns:
F_REVERSE_STRING_R (str): The modified string.
'
local STRING_INPUT=$1
F_REVERSE_STRING_R=$(echo "x${STRING_INPUT}x" | tac | rev)
F_REVERSE_STRING_R=${F_REVERSE_STRING_R%?}
F_REVERSE_STRING_R=${F_REVERSE_STRING_R#?}
}
# [Ref(s).: https://stackoverflow.com/a/2705678/3223785 ]
F_POWER_SED_ECP_R=""
f_power_sed_ecp() {
: 'Escape strings for the "sed" command.
Escaped characters will be processed as is (e.g. /n, /t ...).
Args:
F_PSE_VAL_TO_ECP (str): Value to be escaped.
F_PSE_ECP_TYPE (int): 0 - For the TARGET value; 1 - For the REPLACE value.
Returns:
F_POWER_SED_ECP_R (str): Escaped value.
'
local F_PSE_VAL_TO_ECP=$1
local F_PSE_ECP_TYPE=$2
# NOTE: Operational characters of "sed" will be escaped, as well as single quotes.
# By Questor
if [ ${F_PSE_ECP_TYPE} -eq 0 ] ; then
# NOTE: For the TARGET value. By Questor
F_POWER_SED_ECP_R=$(echo "x${F_PSE_VAL_TO_ECP}x" | sed 's/[]\/$*.^[]/\\&/g' | sed "s/'/\\\x27/g" | sed ':a;N;$!ba;s/\n/\\n/g')
else
# NOTE: For the REPLACE value. By Questor
F_POWER_SED_ECP_R=$(echo "x${F_PSE_VAL_TO_ECP}x" | sed 's/[\/&]/\\&/g' | sed "s/'/\\\x27/g" | sed ':a;N;$!ba;s/\n/\\n/g')
fi
F_POWER_SED_ECP_R=${F_POWER_SED_ECP_R%?}
F_POWER_SED_ECP_R=${F_POWER_SED_ECP_R#?}
}
# [Ref(s).: https://stackoverflow.com/a/24134488/3223785 ,
# https://stackoverflow.com/a/21740695/3223785 ,
# https://unix.stackexchange.com/a/655558/61742 ,
# https://stackoverflow.com/a/11461628/3223785 ,
# https://stackoverflow.com/a/45151986/3223785 ,
# https://linuxaria.com/pills/tac-and-rev-to-see-files-in-reverse-order ,
# https://unix.stackexchange.com/a/631355/61742 ]
F_POWER_SED_R=""
f_power_sed() {
: 'Facilitate the use of the "sed" command. Replaces in files and strings.
Args:
F_PS_TARGET (str): Value to be replaced by the value of F_PS_REPLACE.
F_PS_REPLACE (str): Value that will replace F_PS_TARGET.
F_PS_FILE (Optional[str]): File in which the replacement will be made.
F_PS_SOURCE (Optional[str]): String to be manipulated in case "F_PS_FILE" was
not informed.
F_PS_NTH_OCCUR (Optional[int]): [1~n] - Replace the nth match; [n~-1] - Replace
the last nth match; 0 - Replace every match; Default 1.
Returns:
F_POWER_SED_R (str): Return the result if "F_PS_FILE" is not informed.
'
local F_PS_TARGET=$1
local F_PS_REPLACE=$2
local F_PS_FILE=$3
local F_PS_SOURCE=$4
local F_PS_NTH_OCCUR=$5
if [ -z "$F_PS_NTH_OCCUR" ] ; then
F_PS_NTH_OCCUR=1
fi
local F_PS_REVERSE_MODE=0
if [ ${F_PS_NTH_OCCUR} -lt -1 ] ; then
F_PS_REVERSE_MODE=1
f_reverse_string "$F_PS_TARGET"
F_PS_TARGET="$F_REVERSE_STRING_R"
f_reverse_string "$F_PS_REPLACE"
F_PS_REPLACE="$F_REVERSE_STRING_R"
f_reverse_string "$F_PS_SOURCE"
F_PS_SOURCE="$F_REVERSE_STRING_R"
F_PS_NTH_OCCUR=$((-F_PS_NTH_OCCUR))
fi
f_power_sed_ecp "$F_PS_TARGET" 0
F_PS_TARGET=$F_POWER_SED_ECP_R
f_power_sed_ecp "$F_PS_REPLACE" 1
F_PS_REPLACE=$F_POWER_SED_ECP_R
local F_PS_SED_RPL=""
if [ ${F_PS_NTH_OCCUR} -eq -1 ] ; then
# NOTE: We kept this option because it performs better when we only need to replace
# the last occurrence. By Questor
# [Ref(s).: https://linuxhint.com/use-sed-replace-last-occurrence/ ,
# https://unix.stackexchange.com/a/713866/61742 ]
F_PS_SED_RPL="'s/\(.*\)$F_PS_TARGET/\1$F_PS_REPLACE/'"
elif [ ${F_PS_NTH_OCCUR} -gt 0 ] ; then
# [Ref(s).: https://unix.stackexchange.com/a/587924/61742 ]
F_PS_SED_RPL="'s/$F_PS_TARGET/$F_PS_REPLACE/$F_PS_NTH_OCCUR'"
elif [ ${F_PS_NTH_OCCUR} -eq 0 ] ; then
F_PS_SED_RPL="'s/$F_PS_TARGET/$F_PS_REPLACE/g'"
fi
# NOTE: As the "sed" commands below always process literal values for the "F_PS_TARGET"
# so we use the "-z" flag in case it has multiple lines. By Quaestor
# [Ref(s).: https://unix.stackexchange.com/a/525524/61742 ]
if [ -z "$F_PS_FILE" ] ; then
F_POWER_SED_R=$(echo "x${F_PS_SOURCE}x" | eval "sed -z $F_PS_SED_RPL")
F_POWER_SED_R=${F_POWER_SED_R%?}
F_POWER_SED_R=${F_POWER_SED_R#?}
if [ ${F_PS_REVERSE_MODE} -eq 1 ] ; then
f_reverse_string "$F_POWER_SED_R"
F_POWER_SED_R="$F_REVERSE_STRING_R"
fi
else
if [ ${F_PS_REVERSE_MODE} -eq 0 ] ; then
eval "sed -i -z $F_PS_SED_RPL \"$F_PS_FILE\""
else
tac "$F_PS_FILE" | rev | eval "sed -z $F_PS_SED_RPL" | tac | rev > "$F_PS_FILE"
fi
fi
}
MODEL
f_power_sed "F_PS_TARGET" "F_PS_REPLACE" "" "F_PS_SOURCE"
echo "$F_POWER_SED_R"
EXAMPLE
f_power_sed "{ gsub(/,[ ]+|$/,\"\0\"); print }' ./ and eliminate" "[ ]+|$/,\"\0\"" "" "Great answer (+1). If you change your awk to awk '{ gsub(/,[ ]+|$/,\"\0\"); print }' ./ and eliminate that concatenation of the final \", \" then you don't have to go through the gymnastics on eliminating the final record. So: readarray -td '' a < <(awk '{ gsub(/,[ ]+/,\"\0\"); print; }' <<<\"$string\") on Bash that supports readarray. Note your method is Bash 4.4+ I think because of the -d in readar"
echo "$F_POWER_SED_R"
IF YOU JUST WANT TO ESCAPE THE PARAMETERS TO THE SED COMMAND
MODEL
# "TARGET" value.
f_power_sed_ecp "F_PSE_VAL_TO_ECP" 0
echo "$F_POWER_SED_ECP_R"
# "REPLACE" value.
f_power_sed_ecp "F_PSE_VAL_TO_ECP" 1
echo "$F_POWER_SED_ECP_R"
IMPORTANT: If the strings for KEYWORD and/or replace REPLACE contain tabs or line breaks you will need to use the "-z" flag in your "sed" command. More details here.
EXAMPLE
f_power_sed_ecp "{ gsub(/,[ ]+|$/,\"\0\"); print }' ./ and eliminate" 0
echo "$F_POWER_SED_ECP_R"
f_power_sed_ecp "[ ]+|$/,\"\0\"" 1
echo "$F_POWER_SED_ECP_R"
NOTE: The f_power_sed_ecp and f_power_sed functions above was made available completely free as part of this project ez_i - Create shell script installers easily!.

Standard recommendation here: use perl :)
echo KEYWORD > /tmp/test
REPLACE="<funny characters here>"
perl -pi.bck -e "s/KEYWORD/${REPLACE}/g" /tmp/test
cat /tmp/test

don't forget all the pleasure that occur with the shell limitation around " and '
so (in ksh)
Var=">New version of \"content' here <"
printf "%s" "${Var}" | sed "s/[&\/\\\\*\\"']/\\&/g' | read -r EscVar
echo "Here is your \"text\" to change" | sed "s/text/${EscVar}/g"

If the case happens to be that you are generating a random password to pass to sed replace pattern, then you choose to be careful about which set of characters in the random string. If you choose a password made by encoding a value as base64, then there is is only character that is both possible in base64 and is also a special character in sed replace pattern. That character is "/", and is easily removed from the password you are generating:
# password 32 characters log, minus any copies of the "/" character.
pass=`openssl rand -base64 32 | sed -e 's/\///g'`;

If you are just looking to replace Variable value in sed command then just remove
Example:
sed -i 's/dev-/dev-$ENV/g' test to sed -i s/dev-/dev-$ENV/g test

I have an improvement over the sedeasy function, which WILL break with special characters like tab.
function sedeasy_improved {
sed -i "s/$(
echo "$1" | sed -e 's/\([[\/.*]\|\]\)/\\&/g'
| sed -e 's:\t:\\t:g'
)/$(
echo "$2" | sed -e 's/[\/&]/\\&/g'
| sed -e 's:\t:\\t:g'
)/g" "$3"
}
So, whats different? $1 and $2 wrapped in quotes to avoid shell expansions and preserve tabs or double spaces.
Additional piping | sed -e 's:\t:\\t:g' (I like : as token) which transforms a tab in \t.

An easier way to do this is simply building the string before hand and using it as a parameter for sed
rpstring="s/KEYWORD/$REPLACE/g"
sed -i $rpstring test.txt

Return an array/map from awk to bash

An awk command creates an array which I want to return to bash.
GIT=($(history | grep -c git | awk '{ TMP[4]=$1 } END { print TMP }'))
and getting an error attempt to use array TMP in a scalar context.
So I try inside an awk use already existed array
GIT=()
history | grep -c git | awk 'END { GIT[4]=$1 }'
echo "${GIT[#]}" # empty result
but the result GIT arrary is empty.
The example which I presented doesn't make a lot of sense eg. why 4 in an index? It's just an example. I need use array/map created by awk where the key is an integer. And I need these original indexes, because not all keys are in sequence. I have 1,2,4,7,8 and so on.

You can have awk emit shell commands and then the awk output can be sourced
GIT=()
source <(
history |
grep -c git |
awk -v shellVarname="GIT" '
{tmp[4] = $1}
END {
for (idx in tmp)
printf "%s[%d]=\"%s\"\n", shellVarname, idx, tmp[idx]
}
'
)
declare -p GIT # just to dump the array contents for review

Bash array values as variables

Is it possible to use array values as variables?
For example, i have this script:
#!/bin/bash
SOURCE=$(curl -k -s $1 | sed 's/{//g;s/}//g;s/,/"\n"/g;s/:/=/g;s/"//g' | awk -F"=" '{ print $1 }')
JSON=$(curl -k -s $1 | sed 's/{//g;s/}//g;s/,/"\n"/g;s/:/=/g;s/"//g' | awk -F"=" '{ print $NF }')
data=$2
readarray -t prot_array <<< "$SOURCE"
readarray -t pos_array <<< "$JSON"
for ((i=0; i<${#prot_array[#]}; i++)); do
echo "${prot_array[i]}" "${pos_array[i]}" | sed 's/NOK/0/g;s/OK/1/g' | grep $2 | awk -F' ' '{ print $2,$3,$4 }'
done
EDIT:
I just added: grep $2 | awk -F' ' '{ print $2,$3,$4 }'
Usage:
./json.sh URL
Sample (very short) output:
DATABASE 1
STATUS 1
I don't want to echo out all the lines, i would like to use DATABASE STATUS as variable $DATABASE and echo that out.
I just need DATABASE (or any other) value from command line.
Is it somehow possible to use something like this?
./json.sh URL $DATABASE
Happy to explain more if needed.
EDIT:
curl output without any formattings etc:
{
"VERSION":"R3.1",
"STATUS":"OK",
"DATABASES":{
"READING":"OK"
},
"TIMESTAMP":"2017-03-08-16-20-35"
}
Output using script:
VERSION R3.1
STATUS 1
DATABASES 1
TIMESTAMP 2017-03-08-16-21-54
What i want is described before. For example use DATABASE as varible $DATABASE and somehow get the value "1"
EDIT:
Random json from uconn.edu
./json.sh https://github.uconn.edu/raw/nam12023/novaLauncher/master/manifest.json
Another:
./json.sh https://gitlab.uwe.ac.uk/dc2-roskilly/angular-qs/raw/master/.npm/nan/2.4.0/package/package.json
Last output begins with:
name nan
version 2.4.0
From command line: ./json.sh URL version
At leats it works for me.

I think you want to use jq something like this:
$ curl -k -s "$1" | jq --arg d DATABASES -r '
"VERSION \(.VERSION)",
"STATUS \(if .STATUS == "OK" then 1 else 0 end)",
"DATABASES \(if .[$d].READING == "OK" then 1 else 0 end)",
"TIMESTAMP \(.TIMESTAMP)"
'
VERSION R3.1
STATUS 1
DATABASES 1
TIMESTAMP 2017-03-08-16-20-35
(I'm probably missing a simpler way to convert a boolean value to an integer.)
Quick explanation:
The ,-separated strings each become a separate output line.
The -r option outputs a raw string, rather than a JSON string value.
The name of the database field is passed using the --arg option.
\(...) is jq's interpolation operator; the contents are evaluated as a JSON expression and the result is inserted into the string.

How do i echo specific rows and columns from csv's in a variable?

The below script:
#!/bin/bash
otscurrent="
AAA,33854,4528,38382,12
BBB,83917,12296,96213,13
CCC,20399,5396,25795,21
DDD,27198,4884,32082,15
EEE,2472,981,3453,28
FFF,3207,851,4058,21
GGG,30621,4595,35216,13
HHH,8450,1504,9954,15
III,4963,2157,7120,30
JJJ,51,59,110,54
KKK,87,123,210,59
LLL,573,144,717,20
MMM,617,1841,2458,75
NNN,234,76,310,25
OOO,12433,1908,14341,13
PPP,10627,1428,12055,12
QQQ,510,514,1024,50
RRR,1361,687,2048,34
SSS,1,24,25,96
TTT,0,5,5,100
UUU,294,1606,1900,85
"
IFS="," array1=(${otscurrent})
echo ${array1[4]}
Prints:
$ ./test.sh
12
BBB
I'm trying to get it to just print 12... And I am not even sure how to make it just print row 5 column 4
The variable is an output of a sqlquery that has been parsed with several sed commands to change the formatting to csv.
otscurrent="$(sqlplus64 user/password#dbserverip/db as sysdba #query.sql |
sed '1,11d; /^-/d; s/[[:space:]]\{1,\}/,/g; $d' |
sed '$d'|sed '$d'|sed '$d' | sed '$d' |
sed 's/Used,MB/Used MB/g' |
sed 's/Free,MB/Free MB/g' |
sed 's/Total,MB/Total MB/g' |
sed 's/Pct.,Free/Pct. Free/g' |
sed '1b;/^Name/d' |
sed '/^$/d'
)"
Ultimately I would like to be able to call on a row and column and run statements on the values.
Initially i was piping that into :
awk -F "," 'NR>1{ if($5 < 10) { printf "%-30s%-10s%-10s%-10s%-10s\n", $1,$2,$3,$4,$5"%"; } else { echo "Nothing to do" } }')"
Which works but I couldn't run commands from if else ... or atleaste I didn't know how.

If you have bash 4.0 or newer, an associative array is an appropriate way to store data in this kind of form.
otscurrent=${otscurrent#$'\n'} # strip leading newline present in your sample data
declare -A data=( )
row=0
while IFS=, read -r -a line; do
for idx in "${!line[#]}"; do
data["$row,$idx"]=${line[$idx]}
done
(( row += 1 ))
done <<<"$otscurrent"
This lets you access each individual item:
echo "${data[0,0]}" # first field of first line
echo "${data[9,0]}" # first field of tenth line
echo "${data[9,1]}" # second field of tenth line

"I'm trying to get it to just print 12..."
The issue is that IFS="," splits on commas and there is no comma between 12 and BBB. If you want those to be separate elements, add a newline to IFS. Thus, replace:
IFS="," array1=(${otscurrent})
With:
IFS=$',\n' array1=(${otscurrent})
Output:
$ bash test.sh
12

All you need to print the value of the 4th column on the 5th row is:
$ awk -F, 'NR==5{print $4}' <<< "$otscurrent"
3453
and just remember that in awk row (record) and column (field) numbers start at 1, not 0. Some more examples:
$ awk -F, 'NR==1{print $5}' <<< "$otscurrent"
12
$ awk -F, 'NR==2{print $1}' <<< "$otscurrent"
BBB
$ awk -F, '$5 > 50' <<< "$otscurrent"
JJJ,51,59,110,54
KKK,87,123,210,59
MMM,617,1841,2458,75
SSS,1,24,25,96
TTT,0,5,5,100
UUU,294,1606,1900,85
If you'd like to avoid all of the complexity and simply parse your SQL output to produce what you want without 20 sed commands in between, post a new question showing the raw sqlplus output as the input and what you want finally output and someone will post a brief, clear, simple, efficient awk script to do it all at one time, or maybe 2 commands if you still want an intermediate CSV for some reason.

How can I use sed to make thousands of substitutions in a file using a reference file?

I have a big file with two columns like this:
tiago#tiago:~/$ head Ids.txt
TRINITY_DN126999_c0_g1_i1 ENSMUST00000040656.6
TRINITY_DN126999_c0_g1_i1 ENSMUST00000040656.6
TRINITY_DN126906_c0_g1_i1 ENSMUST00000126770.1
TRINITY_DN126907_c0_g1_i1 ENSMUST00000192613.1
TRINITY_DN126988_c0_g1_i1 ENSMUST00000032372.6
.....
and I have another file with data, like this:
"baseMean" "log2FoldChange" "lfcSE" "stat" "pvalue" "padj" "super" "sub" "threshold"
"TRINITY_DN41319_c0_g1" 178.721774751278 2.1974294626636 0.342621318593487 6.41358066008381 1.4214085388179e-10 5.54686423073089e-08 TRUE FALSE "TRUE"
"TRINITY_DN87368_c0_g1" 4172.76139849472 2.45766387851112 0.404014016558211 6.08311538160958 1.17869459181235e-09 4.02673069375893e-07 TRUE FALSE "TRUE"
"TRINITY_DN34622_c0_g1" 39.1949851245197 3.28758092748061 0.54255370348027 6.05945716781964 1.3658169042862e-09 4.62597265729593e-07 TRUE FALSE "TRUE"
.....
I was thinking of using sed to perform a translation of the values in the first column of the data file, using the first file as a dictionary.
That is, considering each line of the data file in turn, if the value in the first column matches a value in the first column of the dictionary file, then a substitution would be be made; otherwise, the line would simply be printed.
Any suggestions would be appreciated.

You can turn your first file Ids.txt into a sed script:
$ sed -r 's| *(\S+) (\S+)|s/^"\1/"\2/|' Ids.txt > repl.sed
$ cat repl.sed
s/^"TRINITY_DN126999_c0_g1_i1/"ENSMUST00000040656.6/
s/^"TRINITY_DN126999_c0_g1_i1/"ENSMUST00000040656.6/
s/^"TRINITY_DN126906_c0_g1_i1/"ENSMUST00000126770.1/
s/^"TRINITY_DN126907_c0_g1_i1/"ENSMUST00000192613.1/
s/^"TRINITY_DN126988_c0_g1_i1/"ENSMUST00000032372.6/
This removes leading spaces and makes each line into a substitution command.
Then you can use this script to do the replacements in your data file:
sed -f repl.sed datafile
... with redirection to another file, or in-place with sed -i.
If you don't have GNU sed, you can use this POSIX conformant version of the first command:
sed 's| *\([^ ]*\) \([^ ]*\)|s/^"\1/"\2/|' Ids.txt
This uses basic instead of extended regular expressions and uses [^ ] for "not space" instead of \S.

Since the first file (the dictionary file) is large, using sed may be very slow; a much faster and not much more complex approach would be to use awk as follows:
awk -v col=1 -v dict=Ids.txt '
BEGIN {while(getline<dict){a["\""$1"\""]="\""$2"\""} }
$col in a {$col=a[$col]}; {print}'
(Here, "Ids.txt" is the dictionary file, and "col" is the column number of the field of interest in the data file.)
This approach also has the advantage of not requiring any modification to the dictionary file.

#!/bin/bash
# Declare hash table
declare -A Ids
# Go though first input file and add key-value pairs to hash table
while read Id; do
key=$(echo $Id | cut -d " " -f1)
value=$(echo $Id | cut -d " " -f2)
Ids+=([$key]=$value)
done < $1
# Go through second input file and replace every first column with
# the corresponding value in the hash table if it exists
while read line; do
first_col=$(echo $line | cut -d '"' -f2)
new_id=${Ids[$first_col]}
if [ -n "$new_id" ]; then
sed -i s/$first_col/$new_id/g $2
fi
done < $2
I would call the script as
./script.sh Ids.txt data.txt

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

SED Match/Replace URL and Update Serialized Array Count - arrays

WP-CLI handles serialized PHP arrays during a search-replace http://wp-cli.org/commands/search-replace/. I wanted to try a native shell solution, but having WP-CLI was worth the extra overhead in the end.

Related

sed: sed -i "s/secret/$token/g" returning error when value of token have "/" at the end [duplicate]

Return an array/map from awk to bash

Bash array values as variables

How do i echo specific rows and columns from csv's in a variable?

How can I use sed to make thousands of substitutions in a file using a reference file?

Categories

Resources