String substitution using tcl API - c

Is there a way to (ab)use the tcl C-API to 'parse' a string, doing all the replacement (including sub commands in square brackets), but stopping before actually evaluating the resulting command line?
What I'm trying to do is create a command (in C, but I'll consider doing a tcl-wrapper, if there's an elegant way to do it there) which takes a block as a parameter (i.e. curly-braces-quoted-string). I'd like to take that block, split it up and perform substitutions in the same way as if it was to be executed, but stop there and interpret the resulting lines instead.
I've considered creating a namespace, where all valid first-words are defined as commands, however this list is so vast (and pretty much dynamic) so it quickly becomes too cumbersome. I also tried this approach but with the unknown command to intercept the different commands. However, unknown is used for a bunch of stuff, and cannot be bound to a namespace, so I'd have to define it whenever I execute the block, and set it back to whatever it was before when I'm done, which feels pretty shaky. On top of that I'd run the risk (fairly low risk, but not zero) of colliding with an actual command, so I'd very much prefer to not use the unknown command.
The closest I can get is Tcl_ParseCommand (and the rest of the family), which produces a parse tree, which I could manually evaluate. I guess I'll resort to doing it this way if there's no better solution, but I would of course prefer it, if there was an 'official' way..
Am I missing something?

Take a look at Tcl_SubstObj. It's the C equivalent of the [subst] command, which appears to be what you're looking for.
As you indicated in your comment, subst doesn't quite do what you're looking to do. If it helps, the following Tcl code may be what you're looking for:
> set mydata {mylist item $listitem group item {$group item}}
> set listitem {1 2 3}
> subst $mydata ;# error: can't read "group": no such variable
> proc groupsubst {data} {
return [uplevel 1 list $data]
}
> groupsubst $mydata ;# mylist item {1 2 3} group item {$group item}

Related

Bash - loop through array of objects and combine them

I'm trying to create a for-loop to go through all the items from an array, and add the items to a string. The tags are given as a single string with format "tag1 tag2 tag3", and the tagging parameter can be given as many times as I want with the single command with syntax "-tag tag1 -tag -tag2 -tag tag3". I'm unable to create a for loop for the job, and I'm a little confused what is wrong with my code.
TAGS="asd fgh jkl zxc bnm" # Amount of tags varies, but there is always at least one
ARRAY=($TAGS)
TAGSTOBEADDED=""
for i in "$ARRAY[#]"
do
STRINGTOBEADDED="-tag ${ARRAY[$i]}"
$TAGSTOBEADDED=$TAGSTOBEADDED+$STRINGTOBEADDED
done
command $TAGSTOBEADDED
First, your array sintax is wrong as #oguz ismail said. To iter through array items you shold use this:
for i in "${ARRAY[#]}"; { echo $i;}
Second $TAGSTOBEADDED=$TAGSTOBEADDED+$STRINGTOBEADDED this is also fail.
Variables are set like so var="$var 123" you don't need $ in front of var name if you want to change it. Back to code. In this example you dont even need an array, just use TAGS var(without ""):
for i in $TAGS; { TAGSTOBEADDED+="-tag $i"; }
First: avoid storing lists of things in space-delimited strings (as you're currently doing with TAGS and TAGSTOBEADDED) -- there are a bunch of things that can go wrong if they have any "funny" characters (or if IFS gets changed). Use an array instead. Storing them as a string and then converting doesn't help; all of the same potential problems apply during the conversion.
I also recommend using lower- or mixed-case variable names in scripts, since there are a bunch of all-caps names with special meanings, and accidentally using one of those for something else can have weird effects. So, to define the array of tags, I'd just use this:
tags=(asd fgh jkl zxc bnm)
You also have a number of syntax errors in the script. In this line:
for i in "$ARRAY[#]"
... the shell will try to expand $ARRAY as a plain variable (not an array), and then treat "[#]" as just some unrelated characters that go after it. You need braces around the variable refence (like "${ARRAY[#]}") any time you're doing anything nontrivial with a variable reference. BTW, this idiom -- including double-quotes, braces, square-brackets and at-sign -- is what you almost always want when getting the contents of an array.
In this line:
STRINGTOBEADDED="-tag ${ARRAY[$i]}"
$i will expand to one of the array elements, not its index. That is, it'll expand to something like:
STRINGTOBEADDED="-tag ${ARRAY[asd]}"
...which doesn't make any sense. You just want
STRINGTOBEADDED="-tag $i"
...except you don't want that either, because (as I said before) storing lists of things space-delimited in a string is a bad idea. But I'll get to that because fixing it will involve the next line:
$TAGSTOBEADDED=$TAGSTOBEADDED+$STRINGTOBEADDED
There are two problems here: you don't want a dollar sign on the variable being assigned to ($varname gets the value of a variable; anytime you're setting it, don't use the $). Also the + isn't needed to add strings, you just stick them end to end. Well, you'd need to add a space in between, something like one of these:
TAGSTOBEADDED=$TAGSTOBEADDED" "$STRINGTOBEADDED
TAGSTOBEADDED="$TAGSTOBEADDED $STRINGTOBEADDED"
(Generally, you should have double-quotes around all variable references; on the right side of a plain assignment is one of the few places it's safe to leave them unquoted, but I tend to prefer to just double-quote always rather than try to remember all of the exceptions about where it's safe and where it isn't. Plus, quoting just the space looks weird.)
But you don't want to do that either, because (again) space-delimited strings are a bad way to do things. Use an array. So before the loop, create an empty array instead of an empty string:
tagstobeadded=()
...and then inside the loop, append to it with +=( ):
tagstobeadded+=(-tag "$i")
...and then at the end, use it with all the appropriate quotes, braces, etc:
command "${tagstobeadded[#]}"
So, with all of these changes, here's what I'd recommend:
tags=(asd fgh jkl zxc bnm)
tagstobeadded=()
for i in "${tags[#]}"
do
tagstobeadded+=(-tag "$i")
done
command "${tagstobeadded[#]}"

Reversing shell-style brace expansion

Brace expansion takes a pattern and expands it. For example:
sp{el,il,al}l
Expands to:
spell spill spall
Is there an algorithm (potentially with a JavaScript implementation) to do the reverse in a way that minimizes the constructed string?
i.e., take in an array [spell spill spall] and return a string "sp{e,i,a}ll"
Minimizing the resulting string can be done in many different ways, but since you mention Bash, I'll choose the Bash way which is not the most optimized one.
Yes, there is a Bash way! Bash creators have included it as the readline command complete-into-braces. When using Bash interactively, if you hit Meta{ (which is either Alt{ or Esc-then-{ on my machine), all possible completions are grouped into one single brace expansion.
$ echo /usr/
bin/ games/ include/ lib/ local/ sbin/ share/ src/
$ echo /usr/{bin,games,include,l{ib,ocal},s{bin,hare,rc}}
Above, the first time I hit Tab to show all possible completions, and the second time I hit Alt{.
Back to your question: you are looking for an algorithm. Obviously you may find something in Bash source code. The function you are looking for is really_munge_braces() in bracecomp.c
As requested in the original question, node-brace-compression contains a JavaScript implementation. E.g.
var compress = require('brace-compression');
var data = [
'foo-1',
'foo-2',
'foo-3'
];
console.log(compress(data));
// => "foo-{1..3}"

SPSS loop ROC analysis for lots of variables

In SPSS, I would like to perform ROC analysis for lots of variables (989). The problem, when selecting all variables, it gives me the AUC values and the curves, but a case is immediately excluded if it has one missing value within any of the 989 variables. So, I was thinking of having a single-variable ROC analysis put into loop. But I don't have any idea how to do so. I already named all the variables var1, var2, var3, ..., var988, var989.
So, how could I loop a ROC analysis? (Checking "Treat user-missing values as valid" doesn't do the trick)
Thanks!
this sounds like a job for python. Its usually the best solution for this sort of job in SPSS.
So heres a framwork that might help you. I am woefully unfamiliar with ROC-Analysis, but this general pattern is applicable to all kinds of looping scenarios:
begin program.
import spss
for i in range(spss.GetVariableCount()):
var = spss.GetVariableName(i)
cmd = r'''
* your variable-wise analysis goes here --> use spss syntax, beetween the three ' no
* indentation is needed. since I dont know what your syntax looks like, we'll just
* run descriptives and frequencies for all your variables as an example
descriptives %(var)s
/sta mean stddev min max.
fre %(var)s.
'''%locals()
spss.Submit(cmd)
end program.
Just to quickly go over what this does: In line 4 we tell spss to do the following as many times as theres variables in the active dataset, 989 in your case. In line 5 we define a (python) variable named var which contains the variable name of the variable at index i (0 to 988 - the first variable in the dataset having index 0). Then we define a command for spss to execute. I like to put it in raw strings because that simplifies things like giving directories. A raw string is defined by the r''' and ends at the '''. in line 12. "spss.Submit(cmd)" gives the command defined after "cmd = " to spss for execution. Most importantly though, whenever the name of the variable would appear in your syntax, substitute it with "%(var)s"
If you put "set mprint on." a line above the "begin program." youll see exactly what it does in the viewer.

SPSS: Use index variable inside quotation marks

I have several datasets over which i want to run identical commands.
My basic idea is to create a vector with the names of the datasets and loop over it, using the specified name in my GET command:
VECTOR=(9) D = Name1 to Name9.
LOOP #i = 1 to 9.
GET
FILE = Directory\D(#i).sav
VALUE LABELS V1 to V8 'some text D(#i)'
LOOP END.
Now SPSS doesn't recognize that i want it to use the specific value of the vector D.
In Stata i'd use
local D(V1 to V8)
foreach D{
....`D' .....
}
You can't use VECTOR in this way i.e. using GET command within a VECTOR/LOOP loop.
However you can use DEFINE/!ENDDEFINE. This is SPSS's native macro facility language, if you are not aware of this, you'll most likely need to do a lot of reading on it and understand it's syntax usage.
Here's an example:
DEFINE !RunJob ()
!DO !i !IN 1 !TO 9
GET FILE = !CONCAT("Directory\D(",#i,").sav").
VALUE LABELS V1 to V8 !QUOTE(!ONCAT("some text D(",#i,")",
!DOEND
!ENDDEFINE.
SET MPRINT ON.
!RunJob.
SET MPRINT OFF.
All the code between DEFINE and !ENDDEFINE is the body of the macro and the syntax near to the end !RunJob. then runs and executes those procedures defined in the macro.
This a very simply use of a macro with no parameters/arguments assigned but there is scope for much more complexity.
If you are new to DEFINE/!ENDEFINE I would actually suggest you NOT invest time in learning this but instead learn Python Program ability which can be used to achieve the same (and much more) with relative ease compared to DEFINE/!ENDDEFINE.
A python solution to your example would look like this (you will need Python Programmability integration with your SPSS):
BEGIN PROGRAM.
for i in xrange(1,9+1):
spss.Submit("""
GET FILE = Directory\D(%(i)s).sav
VALUE LABELS V1 to V8 'some text D(%(i)s)'.""" % locals())
END PROGRAM.
As you will notice there is much more simplicity to the python solution.
#Caspar: use Python for SPSS for such jobs. SPSS macros have been long deprecated and had better be avoided.
If you use Python for this, you don't even have to type in the file names: you can simply look up all file names in some folder that end with ".sav" as shown in this example.
HTH!
The Python approach is as Ruben says much superior to the old macro facility, but you can use the SPSSINC PROCESS FILES extension command to do tasks like this without any need to know Python. PROCESS FILES is included in the Python Essentials in recent versions of Statistics but can be downloaded from the SPSS Community website (www.ibm.com/developerworks/spssdevcentral) in older versions.
The idea is that you create a syntax file that works on one data file, and PROCESS FILES iterates that over a list of input files or a wildcard specification. For each file, it defines file handles and macros that you can use in the syntax file to open and process the data.

Using exec on each file in a bash script

I'm trying to write a basic find command for a assignment (without using find). Right now I have an array of files I want to exec something on. The syntax would look like this:
-exec /bin/mv {} ~/.TRASH
And I have an array called current that holds all of the files. My array only holds /bin/mv, {}, and ~/.TRASH (since I shift the -exec out) and are in an array called arguments.
I need it so that every file gets passed into {} and exec is called on it.
I'm thinking I should use sed to replace the contents of {} like this (within a for loop):
for i in "${current[#]}"; do
sed "s#$i#{}"
#exec stuff?
done
How do I exec the other arguments though?
You can something like this:
cmd='-exec /bin/mv {} ~/.TRASH'
current=(test1.txt test2.txt)
for f in "${current[#]}"; do
eval $(sed "s/{}/$f/;s/-exec //" <<< "$cmd")
done
Be very careful with eval command though as it can do nasty things if input comes from untrusted sources.
Here is an attempt to avoid eval (thanks to #gniourf_gniourf for his comments):
current=( test1.txt test2.txt )
arguments=( "/bin/mv" "{}" ~/.TRASH )
for f in "${current[#]}"; do
"${arguments[#]/\{\}/$f}"
done
Your are lucky that your design is not too bad, that your arguments are in an array.
But you certainly don't want to use eval.
So, if I understand correctly, you have an array of files:
current=( [0]='/path/to/file'1 [1]='/path/to/file2' ... )
and an array of arguments:
arguments=( [0]='/bin/mv' [1]='{}' [2]='/home/alex/.TRASH' )
Note that you don't have the tilde here, since Bash already expanded it.
To perform what you want:
for i in "${current[#]}"; do
( "${arguments[#]//'{}'/"$i"}" )
done
Observe the quotes.
This will replace all the occurrences of {} in the fields of arguments by the expansion of $i, i.e., by the filename1, and execute this expansion. Note that each field of the array will be expanded to one argument (thanks to the quotes), so that all this is really safe regarding spaces, glob characters, etc. This is really the safest and most correct way to proceed. Every solution using eval is potentially dangerous and broken (unless some special quotings is used, e.g., with printf '%q', but this would make the method uselessly awkward). By the way, using sed is also broken in at least two ways.
Note that I enclosed the expansion in a subshell, so that it's impossible for the user to interfere with your script. Without this, and depending on how your full script is written, it's very easy to make your script break by (maliciously) changing some variables stuff or cd-ing somewhere else. Running your argument in a subshell, or in a separate process (e.g., separate instance of bash or sh—but this would add extra overhead) is really mandatory for obvious security reasons!
Note that with your script, user has a direct access to all the Bash builtins (this is a huge pro), compared to some more standard find versions2!
1 Note that POSIX clearly specifies that this behavior is implementation-defined:
If a utility_name or argument string contains the two characters "{}", but not just the two characters "{}", it is implementation-defined whether find replaces those two characters or uses the string without change.
In our case, we chose to replace all occurrences of {} with the filename. This is the same behavior as, e.g., GNU find. From man find:
The string {} is replaced by the current file name being processed everywhere it occurs in the arguments to the command, not just in arguments where it is alone, as in some versions of find.
2 POSIX also specifies that calling builtins is not defined:
If the utility_name names any of the special built-in utilities (see Special Built-In Utilities), the results are undefined.
In your case, it's well defined!
I think that trying to implement (in pure Bash) a find command is a wonderful exercise that should teach you a lot… especially if you get relevant feedback. I'd be happy to review your code!

Resources