Compute if loop SPSS - loops

Ultimately, I want to change scores of 0 to 1, scores of 1 to 2, and scores of 2 to 3. I thought one way to do that was using +1, but I realize I could also use a more complicated if then series.
Here is what I did so far:
I used the existing variable (x) to create a new variable (y=x+1) using SPSS syntax. I only want to do this for variables with values >=0 (this was my approach to excluding cells with missing data; the range for x is 0-2).
I can create x+1, but it overwrites the existing variables.
DO REPEAT x =var_1 TO var_86.
if (x>=0) x=(x+1).
end repeat.
exe.
I tried this modification, but it doesn't work:
DO REPEAT x = var_1 TO var_86 / y = var_1a TO var_86a.
IF (x >= 0) y=x +1.
END REPEAT.
EXE.
The error message is:
DO REPEAT The form VARX TO VARY to refer to a range of variables has
been used incorrectly. When using VARX TO VARY to create new
variables, X must be an integer less than or equal to the integer Y.
(Can't use A3 TO A1.)
I tried many other configurations including vectors and loops but haven't yet figured out how to do this computation across the range of variables without overwriting the existing ones. Thanks in advance for any recommendations.

The message you are getting is because SPSS doesn't understand the form var_1a TO var_86a.
For the x to y form to work the number has to be at the end of the name, so for example varA_1 to varA_86 should work.
While you're at it, here's a simple way to go about your task:
recode var_1 TO var_86 (0=1)(1=2)(2=3) into varA_1 TO varA_86.

Related

Structuring a for loop to output classifier predictions in python

I have an existing .py file that prints a classifier.predict for a SVC model. I would like to loop through each row in the X feature set to return a prediction.
I am currently trying to define the element from which to iterate over so as to allow for definition of the test statistic feature set X.
The test statistic feature set X is written in code as:
X_1 = xspace.iloc[testval-1:testval, 0:5]
testval is the element name used in the for loop in the above line:
for testval in X.T.iterrows():
print(testval)
I am having trouble returning a basic set of index values for X (X is the pandas dataframe)
I have tested the following with no success.
for index in X.T.iterrows():
print(index)
for index in X.T.iteritems():
print(index)
I am looking for the set of index values, with base 1 if possible, like 1,2,3,4,5,6,7,8,9,10...n
seemingly simple stuff...i haven't located an existing question via stackoverflow or google.
ALSO, the individual dataframes I used as the basis for X were refined with the line:
df1.set_index('Date', inplace = True)
Because dates were used as the basis for the concatenation of the individual dataframes the loops as written above are returning date values rather than
location values as I would prefer hence:
X_1 = xspace.iloc[testval-1:testval, 0:5]
where iloc, location is noted
please ask for additional code if you'd like to see more
the loops i've done thus far are returning date values, I would like to return index values of the location of the rows to accommodate the line:
X_1 = xspace.iloc[testval-1:testval, 0:5]
The loop structure below seems to be working for my application.
i = 1
j = list(range(1, len(X),1)
for i in j:

SPSS recoding variables data from multiple variables into boolean variables

I have 26 variables and each of them contain numbers ranging from 1 to 61. I want for each case of 1, each case of 2 etc. the number 1 in a new variable. If there is no 1, the variable should contain 2.
So 26 variables with data like:
1 15 28 39 46 1 12 etc.
And I want 61 variables with:
1 2 1 2 2 1 etc.
I have been reading about creating vectors, loops, do if's etc but I can't find the right way to code it. What I have done is just creating 61 variables and writing
do if V1=1 or V2=1 or (etc until V26).
recode newV1=1.
end if.
exe.
**repeat this for all 61 variables.
recode newV1 to newV61(missing=2).
So this is a lot of code and quite a detour from what I imagine it could be.
Anyone who can help me out with this one? Your help is much appreciated!
noumenal is correct, you could do it with two loops. Another way though is to access the VECTOR using the original value though, writing that as 1, and setting all other values to zero.
To illustrate, first I make some fake data (with 4 original variables instead of 26) named X1 to X4.
*Fake Data.
SET SEED 10.
INPUT PROGRAM.
LOOP Id = 1 TO 20.
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
VECTOR X(4,F2.0).
LOOP #i = 1 TO 4.
COMPUTE X(#i) = TRUNC(RV.UNIFORM(1,62)).
END LOOP.
EXECUTE.
Now what this code does is create four vector sets to go along with each variable, then uses DO REPEAT to actually refer to the VECTOR stub. Then finishes up with RECODE - if it is missing it should be coded a 2.
VECTOR V1_ V2_ V3_ V4_ (61,F1.0).
DO REPEAT orig = X1 TO X4 /V = V1_ V2_ V3_ V4_.
COMPUTE V(orig) = 1.
END REPEAT.
RECODE V1_1 TO V4_61 (SYSMIS = 2).
It is a little painful, as for the original VECTOR command you need to write out all of the stubs, but then you can copy-paste that into the DO REPEAT subcommand (or make a macro to do it for you).
For a more simple illustration, if we have our original variable, say A, that can take on integer values from 1 to 61, and we want to expand to our 61 dummy variables, we would then make a vector and then access the location in that vector.
VECTOR DummyVec(61,F1.0).
COMPUTE DummyVec(A) = 1.
For a record if A = 10, then here DummyVec10 will equal 1, and all the others DummyVec variables will still by system missing by default. No need to use DO IF for 61 values.
The rest of the code is just extra to do it in one swoop for multiple original variables.
This should do it:
do repeat NewV=NewV1 to NewV61/vl=1 to 61.
compute NewV=any(vl,v1 to v26).
end repeat.
EXPLANATION:
This syntax will go through values 1 to 61, for each one checking whether any of the variables v1 to v26 has that value. If any of them do, the right NewV will receive the value of 1. If none of them do, the right NewV will receive the value of 0.
Just make sure v1 to v26 are consecutively ordered in the file. if not, then change to:
compute NewV=any(vl,v1, v2, v3, v4 ..... v26).
You need a nested loop: two loops - one outer and one inner.

Operating elementwise on an array

I'm trying to check if my arrays are returning nonsense by accessing out of bounds elements, in fortran. And I want to check these values are less than one, and if they are, change them to one.
This is the piece of my code causing issues:
lastNeighLabel=(/clusterLabel(jj-1,kk,ll), clusterLabel(jj,kk-1,ll), clusterLabel(jj,kk,ll-1)/)
LastNeighLabel contains the cluster label (between 1 and n, where n isthe total number of unique seperate clusters found) for the last neighbour in the x,y,z direction respectively.
When jj or kk or ll are 1, they try and access the 0th element in the array, and as FORTRAN counts from 1 in arrays, it tries to destroy the universe. I'm currently in a tangled mess of about 8 if/elseif statements trying to code for every eventuality. But I was hoping there was a way of operating on each element. So basically I'd like to say where((/jj-1,kk-1,ll-1/).lt.1) do clusterLabel(jj-1,kk,ll)=0 etc depending on which element is causing the problem.
But I can't think of a way to do that because where will only manipulate the variables passed to it, not a different array at the same index. Or am I wrong?
Will gladly edit if this doesn't make sense.
It is not obligatory that Fortran accesses arrays starting from one. Any starting value is allowed. If it more convenient to you to have a zero indexed array, declare the array as:
real, dimension (0:N-1, 0:M-1) :: array
Or
real, dimension (0:N, 0:M) :: array
and have the 0 indices be extra to catch special cases.
This might be another solution to your problem, since zero index values would be legal.
Another possible way to approach this, is to create an extended cluster label array (with index bounds starting at 0), which is equal to the cluster label array with a layer of zeroes tacked on the outside. You can then let your loop run safely over all values of jj, kk, and ll. It depends on the size of the array if this is a feasible solution.
integer :: extended_cluster_label(0:size(cluster_label,1), &
0:size(cluster_label,2), &
0:size(cluster_label,3) &
)
extended_cluster_label(0,:,:) = 0
extended_cluster_label(:,0,:) = 0
extended_cluster_label(:,:,0) = 0
extended_cluster_label(1:, 1:, 1:) = cluster_label
Maybe you could use a function?
real function f(A,i,j,k)
real :: A(:,:,:)
integer :: i,j,k
if (i==0.or.j==0.or.k==0) then
f=0
else
f=A(i,j,k)
endif
end function f
and then use f(clusterLabel,jj-1,kk,ll) etc.

Finding whether a value is equal to the value of any array element in MATLAB

Can anyone tell me if there is a way (in MATLAB) to check whether a certain value is equal to any of the values stored within another array?
The way I intend to use it is to check whether an element index in one matrix is equal to the values stored in another array (where the stored values are the indices of the elements which meet a certain criteria).
So, if the indices of the elements which meet the criteria are stored in the matrix below:
criteriacheck = [3 5 6 8 20];
Going through the main array (called array) and checking if the index matches:
for i = 1:numel(array)
if i == 'Any value stored in criteriacheck'
%# "Do this"
end
end
Does anyone have an idea of how I might go about this?
The excellent answer previously given by #woodchips applies here as well:
Many ways to do this. ismember is the first that comes to mind, since it is a set membership action you wish to take. Thus
X = primes(20);
ismember([15 17],X)
ans =
0 1
Since 15 is not prime, but 17 is, ismember has done its job well here.
Of course, find (or any) will also work. But these are not vectorized in the sense that ismember was. We can test to see if 15 is in the set represented by X, but to test both of those numbers will take a loop, or successive tests.
~isempty(find(X == 15))
~isempty(find(X == 17))
or,
any(X == 15)
any(X == 17)
Finally, I would point out that tests for exact values are dangerous if the numbers may be true floats. Tests against integer values as I have shown are easy. But tests against floating point numbers should usually employ a tolerance.
tol = 10*eps;
any(abs(X - 3.1415926535897932384) <= tol)
you could use the find command
if (~isempty(find(criteriacheck == i)))
% do something
end
Note: Although this answer doesn't address the question in the title, it does address a more fundamental issue with how you are designing your for loop (the solution of which negates having to do what you are asking in the title). ;)
Based on the for loop you've written, your array criteriacheck appears to be a set of indices into array, and for each of these indexed elements you want to do some computation. If this is so, here's an alternative way for you to design your for loop:
for i = criteriacheck
%# Do something with array(i)
end
This will loop over all the values in criteriacheck, setting i to each subsequent value (i.e. 3, 5, 6, 8, and 20 in your example). This is more compact and efficient than looping over each element of array and checking if the index is in criteriacheck.
NOTE: As Jonas points out, you want to make sure criteriacheck is a row vector for the for loop to function properly. You can form any matrix into a row vector by following it with the (:)' syntax, which reshapes it into a column vector and then transposes it into a row vector:
for i = criteriacheck(:)'
...
The original question "Can anyone tell me if there is a way (in MATLAB) to check whether a certain value is equal to any of the values stored within another array?" can be solved without any loop.
Just use the setdiff function.
I think the INTERSECT function is what you are looking for.
C = intersect(A,B) returns the values common to both A and B. The
values of C are in sorted order.
http://www.mathworks.de/de/help/matlab/ref/intersect.html
The question if i == 'Any value stored in criteriacheck can also be answered this way if you consider i a trivial matrix. However, you are proably better off with any(i==criteriacheck)

Subset Sum TI Basic Programming

I'm trying to program my TI-83 to do a subset sum search. So, given a list of length N, I want to find all lists of given length L, that sum to a given value V.
This is a little bit different than the regular subset sum problem because I am only searching for subsets of given lengths, not all lengths, and recursion is not necessarily the first choice because I can't call the program I'm working in.
I am able to easily accomplish the task with nested loops, but that is becoming cumbersome for values of L greater than 5. I'm trying for dynamic solutions, but am not getting anywhere.
Really, at this point, I am just trying to get the list references correct, so that's what I'm looking at. Let's go with an example:
L1={p,q,r,s,t,u}
so
N=6
let's look for all subsets of length 3 to keep it relatively short, so L = 3 (6c3 = 20 total outputs).
Ideally the list references that would be searched are:
{1,2,3}
{1,2,4}
{1,2,5}
{1,2,6}
{1,3,4}
{1,3,5}
{1,3,6}
{1,4,5}
{1,4,6}
{1,5,6}
{2,3,4}
{2,3,5}
{2,3,6}
{2,4,5}
{2,4,6}
{2,5,6}
{3,4,5}
{3,4,6}
{3,5,6}
{4,5,6}
Obviously accomplished by:
FOR A,1,N-2
FOR B,A+1,N-1
FOR C,B+1,N
display {A,B,C}
END
END
END
I initially sort the data of N descending which allows me to search for criteria that shorten the search, and using FOR loops screws it up a little at different places when I increment the values of A, B and C within the loops.
I am also looking for better dynamic solutions. I've done some research on the web, but I can't seem to adapt what is out there to my particular situation.
Any help would be appreciated. I am trying to keep it brief enough as to not write a novel but explain what I am trying to get at. I can provide more details as needed.
For optimisation, you simply want to skip those sub-trees of the search where you already now they'll exceed the value V. Recursion is the way to go but, since you've already ruled that out, you're going to be best off setting an upper limit on the allowed depths.
I'd go for something like this (for a depth of 3):
N is the total number of array elements.
L is the desired length (3).
V is the desired sum
Y[] is the array
Z is the total
Z = 0
IF Z <= V
FOR A,1,N-L
Z = Z + Y[A]
IF Z <= V
FOR B,A+1,N-L+1
Z = Z + Y[B]
IF Z <= V
FOR C,B+1,N-L+2
Z = Z + Y[C]
IF Z = V
DISPLAY {A,B,C}
END
Z = Z - Y[C]
END
END
Z = Z - Y[B]
END
END
Z = Z - Y[A]
END
END
Now that's pretty convoluted but it basically check at every stage whether you've already exceed the desired value and refuses to check lower sub-trees as an efficiency measure. It also keeps a running total for the current level so that it doesn't have to do a large number of additions when checking at lower levels. That's the adding and subtracting of the array values against Z.
It's going to get even more complicated when you modify it to handle more depth (by using variables from D to K for 11 levels (more if you're willing to move N and L down to W and X or if TI BASIC allows more than one character in a variable name).
The only other non-recursive way I can think of doing that is to use an array of value groups to emulate recursion with iteration, and that will look only slightly less hairy (although the code should be less nested).

Resources