I have a problem that I think would be solved relatively quickly with a loop. I have to work with SPSS and I think it can only be solved in syntax.
Unfortunately I am not good with loops, so I hope that one of you can help me.
I have done a study on reasons for abortions. Now I would like to present the distribution of reasons.
The problem is that each person was first asked about all their pregnancies (because this is also relevant for the later analysis), then the pregnancy was determined to which the questionnaire will further refer.
So the further questionnaire was only about one of the pregnancies, whereas the first questions (f.ex. year of pregnancy, reason for abortion) were answered for each pregnancy. For the reasons I only need the information that refers to the pregnancy that was also used for the further questionnaire.
I have an index variable that determines the loop at which pass the relevant pregnancy is asked ("index"). Then I have the variable "Loop_1_R" to "Loop_5_R" which queries the reasons for each up to 5 abortions (of course, for each woman, only the number of pregnancies that she also indicated). In between there are some missing data, for ex. it could be that a woman said that she had 5 pregnancies, but only two of them were abortions (f.ex. the third and fifth). So then she would only give reasons for an abortion in loop3 and loop5.
Now I want to create a new variable which contains only the reason which refers to the relevant pregnancy. So for each woman only one value. I was thinking, you could build a loop in the sense of calculate new variable in such a way that loop i is taken at index i.
I could of course do it by hand, but with a VPN count of over 3000 it will obviously take considerably longer.
I hope someone can help me! This is an example dataset with less loops and VPN:
You can use do repeat to loop and catch the value you need this way:
do repeat vr=Loop_1_R to Loop_5_R/vl=1 to 5.
if Index=vl reason=vr.
end repeat.
Related
I am currently working on analyzing a within-subject dataset with 8 time-ordered assessment points for each subject.
The variables of interest in this example is ID, time point, and accident.
I want to create two variables: accident_intercept and accident_slope, based on the value on accident at a particular time point.
For the accident_intercept variable, once a participant indicated the occurrence of an accident (e.g., accident = 1) at a specific time point, I want the values for that time point and the remaining time points to be 1.
For the accident_slope variable, once a participant indicated the occurrence of an accident (e.g., accident = 1) at a specific time point, I want the value of that time point to be 0, but count up by 1 for the remaining time points until the end time point, for each subject.
The main challenge here is that the process stated above need to be repeated/looped for each participant that occupies 8 rows of data.
Please see how the newly created variables would look like:
I have looked into the instruction for different SPSS syntax, such as loop, the lag/lead functions. I also tried to break my task into different components and google each one. However, I have not made any progress :)
I would be really grateful of any helps and directions that you provide.
Here is one way to do what you need using aggregate to calculate "accident time":
if accident=1 accidentTime=TimePoint.
aggregate out=* mode=addvariables overwrite=yes /break=ID/accidentTime=max(accidentTime).
if TimePoint>=accidentTime Accident_Intercept=1.
if TimePoint>=accidentTime Accident_Slope=TimePoint-accidentTime.
recode Accident_Slope accidentTime (miss=0).
Here is another approach using the lag function:
compute Accident_Intercept=0.
if accident=1 Accident_Intercept=1.
if $casenum>1 and id=lag(id) and lag(Accident_Intercept)=1 Accident_Intercept=1.
compute Accident_Slope=0.
if $casenum>1 and id=lag(id) and lag(Accident_Intercept)=1 Accident_Slope=lag(Accident_Slope) +1.
exe.
We have a dataset of 1222x20 in Stata.
There are 611 individuals, such that each individual is in 2 rows of the dataset. There is only one variable of interest in each second row of each individual that we would like to use.
This means that we want a dataset of 611x21 that we need for our analysis.
It might also help if we could discard each odd/even row, and merge it later.
However, my Stata skills let me down at this point and I hope someone can help us.
Maybe someone knows any command or menu option that we might give a try.
If someone knows such a code, the individuals are characterized by the variable rescode, and the variable of interest on the second row is called enterprise.
Below, the head of our dataset is given. There is a binary time variable followup, where we want to regress the enterprise(yes/no) as dependent variable at time followup = followup onto enterprise as independent variable at time followup = baseline
We have tried something like this:
reg enterprise(if followup="Folowup") i.aimag group loan_baseline eduvoc edusec age16 under16 marr_cohab age age_sq buddhist hahl sep_f nov_f enterprise(if followup ="Baseline"), vce(cluster soum)
followup is a numeric variable with value labels, as its colouring in the Data Editor makes clear, so you can't test its values directly for equality or inequality with literal strings. (And if you could, the match needs to be exact, as Folowup would not be read as implying Followup.)
There is a syntax for identifying observations by value labels: see [U] 13.11 in the pdf documentation or https://www.stata-journal.com/article.html?article=dm0009.
However, it is usually easiest just to use the numeric value underneath the value label. So if the variable followup had numeric values 0 and 1, you would test for equality with 0 or 1.
You must use == not = for testing for equality here:
... if followup == 1
For any future Stata questions, please see the Stata tag wiki for detailed advice on how to present data. Screenshots are usually difficult to read and impossible to copy, and leave many details about the data obscure.
Probably an easy question:
I want to run this piece of syntax:
SUMMARIZE
/TABLES=AGENCY
PIN
AGE
GENDER
DISABILITY
MAINSERVICE
MRESAGENCY
MRESSUPPORT
/FORMAT=LIST NOCASENUM TOTAL
/TITLE='Case Summaries'
/MISSING=VARIABLE
/CELLS=COUNT.
for 264 different agencies which are all values contained in the variable 'AGENCY'.
I want to create a different table for each agency outlining the above information for them.
I think I can do this using a DO REPEAT or LOOP on SPSS.
Any advice would be much appreciated.
Thank you :)
note: I have Google'd and read endless amounts on looping I am just a little unsure as to which method is what I am looking for
Take a look at SPLIT FILE, which meets your needs
I'm taking my first course in AI, and I have to define some problems in my homework (not yet solve them, just supply a definition).
So I have to define about boolean satisfiability problem
:
What is a state?
What is the initial state?
What is a final state?
What are the operators?
My question is: Should the formula be a part of the state?
Considerations so far:
The operator doesn't change it, and it's constant through the computation, so it's not.
If I do include it, in theory, the search space gets much bigger, since more states are possible, but in reality the formula can't be changed, so I get a big state, and a branching factor that is not corresponding.
It's varying from one execution to the next, so it should be a part of the state.
You need only really consider the varying parts of the problem to be a state when conducting a search such as this, although I'd say in this case it really comes down to how you define the problem.
The search space for a given run of the algorithm depends upon the input formula, but after that is fixed, ie you are searching the space of n length bit vectors where n is the number of variables in the formula. So the formula is not part of the state because it does not vary.
The counter claim is that you are searching in a larger space of formula-vector pairs, but as you cannot change the formula as part of the problem, this has not really increased the size of the search space. So I would not make the claim that "If I do include it, in theory the search space gets much bigger". It does not, the reachable states are the same, the branching is the same, the space that requires exploring to solve the problem is the same.
Given this, my answer would that the formula is not part of the state, but defines the nature of the state space. So the answers to your four questions will each be functionally dependant on the formula in some way, but the state depends only on the length of the formula.
Hope that makes sense!
This is just a note for future readers - Not an answer
Vic Smith is right, another way to look at the fact that in theory there are more states but in practice not (my second dot), is just to think about it as separate bondage spaces. For example for the formula X or Y there is one bondage space, and for not X and Y there is another one and they have no common nodes in the representation.
So it can vary from one execution to another, but still has the same "reachable" states, and same branching factor. And each execution has a different starting state.
A user has to complete ten steps to achieve a desired result. The ten steps can be completed in any order.
If there is a bug, the bug is dependent only on the steps that have been taken, not the order in which they were taken (i.e., the bug is path independent).
For example: If the user performs three steps in the order 10, 1, 2 and produces a bug the exact same bug will be produced if the user performs the same three steps in the order 1, 2, 10.
What is the maximum number of unique bugs this program can have?
You mean what is the number of distinct sets pickable from 10 elements? That's a powerset: 2**10.
Much later: some knowledgeable others have suggested that having no bugs should not be counted as a bug. Accordingly, I revise my count: 2**10 - 1.
hughdbrown's answer is correct, but there is another possible interpretation of the question. Suppose that a sequence of operations can never produce more than one bug (i.e. that it should just be counted as one bug). For example, if the operations (3,6,2) is a bug, then you shouldn't be allowed to count (3,6,2,5) as another bug. In that case, rather than finding the maximum possible number of subsets of {1,2,...,10}, you want to find the maximum number of possible subsets so that no one contains another. The answer to this version of the question is "10 choose 5"=252.
Edit: by the way, the result that says this is maximal is called Sperner's Theorem.
It depends entirely how many ways there are of doing each step. If you have a process that involves only one step, but there are multiple ways of doing that step, every step could have an associated bug.
There's also the misuse of functions, which you cant prevent against, which could be considered a bug. ie:
If a user was to think that
rm -rf /
was short for
remove media --really fast /
ie: eject all devices1
I would guess that would be a potential bug. Its user error really, but its still a singular thing that can occur that produces results other than that were wanted.
You could argue the above is a bit over the top, but ultimately, there is no limitation on the ways users can do things wrong.
When users are there, assume, anything that can go wrong, will.
The only problem with the above reasoning, is you have to prematurely delete powerful things so users don't hurt themselves, which leads to less effective tools for those who know how to use them. Like corks on forks sort of rationale.
The only way to solve this concern effectively is give newbs blunt objects to learn with, and then give them an option which takes away all the foam padding once they learn the ropes, so experienced users don't have to keep working with blunt tools, and don't have to deblunten every tool themself.
( If there are infinite numbers of possibly ways to do 1 step, I don't even want to begin to think of the numbers of ways to do 10 steps wrong )
1: If you don't know, this will erase lots of your hard drive and cause much pain. Don't do it.
One, a designer fault? :)