I have a number of variables whose name begins with the prefix indoor. What comes after indoor is not numeric (that would make everything simpler).
I would like a tabulation for each of these variables.
My code is the following:
local indoor indoor*
foreach i of local indoor {
tab `i' group, col freq exact chi2
}
The problem is that indoor in the foreach command resolves to indoor* and not to the list of the indoor questions, as I hoped. For this reason, the tab command is followed by too many variables (it can only handle two) and this results in an error.
The simple fix is to substitute the first command with:
local indoor <full list of indoor questions>
But this is what I would like to avoid, that is to have to find all the names for these variables and then paste them in the code. It seems there is a quicker fix for this but I can't think of any.
The trick is to use ds or unab to create the varlist expansion before asking Stata to loop over values in the foreach loop.
Here's an example of each:
******************! BEGIN EXAMPLE
** THIS FIRST SECTION SIMPLY CREATES SOME FAKE DATA & INDOOR VARS **
clear
set obs 10000
local suffix `c(ALPHA)'
token `"`suffix'"'
while "`1'" != "" {
g indoor`1'`2'`3' = 1+int((5-1+1)*runiform())
lab var indoor`1'`2'`3' "Indoor Values for `1'`2'`3'"
mac shift 1
}
g group = rbinomial(1,.5)
lab var group "GROUP TYPE"
** NOW, YOU SHOULD HAVE A BUNCH OF FAKE INDOOR
**VARS WITH ALPHA, NOT NUMERIC SUFFIXES
desc indoor*
**USE ds TO CREATE YOUR VARLIST FOR THE foreach LOOP:
ds indoor*
di "`r(varlist)'"
local indoorvars `r(varlist)'
local n 0
foreach i of local indoorvars {
**LET'S CLEAN UP YOUR TABLES A BIT WITH SOME HEADERS VIA display
local ++n
di in red "--------------------------------------------"
di in red "Table `n': `:var l `i'' by `:var l group'"
di in red "--------------------------------------------"
**YOUR tab TABLES
tab `i' group, col freq chi2 exact nolog nokey
}
******************! END EXAMPLE
OR using unab instead:
******************! BEGIN EXAMPLE
unab indoorvars: indoor*
di "`indoorvars'"
local n 0
foreach i of local indoorvars {
local ++n
di in red "--------------------------------------------"
di in red "Table `n': `:var l `i'' by `:var l group'"
di in red "--------------------------------------------"
tab `i' group, col freq chi2 nokey //I turned off exact to speed things up
}
******************! END EXAMPLE
The advantages of ds come into play if you want to select your indoor vars using a tricky selection rule, like selecting indoor vars based on information in the variable label or some other characteristic.
You could do this with
foreach i of var `indoor' {
tab `i' group, col freq exact chi2
}
This would work. It is almost identical to the code in the question.
unab indoor : indoor*
foreach i of local indoor {
tab `i' group, col freq exact chi2
}
foreach v of varlist indoo* {
do sth with `v'
}
Related
I am trying to create two lists of files and create two new datasets that merges all those files. To do So I was trying the following:
*** SET FOLDER PATHS ***********************************************************
global projectFolder "C:\Users\XXX"
global codeFolder "${projectFolder}\code"
global databaseFolder "${projectFolder}\data"
global rawFolder "${databaseFolder}\raw"
global outputsFolder "${databaseFolder}\output"
*** CREATING VECTORS WITH FILE NAMES *******************************************
global file_all dir "$outputsFolder" files "*.dta"
di `$file_all'
global file_monthly dir "$outputsFolder" files "*_monthly.dta"
di `$file_monthly'
global file_yearly : list global file_all - global file_monthly
di `$file_yearly'
I found a few problems. First, I was not able to create the list of files, and second, I didn't find a way to create this loop without merging twice the first dataset.
*** MERGING YEARLY OUTCOMES ****************************************************
use "$outputsFolder\first_dataset.dta", clear
foreach file in `file_yearly' {
merge 1:1 muni_code year using `file', nogen
}
Within your foreach loop over the files, you can conditionally load/use if the first file (in the example below, it requires knowing the name of the "first" file), else merge, like this:
local files: dir "." files "yearly*.dta"
foreach f of local files {
if "`f'" == "yearly_1.dta" use `f'
else merge 1:1 year muni using `f', nogen
}
list, clean
Output:
year muni val1 val2 val3
1. 2001 1 .3132002 .1924075 .8190824
2. 2002 2 .5559791 .1951401 .4882096
3. 2003 3 .9382851 .9509598 .2704866
4. 2004 4 .7363221 .2904454 .5859706
Input:
set seed 123
forvalues i = 1/3 {
clear
set obs 4
gen year = 2000 + _n
gen muni = _n
gen val`i' = runiform()
save yearly_`i', replace
}
I have a loop in latex, where I access 5 different tables to include in my document. The look has two elements - one variable indicating short name of the category (\n which can be A, O, I, R or H) and the variable that has the long name (\m, which can be "Apartment", "Office", etc).
This loop works as intended for caption and for input. But it does not work for "\label". In other words, the loop produces 5 tables, pulling the right files each time. It puts correct caption on these tables (Apartment, Office, etc), but \label does not get populated correctly. It produces only one label as "output_reg\n" instead of 5 labels as "output_reg_A", "output_reg_O", etc.
I would appreciate all the help I can get!
\documentclass{article}
\usepackage{tikz}
\begin{document}
\foreach \n\m in {A/Apartments,O/Office,R/Retail,I/Industrial,H/Hotel}
{ \begin{table}
\small
\centering
\caption{Regression results \n - \m } \label{output_reg_\n}
\begin{tabular}{ccccc}
a & a & \\
a & a &
\end{tabular}
\end{table}
}
content
% I want to be able to reference the tables as \ref{output_reg_A} and \ref{output_reg_O and so on.
\end{document}
I'm just learning SPSS and I want to do simple subgroup analysis based on a variable "status" I created which can take values from 0 to 8. I would like to print outputs in one go.
this is the pseudocode for what I want to do:
for( i = 1, i = 8, i++)
{
filter by (ststus = i)
display analysis
remove filter
}
That way I can do it all in one go but also i can add to the analysis code and do something easily for the 8 subgroups.
I don't know if it's relevant but here is the code I want to iterate over currently:
USE ALL.
COMPUTE filter_$=(Workforce EQ 1 AND SurveySample = 1 AND State = 1).
VARIABLE LABELS filter_$ 'Workforce EQ 1 (FILTER)'.
> VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMATS filter_$
> (f1.0). FILTER BY filter_$. EXECUTE.
>
>
> FREQUENCIES VARIABLES = Q86 Q33 Q34 Q88 FSEScore /BARCHART FREQ
> /ORDER=ANALYSIS.
>
> CROSSTABS /TABLES=FSEScore BY Q86 /FORMAT=AVALUE TABLES
> /CELLS=ROW /COUNT ROUND CELL.
>
> FILTER OFF. USE ALL.
Thanks guys.
split file command may solve the problem - it causes your analysis reports to show results for each category of your split variable separately:
*run your transformations.
sort cases by status.
split file by status.
FREQUENCIES .....
CROSSTABS ....
split file off.
If this is not enough, you can use a macro to run through "status" categories:
first define the macro:
define MyMacro ()
!do !ST=1 !to 8
* filter commands using **status = !ST**
* transformations using **status = !ST**
FREQUENCIES .....
CROSSTABS ....
!doend
!enddefine.
now call your macro:
MyMacro .
this is probably a very getto way of doing this, the suggestion above is probably more sensible.
You can initialise Python is spss. The following code works:
begin program.
import spss
for i in xrange(1,8):
string = str(i)
spss.Submit("""
USE ALL.
COMPUTE filter_$=(Workforce EQ 1 AND SurveySample = 1 AND Status = %s).
VARIABLE LABELS filter_$ 'Workforce EQ 1 (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMATS filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE.
#analysis as required
FREQUENCIES VARIABLES = Q86
/BARCHART FREQ
/ORDER=ANALYSIS.
"""%(' '.join(string)) )
end program.
Many thanks to eli-k I probably should have just used splitfile.
I am trying to run this code:
function calcs.grps(Number,ion_color)
grp .. ion_color .. Y[Number] = (ion_py_mm)
grp .. ion_color .. Z[Number] = (ion_pz_mm)
end
in a Lua script, the arrays already exist (eg grp2Y,grp5Z etc) and I want to use this function to populate them based upon the two variables fed in. I keep getting the error ' '=' expected near '..' '. What am I doing wrong?
To flesh it out a bit:
I am 'flying' 120 ions in my simulation. This is actually 12 groups of 10 ions. The individual groups of 10 are distinguished by ion_color, which is an integer value from 1 to 12. The variable 'Number' just cycles through 1 to 10 each time before moving on to the next color. Once I have populated these arrays I want to get the standard deviation for each group.
Thank you!
You can't "construct" name of variable, but you can construct an index. Use two levels of nested tables.
function calcs.grps(Number,ion_color)
ion['grp' .. ion_color .. 'Y'][Number] = (ion_py_mm)
ion['grp' .. ion_color .. 'Z'][Number] = (ion_pz_mm)
end
Well, actually you can, since all global variables are just entries in _G table, but don't do that since it is bad - it is unreadable, makes stuff spill to other functions you didn't intend to, etc.
The technical answer to your question is to simply index _G, _G is a table which holds all global variables:
function calcs.grps(Number,ion_color)
_G['grp' .. ion_color .. Y'][Number] = (ion_py_mm)
_G['grp' .. ion_color .. 'Z'][Number] = (ion_pz_mm)
end
But I think the better question, is why aren't you organizing it like this...
local ions = {
Red = {
{
Y = 0, --Y property
Z = 0 --Z property
},
--Continue your red ions
},
NewColor = {
Y = 0, --Y property
Z = 0 --Z property
},
--Continue this color's ions
},
--You get the idea
}
function calcs.grps(color, number)
ions[color][number].Y = (ion_py_mm)
ions[color][number].Z = (ion_pz_mm)
end
Then you would pass a color, and a number indicating which ion of this color
It looks a lot cleaner, IMO.
Let's say I have 60 variables, none with similar naming patterns. I want to assign labels to all variables, which I stored locally. So for example
local mylabels "dog cat bird"
However I am struggling with the exact expression of the loop. Do I have to store my variable range globally and then use a foreach? Or do I use forvalues?
Edit: I was referring to variable labels. I managed to create a loop, similar to the method used here http://www.stata.com/support/faqs/programming/looping-over-parallel-lists/. However I ran into a more difficult problem: my variables have no particular naming patterns, and the labels have special characters (spaces, commas, %-signs), and here is where my loop does not work.
Some example data (excuse the randomness):
gen Apples_ts_sum = .
gen Pears_avg_1y = .
gen Bananas_max_2y = .
And some example labels:
"Time series of apples, sum, %" "Average of pears, over 1 year"
"Maximum of bananas, over 2 years".
I ran into this entry by Nick Cox: http://www.stata.com/statalist/archive/2012-10/msg00285.html and tried to apply the mentioned parentheses method, like so:
local mylabels `" "Time series of apples, sum, %" "Average of pears, over 1 year" "Maximum of bananas, over 2 years" "'
But could not get it to work.
If you want to label all the variables the same thing, for example "dog cat bird", Then you can use the varlist option for the describe command. Let's say your 60 variables can be generally listed with the expression EXP. Then:
qui des EXP, varlist
foreach variable in `r(varlist)'{
label var `variable' "dog cat bird"
}
Edited:
Taking your example data, I created another local containing the variable names.
local myvar `" "Apples_ts_sum" "Pears_avg_1y" "Bananas_max_2y" "'
local mylabels `" "Time series of apples, sum, %" "Average of pears, over 1 year" "Maximum of bananas, over 2 years" "'
forval n = 1/3{
local a: word `n' of `mylabels'
local b: word `n' of `myvar'
di "variable `b', label `a'"
label var `b' "`a'"
}
Note that I manually created the list of variables. You can automatically create this list using the method I listed above, with des, varlist.
qui des , varlist
foreach var in `r(varlist)'{
local myvar_t "`myvar_t' `var'"
}
You can then use the local myvar_t instead of myvar in the above example.