Repeating syntax for a list of variables in SPSS - loops

I have a dataset with 134 different variables, I have now one variable for where the syntax is working and this runs fine on my SPSS. However, I don't want to have to repeat this step by hand 133 times more, as it is a lot of work. I want to know the string term present in AE_term1 and create a new variable where it is classified as a number, called AE_term1num. However, I don't know how to write a syntax that automatically does this for all 134 AE_term. Could someone please help me?
The data looks like this:
AE_term1
AE_dur1
AE_term2
AE_dur2
etc
Cystitis
14
Heamaturia
3
uti
15
Sepsis
8
etc
etc
etc
etc
To identify the string variable, i have written the following syntax:
`compute AE_term1num=0.
do if (char.index(lower(AE_term1), "cystitis")).
compute AE_term1num=1.
else if (char.index(lower(AE_term1), "uti")).
compute AE_term1num = 2.
else.
compute AE_term1num=0.
end if.
execute.`
This syntax works for finding the correct values and returns what i want, however I do not know how to loop it over the remaining 133 AE_term variables, if this is even possible.

To start with, your original syntax can be streamlined like this:
compute AE_term1num=0.
if char.index(lower(AE_term1), "cystitis") AE_term1num=1.
if char.index(lower(AE_term1), "uti") AE_term1num = 2.
Now if the same conditions exactly apply for all the rest of the variables, you can use do repeat to automate the process:
do repeat AE_term = AE_term1 to AE_term134 / AE_termN = AE_termN1 to AE_termN134.
compute AE_termN=0.
if char.index(lower(AE_term), "cystitis") AE_termN=1.
if char.index(lower(AE_term), "uti") AE_termN= 2.
end repeat.
NOTE - using AE_term1 to AE_term134 - existing variables - requires that they be consecutive in the dataset. Using AE_termN1 to AE_termN134 to create new variables requires that the numbers be at the end of the name, so AE_term1num to AE_term134num wouldn't work.

Related

How can you run a loop in SPSS syntax that calculates the difference between many sets of variables?

I have a set of variables (A1, A2, B1, B2, C1, C3 ...) that I need to calculate the difference for to eventually create a set of Bland-Altman plots after extracting the mean difference and sd of the difference from a t-test using OMS.
As a first step I have it working for a single pair of variables (e.g. A1 and A2) and am now trying to create a macro that will loop through the first few pairs as a test:
```
DEFINE BlandAlt (scan1vars=!CMDEND / scan2vars=!CMDEND)
COMPUTE diff = scan1vars - scan2vars.
EXECUTE.
T-TEST
/TESTVAL=0
/MISSING=ANALYSIS
/VARIABLES=diff
/CRITERIA=CI(.95).
!ENDDEFINE.
BlandAlt
scan1vars = JumpJumpHeightcm.1 JumpJumpHeightt_score.1 JumpMaxChangeinAccelerationms3.1 JumpMaxChangeinAccelerationt_score.1 JumpMaxAccelerationms2.1 JumpMaxAccelerationt_score.1
scan2vars= JumpJumpHeightcm.2 JumpJumpHeightt_score.2 JumpMaxChangeinAccelerationms3.2 JumpMaxChangeinAccelerationt_score.2 JumpMaxAccelerationms2.2 JumpMaxAccelerationt_score.2.
```
When I run the macro I get an error on the first variable:
Error # 4381 in column 35. Text: JumpJumpHeightt_score.1 The
expression ends unexpectedly. Execution of this command stops.
and a warning when it tries to run the t-test:
Text: diff Command: T-TEST An undefined variable name, or a scratch or
system variable was specified in a variable list >which accepts only
standard variables. Check spelling and verify the existence of this
variable. Execution of this command stops.
Is anyone able to help get this part working? I'm hoping it should then be easy to include the other commands within the macro.
See my comment for corrections to your original macro. After correcting the macru it should work well, only you are not using the macro call the way it is built. You need to call it this way:
BlandAlt scan1vars = JumpJumpHeightcm.1 / scan2vars= JumpJumpHeightcm.2 .
BlandAlt scan1vars = JumpJumpHeightt_score.1 / scan2vars=JumpJumpHeightt_score.2 .
...
Now this is obviously not looping throug your variable list. The problem with SPSS macro is that it's very difficult to get it to loop through two lists at the same time. But in your case, there is no need - there is only one actual list to loop through, while letting the macro add the 1 or 2 suffix to the variable name. Try this:
DEFINE BlandAlt (vrs=!CMDEND)
!do !vr !in(!vrs)
COMPUTE diff = !concat(!vr,".1") - !concat(!vr,".2").
EXECUTE.
TTEST.....
!doend
!enddefine.
Now the macro call would look like this:
BlandAlt vrs = JumpJumpHeightcm JumpJumpHeightt_score JumpMaxChangeinAccelerationms3
JumpMaxChangeinAccelerationt_score JumpMaxAccelerationms2 JumpMaxAccelerationt_score .
I have another solution to offer - putting this in a separate answer because it uses a completely different approach. The idea is that sometimes instead of looping through variables for analyses, you can get the same results by restructuring to long format and then analysing only once using split file. Like this:
varstocases
make scan1vars from JumpJumpHeightcm.1 JumpJumpHeightt_score.1 JumpMaxChangeinAccelerationms3.1
JumpMaxChangeinAccelerationt_score.1 JumpMaxAccelerationms2.1 JumpMaxAccelerationt_score.1 /
make scan2vars from JumpJumpHeightcm.2 JumpJumpHeightt_score.2 JumpMaxChangeinAccelerationms3.2
JumpMaxChangeinAccelerationt_score.2 JumpMaxAccelerationms2.2 JumpMaxAccelerationt_score.2 /
index=origvar(scan1vars).
* now you can do the whole process only once.
COMPUTE diff = scan1vars - scan2vars.
sort cases by origvar.
split file by origvar.
T-TEST
/TESTVAL=0
/MISSING=ANALYSIS
/VARIABLES=diff
/CRITERIA=CI(.95).
split file off.

Power Query M loop table / lookup via a self-join

First of all I'm new to power query, so I'm taking the first steps. But I need to try to deliver sometime at work so I can gain some breathing time to learn.
I have the following table (example):
Orig_Item Alt_Item
5.7 5.10
79.19 79.60
79.60 79.86
10.10
And I need to create a column that will loop the table and display the final Alt_Item. So the result would be the following:
Orig_Item Alt_Item Final_Item
5.7 5.10 5.10
79.19 79.60 79.86
79.60 79.86 79.86
10.10
Many thanks
Actually, this is far too complicated for a first Power Query experience.
If that's what you've got to do, then so be it, but you should be aware that you are starting with a quite difficult task.
Small detail: I would expect the last Final_Item to be 10.10. According to the example, the Final_Item will be null if Alt_Item is null. If that is not correct, well that would be a nice first step for you to adjust the code below accordingly.
You can create a new blank query, copy and paste this code in the Advanced Editor (replacing the default code) and adjust the Source to your table name.
let
Source = Table.Buffer(Table1),
AddedFinal_Item =
Table.AddColumn(
Source,
"Final_Item",
each if [Alt_Item] = null
then null
else List.Last(
List.Generate(
() => [Final_Item = [Alt_Item], Continue = true],
each [Continue],
each [Final_Item =
Table.First(
Table.SelectRows(
Source,
(x) => x[Orig_Item] = [Final_Item]),
[Alt_Item = "not found"]
)[Alt_Item],
Continue = Final_Item <> "not found"],
each [Final_Item])))
in
AddedFinal_Item
This code uses function List.Generate to perform the looping.
For performance reasons, the table should always be buffered in memory (Table.Buffer), before invoking List.Generate.
List.Generate is one of the most complex Power Query functions.
It requires 4 arguments, each of which is a function in itself.
In this case the first argument starts with () and the other 3 with each (it should be clear from the outline above: they are aligned).
Argument 1 defines the initial values: a record with fields Final_Item and Continue.
Argument 2 is the condition to continue: if an item is found.
Argument 3 is the actual transformation in each iteration: the Source table is searched (with Table.SelectRows) for an Orig_Item equal to Alt_Item. This is wrapped in Table.First, which returns the first record (if any found) and accepts a default value if nothing found, in this case a record with field Alt_Item with value "not found", From this result the value of record field [Alt_Item] is returned, which is either the value of the first record, or "not found" from the default value.
If the value is "not found", then Continue becomes false and the iterations will stop.
Argument 4 is the value that will be returned: Final_Item.
List.Generate returns a list of all values from each iteration. Only the last value is required, so List.Generate is wrapped in List.Last.
Final remark: actual looping is rarely required in Power Query and I think it should be avoided as much as possible. In this case, however, it is a feasible solution as you don't know in advance how many Alt_Items will be encountered.
An alternative for List.Generate is using a resursive function.
Also List.Accumulate is close to looping, but that has a fixed number of iterations.
This can be solved simply with a self-join, the open question is how many layers of indirection you'll be expected to support.
Assuming just one level of indirection, no duplicates on Orig_Item, the solution is:
let
Source = #"Input Table",
SelfJoin1 = Table.NestedJoin( Source, {"Alt_Item"}, Source, {"Orig_Item"}, "_tmp_" ),
Expand1 = ExpandTableColumn( SelfJoin1, "_tmp_", {"Alt_Item"}, {"_lkp_"} ),
ChkJoin1 = Table.AddColumn( Expand1, "Final_Item", each (if [_lkp_] = null then [Alt_Item] else [_lkp_]), type number)
in
ChkJoin1
This is doable with the regular UI, using Merge Queries, then Expand Column and adding a custom column.
If yo want to support more than one level of indirection, turn it into a function to be called X times. For data-driven levels of indirection, you wrap the calls in a list.generate that drop the intermediate tables in a structured column, though that's a much more advanced level of PQ.

easier use of loops and vectors in spss to combine variables

I have a student who has gathered data in a survey online whereby each response was given a variable, rather than the variable having whatever the response was. We need a scoring algorithm which reads the statements and integrates. I can do this with IF statements per item, e.g.,
if Q1_1=1 var1=1.
if Q1_2=1 var1=2.
if Q1_3=1 var1=3.
if Q1_4=1 var1=4.
Doing this for a 200 item survey (now more like 1000) will be a drag and subject to many typos unless automated. I have no experience of vectors and loops in SPSS, but some reading suggests this is the way to approach the problem.
I would like to run if statements as something like (pseudocode):
for items=1 1 to 30
for responses=1 to 4
if Q1_2_1=1 a=1.
if Q1_2=1 a=2.
if Q1_3=1 a=3.
if Q1_4=1 a=4.
compute newitem(items)=a.
next response.
next item.
Which I would hope would produce a new variable (newitem1 to 30) which has one of the 4 responses for it's original corresponding 4 variable information.
Never written serious spss code before: please advise!
This will do the Job:
* creating some sample data.
data list free (",")/Item1_1 to Item1_4 Item2_1 to Item2_4 Item3_1 to Item3_4.
begin data
1,,,,,1,,,,,1,,
,1,,,1,,,,1,,,,
,,,1,,,1,,,,,1,
end data.
* now looping over the items and constructing the "NewItems".
do repeat Item1=Item1_1 to Item1_4
/Item2=Item2_1 to Item2_4
/Item3=Item3_1 to Item3_4
/Val=1 to 4.
if Item1=1 NewItem1=Val.
if Item2=1 NewItem2=Val.
if Item3=1 NewItem3=Val.
end repeat.
execute.
In this way you run all you loops simultaneously.
Note that "ItemX_1 to ItemX_4" will only work if these four variables are consecutive in the dataset. If they aren't, you have to name each of them separately - "ItemX_1 ItemX_2 ItemX_3 itemX_4".
Now if you have many such item sets, all named regularly as in the example, the following macro can shorten the process:
define !DoItems (ItemList=!cmdend)
!do !Item !in (!ItemList)
do repeat !Item=!concat(!Item,"_1") !concat(!Item,"_2") !concat(!Item,"_3") !concat(!Item,"_4")/Val=1 2 3 4.
if !item=1 !concat("New",!Item)=Val.
end repeat.
!doend
execute.
!enddefine.
* now you just have to call the macro and list all your Item names:
!DoItems ItemList=Item1 Item2 Item3.
The macro will work with any item name, as long as the variables are named ItemName_1, ItemName_2 etc'.

Reduced Survey Frequency - Salesforce Workflow

Hoping you can help me review the logic below for errors. I am looking to create a workflow that will send a survey out to end users on a reduced frequency. Basically, it will check the Account object of the Case for a field, 'Reduced Survey Frequency', which contains a # and will not send a survey until that # of days has passed since the last date set on the Contact field 'Last Survey Date'. Please review the code and let me know any recommended changes!
AND( OR(ISPICKVAL(Status,"Closed"), ISPICKVAL(Status,"PM Sent")),
OR(CONTAINS(RecordType.Name,"Portal Case"),CONTAINS(RecordType.Name,"Standard Case"),
CONTAINS(RecordType.Name,"Portal Closed"),
CONTAINS(RecordType.Name,"Standard Closed")),
NOT( Don_t_sent_survey__c )
,
OR(((TODAY()- Contact.Last_Survey_Date__c) >= Account.Reduced_Survey_Frequency__c ),Account.Reduced_Survey_Frequency__c==0,
ISBLANK(Account.Reduced_Survey_Frequency__c),
ISBLANK(Contact.Last_Survey_Date__c)
))
Thanks,
Brian H.
Personally I prefer the syntax where && and || are used instead of AND(), OR()functions. It just reads bit nicer to me, no need to trace so many commas, keep track of indentation in the more complex logic... But if you're more used to this Excel-like flow - go for it. In the end it has to be readable for YOU.
Also I'd consider reordering this a bit - simple checks, most likely to fail first.
The first part - irrelevant to your question
Don't use RecordType.Name because these Names can be translated to say French and it will screw your logic up for users who will select non-English as their preferred language. Use RecordType.DeveloperName, it's safer.
CONTAINS - do you really have so many record types that share this part in their name? What's wrong with normal = comparison? You could check if the formula would be more readable with CASE() statement. Or maybe flip the logic if there are say 6 rec types and you've explicitly listed 4 (this might have to be reviewed though when you add new rec. type). If you find yourself copy-pasting this block of 4 checks frequently - consider making a helper formula field with it...
The second part
ISBLANK checks could be skipped if you'll properly use the "treat nulls as blanks / as zeroes" setting at the bottom of formula editor. Because you're making check like
OR(...,
Account.Reduced_Survey_Frequency__c==0,
ISBLANK(Account.Reduced_Survey_Frequency__c),
...
)
which is essentially what this thing was designed for. I'd flip it to "treat nulls as zeroes" (but that means the ISBLANK check will never "fire"). If you're not comfortable with that - you can also "safely compare or substract" by using
BLANKVALUE(Account.Reduced_Survey_Frequency__c,0)
Which will have the similar "treat null as zero" effect but only in this one place.
So... I'd end up with something like this:
(ISPICKVAL(Status,'Closed') || ISPICKVAL(Status, 'PM Sent')) &&
(RecordType.DeveloperName = 'Portal_Case' ||
RecordType.DeveloperName = 'Standard_Case' ||
RecordType.DeveloperName = 'Portal_Closed' ||
RecordType.DeveloperName = 'Standard_Closed'
) &&
NOT(Don_t_sent_survey__c) &&
(Contact.Last_Survey_Date__c + Account.Reduced_Survey_Frequency__c < TODAY())
No promises though ;)
You can easily test them by enabling debug logs. You'll see there the workflow formula together with values that are used to evaluate it.
Another option is to make a temporary formula field with same logic and observe (in a report?) where it goes true/false for mass spot check.

Lua string library choices for finding and replacing text

I'm new to Lua programming, having come over from python to basically make a small addon for world of warcraft for a friend. I'm looking into various ways of finding a section of text from a rather large string of plain text. I need to extract the information from the text that I need and then process it in the usual way.
The string of text could be a number of anything, however the below is what we are looking to extract and process
-- GSL --
items = ["itemid":"qty" ,"itemid":"qty" ,"itemid":"qty" ,]
-- ENDGSL --
We want to strip the whole block of text from a potentially large block of text surrounding it, then remove the -- GSL -- and -- ENDGSL -- to be left with:
items = ["itemdid":"qty …
I've looked into various methods, and can't seem to get my head around any of them.
Anyone have any suggestions on the best method to tackle this problem?
EDIT: Additional problem,
Based on the accepted answer I've changed the code slightly to the following.
function GuildShoppingList:GUILDBANKFRAME_OPENED()
-- Actions to be taken when guild bank frame is opened.
if debug == "True" then self:Print("Debug mode on, guild bank frame opened") end
gslBankTab = GetCurrentGuildBankTab()
gslBankInfo = GetGuildBankText(gslBankTab)
p1 = gslBankInfo:match('%-%- GSL %-%-%s+(.*)%s+%-%- ENDGSL %-%-')
self:Print(p1)
end
The string has now changed slightly the information we are parsing is
{itemid:qty, itemid:qty, itemid:qty, itemid:qty}
Now, this is a string that's being called in p1. I need to update the s:match method to strip the { } also, and iterate over each item and its key seperated by, so I'm left with
itemid:qty
itemid:qty
itemid:qty
itemid:qty
Then I can identify each line individually and place it where it needs to go.
try
s=[[-- GSL --
items = ["itemid":"qty" ,"itemid":"qty" ,"itemid":"qty" ,]
-- ENDGSL --]]
print(s:match('%-%- GSL %-%-%s+(.*)%s+%-%- ENDGSL %-%-'))
The key probably is that - is a pattern modifier that needs quoting if you want a literal hyphen. More info on patterns in the Lua Reference Manual, chapter 5.4.1
Edit:
To the additional problem of looping through keys of what is almost an array, you could do 2 things:
Either loop over it as a string, assuming both key and quantity are integers:
p="{1:10, 2:20, 3:30}"
for id,qty in p:gmatch('(%d+):(%d+)') do
--do something with those keys:
print(id,qty)
end
Or slightly change the string, evaluate it as a Lua table:
p="{1:10, 2:20, 3:30}"
p=p:gsub('(%d+):','[%1]=') -- replace : by = and enclose keys with []
t=loadstring('return '..p)() -- at this point, the anonymous function
-- returned by loadstring get's executed
-- returning the wanted table
for k,v in pairs(t) do
print(k,v)
end
If the formats of keys or quantities is not simply integer, changing it in the patterns should be trivial.

Resources