How to call function in Nim based on a number without ifs? - call

So I want to write a procedure to which I'll pass a number and it will call some other function based on the number I gave it. It could work something like this:
#example procs
proc a(): proc =
echo 'a'
proc b(): proc =
echo 'b'
proc callProcedure(x:int): proc =
if x>5:
a()
else:
b()
callProcedure(10) #calls function a
callProcedure(1) #calls function b
However I want to do this without ifs. For example I was looking into somehow storing a reference to the functions in an array and call it via its index but I could not get that to work and I've got the suspicion it cannot be done (in which case I would like to understand why). I am also interested in other ways of achieving the same result. I've got multiple reasons why I want to do this: a) I've read ifs are slow(er-ish) among other things (like that novices should leave optimizing them up to the compiler so I know I don't have to even think about it but I would like to try something like this to learn nonetheless) b) I think it opens up some possibilities like being very dynamic in the order of and which procedures will be called c) just out of curiosity
Some of the things I've looked at in relation to this but could not manage to find/learn what I need from them:
Nim return custom tuple containing proc
Nim stored procedure reference in tuple
PS: I'm very new to this so if I missed something obvious feel free to point me in the right direction.
Thanks in advance!

You can store procedures in an array just fine, but they have to be of the same type:
proc a() =
echo "a"
proc b() =
echo "b"
let procs = [a, b]
proc callProcedure(x: int) =
procs[int(x <= 5)]()
callProcedure(10) # Calls function a
callProcedure(1) # Calls function b
You don't need the "proc" return type. Also this still uses comparisons under the hood for x <= 5, so there's that. And ifs aren't slow.

Related

What is the difference between indexing an array and a dictionary?

From my understanding, an array is a simple table of values, such as local t = {"a","b","c"}, and a dictionary is a table of objects, such as local t = {a = 1, b = 2, c = 3} Of course, let me know if I'm wrong in either or both cases.
Anyways, my question lies in how we index the entries in either of these cases. For example, let's say I have the following code:
local t = {"TestEntry"}
print(t["TestEntry"])
Of course, this prints nil. However, when we use a dictionary the same way:
local t = {TestEntry = 1}
print(t["TestEntry"])
This, naturally, prints 1. My question is, why does it work this way for dictionaries, but not arrays?
Finally, I'd like to address the issue that led me to this question. Let's say, before I want to run a chunk of code, I need to see if a specific value is inside a table. It would be convenient if I could just check if it is in the table with table["GivenEntry"], but, as we have seen, this would only work if the entry in the table is actually an object. In my specific case, I am simply using an array, so it is not an object.
Thus, I had to resort to using a for loop to check the table:
local t = {"TestEntry1","TestEntry2"}
for i,v in pairs(t) do
if v == "TestEntry1" then
--do code
end
end
After doing this, it almost seemed as if it would be easier to create a silly dictionary, like:
local t = {TestEntry1 = "TestEntry1"}
because then, I could simply run t["TestEntry1"], and I wouldn't have to worry about having an empty table (because then the for loop would not run). Are there ramifications to creating a dictionary for such purposes? Is it less efficient in general?
Your input is appreciated,
Thank you.
In Lua both arrays an dictionaries are the same type (the table). local t = {"TestEntry"} is essentially short for local t = {[1] = "TestEntry"} (The brackets are needed by Lua for a number, you would access it with t[1]).
So the options for checking if "TestEntry1" is in the table are as you have written. A dictionary takes more memory and depending on how many values you have may take a while to create, but accessing a key should be constant time. Whereas to loop through the table will take longer and longer the more items you have so it is a tradeoff you have to decide on.
There are faster ways to search an array however (e.g. if it is sorted: https://en.wikipedia.org/wiki/Binary_search_algorithm)

Unknown identifiers for ensure block in Eiffel

So I'm new to Eiffel programming and I'm trying to learn to write postconditions in the ensure block of a feature, in particular with writing loops.
So I tried this:
feature
-- sets the value of a particular in an array to x
foo (a: ARRAY[INTEGER]; target_val, x: INTEGER)
require
valid_target: 1 <= target_val and target_val <= a.count
do
a[target_val] := x
ensure
across
1 |..| a.count as i
all
across
1 |..| a.count as j
all
i.item /= j.item implies a[i.item] /= a[j.item]
end
end
end
But for some reason I get an unknown identifier for i and j. Does anyone know what causes this error and how I could fix it? Also, is there another alternative to using across ... as ... all ... end in the ensure block? Thanks so much in advance!
I don't know why you get a compilation error - I pasted your code and it compiles fine.
Incidentally, the Eiffel style guidelines say your comment should come AFTER the feature name and arguments, not before it.
As mentioned in another answer, there seems to be no issues with compilation. So, some more information may be required to figure out what's wrong: compiler, its version, etc.
There are at least several alternatives to the example code:
Replace iteration over indexes with iteration over structures themselves:
across a as u all
across a as v all
u.target_index /= v.target_index implies u.item /= v.item
end
end
Write a helper function that will do the necessary tests and return their results as a BOOLEAN.
Add a helper function that iterates over the structure and takes an test agent as an argument, similar to
for_all_with_index (a: ARRAY [BAR]; test: FUNCTION [BAR, INTEGER, BOOLEAN]): BOOLEAN
do
Result := across a as c all test (c.item, c.target_index) end
end
and pass agents that will test items. However, even if it works nice with a single agent, the code with nested mutually dependent agents becomes too heavy.

macro to run iterations from data table in existing program

I am completely unfamiliar with macros/do loops/arrays in SAS, but I have been trying to read up on them. It is not going well.
I have a dataset that has 148,176 rows, 9 columns. I want to run all 148176 combinations one by one through my program (so each row one by one) and have it spit out each result as one long list. I should have 148176 values at the end.
Before working with the macro piece, I just used macro variables so the user could input each value, like so:
%let classIin = 1;
%let classIIin = 0.8;
Now I would like to replace each number of the above %let statements with a variable from the 9 columns (each column would correspond to one of the above macro variables, there are 9 I just didn't list them all).
I started trying to write this code, but I am really confused and I know I am missing key things about this process. If anyone has some helpful video tutorials I should watch, I am happy to do that, because nothing I am finding is helping me much so far.
In the following, "AA" and "AB" are two of the column names in Work.MasterPlanList, but I'm not sure if I can call forth variables in this way.
%macro masterlist;
%do i=1 %to 148176;
Data Work.test;
Set work.MasterPlanList(firstobs=&i obs=&i);
call symputx ('classIin', AA)
call symputx ('classIIin', AB)
%end;
%mend;
Then I would theoretically call in the %macro in my code, but the other problem is that I need each variable from this list at different times in my code. Is that an issue or will my macro work by looking at row 1, go through my whole code/calculation set, spit out value 1, then go back to the beginning and look at row 2, go through the code/calc, value 2, etc. etc. etc. until 148176?
It is hard to answer without more specifics of the calculations you are doing. For example you could possibly just do all of your calculations in a data step and never use macro variables or macros.
But if have structured your analysis for one set of parameters as a macro then you can use the dataset to generate multiple calls to the macro. Although 150K calls to a long complex macro is quite a lot.
Say you had a macro called %MYMACRO that had 2 input parameters. And you had a SAS dataset with 2 variables with the values for those parameters. You could then use CALL EXECUTE() or other code generation methods to generate one macro call per observation.
For code generation on this scale I find that using a data step to write the code is easier to understand and debug than using CALL EXECUTE. Especially if you name your dataset variable with the same names as the macro parameters.
filename code temp;
data _null_;
set my_metadata ;
file code ;
put '%mymacro(' var1= ',' var2= ')';
run;
%include code /source2;

In Lua, how should I handle a zero-based array index which comes from C?

Within C code, I have an array and a zero-based index used to lookup within it, for example:
char * names[] = {"Apple", "Banana", "Carrot"};
char * name = names[index];
From an embedded Lua script, I have access to index via a getIndex() function and would like to replicate the array lookup. Is there an agreed on "best" method for doing this, given Lua's one-based arrays?
For example, I could create a Lua array with the same contents as my C array, but this would require adding 1 when indexing:
names = {"Apple", "Banana", "Carrot"}
name = names[getIndex() + 1]
Or, I could avoid the need to add 1 by using a more complex table, but this would break things like #names:
names = {[0] = "Apple", "Banana", "Carrot"}
name = names[getIndex()]
What approach is recommended?
Edit: Thank you for the answers so far. Unfortunately the solution of adding 1 to the index within the getIndex function is not always applicable. This is because in some cases indices are "well-known" - that is, it may be documented that an index of 0 means "Apple" and so on. In that situation, should one or the other of the above solutions be preferred, or is there a better alternative?
Edit 2: Thanks again for the answers and comments, they have really helped me think about this issue. I have realized that there may be two different scenarios in which the problem occurs, and the ideal solution may be different for each.
In the first case consider, for example, an array which may differ from time to time and an index which is simply relative to the current array. Indices have no meaning outside the code. Doug Currie and RBerteig are absolutely correct: the array should be 1-based and getIndex should contain a +1. As was mentioned, this allows the code on both the C and Lua sides to be idiomatic.
The second case involves indices which have meaning, and probably an array which is always the same. An extreme example would be where names contains "Zero", "One", "Two". In this case, the expected value for each index is well-known, and I feel that making the index on the Lua side one-based is unintuitive. I believe one of the other approaches should be preferred.
Use 1-based Lua tables, and bury the + 1 inside the getIndex function.
I prefer
names = {[0] = "Apple", "Banana", "Carrot"}
name = names[getIndex()]
Some of table-manipulation features - #, insert, remove, sort - are broken.
Others - concat(t, sep, 0), unpack(t, 0) - require explicit starting index to run correctly:
print(table.concat(names, ',', 0)) --> Apple,Banana,Carrot
print(unpack(names, 0)) --> Apple Banana Carrot
I hate constantly remembering of that +1 to cater Lua's default 1-based indices style.
You code should reflect your domain specific indices to be more readable.
If 0-based indices are fit well for your task, you should use 0-based indices in Lua.
I like how array indices are implemented in Pascal: you are absolutely free to choose any range you want, e.g., array[-10..-5]of byte is absolutely OK for an array of 6 elements.
This is where Lua metemethods and metatables come in handy. Using a table proxy and a couple metamethods, you can modify access to the table in a way that would fit your need.
local names = {"Apple", "Banana", "Carrot"} -- Original Table
local _names = names -- Keep private access to the table
local names = {} -- Proxy table, used to capture all accesses to the original table
local mt = {
__index = function (t,k)
return _names[k+1] -- Access the original table
end,
__newindex = function (t,k,v)
_names[k+1] = v -- Update original table
end
}
setmetatable(names, mt)
So what's going on here, is that the original table has a proxy for itself, then the proxy catches every access attempt at the table. When the table is accessed, it increment the value it was accessed by, simulating a 0-based array. Here are the print result:
print(names[0]) --> Apple
print(names[1]) --> Banana
print(names[2]) --> Carrot
print(names[3]) --> nil
names[3] = "Orange" --Add a new field to the table
print(names[3]) --> Orange
All table operations act just as they would normally. With this method you don't have to worry about messing with any unordinary access to the table.
EDIT: I'd like to point out that the new "names" table is merely a proxy to access the original names table. So if you queried for #names the result would be nil because that table itself has no values. You'd need to query for #_names to access the size of the original table.
EDIT 2: As Charles Stewart pointed out in the comment below, you can add a __len metamethod to the mt table to ensure the #names call gives you the correct results.
First of all, this situation is not unique to applications that mix Lua and C; you can face the same question even when using Lua only apps. To provide an example, I'm using an editor component that indexes lines starting from 0 (yes, it's C-based, but I only use its Lua interface), but the lines in the script that I edit in the editor are 1-based. So, if the user sets a breakpoint on line 3 (starting from 0 in the editor), I need to send a command to the debugger to set it on line 4 in the script (and convert back when the breakpoint is hit).
Now the suggestions.
(1) I personally dislike using [0] hack for arrays as it breaks too many things. You and Egor already listed many of them; most importantly for me it breaks # and ipairs.
(2) When using 1-based arrays I try to avoid indexing them and to use iterators as much as possible: for i, v in ipairs(...) do instead of for i = 1, #array do).
(3) I also try to isolate my code that deals with these conversions; for example, if you are converting between lines in the editor to manage markers and lines in the script, then have marker2script and script2marker functions that do the conversion (even if it's simple +1 and -1 operations). You'd have something like this anyway even without +1/-1 adjustments, it would just be implicit.
(4) If you can't hide the conversion (and I agree, +1 may look ugly), then make it even more noticeable: use c2l and l2c calls that do the conversion. In my opinion it's not as ugly as +1/-1, but has the advantage of communicating the intent and also gives you an easy way to search for all the places where the conversion happens. It's very useful when you are looking for off-one bugs or when API changes cause updates to this logic.
Overall, I wouldn't worry about these aspects too much. I'm working on a fairly complex Lua app that wraps several 0-based C components and don't remember any issues caused by different indexing...
Why not just turn the C-array into a 1-based array as well?
char * names[] = {NULL, "Apple", "Banana", "Carrot"};
char * name = names[index];
Frankly, this will lead to some unintuitive code on the C-side, but if you insist that there must be 'well-known' indices that work in both sides, this seems to be the best option.
A cleaner solution is of course not to make those 'well-known' indices part of the interface. For example, you could use named identifiers instead of plain numbers. Enums are a nice match for this on the C side, while in Lua you could even use strings as table keys.
Another possibility is to encapsulate the table behind an interface so that the user never accesses the array directly but only via a C-function call, which can then perform arbitrarily complex index transformations. Then you only need to expose that C function in Lua and you have a clean and maintainable solution.
Why not present your C array to Lua as userdata? The technique is described with code in PiL, section 'Userdata'; you can set the __index, __newindex, and __len metatable methods, and you can inherit from a class to provide other sequence manipulation functions as regular methods (e.g., define an array with array.remove, array.sort, array.pairs functions, which can be defined as object methods by a further tweak to __index). Doing things this way means you have no "synchronisation" issues between Lua and C, and it avoids risks that "array" tables get treated as ordinary tables resulting in off-by-one errors.
You can fix this lua-flaw by using an iterator that is aware of different index bases:
function iarray(a)
local n = 0
local s = #a
if a[0] ~= nil then
n = -1
end
return function()
n = n + 1
if n <= s then return n,a[n] end
end
end
However, you still have to add the zeroth element manually:
Usage example:
myArray = {1,2,3,4,5}
myArray[0] = 0
for _,e in iarray(myArray) do
-- do something with element e
end

Defining variables in sas to clean up code

I'm new to SAS coming from python, java and C++. From these languages, the proper thing to do when writing/repeating large statements is to encapsulate them in a variable that is defined once and repeated several times in the code.
I.e. instead of writing the same where statement over and over each time two similar datasets are merged, I want to write:
WHERE_CONDITION_VARIABLE = 'X in (10, 100, 1000, 10000 ......100000000);
data output;
merge in1 in2;
WHERE WHERE_CONDITION_VARIABLE;
run;
data output2;
merge in3 in4;
WHERE WHERE_CONDITION_VARIABLE;
run;
Unfortunately, I haven't been able to figure out how to define a variable such as WHERE_CONDITION_VARIABLE to streamline the code. Is what I'm asking possible to do in SAS?
You can use macro variables.
You define them like this:
%let WHERE_CONDITION_VARIABLE = X in (10, 100, 1000);
And reference them like this:
&WHERE_CONDITION_VARIABLE
SAS has a lot of options for avoiding repeating code; in that way it's actually a lot like python, although the method for accomplishing it is a little different as you do have a separate compilation step (so you can't just say WHERE like you ask directly).
First off, you have the macro variable. If you're just repeating text several times, you can define it in a macro variable, like so:
%let condition=X in (1,10,100,1000);
Macro variables are treated as if they were text you had written. They do not need quotation marks or other text qualifiers, unless they are intended to contain them as legal code, ie:
%let condition=X in ('A','B','C');
would be legal, but
%let condition="X in ('A','B','C')";
would probably not be what you want (unless you want that to be evaluated as a string, anyhow).
Through macro variables, you also have the ability to generate larger amounts of code in a datastep and then include it. For example, if you have a dataset containing a list of conditions, you could apply them this way:
data conditions;
format condition $50.;
input condition $;
datalines4;
if x = 15 then y=5;
if x = 20 then y=10;
if x = 20 and z = 5 then y=15;
if x = 20 and z = 10 then y=20;
;;;;
run;
proc sql;
select condition into :condlist separated by ' ' from conditions;
quit;
data want;
set have;
&condlist;
run;
That would take the conditions from "conditions" dataset and push it into a macro variable "&condlist". The PROC SQL call is the easiest way to get it into a macro variable, but there are others; CALL SYMPUT also can do it in a data step, or you can write it to a text file and then %include the text file as code as well. This is more commonly used in advanced programming by generating calls to a macro, with the conditions dataset providing the macro parameters; in this case you might have a macro
%macro cond(x=,y=,z=);
if x=&x and z=&z then y=&y;
%mend cond;
Then you could generate calls to cond from a dataset with just x,y,z values:
proc sql;
select cats('%cond(x=',x,',y=',y,',z=',z,')') into :condlist separated by ' ' from conditions;
quit;
and use it in the same way.
Macro programming in general is a good solution for avoiding code creep; a macro is written once and then can be run multiple times with different parameters. A macro can be anywhere from one line of code (like above) executed inside a data step, to hundreds of lines containing multiple DATA and PROC steps. Macro programming is a complex topic in and of itself, and worth reading more on.
You can also write a function in SAS. PROC FCMP (function compile) allows you to write fairly complex functions and execute them in your data step or even your PROC statements. http://www.lexjansen.com/pharmasug/2011/tu/pharmasug-2011-tu07.pdf is a good place to start with FCMP if you have 9.2; if you have 9.3, I haven't seen any papers yet (but there may be some out there) showing the newer things in FCMP. FCMP is fairly new so there are still a lot of changes in each iteration of SAS.
Here's an example of FCMP to do your condition:
proc fcmp
outlib=work.funcs.Test; /* where will the functions be saved */
function condition(x); /* declare a function returning a number */
if x in (1,10,100,1000) then return(1);
else return(0);
endsub;
quit;
data have;
do x = 1,5,10,20,100,150,1000,1500;
output;
end;
run;
options cmplib=work.funcs;
data want;
set have;
if condition(x) then output;
run;
You also have the CALL EXECUTE statement, which allows you to directly execute code from a dataset. Using the same CONDITIONS dataset:
data _null_;
set conditions end=eof;
if _n_ = 1 then call execute('data want; set have;');
call execute(condition);
if eof then call execute('run;');
run;
That would effectively construct a data step that executes immediately following your data null step with the same code as in the macro variable example. Call execute works a little differently, so while in this example there shouldn't be any difference, there are a few issues with timing that can cause problems (or can be advantageous); which you use depends on the circumstance. Particularly for CALL EXECUTE, read up on the documentation and online papers (SUGI papers most commonly) to find out more details.
In addition to directly executing code via macro variables or CALL EXECUTE, you have a lot of other ways of performing tasks to avoid wallpaper code. For example, to more easily perform the if statements above, you might be able to use a format. Formats convert one value to another value; most commonly you might have something like 'DOLLAR6.2' which would give you $3.50 from the number 3.5. However, formats can also be used to replace if-this-then-that expressions. If there were only X and Y (and no Z conditions), then you could do this, given this conditions dataset:
data conditions;
input x y;
datalines;
1 5
2 10
3 20
4 50
5 100
;;;;
run;
data for_fmt;
set conditions;
rename x=start y=label;
fmtname='XTOY';
type='i'; *type=i means numeric informat, so numeric to numeric conversion. Informat = to numeric, Format= to character.;
run;
proc format cntlin=for_fmt;
quit;
data want;
set have;
y = input(x,XTOY.);
run;
There you have one line of code converting x to y. (Of course there is a bit of code setting up the format, but it can be separated from the main code, and included in the set-up portion of your code, like a .h file in c).
You also have hash table lookups, which are really helpful when you have more complex conversions - either 1 to many or many to 1. They work just like they sound - you load the hash table into memory and perform lookups. http://support.sas.com/rnd/base/datastep/dot/hash-getting-started.pdf is one good place to start.
Finally, one good way to avoid repeating code is to use fewer separate datasets. SAS data steps and procedures have the "BY" statement available, which means they treat each different value of the BY variable(s) as effectively a separate dataset. The variable names and lengths need to match, as it is still technically one dataset, but if you have many datasets of similar data, and want to perform the same action to each, you can perform them once with a BY statement rather than multiple times.
For example, say you had the dataset SASHELP.CARS. You might want to calculate something separately for each make of car. You could either do:
data acura;
set sashelp.cars;
if make='ACURA';
run;
data honda;
set sashelp.cars;
if make='HONDA';
run;
And then run your code on each dataset separately. However, a more SASsy way to do it is to use the BY statement:
proc means data=sashelp.cars;
by make;
var mpg_city mpg_highway;
run;
Now you get a separate page for each make. You can use the BY statement in data step processing as well; you get variables FIRST.make and LAST.make which tell you if you're on the first record of a new MAKE or the last record of a MAKE (the record just before a change in value), which allow you to do things based on where you are in a dataset's BY group (for example, if first.make then counter=0; would allow you to have a counter that is reset each time you have a new value in make. ) The only caveat for BY groups is you have to sort your dataset by the BY variable prior to using it (or have an index on that variable, or both). This is really helpful for analysis of bootstrap samples or other processes where you have many nearly-identical datasets and perform identical actions on them.
I am assuming you want to put all the WHERE Conditions variables to be put in a bucket and then utilizing them based on index like structure (Python).
If that's the case then you may want to have a look at "INTO".
In "INTO" you will drop all of your X's.
And then you can take them whenever you want.

Resources