Variable arraylength and template input in C++11 - arrays

I've seen tons of questions about this. Some have answers, some don't, but none seem to work for me. I have this program (somebody else wrote it) that I wish to use. However there are two problems in the constructor:
template<unsigned N>
class Enumeration {
public:
Enumeration(const array<vector<pair<unsigned char, double>>, N>& pDistribution);
}
The problem with this is that I wish to run this class on user defined input. This input decides the value of N. But because of the 1. const requirement on N for arrays, seeing as I need to construct the array that I will use in the constructor and 2. the const requirement N for templates, I am in quite a pickle.
I tried double pointers, using a proxing class or constexpr voids, non seem to work (depending on whether I did it correctly, I'm reletively new in C++).
My last resort is to do something really ugly with a many-cases switch-statement, but I was hoping someone here can help me out. Preferably without using an extension for the compiler.

The class you have shown does not support N being determined at run-time. It is intended for a different purpose, for when N can be determined at compile time.
Trying to allow N be determined at run-time in the above case is almost certainly a bad idea.
Instead, writing a variant of your type such that the outermost container is not an array but rather a vector would be the general approach required to make the size of the outermost container be determined at run time.
This will involve rewriting most of the class.
class Enumeration_Runtime {
public:
Enumeration_Runtime(const std::vector<std::vector<std::pair<unsigned char, double>>>& pDistribution);
};
the const&ness of the parameter might be best turned into a pass-by-value, but I am unsure.
There is no easy route here, because the person who wrote Enumeration<N> wrote it to not allow N to vary at run time.

Related

How can I parametrize a callback function that I submit to an external library

Say I have an external library that computes the optima, say minima, of a given function. Say its headers give me a function
double[] minimizer(ObjFun f)
where the headers define
typedef double (*ObjFun)(double x[])
and "minimizer" returns the minima of the function f of, say, a two dimensional vector x.
Now, I want to use this to minimize a parameterized function. I don't know how to express this in code exactly, but say if I am minimizing quadratic forms (just a silly example, I know these have closed form minima)
double quadraticForm(double x[]) {
return x[0]*x[0]*q11 + 2*x[0]*x[1]*q12 + x[1]*x[1]*q22
}
which is parameterized by the constants (q11, q12, q22). I want to write code where the user can input (q11, q12, q22) at runtime, I can generate a function to give to the library as a callback, and return the optima.
What is the recommended way to do this in C?
I am rusty with C, so asking about both feasibility and best practices. Really I am trying to solve this using C/Cython code. I was using python bindings to the library so far and using "inner functions" it was really obvious how to do this in python:
def getFunction(q11, q12, q22):
def f(x):
return x[0]*x[0]*q11 + 2*x[0]*x[1]*q12 + x[1]*x[1]*q22
return f
// now submit getFunction(/*user params*/) to the library
I am trying to figure out the C construct so that I can be better informed in creating a Cython equivalent.
The header defines the prototype of a function which can be used as a callback. I am assuming that you can't/won't change that header.
If your function has more parameters, they cannot be filled by the call.
Your function therefor cannot be called as callback, to avoid undefined behaviour or bogus values in parameters.
The function therefor cannot be given as callback; not with additional parameters.
Above means you need to drop the idea of "parameterizing" your function.
Your actual goal is to somehow allow the constants/coefficients to be changed during runtime.
Find a different way of doing that. Think of "dynamic configuration" instead of "parameterizing".
I.e. the function does not always expect those values at each call. It just has access to them.
(This suggests the configuration values are less often changed than the function is called, but does not require it.)
How:
I only can think of one simple way and it is pretty ugly and vulnerable (e.g. due to racing conditions, concurrent access, reentrance; you name it, it will hurt you ...):
Introduce a set of global variables, or better one struct-variable, for readability. (See recommendation below for "file-global" instead of "global".)
Set them at runtime to the desired values, using a separate function.
Initialise them to meaningful defaults, in case they never get written.
Read them at the start of the minimizing callback function.
Recommendation: Have everything (the minimizing function, the configuration variable and the function which sets the configuration at runtime) in one code file and make the configuration variable(s) static (i.e. restricts access to it this code file).
Note:
The answer is only the analysis that and why you should not try paraemeters.
The proposed method is not considered part of the answer; it is more simple than good.
I invite more holistic answers, which propose safer implementation.

Sorting and managing numerous variables

My project has classes which, unavoidably, contain hundreds upon hundreds of variables that I'm always having to keep straight. For example, I'm always having to keep track of specific kinds of variables for a recurring set of "items" that occur inside of a class, where placing those variables between multiple classes would cause a lot of confusion.
How do I better sort my variables to keep from going crazy, especially when it comes time to save my data?
Am I missing something? Actionscript is an Object Oriented language, so you might have hundreds of variables, but unless you've somehow treated it like a grab bag and dumped it all in one place, everything should be to hand. Without knowing what all you're keeping track of, it's hard to give concrete advice, but here's an example from a current project I'm working on, which is a platform for building pre-employment assessments.
The basic unit is a Question. A Question has a stem, text that can go in the status bar, a collection of answers, and a collection of measures of things we're tracking about what the user does in that particular type of questions.
The measures are, again, their own type of object, and come in two "flavors": one that is used to track a time limit and one that isn't. The measure has a name (so we know where to write back to the database) and a value (which tells us what). Timed ones also have a property for the time limit.
When we need to time the question, we hand that measure to yet another object that counts the time down and a separate object that displays the time (if appropriate for the situation). The answers, known as distractors, have a label and a value that they can impart to the appropriate measure based on the user selection. For example, if a user selects "d", its value, "4" is transferred to the measure that stores the user's selection.
Once the user submits his answer, we loop through all the measures for the question and send those to the database. If those were not treated as a collection (in this case, a Vector), we'd have to know exactly what specific measures are being stored for each question and each question would have a very different structure that we'd have to dig through. So if looping through collections is your issue, I think you should revisit that idea. It saves a lot of code and is FAR more efficient than "var1", "var2", "var3."
If the part you think is unweildy is the type checking you have to do because literally anything could be in there, then Vector could be a good solution for you as long as you're using at least Flash Player 10.
So, in summary:
When you have a lot of related properties, write a Class that keeps all of those related bits and pieces together (like my Question).
When objects have 0-n "things" that are all of the same or very similar, use a collection of some sort, such as an Array or Vector, to allow you to iterate through them as a group and perform the same operation on each (for example, each Question is part of a larger grouping that allows each question to be presented in turn, and each question has a collection of distractors and another of measures.
These two concepts, used together, should help keep your information tidy and organized.
While I'm certain there are numerous ways of keeping arrays straight, I have found a method that works well for me. Best of all, it collapses large amounts of information into a handful of arrays that I can parse to an XML file or other storage method. I call this method my "indexed array system".
There are actually multiple ways to do this: creating a handful of 1-dimensional arrays, or creating 2-dimensional (or higher) array(s). Both work equally well, so choose the one that works best for your code. I'm only going to show the 1-dimensional method here. Those of you who are familiar with arrays can probably figure out how to rewrite this to use higher dimensional arrays.
I use Actionscript 3, but this approach should work with almost any programming or scripting language.
In this example, I'm trying to keep various "properties" of different "activities" straight. In this case, we'll say these properties are Level, High Score, and Play Count. We'll call the activities Pinball, Word Search, Maze, and Memory.
This method involves creating multiple arrays, one for each property, and creating constants that hold the integer "key" used for each activity.
We'll start by creating the constants, as integers. Constants work for this, because we never change them after compile. The value we put into each constant is the index the corresponding data will always be stored at in the arrays.
const pinball:int = 0;
const wordsearch:int = 1;
const maze:int = 2;
const memory:int = 3;
Now, we create the arrays. Remember, arrays start counting from zero. Since we want to be able to modify the values, this should be a regular variable.
Note, I am constructing the array to be the specific length we need, with the default value for the desired data type in each slot. I've used all integers here, but you can use just about any data type you need.
var highscore:Array = [0, 0, 0, 0];
var level:Array = [0, 0, 0, 0];
var playcount:Array = [0, 0, 0, 0];
So, we have a consistent "address" for each property, and we only had to create four constants, and three arrays, instead of 12 variables.
Now we need to create the functions to read and write to the arrays using this system. This is where the real beauty of the system comes in. Be sure this function is written in public scope if you want to read/write the arrays from outside this class.
To create the function that gets data from the arrays, we need two arguments: the name of the activity and the name of the property. We also want to set up this function to return a value of any type.
GOTCHA WARNING: In Actionscript 3, this won't work in static classes or functions, as it relies on the "this" keyword.
public function fetchData(act:String, prop:String):*
{
var r:*;
r = this[prop][this[act]];
return r;
}
That queer bit of code, r = this[prop][this[act]], simply uses the provided strings "act" and "prop" as the names of the constant and array, and sets the resulting value to r. Thus, if you feed the function the parameters ("maze", "highscore"), that code will essentially act like r = highscore[2] (remember, this[act] returns the integer value assigned to it.)
The writing method works essentially the same way, except we need one additional argument, the data to be written. This argument needs to be able to accept any
GOTCHA WARNING: One significant drawback to this system with strict typing languages is that you must remember the data type for the array you're writing to. The compiler cannot catch these type errors, so your program will simply throw a fatal error if it tries to write the wrong value type.
One clever way around this is to create different functions for different data types, so passing the wrong data type in an argument will trigger a compile-time error.
public function writeData(act:String, prop:String, val:*):void
{
this[prop][this[act]] = val;
}
Now, we just have one additional problem. What happens if we pass an activity or property name that doesn't exist? To protect against this, we just need one more function.
This function will validate a provided constant or variable key by attempting to access it, and catching the resulting fatal error, returning false instead. If the key is valid, it will return true.
function validateName(ID:String):Boolean
{
var checkthis:*
var r:Boolean = true;
try
{
checkthis = this[ID];
}
catch (error:ReferenceError)
{
r = false;
}
return r;
}
Now, we just need to adjust our other two functions to take advantage of this. We'll wrap the function's code inside an if statement.
If one of the keys is invalid, the function will do nothing - it will fail silently. To get around this, just put a trace (a.k.a. print) statement or a non-fatal error in the else construct.
public function fetchData(act:String, prop:String):*
{
var r:*;
if(validateName(act) && validateName(prop))
{
r = this[prop][this[act]];
return r;
}
}
public function writeData(act:String, prop:String, val:*):void
{
if(validateName(act) && validateName(prop))
{
this[prop][this[act]] = val;
}
}
Now, to use these functions, you simply need to use one line of code each. For the example, we'll say we have a text object in the GUI that shows the high score, called txtHighScore. I've omitted the necessary typecasting for the sake of the example.
//Get the high score.
txtHighScore.text = fetchData("maze", "highscore");
//Write the new high score.
writeData("maze", "highscore", txtHighScore.text);
I hope ya'll will find this tutorial useful in sorting and managing your variables.
(Afternote: You can probably do something similar with dictionaries or databases, but I prefer the flexibility with this method.)

Whys is it a bad idea to have an Object[] array?

I was explaining to a friend a few days ago the concept or inheritance and containers.
He has very little programming knowledge so it was really just a friendly chat.
During the conversation he came to me with a question that i just couldn't answer.
"Why cant you just have an array of the top level class, and add anything to it"
I know this is a bad idea having being told so before by someone far smarter but for the life of me i couldn't remember why.
I mean we do it all the time with inheritance.
Say we have class animal which is parent of cat and dog. If we need a container of both of these we make the array of type animal.
So lets say we didn't have that inheritance link, couldn't we just use the base object class and have everything in the one container.
No specific programming language.
Syntactically, there is no problem with this. By declaring an array of a specific type, you are giving implicit information about the contents of that array. You could well declare a contain of Object instances, but it means you lose all the type information of the original class at compile-time.
It also means that each time you get an object out of the array at runtime, the only field instances and methods you know exist are the fields/methods of Object (which arguably is a compile time problem). To use any of the fields and methods of more specific subclasses of the object, you'd have to cast.
Alternatively, to find out the specific class at runtime you'd have to use features like reflection which are overkill for the majority of cases.
When you take elements out of the container you want to have some guarantees as to what can be done with them. If all elements of the container are returned as instances of Animal (remember here that instances of Dog are also instances of Animal) then you know that they can do all the things that Animals can do (which is more things than what all Objects can do).
Maybe, we do it in programming for the same reason as in Biology? Reptiles and Whales are animals, but they are quite different.
It depends on the situation, but without context, it's definitely okay in most (if not all) object-oriented languages to have an array of a base type (that is, as long as they follow all the substitution principles) containing various instances of different derived types.
Object arrays exist in certain cases in most languages. The problem is that whenever you want to use them, you need to remember what type they were, and stay casting them or whatever.
It also makes the code very horrible to follow and even more horrible to extend, not to mention error prone.
Plant myplant = new Plant();
listOfAnimals.Add(myplant);
would work if the list is object, but you'd get a compile time error if it was Animal.

Specific functions vs many Arguments vs context dependent

An Example
Suppose we have a text to write and could be converted to "uppercase or lowercase", and can be printed "at left, center or right".
Specific case implementation (too many functions)
writeInUpperCaseAndCentered(char *str){//..}
writeInLowerCaseAndCentered(char *str){//..}
writeInUpperCaseAndLeft(char *str){//..}
and so on...
vs
Many Argument function (bad readability and even hard to code without a nice autocompletion IDE)
write( char *str , int toUpper, int centered ){//..}
vs
Context dependent (hard to reuse, hard to code, use of ugly globals, and sometimes even impossible to "detect" a context)
writeComplex (char *str)
{
// analize str and perhaps some global variables and
// (under who knows what rules) put it center/left/right and upper/lowercase
}
And perhaps there are others options..(and are welcome)
The question is:
Is there is any good practice or experience/academic advice for this (recurrent) trilemma ?
EDIT:
What I usually do is to combine "specific case" implementation, with an internal (I mean not in header) general common many-argument function, implementing only used cases, and hiding the ugly code, but I don't know if there is a better way that I don't know. This kind of things make me realize of why OOP was invented.
I'd avoid your first option because as you say the number of function you end up having to implement (though possibly only as macros) can grow out of control. The count doubles when you decide to add italic support, and doubles again for underline.
I'd probably avoid the second option as well. Againg consider what happens when you find it necessary to add support for italics or underlines. Now you need to add another parameter to the function, find all of the cases where you called the function and updated those calls. In short, anoying, though once again you could probably simplify the process with appropriate use of macros.
That leaves the third option. You can actually get some of the benefits of the other alternatives with this using bitflags. For example
#define WRITE_FORMAT_LEFT 1
#define WRITE_FORMAT_RIGHT 2
#define WRITE_FORMAT_CENTER 4
#define WRITE_FORMAT_BOLD 8
#define WRITE_FORMAT_ITALIC 16
....
write(char *string, unsigned int format)
{
if (format & WRITE_FORMAT_LEFT)
{
// write left
}
...
}
EDIT: To answer Greg S.
I think that the biggest improvement is that it means that if I decide, at this point, to add support for underlined text I it takes two steps
Add #define WRITE_FORMAT_UNDERLINE 32 to the header
Add the support for underlines in write().
At this point it can call write(..., ... | WRITE_FORMAT_UNLDERINE) where ever I like. More to the point I don't need to modify pre-existing calls to write, which I would have to do if I added a parameter to its signature.
Another potential benefit is that it allows you do something like the following:
#define WRITE_ALERT_FORMAT (WRITE_FORMAT_CENTER | \
WRITE_FORMAT_BOLD | \
WRITE_FORMAT_ITALIC)
I prefer the argument way.
Because there's going to be some code that all the different scenarios need to use. Making a function out of each scenario will produce code duplication, which is bad.
Instead of using an argument for each different case (toUpper, centered etc..), use a struct. If you need to add more cases then you only need to alter the struct:
typedef struct {
int toUpper;
int centered;
// etc...
} cases;
write( char *str , cases c ){//..}
I'd go for a combination of methods 1 and 2.
Code a method (A) that has all the arguments you need/can think of right now and a "bare" version (B) with no extra arguments. This version can call the first method with the default values. If your language supports it add default arguments. I'd also recommend that you use meaningful names for your arguments and, where possible, enumerations rather than magic numbers or a series of true/false flags. This will make it far easier to read your code and what values are actually being passed without having to look up the method definition.
This gives you a limited set of methods to maintain and 90% of your usages will be the basic method.
If you need to extend the functionality later add a new method with the new arguments and modify (A) to call this. You might want to modify (B) to call this as well, but it's not necessary.
I've run into exactly this situation a number of times -- my preference is none of the above, but instead to use a single formatter object. I can supply it with the number of arguments necessary to specify a particular format.
One major advantage of this is that I can create objects that specify logical formats instead of physical formats. This allows, for example, something like:
Format title = {upper_case, centered, bold};
Format body = {lower_case, left, normal};
write(title, "This is the title");
write(body, "This is some plain text");
Decoupling the logical format from the physical format gives you roughly the same kind of capabilities as a style sheet. If you want to change all your titles from italic to bold-face, change your body style from left justified to fully justified, etc., it becomes relatively easy to do that. With your current code, you're likely to end up searching through all your code and examining "by hand" to figure out whether a particular lower-case, left-justified item is body-text that you want to re-format, or a foot-note that you want to leave alone...
As you already mentioned, one striking point is readability: writeInUpperCaseAndCentered("Foobar!") is much easier to understand than write("Foobar!", true, true), although you could eliminate that problem by using enumerations. On the other hand, having arguments avoids awkward constructions like:
if(foo)
writeInUpperCaseAndCentered("Foobar!");
else if(bar)
writeInLowerCaseAndCentered("Foobar!");
else
...
In my humble opinion, this is a very strong argument (no pun intended) for the argument way.
I suggest more cohesive functions as opposed to superfunctions that can do all kinds of things unless a superfunction is really called for (printf would have been quite awkward if it only printed one type at a time). Signature redundancy should generally not be considered redundant code. Technically speaking it is more code, but you should focus more on eliminating logical redundancies in your code. The result is code that's much easier to maintain with very concise, well-defined behavior. Think of this as the ideal when it seems redundant to write/use multiple functions.

GCC function attributes vs caching

I have one costly function that gets called many times and there is a very limited set of possible values for the parameter.
Function return code depends only on arguments so the obvious way to speed things up is to keep a static cache within the function for possible arguments and corresponding return codes, so for every combination of the parameters, the costly operation will be performed only once.
I always use this approach in such situations and it works fine but it just occurred to me that GCC function attributes const or pure probably can help me with this.
Does anybody have experience with this? How GCC uses pure and const attributes - only at compile time or at runtime as well?
Can I rely on GCC to be smart enough to call a function, declared as
int foo(int) __attribute__ ((pure))
just once for the same parameter value, or there is no guarantee whatsoever and I better stick to caching approach?
EDIT: My question is not about caching/memoization/lookup tables, but GCC function atributes.
I think you are confusing the GCC pure attribute with memoization.
The GCC pure attribute allows the compiler to reduce the number of times the function is called in certain circumstances (such as loop unrolling). However it makes no guarantees that it will do so, only if it think it's appropriate.
What you appear to be looking for is memoization of your function. Memoization is an optimization where calculations for the same input should not be repeated. Instead the previous result should be returned. The GCC pure attribute does not make a function work in this way. You would have to hand implement this.
I have one costly function that gets called many times and there is very limited set of possible values for the parameter.
Why not use a static constant map then (the arguments' can be hashed to generate a key, the return code the value)?
This sounds like it might be solved with a template function. If all if the known parameters and return values are known at compile-time, you could perhaps generate a template instance of the function for each possible parameter. Essentially you'd be calling a different instance of the function for each possible parameter. Not sure it would be any easier than the static cache you've already implemented, but might be worth exploring.
Check out template metaprogramming. The concepts are similar to 'memoization', suggested by JaredPar, even using the same introductory example of a factorial function. It might be appropriate to say that these kinds of templates are compile-time implementations of memoization.
I dont like to reopen old threads, but there was a particularly offensive comment here:
"templates are for dealing with different types, rather than different values of the same type"
Now, take a simple template factorial implementation:
template<int n> struct Factorial {
static const int value = n * Factorial<n-1>::value;
};
template<> struct Factorial<0> {
static const int value = 1;
};
The template parameter here is an integer, not a typename.

Resources