How to design an easily imported global float array - c

Inside of my program I am trying to define an array of floats that are very globally accessible within the program. These are just for development purposes so I am trying to achieve this array using the fewest amount of code changes.
I do realize that this is VERY bad programming practice and style but I am on a single developer project and this code will only be there for a day or two on a separate branch of the codebase.
The array needs to be accessed from 2 parts of a very large codebase and I dont have a lot of ways for these two sections to communicate with each-other without a lot of issues.
I am considering two designs:
globalTweakingArray.h
v1
#define TWEAKING_ELEMENTS
float tweakingArray[TWEAKING_ELEMENTS];
v2
#define TWEAKING_ELEMENTS
float *tweakingArray;
// Where can I malloc this
Now my confusion comes from how these are used, what address space they are in etc etc. The goal would be that within a-couple c++ source files they can #import "globalTweakingArray.h" and all be reading and writing the same values.
I am a bit confused about the implications of global variables here so I am not certain what dangers come with the design of v1 and v2. Furthermore I am not even sure how I can malloc some space for the array in v2.
Are either of these designs the correct way to define a global array? What are the dangers that come with either? Are they even the correct way to define this global tweaking array?
Side-Note: One of the places I will be accessing this array from is objective-c source. That shouldn't matter right? A float is a float and a pointer is a pointer right?

After looking into potential solutions here is the solution I ended up going for. Having the variable in the header ended up also causing some compiler issues so I have the following set up now.
Header
#define TWEAKING_ELEMENTS 10
float getTweaking(int i);
void setTweaking(int i, float val);
Implementation
float tweakingArray[TWEAKING_ELEMENTS];
float getTweaking(int i) {
return tweakingArray[i];
}
void setTweaking(int i, float val) {
tweakingArray[i] = val;
}
Keep in mind that if you have objective-c code referencing this c++ code the objective-c code needs to actually be objective-c++ this means the file must be named ending in .mm. This may introduce additional complications.

Related

Sharing common data in C for scientific computing

In assignments where I have been forced to use C for scientific computing (rather than say, C++, my default choice), I often come across the following pattern:
There's usually a set of data that is commonly needed by many functions. For example, in solving differential equations, I would need to know number of points, the equations' parameters, etc:
struct parameters{
unsigned int num_x;
double length_x;
// so forth
};
I often end up having to combine them in a structure and then end up repeating myself in nearly every function: void f(struct parameters* p, ...). This wouldn't be so bad if it made sense for every function to have it as part of its interface, but it is not always the case, and I dislike the repetition anyway.
Furthermore, it is not always meaningful to have all these parameters in one structure, but splitting it up would make the interface more unmanageable.
Are there any workarounds or useful design patterns to deal with this? Making a global p would fix this, but justifying the use of a global when they are generally not recommended is difficult.
To my mind, there are two big reasons not to use global variables:
Since they're accessible everywhere, it can be impossible to keep track of when and how they get changed.
Their use makes it much more difficult to turn some standalone code into a utility function (a library, for example) that can be easily called from another program, with perhaps multiple instances.
But sometimes, there is data that is just truly global, potentially needed in all parts of a program, and if that's the case, I don't believe there should be any stigma against making it global, if it basically is.
You can dutifully pass around a pointer to your "shared" or "common" data (as you suggested), and often this is absolutely the right pattern, but in that case you've basically reintroduced problem #1.
And if you're sure you're never going to want to repackage your program as a separable, callable library, objection #2 goes away, too.
As Mark Benningfield suggested in a comment, the reason not to use globals is not just because everyone says you shouldn't. If you know what you're doing, and if a global isn't going to cause you problems, you should go ahead and use it.
Me, the only thing I insist on is that if a variable is global, it must have a nice, long, descriptive name. One- or two-character global variable names are right out.
(But with all of that said, you will usually find that global variables, like gotos, can be kept to a bare minimum. The general advice to steer clear of them when possible, though that advice is indeed sometimes overzealously or religiously applied, is usually right.)
As generally you will be just passing a pointer around, using one big struct may be prefererd. You can document in your functions which members it uses (its actual interface).
You could break down the struct in a number of structs for different types of computation, all having distinct members, and can combine them all in the big struct.
There may be no preferred design pattern.
One method I have used in the past is to declare a set of macro constants at the file level - your example would then be something akin to
#define NUM_X <value>
#define LENGTH_X <value>
These are of course substituted by the preprocessor, which in a global-averse situation is beneficial.
If you really want to avoid using global variables but only want to set your structure once, then you can write a module to do just that.
parameters.h:
struct t_param{
unsigned int num_x;
double length_x;
// so forth
};
int get_parameters(struct t_param * out);
int init_parameters(const struct t_param* in);
parameters.c:
#include <string.h>
#include "parameters.h"
static struct t_param parameters = {0,0.0};
static int initialized = 0;
int init_paramaters(const t_param* in)
{
if(initialized == 0)
{
memcpy(&parameters, in);
initialized = 1;
return 0;
}else
{
return -1;
}
}
int get_parameters(t_param *out)
{
if(initialized == 0)
{
return -1;
}else
{
memcpy(out, &parameters);
return 0;
}
}
Edit: Traded out member assignment for memcpy calls.

Integrate C function with multiple outputs built with MATLAB Coder

I have been coding in MATLAB and I managed to convert my work to a single function, however it has several inputs and outputs. For the sake of simplicity, lets say it receives three inputs: X (vectorial and for reading only), Y (vectorial and for reading and writing) and Z (scalar and for writing only). Thanks to the reply here I was able to understand that I must create variables with special MATLAB types in order to pre-allocate space and then send them as parameters in my function in the C code.
An initial version with a single scalar output (Z) worked as expected, however taking the next step towards having multiple outputs has raised some questions. I'll try to be as concise as possible. Here's the header of my function in MATLAB and C code once I change Z to a vector:
[Y,Z]=foo(X,Y)
void foo(const unsigned int *X, float Y[n_Y], float Z[n_Z])
These are my doubts so far.
1 - I would expect that if Z is only created inside, it should not appear as an input for the C function. What should I do with it in order to obtain it outside the function? My idea would be to provide a fake variable with the same name that would later be overwritten.
2 - If Y is being changed, then the function should receive the pointer to Y. Is it being updated this way, as it should?
3 - Right now the dimensions are set for X as (1x:inf), which causes the pointer to show up. If I change to a smaller and realistic bound, that single input transforms into two, although nothing else changed (the variable creation in C is independent). Now there is const unsigned int X_data[], const int X_size[2] instead of just const unsigned int X. How should I deal with it within the C code?
The call to the function in C is being made as follows:
emxArray_uint32_T *X=emxCreate_uint32_T(1,n_X);
static emxArray_uint32_T *Y=emxCreate_real32_T(1,n_Y), *Z=emxCreate_real32_T(1,n_Z);
foo(X,&Y,&Z);
emxDestroyArray_uint32_T(X);
I should say that I have not tried to compile the lastest steps, since I need a specific environment to do so (laboratory). However, when I have access to it, the code needs to be almost ready to go. Also, without solving these doubts I think I shouldn't anyway. If it works somehow and I don't understand why, then it's the same as not working.

Linux kernel: why do 'subclass' structs put base class info at end?

I was reading the chapter in Beautiful Code on the Linux kernel and the author discusses how Linux kernel implements inheritance in the C language (amongst other topics). In a nutshell, a 'base' struct is defined and in order to inherit from it the 'subclass' struct places a copy of the base at the end of the subclass struct definition. The author then spends a couple pages explaining a clever and complicated macro to figure out how many bytes to back in order to convert from the base part of the object to the subclass part of the object.
My question: Within the subclass struct, why not declare the base struct as the first thing in the struct, instead of the last thing?
The main advantage of putting the base struct stuff first is when casting from the base to the subclass you wouldn't need to move the pointer at all - essentially, doing the cast just means telling the compiler to let your code use the 'extra' fields that the subclass struct has placed after the stuff that the base defines.
Just to clarify my question a little bit let me throw some code out:
struct device { // this is the 'base class' struct
int a;
int b;
//etc
}
struct usb_device { // this is the 'subclass' struct
int usb_a;
int usb_b;
struct device dev; // This is what confuses me -
// why put this here, rather than before usb_a?
}
If one happens to have a pointer to the "dev" field inside of a usb_device object then in order to cast it back to that usb_device object one needs to subtract 8 from that pointer. But if "dev" was the first thing in a usb_device casting the pointer wouldn't need to move the pointer at all.
Any help on this would be greatly appreciated. Even advice on where to find an answer would be appreciated - I'm not really sure how to Google for the architectural reason behind a decision like this. The closest I could find here on StackOverflow is:
why to use these weird nesting structure
And, just to be clear - I understand that a lot of bright people have worked on the Linux kernel for a long time so clearly there's a good reason for doing it this way, I just can't figure out what it is.
The Amiga OS uses this "common header" trick in a lot of places and it looked like a good idea at the time: Subclassing by simply casting the pointer type. But there are drawbacks.
Pro:
You can extend existing data structures
You can use the same pointer in all places where the base type is expected, no pointer arithmetic needed, saving precious cycles
It feels natural
Con:
Different compilers tend to align data structures differently. If the base structure ended with char a;, then you could have 0, 1 or 3 pad bytes afterwards before the next field of the subclass starts. This led to quite nasty bugs, especially when you had to maintain backwards compatibility (i.e. for some reason, you have to have a certain padding because an ancient compiler version had a bug and now, there is lots of code which expects the buggy padding).
You don't notice quickly when you pass the wrong structure around. With the code in your question, fields get trashed very quickly if the pointer arithmetic is wrong. That is a good thing since it raises chances that a bug is discovered more early.
It leads to an attitude "my compiler will fix it for me" (which it sometimes won't) and all the casts lead to a "I know better than the compiler" attitude. The latter one would make you automatically insert casts before understanding the error message, which would lead to all kinds of odd problems.
The Linux kernel is putting the common structure elsewhere; it can be but doesn't have to be at the end.
Pro:
Bugs will show early
You will have to do some pointer arithmetic for every structure, so you're used to it
You don't need casts
Con:
Not obvious
Code is more complex
I'm new to the Linux kernel code, so take my ramblings here with a grain of salt. As far as I can tell, there is no requirement as to where to put the "subclass" struct. That is exactly what the macros provide: You can cast to the "subclass" structure, regardless of its layout. This provides robustness to your code (the layout of a structure can be changed, without having to change your code.
Perhaps there is a convention of placing the "base class" struct at the end, but I'm not aware of it. I've seen lots of code in drivers, where different "base class" structs are used to cast back to the same "subclass" structure (from different fields in the "subclass" of course).
I don't have fresh experience from the Linux kernel, but from other kernels. I'd say that this doesn't matter at all.
You are not supposed to cast from one to the other. Allowing casts like that should only be done in very specific situations. In most cases it reduces the robustness and flexibility of the code and is considered quite sloppy. So the deepest "architectural reason" you're looking for might just be "because that's the order someone happened to write it in". Or alternatively, that's what the benchmarks showed would be the best for performance of some important code path in that code. Or alternatively, the person who wrote it thinks it looks pretty (I always build upside-down pyramids in my variable declarations and structs if I have no other constraints). Or someone happened to write it this way 20 years ago and since then everyone else has been copying it.
There might be some deeper design behind this, but I doubt it. There's just no reason to design those things at all. If you want to find out from an authoritative source why it's done this way, just submit a patch to linux that changes it and see who yells at you.
It's for multiple inheritance. struct dev isn't the only interface you can apply to a struct in the linux kernel, and if you have more than one, just casting the sub class to a base class wouldn't work. For example:
struct device {
int a;
int b;
// etc...
};
struct asdf {
int asdf_a;
};
struct usb_device {
int usb_a;
int usb_b;
struct device dev;
struct asdf asdf;
};

alternative to sandboxing C for a programming contest with very limited constraints

I'm trying to organize a programming contest for signal processing; originally it was going to be in Python, but the question came up if I could expand allowable entries to C.
The type of programming needed for the entries is really pretty limited:
no stdin/stdout needed
contestants can declare 1 struct containing state variables
entries can declare functions
I will create my own trusted C code for a test harness that calls into the contestants' entries
So I am wondering:
is it possible to declare a particular C file as "safe" by parsing, if there are severe restrictions on the type of calculations allowed? The one thing I can't seem to figure out is how to easily prevent casting pointers or pointer arithmetic.
Entries would be of this form (more or less):
#include "contest.h"
// includes stdint.h and math.h and some other things
// no "#" signs after this line allowed
typedef struct MyState {
int16_t somevar;
int16_t anothervar;
...
} MyState_t;
void dosomething(MyState *pstate)
{
...
}
void dosomethingelse(MyState *pstate)
{
...
}
void calculate_timestep(MyState *pstate, ContestResults *presults)
{
...
}
I've read some of the sandboxing questions (this and this) and it looks a bit difficult to find a way to sandbox one part of C code but allow other trusted parts of C code. So I'm hoping that parsing may be able to help "bless" C code that meets certain constraints.
Any advice? I don't really care if it gets stuck in an infinite loop (I can kill it if the time takes too long) but I do want to prevent OS access or unwanted memory access.
There's no point in allowing C if you also want to disallow things that are part of C, such as pointers, casting, and pointer arithmetic. Many valid C programs then become impossible to write, which would seem counter-intuitive if you're saying "you can use C".
It's hard to detect statically that a program won't do
*(uint32_t *) 0 = 0xdeadf00d;
which might cause a segmentation fault on your host operating system. I'm sure it's possible, or that very good attempts have been made. This Wikipedia article has a list of C and C++ static checking tools that you can investigate.

what the author of nedtries means by "in-place"?

I. Just implemented a kind of bitwise trie (based on nedtries), but my code does lot
Of memory allocation (for each node).
Contrary to my implemetation, nedtries are claimed to be fast , among othet things,
Because of their small number of memory allocation (if any).
The author claim his implementation to be "in-place", but what does it really means in this context ?
And how does nedtries achieve such a small number of dynamic memory allocation ?
Ps: I know that the sources are available, but the code is pretty hard to follow and I cannot figure how it works
I'm the author, so this is for the benefit of the many according to Google who are similarly having difficulties in using nedtries. I would like to thank the people here on stackflow for not making unpleasant comments about me personally which some other discussions about nedtries do.
I am afraid I don't understand the difficulties with knowing how to use it. Usage is exceptionally easy - simply copy the example in the Readme.html file:
typedef struct foo_s foo_t;
struct foo_s {
NEDTRIE_ENTRY(foo_t) link;
size_t key;
};
typedef struct foo_tree_s foo_tree_t;
NEDTRIE_HEAD(foo_tree_s, foo_t);
static foo_tree_t footree;
static size_t fookeyfunct(const foo_t *RESTRICT r)
{
return r->key;
}
NEDTRIE_GENERATE(static, foo_tree_s, foo_s, link, fookeyfunct, NEDTRIE_NOBBLEZEROS(foo_tree_s));
int main(void)
{
foo_t a, b, c, *r;
NEDTRIE_INIT(&footree);
a.key=2;
NEDTRIE_INSERT(foo_tree_s, &footree, &a);
b.key=6;
NEDTRIE_INSERT(foo_tree_s, &footree, &b);
r=NEDTRIE_FIND(foo_tree_s, &footree, &b);
assert(r==&b);
c.key=5;
r=NEDTRIE_NFIND(foo_tree_s, &footree, &c);
assert(r==&b); /* NFIND finds next largest. Invert the key function to invert this */
NEDTRIE_REMOVE(foo_tree_s, &footree, &a);
NEDTRIE_FOREACH(r, foo_tree_s, &footree)
{
printf("%p, %u\n", r, r->key);
}
NEDTRIE_PREV(foo_tree_s, &footree, &a);
return 0;
}
You declare your item type - here it's struct foo_s. You need the NEDTRIE_ENTRY() inside it otherwise it can contain whatever you like. You also need a key generating function. Other than that, it's pretty boilerplate.
I wouldn't have chosen this system of macro based initialisation myself! But it's for compatibility with the BSD rbtree.h so nedtries is very easy to swap in to anything using BSD rbtree.h.
Regarding my usage of "in place"
algorithms, well I guess my lack of
computer science training shows
here. What I would call "in place"
is when you only use the memory
passed into a piece of code, so if
you hand 64 bytes to an in place
algorithm it will only touch that 64
bytes i.e. it won't make use of
extra metadata, or allocate some
extra memory, or indeed write to
global state. A good example is an
"in place" sort implementation where
only the collection being sorted
(and I suppose the thread stack)
gets touched.
Hence no, nedtries doesn't need a
memory allocator. It stores all the
data it needs in the NEDTRIE_ENTRY
and NEDTRIE_HEAD macro expansions.
In other words, when you allocate
your struct foo_s, you do all the
memory allocation for nedtries.
Regarding understanding the "macro
goodness", it's far easier to
understand the logic if you compile
it as C++ and then debug it :). The
C++ build uses templates and the
debugger will cleanly show you state
at any given time. In fact, all
debugging from my end happens in a
C++ build and I meticulously
transcribe the C++ changes into
macroised C.
Lastly, before a new release, I
search Google for people having
problems with my software to see if
I can fix things and I am typically
amazed what someone people say about
me and my free software. Firstly,
why didn't those people having
difficulties ask me directly for
help? If I know that there is
something wrong with the docs, then
I can fix them - equally, asking on
stackoverflow doesn't let me know
immediately that there is a docs
problem bur rather relies on me to
find it next release. So all I would
say is that if anyone finds a
problem with my docs, please do
email me and say so, even if there
is a discussion say like here on
stackflow.
Niall
I took a look at the nedtrie.h source code.
It seems that the reason it is "in-place" is that you have to add the trie bookkeeping data to the items that you want to store.
You use the NEDTRIE_ENTRY macro to add parent/child/next/prev links to your data structure, and you can then pass that data structure to the various trie routines, which will extract and use those added members.
So it is "in-place" in the sense that you augment your existing data structures and the trie code piggybacks on that.
At least that's what it looks like. There's lots of macro goodness in that code so I could have gotten myself confused (:
In-place means you operate on the original (input) data, so the input data becomes the output data. Not-in-place means that you have separate input and output data, and the input data is not modified. In-place operations have a number of advantages - smaller cache/memory footprint, lower memory bandwidth, hence typically better performance, etc, but they have the disadvantage that they are destructive, i.e. you lose the original input data (which may or may not matter, depending on the use case).
In-place means to operate on the input data and (possibly) update it. The implication is that there no copying and/moving of the input data. This may result in loosing the input data original values which you will need to consider if it is relevant for your particular case.

Resources