Some background to the issue
if I have a struct like
typedef struct {
idx_type type;
union {
char *str;
int num;
} val
} cust_idx;
and I have loops like this
for (i = 0; i < some_get(x); i++) {
some_fun(z, NULL, i);
}
that I want to refactor to use the struct like some_fun(z, idx) where idx is one of my cust_idx structs, would it be best to keep i as the loop counter and update idx or change the for header to use idx.val.num instead of i?
For the purposes of this, assume idx_type is an enum for string and number types, and all other fields will have macros, but I'm only going to use the IDX_NUM macro here as I'm not worried about anything to do with idx.type.
To sum up my concerns:
Will it be readable? I don't want to leave behind a mess that someone will read and just shake their head...
Is it advised against?
Which of these is the best solution?
Struct field as loop counter
#define IDX_NUM(x) (x.val.num)
...
cust_idx j;
j.type = TYPE_num;
for (IDX_NUM(j) = 0; IDX_NUM(j) < some_get(x); IDX_NUM(j)++) {
some_fun(z, j);
}
This does the same as the original, but the using struct field/macro extends and complicates the for loop header in my opinion but it's still fairly understandable.
Modify struct with original counter
cust_idx j;
j.type = TYPE_num;
for (i = 0; i < some_get(x); i++) {
IDX_NUM(j) = i;
some_fun(z, j);
}
This results in the least changes from old code logically, but will end in by far the largest amount of code due to the add assignment lines.
Pointer to struct field
cust_idx j;
int *i = &(j.val.num);
j.type = TYPE_num;
for ((*i) = 0; (*i) < some_get(x); (*i)++) {
some_fun(z, j);
}
I'm not sure how good this would be in the long run, or if it's advised against.
As to readability, I would always prefer separate loop counters.
EDIT: The following in italic is not right in this specific case as C structs by default are passed as value copies over the stack, so passing j to some_fun() in the loop is ok. But I'll leave the caveat here, as it applies to many similar situations, where the struct or array is passed by a pointer value. (aka 'passed by reference').
That is especially true in code like you posted, where you call a function with the structure as an argument inside the loop.
If I don't know what some_fun() does, I can only hope that the struct's member that I use as a loop counter is not modified. And hope is not a strategy.
So, unless there are very hard reasons for doing otherwise, I'd always place readability first. Remember: If you write code that is at the limits of your own syntactic and semantic capabilities, you will have very little fun debugging such code, as debugging is an order of magnitude harder than writing (buggy) code. ;)
Addition: You could look at the disassemblies of all variants. The compiler might do a lot of optimizations here, especially if it can 'see' some_fun().
Related
I'm writing a program which uses a number of structures that must be cleaned up/updated/initialized and so on. I'm currently using something like the following code to deal with them:
typedef struct Thing { ... } Thing;
typedef void (*ProcessThing)(Thing * target);
const int ThingCount = 3;
Thing ListOfThings[ThingCount];
Thing * ThingA = ListOfThings[0];
Thing * ThingB = ListOfThings[1];
Thing * ThingC = ListOfThings[2];
int DoStuffToThings(ProcessThing Action){
int i;
for(i = 0; i < ThingCount; i++){
(*Action)(ListOfThings[i]);
}
}
ThingA, ThingB, and ThingC (and so on) are known at compile time, though the exact number/nature of them is changing as I develop the program.
The main benefit of this setup is that I can easily add more ThingN if needed by adding another definition line and incrementing ThingCount, and can easily process the list of things in new ways with more ProcessThing's.
This isn't a technique I often see used in example code, even when it seems like it would be useful. I don't have a lot of practical experience with C, and am concerned that there may be some problem with this method that I am not seeing. Is there such a problem, or am I being overly nervous?
I was playing with C++ earlier and was thinking if, in some cases, Since my C compiler refuses to let me write code such as:
for (int i = 0; i < 30; ++i)
{
...
}
I try writing something like:
#include <stdio.h>
int main(int argc, char **argv)
{
{
int i;
for (i = 0; i < 30; ++i) {
printf("%d.\n", i);
}
}
return 0;
}
The result is that I do not have this i variable in my scope for more than I need it for, and frees up i to be used for other purposes in the main scope (i wouldn't do this to i, since it's iterator by convention). So I am allowed to write silly code like:
#include <stdio.h>
int main(int argc, char **argv)
{
int i = 3;
/* my loop scope. */
{
int i;
for (i = 0; i < 30; ++i) {
printf("%d.\n", i);
}
}
printf("i remains intact! %d.\n", i);
return 0;
}
Again, I would not intentionally make real code to abuse i like this, but in many cases, especially dealing with temporary variables necessary in libc function calls, such as sem_getvalue or Windows API FormatMessage, where it is easy to clutter up the scope with many temp variables and get confused with what's going on. By using unnamed scopes, In theory, I could reduce complexity of my code by isolating these temporary variables to unnamed scopes.
The idea seems silly but compelling and I am curious if there is any cost/drawback to writing code in this style. Is there inherent issues with writing code this way, or is this a decent practice?
{
int x = 3;
{
int y = 4;
int sum = x+y;
}
}
Has no more cost than:
{
int x = 3;
int y = 4;
int sum = x+y;
}
Because braces do not translate into any machine code themselves, they are just there to tell the compiler about scope.
The fact your variables have the same name also has no effect because variable names are also just for the compiler and do not change machine code.
frees up i to be used for other purposes in the main scope
This is the flaw in your reasoning. You should not use a variable with the same name for unrelated purposes inside the same function. Apart from making the code unreadable, it opens up for all kinds of subtle bugs. Copy/paste one snippet and put it elsewhere in the function, and suddenly it is working with another variable. That is very bad practice, period.
Similarly, it is most often bad practice to have variables in different scopes but with the same name.
If you have multiple loops in the same function that all uses an interator i with the same type, simply declare it at the beginning of the function and re-use it over and over.
If you need an i with different type at different places in a function, that's a clear sign saying that you should split the function in several.
Overall, whenever you find yourself in need to use an obscure language mechanism, you need to step back and consider if you couldn't just design the program in a simpler way. This is almost always the case. Excellent programmers always strive for simplicity and never for complexity.
I don't think local variable allocation works how you think it does.
The compiler maps all the local variables memory requirements together and on entry to the function allocates an offset of that amount on the stack at one time regardless of how many you have. So allocating 20 variables takes the same time as 1.
If I have two arrays of the same size, let's say,
int pa1[100];
int pa2[100];
I know that if, at some point in the code, I want to copy the contents of pa2 in pa1,
pa1 = pa2;
is not the correct way to do it. Instead I could use perhaps a loop. However I was thinking that if I had two struct pointers (ps1, ps2) it is valid to write:
*ps1 = *ps2;
If that structure contained a 100 int array, and I made ps1 and ps2 point to pa1 and pa2 respectively, what is the difference between the previous instruction and a loop that copies every single element in the arrays?
for (int i = 0; i<100; i++) pa1[i] = pa2[i];
Does it have any performance difference? Why?
My first guess is that using the pointers is better than using a loop, but I am not sure. I tried to make a web search but had no success, maybe because I could not find the exact words that describe what I want to know.
Usually compilers uses standard C function memcpy in such cases. It is much faster then using a manually written loop.
Yes it works, as the program below demonstrate.
No you can't be really sure of the code the compiler will choose to do internally. But it is likely that the compiler will emit a very efficient code like calling memcopy (but no include necessary) or using optimized assembly.
But it can't be guaranteed. You will have to test or analyse generated assembly output.
On the readability part I believe copying structure is at least as easy to read than an explicit loop or call to memcopy. And it could be even better if the two arrays are acompanied by other variables related to the arrays that could also go to the structures.
#include <stdio.h>
int main() {
struct {
int t[10];
} s1, s2;
int i = 0;
for(i=0; i < 10 ; ++i){
s1.t[i] = i;
s2.t[i] = -1;
}
printf("s1 [%d %d ... %d %d] s2 [%d %d ... %d %d]\n",
s1.t[0], s1.t[1], s1.t[8], s1.t[9],
s2.t[0], s2.t[1], s2.t[8], s2.t[9]);
s2 = s1;
printf("s1 [%d %d ... %d %d] s2 [%d %d ... %d %d]\n",
s1.t[0], s1.t[1], s1.t[8], s1.t[9],
s2.t[0], s2.t[1], s2.t[8], s2.t[9]);
}
You mean this?
int pa1[100], pa2[100];
int *ps1 = pa1, *ps2 = pa2;
// Initialize the arrays
*ps1 = *ps2;
If it is, then all the last line is doing is:
pa1[0] = pa2[0];
nothing else.
If you need to copy all elements of an array to the other, you will need to iterate through them.
EDIT: On a second read, now I think you meant this:
struct a{
int v[100];
};
int pa1[100], pa2[100];
a *ps1, *ps2;
// Initialize the arrays
ps1 = (a*)pa1;
ps2 = (a*)pa2;
*ps1 = *ps2;
Pretty clever, although you won't find a definite answer for your question. That's implementation-specific, altough many implementations seem to use memcpy or something like it, witch may be barely a little faster than iterating the array. On a second note, your compiler will easily optimize a copy between two arrays using iteration, so the difference is probably negligible, and not worth the extra effort in writing the struct and the pointers in the first place.
So, the real question here is: why do you need such an optimization?
Is it more efficient to access an array each time I use a variable, or to create a temporary variable and set it to the array:
For example:
int A; int B; ...etc... int Z;
int *ints = [1000 ints in here];
for (int i = 0; i < 1000; i++) {
A = ints[i];
B = ints[i];
C = ints[i];
...etc...
Z = ints[i];
}
or
int A; int B; ...etc... int Z;
int *ints = [1000 ints in here];
for (int i = 0; i < 1000; i++) {
int temp = ints[i];
A = temp;
B = temp;
C = temp;
...etc...
Z = temp;
}
Yes, this is not something I want to do, but it is the easiest example I could think of.
So which for loop would be quicker at using the array?
It doesn't matter; the compiler will most likely produce the same code in both cases (unless you have disabled all optimizations). (The generated assembly code will likely resemble the second example - first, the array element will be loaded into a register, and then, the register will be used whenever the array element is needed.) Go with the style you find to be most readable and least prone to errors (which is probably the latter style, which avoids repeating the index).
(This assumes that you don't have any threads or volatile variables, so that the array element is guaranteed not to change in the course of a loop iteration.)
The compiler is smart enough to realize that these are equivalent and will produce the same code. You should therefore write it in the most understandable way for future people reading your code.
As Aasmund's answer states, there is likely no performance difference since the compiler will treat both in the same way. However, you might find assigning to a temporary variable gives improved code readability, and if in the future you want to use ints[i+1] throughout the loop you will only need to change one line rather than many. Never call a variable "temp" though, give it a useful name like currentInt.
I have twenty or so integers which I want to be able to refer to by name when they're being set, but I would like to also be able refer to them by number like they were in an array, so I can print them out one by one using a for loop. Any ideas how to code this in C? Here's what I'm talking about in pseudo code:
/* a data structure to keep a count of each make of car I own */
my_cars;
/* set the counts */
my_cars.saabs = 2;
my_cars.hondas = 3;
my_cars.porsches = 0;
/* print the counts */
for(all i in my_cars) {
print my_cars[i];
}
Is this asking too much of a low level language like C?
struct car {
const char *name;
int count;
} my_cars[] = {{"Saab", 2}, {"Honda", 3}, {"Porsche", 0}};
int i;
for (i = 0; i < sizeof(my_cars) / sizeof(my_cars[0]); i++)
printf("%s: %d\n", my_cars[i].name, my_cars[i].count);
To do that you should use an array instead of standalone data fields
#include <stdio.h>
typedef enum CarType {
CART_SAAB,
CART_HONDA,
CART_PORSHE,
CART_COUNT_
} CarType;
typedef struct MyCars {
unsigned ncars[CART_COUNT_];
} MyCars;
int main(void)
{
MyCars my_cars = { 0 } ;
unsigned i;
my_cars.ncars[CART_SAAB] = 2;
my_cars.ncars[CART_HONDA] = 3;
for (i = 0; i < CART_COUNT_; ++i)
printf("%u\n", my_cars.ncars[i]);
return 0;
}
C can do anything any other language can do. This does look like homework and I bet you are expected to make something with a key. Remember, your instructor wants you to use the data structures he or she is trying to teach you. He doesn't really want the problem solved in any random way, he wants it solved applying the topics you have been discussing.
So think about a data structure containing both strings and counts, one that can be searched, and provide functions to do that. What you are likely to get here are nice, professional, simple solutions to the problem. And that's not really what your instructor wants...
enum Makes { SAAB, HONDA, PORSCHE, INVALID };
int my_cars[INVALID];
my_cars[SAAB] = 2;
my_cars[HONDAS] = 3;
my_cars[PORSCHE] = 0;
You need two data structures. An array to hold the numbers, and a map from the name to the index in the array. In C++ you'd use one of the map classes in the standard library. I don't know what's available in C but I'm sure there are map implementations available.
The low-level C way to do this would be to wrap the cars structure into a union:
// define a structure for the cars.
typedef struct
{
int saabs;
int hondas;
int porsches;
} cars;
// wrap it into a union:
typedef union
{
cars byname;
int byid[3]; // Note: Number and type must match with the cars structure.
} cars2;
int main (int argc, char **arg)
{
cars2 my_cars;
int i;
// fill out by name:
my_cars.byname.saabs = 1;
my_cars.byname.hondas = 5;
my_cars.byname.porsches = 3;
// print by index:
for (i=0; i<3; i++)
printf ("%d\n", my_cars.byid[i]);
}
Umm...based on what you've pseudo coded up there you could probably use a union. The answers others are giving seem oriented around allowing a mapping between names and numbers. If thats what you're looking for (as in, being able to print the names) then their answers will be better. However it sounds like to me you're simply looking for clarity in the code to allow you to reference things by name or number, in this case a union would be ideal I think. This is exactly the type of thing a low level language like C lets you do.
union my_cars {
struct names {
int saab;
int ford;
...
}
int[NUM_MAKES] nums;
}
You will have to be careful to ensure NUM_MAKES is in sync with the number of makes you define. Then you can do something like:
my_cars.names.saab = 20;
my_cars.nums[0] = 30;
And modify the same element.
FYI, my C is a little rusty so there may be syntax errors there, feel free to correct.
EDIT:
Ahh, I read some of the other answers using ENUMs or DEFINEs and those might actually be simpler/easier than this one...but unions are still worth knowing.
There are maybe a couple of options.
It is possible to have the same space in memory defined (and used) in two different ways. In other words, you could have a struct with the named members and reference it either as the struct or as an array depending on how you intended to address it.
Alternatively, you could do an enumerated typedef that names the locations in the array, e.g.
typedef enum {
SABS = 0,
HONDAS,
PORSCHES
} cars;
This would then allow you to refer to offsets in the array by name, e.g.
mycars[SABS] = 5;