Multiple instances of a variable (static, non-static) - c

I came across this piece of C code:
main(){
static int i=0;
i++;
if(i<=5){
int i = 3;
printf(" %d",i);
main();
}
}
1. First, I expected this code to give a compilation error as there are multiple definitions of the variable i. But, it compiled and ran successfully and gave this output.
3 3 3 3 3
2. Observing the output, 3 is printed exactly 5 times, which means the loop was counted from 0 to 5 thus implying that for the if condition , the first definition (static) of i was used.
3 However, the value being printed is 3 which is the 2nd definition of i.
So the variable label i is referring to two different instances in memory. One is being used as the loop count, to do the increment, and the other is the value being printed.
The only way I can somehow explain this is:
int i = 3 (the 2nd definition) is repeated in every recursive call. That instance of i is created when the function is called, and killed when the next recursive call is made. (Because of static scoping). printf uses this instance, as it is the latest definition(?)
When entering a new level of recursion, i++ is being done. Since there is no other way to resolve this i, it uses the static "instance" of i , which is still "alive" in the code as it was defined as static.
However, I'm unable to exactly put a finger on how this works..can anyone explain what's going on here, in the code and the memory?
How is the variable binding being done by the compiler here?

The inner scope wins.
Example:
int i = 1;
void foo() {
int i = 2; // hides the global i
{
int i = 3; // hides local i
}
}
This behavior is by design. What you can do is use different naming conventions for variable scopes:
global/statics
function arguments
locals
class/struct members
Some compilers will issue a warning if you hide a variable in the same function (e.g. function argument and regular local variable). So you the max warning level on your compiler.

The compiler will always use the most local version of a variable when more than one variable of that name exists.
Outside the loop, the first i is the only one that exists, so it is the one that is checked. Then a new i is created, with value 3. At this point whenever you talk about i it will assume you mean the second one, since that's more local. When you exit the loop, the second i will go out of scope and be deleted and so if you start talking about i again it will be the first one.

The {} of the if statement creates a new block scope and when you declare i in that scope you are hiding the i in the outer scope. The new scope does not start until { and thus the if statement is referring to the i in the outer scope.
Hiding is covered in the draft C99 standard section 6.2.1 Scopes of identifiers paragraph 4 says (emphasis mine):
[...]If an identifier designates two different entities in the same name
space, the scopes might overlap. If so, the scope of one entity (the inner scope) will be a
strict subset of the scope of the other entity (the outer scope). Within the inner scope, the
identifier designates the entity declared in the inner scope; the entity declared in the outer scope is hidden (and not visible) within the inner scope.

Related

Need clarification of the meaning of "side effect" in C

I have read many discussions of what constitutes a "side effect" in C, and many seem to indicate that it must involve changing something that is not local to the function causing the change. Changing an external variable or a file are the typical types of things the discussions say have to be changed to qualify as a side effect. The discussions also commonly imply that merely changing the value of a local automatic variable in the same block in which it is declared is not a side effect. However, 5.1.2.3.2 of the C17 standard states the following:
Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects,12) which are changes in the state of the execution environment.
My understanding from C17 6.2.4.1 is that all variables (among other things) represent objects. If that is the case, wouldn't changing the value of any variable regardless of scope or storage class qualify as a side effect?
If that is the case, wouldn't changing the value of any variable regardless of scope or storage class qualify as a side effect?
Yes.
Consider the code a = 3; printf("%d\n", a++);. This prints “3”, because the main effect of a++ is to produce the value of a for the expression, and the value of a is 3. The code also changes the value of a, and that is a side effect.
Consider printf("%d\n", (a = 3) * 4);. The main effect of the assignment expression, a = 3, is to produce the new value of the assigned object, so its value is 3. Then that is multiplied by 4, so printf prints “12”. As a side effect, the stored value of a is changed to 3.
Also, printf writes to standard output, and that is also a side effect.

Declaring struct instances as static/local in main() with while(1)

I am working on microntroller RZA1, with KPIT GNUARM 16 toolchain, in e2 studio. I am not an expert on the subject, so I'll try to explain the problem the best that I can. The issue is related to a structure mainwindow, defined in my code, which contains important features of the graphical interface:
typedef struct
{
page_t pages[MAXNUMPAGE];
logger_t storico;
messagges_t messaggio;
graph_t grafico;
} mainwindow_t;
In the main() function I declare a local instance of this struct, as it contains a while(1) loop, which is used to refresh the GUI application in case of user interaction (i.e pushbutton clicked). The problem that I have encountered is that there's a difference in the way program executes in case the instance of mainwindow_t is declared with or without static keyword. For instance,
main()
{
static mainwindow_t mainwindow;
....
init_pages(mainwindow.pages);
while(1)
{
page_update(mainwindow.pages);
}
}
works perfectly well, whereas with only mainwindow_t mainwindow; it seems that the changes made in the function init_pages() had no effect: entire content of the array page[MAXNUMPAGE] is uninitialized.
Therefore, my question is: should there be any functional difference between non-static local and static local declaration of an array inside a function, if that function basically never returns ?
The problem has nothing to do with whether the variable lives on the stack or not. It has to do with initialization.
Variables with static storage duration, i.e. file-scope variables or local variables with the static keyword, are implicitly initialized so that (loosely speaking) all variables with arithmetic type are initialized to 0 and all pointer variables are initialized to NULL.
In contrast, variables with automatic storage duration, i.e. variables declared inside of a function, are not initialized if there is no explicit initializer and its value is indeterminate.
While you didn't show your initialization function, it apparently doesn't set all fields in mainwindow.pages and depends on the other fields being zero-initialized. When mainwindow is declared non-static, this results in your program reading some indeterminate fields which causes undefined behavior, which explains why the problem mysteriously disappears when you attempt to trim down the code.
Adding an initializer to mainwindow addresses this issue by setting any fields explicitly listed, while applying the static object initialization rules to any remaining fields not explicitly initialized.

Scope rules in C: Nested blocks

I have the following nested function:
int main()
{
int a, b, c;
a = 10;
int foo()
{
int a, b, c;
//some more code here
}
// some more code here
}
Now, I need to assign the variable a that belongs to foo(), with the value of the variable a that belongs to main(). Basically, something like foo.a = main.a is what I'm looking for.
Is there any way of doing this kind of assignment? I read through scope rules here and here , but didn't find anything I could use in this situation.
I know that using a nested function is not advisable, but I'm working on preexisting code, and I don't have permission to change the structure of the code.
How do I proceed?
Keeping apart the nested function part, AFAIK, C does not provied any direct way to access the shadowed variable.
Primary Advice: Do not use this approach. Always use separate variable names for inner scopes and supply -Wshadow to gcc to detect and avoid possible shdowing.
However, just in case, you have to use the same variable names for inner and outer scope and you have to access the outer scope variable from the inner scope, your best bet is to (in this very order, inside the inner block)
declare a pointer, assign the address of the outer variable to it.
declare and define the local variable.
use both.
Note: As a general word of advice, please try not to write new code (I understand the maintainance part) in this manner. It is both hard to manage and hard to read.

Verifying data types/structs in a parser

I'm writing a recursive descent parser, and I'm at the point where I'm unsure how to validate everything. I'm not even sure if I should be doing this at the stage of the parser. What I mean is, I could have some syntax i.e:
int x = 5
int x = 5
And that would be valid, so would the parser check if x has already been defined? If so, would I use a hashmap? And what kind of information would I need to store, like how can I handle the scope of a variable, since x could be defined in a function in a local and global scope:
int x = 5;
void main() {
int x = 2;
}
And finally, when I store to the hashmap, how can I differentiate the types? For example, I could have a variable called foo, and a struct also called foo. So when I put foo in a hashmap, it will probably cause some errors. I'm thinking I could prefix it like storing this as the hashmaps key for a struct struct_xyz where xyz is the name of the struct, and for variables int_xyz?
Thanks :)
I'm going to assume that regardless of which approach you choose, your parser will be constructing some kind of abstract syntax tree. You now have two options. Either, the parser could populate the tree with identifier nodes that store the name of the variable or function that they are referencing. This leaves the issue of scope resolution to a later pass, as advocated in many compiler textbooks.
The other option is to have the parser immediately look the identifier up in a symbol table that it builds as it goes, and store a pointer to the symbol in the abstract syntax tree node instead. This approach tends to work well if your language doesn't allow implicit forward-references to names that haven't been declared yet.
I recently implemented the latter approach in a compiler that I'm working on, and I've been very pleased with the result so far. I will briefly describe my solution below.
Symbols are stored in a structure that looks something like this:
typedef struct symbol {
char *name;
Type *type;
Scope *scope; // Points to the scope in which the symbol was defined.
} Symbol;
So what is this Scope thing? The language I'm compiling is lexically scoped, and each function definition, block, etc, introduces a new scope. Scopes form a stack where the bottom element is the global scope. Here's the structure:
typedef struct scope {
struct scope *parent;
Symbol *buckets;
size_t nbuckets;
} Scope;
The buckets and nbuckets fields are a hash map of identifiers (strings) to Symbol pointers. By following the parent pointers, one can walk the scope stack while searching for an identifier.
With the data structures in place, it's easy to write a parser that resolves names in accordance with the rules of lexical scoping.
Upon encountering a statement or declaration that introduces a new scope (such as a function declaration or a block statement), the parser pushes a new Scope onto the stack. The new scope's parent field points to the old scope.
When the parser sees an identifier, it tries to look it up in the current scope. If the lookup fails in the current scope, it continues recursively in the parent scope, etc. If no corresponding Symbol can be found, an error is raised. If the lookup is successful, the parser creates an AST node with a pointer to the symbol.
Finally, when a variable or function declaration is encountered, it is bound in the current scope.
Some languages use more than one namespace. For instance, in Erlang, functions and variables occupy different namespaces, requiring awkward syntax like fun foo:bar/1 to get at the value of a function. This is easily implemented in the model I outlined above by keeping several Scope stacks - one for each namespace.
If we define "scope" or "context" as mapping from variable names to types (and possibly some more information, such as scope depth), then its natural implementation is either hashmap or some sort of search tree. Upon reaching any variable definition, compiler should insert the name with corresponding type into this data structure. When some sort of 'end scope' operator is encountered, we must already have enough information to 'backtrack' changes in this mapping to its previous state.
For hashmap implementation, for each variable definition we can store previous mapping for this name, and restore this mapping when we have reached the 'end of scope' operator. We should keep a stack of stacks of this changes (one stack for each currently open scope), and backtrack topmost stack of changes in the end of each scope.
One drawback of this approach is that we must either complete compilation in one pass, or store mapping for each identifier in program somewhere, as we can't inspect any scope more than once, or in order other than order of appearance in the source file (or AST).
For tree-based implemetation, this can be easily achieved with so called persistent trees. We just maintain a stack of trees, one for each scope, pushing as we 'open' some scope, and poping when the scope is ended.
The 'depth of scope' is enough for choose what to do in the situation where then new variable name conflicts with one already in mapping. Just check for old depth < new depth and overwrite on success, or report error on failure.
To differentiate between function and variable names you can use separate (yet similar or same) mappings for those objects. If some context permits only function or only variable name, you already know where to look. If both are permited in some context, perform lookup in both structures, and report "ambiguity error" if name corresponds to a function and a variable at the same time.
The best way is to use a class, where you define structures like HashMap, that lets you to do controls about the type and or the existence of a variable. This class should have static methods that interface with the grammar rules written in the parser.

Is this a global?

I'm trying to understand this function and convert it to ctypes:
15 XDisplay* GetXDisplay() {
16 static XDisplay* display = NULL;
17 if (!display)
18 display = OpenNewXDisplay();
19 return display;
20 }
We see here if(!display) then do display = OpenNewXDisplay(); but what confuses me is the guy defines on the line above it that display is NULL (static XDisplay* display = NULL;) so why on earth the need for the if, if he just set it to null? Is display a global variable somehow?
display is a static variable.
For a static variable, initialisation only happens once, not every time the function is entered. This is just basic C (also basic C++, or basic Objective-C).
So this code is just a primitive way to create a singleton object.
As the others mentioned before display is a static variable.
The static storage class instructs the compiler to keep a local
variable in existence during the life-time of the program instead of
creating and destroying it each time it comes into and goes out of
scope. Therefore, making local variables static allows them to
maintain their values between function calls.
Source: http://www.tutorialspoint.com/cprogramming/c_storage_classes.htm
You should read more about whats static word means:
http://en.wikipedia.org/wiki/Static_variable
basicly it means that the variable will be defined only once. which means that on the next time the function will be called the previous value of the variable will be stay.
So its not quite a global variable since its has the scope of a regular variable but keeps its value over function calls.

Resources