why is the error of Undefined symbol detected in semantic analysis? - c

According to me ,An undefined symbol should be detected in parsing phase since after token is made in the lexical phase , then its type is undefined and if we try to assign a value to that undefined symbol , then how is the parse tree constructed without any error ?
int a;
b=19;
Here , the for symbol b , token has already been generated by the lexical analyser and value is as well associated with it , so what is done in the parsing phase due to which the undefined symbol is not detected in this phase ?

Related

Is having global variables in common blocks an undefined behaviour?

0.c
int i = 5;
int main(){
return i;
}
1.c
int i;
Above compiles fine with gcc 0.c 1.c without any link errors about multiple definitions. The reason is i gets generated as common blocks (-fcommon which is the default behaviour in gcc).
The proper way to do this is using the extern keyword which is missing here.
I have been searching online to see if this is undefined behaviour or not, some post say it is, some say it isn't and it's very confusing:
It is UB
Is having multiple tentative definitions in separate files undefined behaviour?
Why can I define a variable twice in C?
How do I use extern to share variables between source files?
http://port70.net/~nsz/c/c11/n1570.html#J.2
An identifier with external linkage is used, but in the program there does not exist exactly one external definition for the identifier, or the identifier is not used and there exist multiple external definitions for the identifier (6.9).
It is NOT UB
Global variables and the .data section
Defining an extern variable in multiple files in C
Does C have One Definition Rule like C++?
Look for -fno-common:
https://gcc.gnu.org/onlinedocs/gcc-4.8.5/gcc/Code-Gen-Options.html
So which one is it? is using -fcommon one of the few places where having multiple definition is allowed and the compiler sorts it out for you? or it is still UB?
Analysis of the code according to the C Standard
This is covered in section 6.9/5 of the latest C Standard:
Semantics
An external definition is an external declaration that is also a definition of a function (other than an inline definition) or an object. If an identifier declared with external linkage is used in an expression (other than as part of the operand of a sizeof or _Alignof operator whose result is an integer constant), somewhere in the entire program there shall be exactly one external definition for the
identifier; otherwise, there shall be no more than one.
The term "external definition" should not be confused with "external linkage" or the extern keyword, those are are entirely different concepts that happen to have similar spelling.
"external definition" means a definition that is not tentative, and not inside a function.
Regarding tentative definition, ths is covered by 6.9.2/2:
A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static , constitutes a tentative definition. If a translation unit contains one or more tentative definitions for an identifier, and the translation unit contains no external definition for that identifier, then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.
So in your file 1.c, as per 6.9.2/2 the behaviour is exactly as if it had said int i = 0; instead. Which would be an external definition. This means 0.c and 1.c both behave as if they had external definitions which violates the rule 6.9/5 saying there shall be no more than one external definition.
Violating a semantic rule means the behaviour is undefined with no diagnostic required.
Explanation of what "undefined behaviour" means
See also: Undefined, unspecified and implementation-defined behavior
In case it is unclear, the C Standard saying "the behaviour is undefined" means that the C Standard does not define the behaviour. The same code built on different conforming implementations (or rebuilt on the same conforming implementation) may behave differently, including rejecting the program , accepting it, or any other outcome you might imagine.
(Note - some programs can have the defined-ness of their behaviour depend on runtime conditions; those programs cannot be rejected at compile-time and must behave as specified unless the condition occurs that causes the behaviour to be undefined. But that does not apply to the program in this question since all possible executions would encounter the violation of 6.9/5).
Compiler vendors may or may not provide stable and/or documented behaviour for cases where the C Standard does not define the behaviour.
For the code in your question it is common (ha ha) for compiler vendors to provide reliable behaviour ; this is documented in a non-normative Annex J.5.11 to the Standard:
J.5 Common extensions
J.5.11 Multiple external definitions
1 There may be more than one external definition for the identifier of an object, with or without the explicit use of the keyword extern ; if the definitions disagree, or more than one is initialized, the behavior is undefined (6.9.2).
It seems the gcc compiler implements this extension if -fcommon switch is provided, and disables it if -fno-common is provided (and the default setting may vary between compiler versions).
Footnote: I intentionally avoid using the word "defined" in relation to behaviour that is not defined by the C Standard as it seems to me that is one of the cause of confusion for OP.

Is there a way to throw a link time error if a variable is initialized at declaration and linked to a NOLOAD section?

I have a variable initialized when declared that is also marked to be linked into a NOLOAD section, i.e. :
struct mystruct_s mystruct __attribute((section(".noload_sec"))) =
{
.something = 100,
.something_else = 100,
};
Is there a way for the linker to automatically detect this invalid condition? Meaning can we error if someone tries to initialize a variable at declaration which is located into a section which will not be loaded?
After a few different attempts, and a lack of answers here, I've concluded that this cannot be enforced with current GCC.
A solution would be to write a build time script to scan the source and throw the error when a variable declaration for a symbol located in a no init section is initialized.

What does _("write error") mean?

In Debian 8's source code /source/procps-3.3.9/lib/fileutils.c line 38 is:
char const *write_error = _("write error");
I am confused about the _("write error") part. Google showed that result on variable naming convention or library reserved names, but nothing about when _ was on the right side of = and before a () quoted string.
I also put this line into a simplest test program as only useful line then had compilation failed saying:
test.c:5:20: warning: implicit declaration of function ‘_’ [-Wimplicit-function-declaration]
char const *str = _("test string");
^
test.c:5:20: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
/tmp/cczQpqTh.o: In function `main':
test.c:(.text+0x15): undefined reference to `_'
collect2: error: ld returned 1 exit status
Does anyone know what _(" ") format means?
This is the standard way to mark up strings for translation using GNU gettext, a free software translation tool.
The _() macro is found by an external tool which extracts the text to make it translatable, as well as (at run-time) do look-ups to replace the literal with the necessary translation.
There is nothing special about the name _, it's just a very short but perfectly valid C identifier. Perhaps it's a bit iffy to begin a public symbol with an underscore, I'm not sure right now.
The error you're getting is because your test program very likely fails to include the <libintl.h> header (part of gettext, of course) which declares this macro. Thus you get the normal "undefined reference" error, as expected.

At what stage is error thrown?

Compilation generally occur in several stages:lexical analysis, syntax analysis, etc. Say, in C language, I wrote
a=24;
without declaring a as int. Now, at what stage of compilation an error is detected? At syntax analysis stage? If that is the case, then what does lexical analyzer do? Just tokenizing the source code?
If talking about a general form of compiler,it is obvious that the error will occur at the syntax analysis phase when the parser will look for the symbol searching in symbol table entries ,and the subsequent phases - only if processed further after recovering from error.
The dragon book also clearly tells that. It is mentioned in the page where the types of error are mentioned. The topic to be studied thoroughly to understand this issue is given in 4.1.3 - Syntax Error Handling .
a = 24; // without declaring a as an int type variable.
Here, the work of lexical phase is simply to access characters and form tokens and subsequently pass them to the further phases,i.e., to the parse in the syntax analysis phase,etc.
I don't know your compiler, but in general this would be in the parsing stage (syntax analysis) and not the lexical stage (tokenizing). Most C compilers will be written using a lex/yacc variant, which makes the above assumption more plausible. If you want to know the details, dive into the dragon book, a great resource.
If I were to write the compiler, I'd have the lexical analyzer spit out tokens (in this case: a, =, 24 and finally ;). The parser would maintain a symbol table and upon seeing the symbol a it would check whether the symbol was in the table; if not (as in your example) it would signal an error.

C89 - error: expected ')' before '*' token

I am getting this error within C.
error: expected ')' before '*' token
But cannot trace it.
void print_struct(struct_alias *s) //error within this line
{
...
} //end of print method
My question is when receiving this error where can the error stem back to? Is it a problem with the function, can it be an error with what is being passed in? What is the scope of the error?
The compiler doesn't recognize the name struct_alias as a type name.
For that code to compile, struct_alias would have to be declared as a typedef, and that declaration would have to be visible to the compiler when it sees the definition of print_struct.
(Typedef names are tricky. In effect, they become temporarily user-defined keywords, which is why errors involving them can produce such confusing error messages.)
This is not specific to C89; it applies equally to C90 (which is exactly the same language as C89), to C99, and to C11.
The error means that here's no such type as struct_alias declared in this translation unit.

Resources