How to Protect Against Symbol Redefinition - c

My project incorporates a stack, which has a number of user-defined types (typedef). The problem is that many of these type definitions conflict with our in-house type definitions. That is, the same symbol name is being used. Is there any way to protect against this?
The root of the problem is that to use the stack in our application, or wrapper code, as the case may be, a certain header file must be included. This stack header file in turn includes the stack provider's types definition file. That's the problem. They should have included their type definition file via a non-public include path, but they didn't. Now, there are all sorts of user-defined type conflicts for very common names, such as BYTE, WORD, DWORD, and so forth.

Since you probably can't easily change the program stack you are using, you will have to start with your own code.
The first thing to do is (obviously) to limit the number of names in the global namespace, as far as possible. Don't use global variables, just use static ones, as an example.
The next step is to adopt a naming convention for your code modules. Suppose you have an "input module" in the project. You could then for example prefix all functions in the input module "inp".
void inp_init (void);
void inp_get (int input);
#define INP_SOMECONSTANT 4
typedef enum
{
INP_THIS,
INP_THAT,
} inp_something_t;
And so on. Whenever these items are used elsewhere in the code, they will not only have a unique identifier, it will also be obvious to the reader which module they belong to, and therefore what purpose they have. So while fixing the namespace conflicts, you gain readability at the same time.
Something like the above could be the first steps to implementing a formal coding standard, something you need to do sooner or later anyway as a professional programmer.

I suggest you define a wrapping header that redefines all of the functions and structures exported by the stack in terms of your own types. This header is then included in your system files but not in the stack files (where it would conflict). You can then compile and link but there is a weak point at the interface. If you select your types correctly in your redefinitions, it should work correctly, leaving only an maintenance problem on each update from the stack supplier...

I think that I've come up with a reasonable workaround, for the time being, but as Lundin stated, a formal coding standard is needed for a long-term solution.
Basically what I did was to move the inclusion of the required stack header file to before the inclusion of our in-house type definitions file. Then, between those two includes I added a compiler macro to set a defined constant dependent on whether the stack's header file single-include protection definition has been defined. Then, I used that conditional defined constant as a conditional compile option in our in-house type definition file to prevent the conflicting data-types from being re-defined. It's a little sloppy, but progress can only be made in incremental steps.

Related

how to share array type definition without common header file?

Situation
I'm using min GW compiler:
>bin\cpp --version
cpp.exe (GCC) 6.1.0
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
I have two header files main.h and submodule.h. For various reasons I cannot simply include one of this headers into the other.
[update]
I think you need to explain the various reasons why you cannot simply include one of this headers into the other because that's the obvious answer... – Andrew
I cannot import main.h into submodule.h because in that case a change in main.h would trigger a recompilation of the submodule althoug nothing changed here. Compile time is a major concern for my client.
I cannot include submodule.h into main.h because submodule.h defines lots of stuff but only a few definitions are public. My client wants to reduce visibility of identifiers as much as possible.
My client uses the content of main.h to verify compatibility of different versions of the target software. Existence and size of the mentioned array is one of the compatibility criteria. Therefore the definition of the array must stay in main.h
There are some versions of the target software that do not have the submodule at all. Therefore the files building this submodule may or may not be present. There is a lot of overhead (for my client) to deal with that situation which has to be done by someone else, not me. So my client also wants to limit the number of "flickery" files.
I also have lots of other *.h files that include main.h but not submodule.h, and they should not to hide some things in the sub module.
The submodule.h defines lots of stuff implemented in submodule.c.
Among that is an array type definition and a global variable of that type:
typedef const char INDEX_TABLE_t[42];
const INDEX_TABLE_t INDEX_TABLE;
The submodule.c implements this array:
const INDEX_TABLE_t INDEX_TABLE {/* 42 random char values */};
The variable INDEX_TABLE ist used in that other *.h files:
char SOME_OTHER_INDEX[23] = {/* 23 random char values */};
#define SELECTOR_VALUE 5
#define a_fix_name INDEX_TABLE[SOME_OTHER_INDEX[SELECTOR_VALUE]]
these *.h files include main.h but not submodule.h.
Therefore I used to add the (exact same) type definition of INDEX_TABLE_t and INDEX_TABLE_t to main.h which compiles fine.
Problem
My client uses a code alaysis tool (QA-C) that complains about the doubled definition of the type INDEX_TABLE_t.
[C] More than one declaration of 'INDEX_TABLE_t' (with no linkage).
The client instructed me to change the code so that this error will no longer issued by the code analysis tool.
I usually solve this by adding the extern keyword to all but one occurrence.
But in this case the compiler throws an exception:
error: conflicting specifiers in declaration of 'INDEX_TABLE_t'
But the declaratios are equal (they are rendered based on a model).
Questions
Do I have any chance to make both happy, the compiler and the code analyser?
Is createing another header file to be included in in main.h or all the other *.h files my only option?
I have two header files main.h and submodule.h. For various reasons I cannot simply include one of this headers into the other.
Then do yourself a favor and fix that, even if you don't actually #include "submodule.h" inside main.h. Your claim that you cannot do so has very bad smell.
The submodule.c implements this array:
const INDEX_TABLE_t INDEX_TABLE {/* 42 random char values */};
You appear to have omitted an = before the initializer. Also, with INDEX_TABLE_t being an array type with const elements, I don't think the extra const there has any additional effect.
My client uses a code alaysis tool (QA-C) that complains about the
doubled definition of the type INDEX_TABLE_t.
[C] More than one declaration of 'INDEX_TABLE_t' (with no linkage).
I suppose the tool is concerned by exactly the fact that the declaration is repeated in separate files, instead of being centralized in a single header. This is a valid concern, not so much for the program now, but for ongoing maintenance and development. You have set a trap for a future maintainer (maybe future you) wherein they might change only one of the type definitions, or change the two in incompatible ways, thus introducing a subtle but impactful bug.
The client instructed me to change the code so that this error will no
longer issued by the code analysis tool.
I usually solve this by adding the extern keyword to all but one
occurrence. But in this case the compiler throws an exception:
error: conflicting specifiers in declaration of 'INDEX_TABLE_t'
But the declaratios are equal (they are rendered based on a model).
INDEX_TABLE_t designates a type, not an object or function. It cannot have external linkage (per extern) because it automatically and necessarily has no linkage.
Do I have any chance to make both happy, the compiler and the code
analyser?
Yes.
Is createing another header file to be included in in main.h or all
the other *.h files my only option?
Not exactly, but you do need to put the type definition in a single header, and have all your sources get it from there, directly or indirectly. One alternative to your idea would be to #include that header directly into your .c files, which probably would require careful management of the order of your #include statements.
But overall, it sounds like your header collection could benefit from some refactoring. As a general rule, each header should (and should be able to) include all the headers needed to provide declarations for identifiers used but not declared within, and no other headers. This is facilitated in part by using include guards in every header. There may be other aspects to making that work if you did not design for it from the beginning, but it certainly can be done.
In response to the edited question
That some builds of the software do not include submodule but (presumably) do use main.h is a strong indication that main.h is the wrong place for a typedef for the type of an object of which only submodule provides an instance. It should go in a header associated with submodule or more broadly with the collection of different sources that all use this attribute of submodule.
Perhaps that header could be submodule.h itself. Perhaps it should be a separate header, say submodule_general.h, which might even be a better place for some of the other stuff now in submodule.h, too. Perhaps there's stuff in submodule.h that doesn't need to be there, and removing it -- possibly in conjunction with converting some objects and functions from external to internal -- would make it more palatable to include submodule.h in more places.
However you split declarations among headers to avoid duplication and serve whatever other objectives you may have, you always have the alternatives of including headers into the sources that need them either directly, or indirectly via other headers.

is the purpose of header files in C only warning to users?

I'm a beginner into Linking, sorry if my questions are too basic. lets say I have two .c files
file1.c is
int main(int argc, char *argv[])
{
int a = function2();
return 0;
}
file2.c is
int function2()
{
return 2018;
}
I know the norm is, create a file2.h and include it in file1.c, and I have some questions:
Q1. #include in file1.c doesn't make too much difference or improve much to me, I can still compile file1.c without file2.h correctly, the compiler will just warn me 'implicit declaration of function 'function2', but does this warning help a lot? Programmers might know that function2 is defined in other .c file(if you use function2 but don't define it, you certainly know the definition is somewhere else) and linker will do its job to produce the final executable file? so the only purpose of include file2,c to me is, don't show any warning during compilation, is my understanding correct.
Q2. Image this scenario, a programmer define function2 in file1.c, he doesn't know that his function2 in conflict with the one in file2.c until the linker throws the error(obvious he can compile his file1.c alone correctly. But if we want him to know his mistake when he compiles his file1.c, adding file2.h still don't help, so what's the purpose of adding header file?
Q3. What should we add to let the programmer know he should choose a different name for function2 rather then be informed the error by linker in the final stage.
Per C89 3.3.2.2 Function calls emphasis mine:
If the expression that precedes the parenthesized argument list in a function call consists solely of an identifier, and if no declaration is visible for this identifier, the identifier is implicitly declared exactly as if, in the innermost block containing the function call, the declaration
extern int identifier();
appeared
Now, remember, empty parameter list (declared with nothing inside the () braces) declares a function that takes unspecified type and number of arguments. Type void inside braces to declare that a function takes no arguments, like int func(void).
Q1:
does this warning help a lot?
Yes and no. This is a subjective question. It helps those, who use it. As a personal note, always make this warning an error. Using gcc compiler use -Werror=implicit-function-declaration. But you can also ignore this warning and make the simplest main() { printf("hello world!\n"); } program.
linker will do its job to produce the final executable file? so the only purpose of include file2,c to me is, don't show any warning during compilation, is my understanding correct.
No. In cases the function is called using different/not-compatible pointer type. It invokes undefined behavior. If the function is declared as void (*function2(void))(int a); then calling ((int(*)())function2)() is UB as is calling function2() without previous declaration. Per Annex J.2 (informative):
The behavior is undefined in the following circumstances:
A pointer is used to call a function whose type is not compatible with the pointed-to type (6.3.2.3).
and per C11 6.3.2.3p8:
A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the referenced type, the behavior is undefined.
So in your lucky case int function2() indeed this works. It also works for example for atoi() function. But calling atol() will invoke undefined behavior.
Q2:
the linker throws the error
This should happen, but is really linker dependent. If you compile all sources using a single stage with the gcc compiler it will throw an error. But if you create static libraries and then link them using gcc compiler without -Wl,-whole-archive then it will pick the first declaration is sees, see this thread.
what's the purpose of adding header file?
I guess simplicity and order. It is a convenient and standard way to share data structures (enum, struct, typedefs) and declarations (function and variable types) between developers and libraries. Also to share preprocessor directives. Image you are writing a big library with over 1000+ files that will work with over 100+ other libraries. In the beginning of each file would you write struct mydata_s { int member1; int member2; ... }; int printf(const char*, ...); int scanf(const char *, ...); etc. or just #include "mydata.h" and #include <stdio.h>? If you would need to change mydata_s structure, you would need to change all files in your project and all the other developers which use your library would need to change the definition too. I don't say you can't do it, but it would be more work to do it and no one will use your library.
Q3:
What should we add to let the programmer know he should choose a different name for function2 rather then be informed the error by linker in the final stage.
In case of name clashes you will by informed (hopefully) by the linker that it found two identifiers with the same name. You would need to create a tool to check your sources exactly for that. I don't know why the need for this, the linker is specifically made to resolve symbols so it naturally handles the cases when two symbols with the same identifier exists.
Short answer:
Take away: the earlier the compiler alert the better.
Q1: meaning of .h: consistency and early alerts. Alerting early on common ways of going wrong improves reliability of code and adds up to less debugging and production crashes.
Q2: Clashing Names bring early alerts to developers, which are usually easier to fix.
Q3: Early duplicate definition alerts are not baked into the C standard.
Exercises:
1. Define a function in one file that printf("%d\n",i) an int argument then call that function in another file with a float of 42.0.
2. Call with (double)42.0.
3. Define function with char *str argument printed under %.s then call with int argument.
Longer answers:
Popular convention: in typical use the name of the .h file is derived from the .c file, or files, it is associated with. file.h and file.c. For .h files with many definitions, say string.h, derive the file name from a hither perspective of what's within (as in the str... functions).
My big rule: it’s always better to structure your code so compilers can immediately alert on bugs at compile time rather than letting them slide through to debug or run time where they depend on code actually running in just the right way to find. Run time errors can be very difficult to diagnose, especially if they hit long after the program is in production, and expensive in maintenance and brings down your customer experience. See "yoda notation".
Q1: meaning of .h: consistency and early alerts and improved reliability of code.
C .h files allow developers of .c files compiled at different times to share common declarations. No duplicate code. .h files also allow functions to be consistently called from all files while identifying improper argument signatures (argument counts, bad clashes, etc.). Having.c files defining functions also #include the .h file helps assure the arguments in the definition are consistent with the calls; this may sound elementary, but without it all the human errors of signature clashes can sneak through.
Omitting .h files only works if the argument signatures of all callers perfectly match those in the definitions. This is often not the case so without .h files any clashing signatures would produce bad numbers unless you also had parallel externs in the calling file (bad bad bad). Things like int vs float can produce spectacularly wrong argument values. Bad pointers can produce segment faults and other total crashes.
Advantage: with externs in .h files compilers can correctly cast mismatching arguments to the correct type, assuring better calls. While you can still botch arguments it’s much less likely. It also helps avoid conditions where the mismatches work on one implementation but not another.
Implicit declaration warnings are hugely helpful to me as they usually indicate I’ve forgotten a .h file or spelled the name an external name wrong.
Q2: Clashing Names. Early alerts.
Clashing names are bad and it is the developers responsibility to avoid problems. C++ solves the issue with name spaces, which C, being a lower level language, does not have.
Use of .h files can allow can let compiler diagnostics alert developers where clashes care are early in the game. If compiler diagnostics don’t do this hopefully linkers will do so on multidefined symbol errors, but this is not guaranteed by the standard.
A common way to fake name spaces is by starting all potentially clashing definitions in a .h with some prefix (extern int filex_function1(int arg, char *string) or #define FILEX_DEF 42).
What to do if two different external libraries being used share the same names is beyond the scope of this answer.
Q3: early duplicate alerts. Sorry… early alerts are implementation dependent.
This would be difficult for the C standard to define. As C is an old language there are many creative different ways C programs are written and stored.
Hunting for clashing names before using them is up to the developer. Tools like cross reference programs can help. Even something stupid like ctags associated with vim or emacs can help.
you misunderstand usage of header files and function prototypes.
header files are needed to share common information between multiple code files. such information includes macro definition, data types, and, possibly, function prototypes.
function protoypes are needed for the compiler to correctly handle return data types and to give you early warnings of misuse of function return types and arguments.
function prototypes can be declared in header files or can be declared in the files which use them (more typing).
you have a very simple example, with just 2 files. Now imagine a project with hudreds of files and thousands of functions. You will be lost in linker errors.
'c' allows you to use an undeclared function due to legacy reasons. In this situation it assumes that the function has a return type of 'int'. However, modern data types has a bigger veriety than in early days. The function can return pointers, 64-bit data, structures. To express that you must use prototypes or nothing will work. The compiler has to know how to handle function returns correctly.
Also, it can give you warnings about incorrect use of argument types. Due to leagacy, those are still warnings, but they got addressed in early c++ and converted to errors.
Those warnings give you early debugging capabilities. Type mismatch warnings can save you days of debugging in some cases.
So, in your example you do not need the header file. You can prototype the function in the 'main' file using the 'extern' syntax. You can even do without prototyping. However, in real modern programming world you cannot allow the latter. In particular when you work in a team or want your program to be maintainable.
It is a good idea to store you funcion protypes in header files. This would be a good documentation source, in particular with good comments. BTW, function names must make sense to be maintainable.
Q1. Yes. C is a low level language, and was historically used to bind low level constructs into higher level concepts. For example, traditionally the label _end is at the last address in a program. The label is typeless but you can declare it as any type that is convenient to you. A "properly typed" language would make this sort of abuse difficult.
Q2. By convention, both file1.c and file2.c would include file2.h; one as consumer, the other as producer. Following this simple idiom will catch declaration vs definition errors; although again, the "warning" is not necessarily enforced.
Q3. Many software organizations take a "warnings are errors" rule to socially control their programmers.

Should _GNU_SOURCE be defined throughout the project?

If I'm planning to use something that is only provided after declaring _GNU_SOURCE, do I need to declare _GNU_SOURCE at the top of all source files in the project?
Is it safe to only declare it at the top of any source files that require it?
My initial concern is related to type declarations... it's of course possible that a struct changes shape after defining _GNU_SOURCE, but is that likely, or is it guaranteed that such things "will not change shape"?
For example, if I use a struct to declare a variable in one file (with _GNU_SOURCE), and then use that variable in another (without _GNU_SOURCE), is guaranteed that I will not run into problems?
In this case I'm after pthread_tryjoin_np().
It is safe to declare it only in files that need it.
After all, the whole point is that some code would break if was defined, but it still needs to be linked with code that uses it.
As #IanAbbott notes below, exception is when you use some of the varying types in your interface. Then you need to keep the definition consistent for the modules that use it. E.g. off_t becomes, under _GNU_SOURCE, alias for off64_t, so if you then include the same header with _GNU_SOURCE turned off, it will define different functions.
That said, within a project there is really not much reason not to define it everywhere, because once you define it in any file, you depend on it. So defining it locally only helps anything if it is in an optional component or if you have alternates for other systems that use functions specific to those other systems instead.

Is hidden declaration possible in a project?

In my project, a structre is being used in several functions.
like this:
void function1 (Struct_type1 * pstType1);
but when I search for Struct_type1 's references, I can't find any. This Structure must be defined somewhere. How to find the definition?
OS- Windows
Edit: I think its difficult to answer this without source code and I can't share that big project here. So, I've changed my question to:
Is Hidden Declaration possible in an embedded project?
(by hidden I mean no one can see the definition.)
Is Hidden Declaration possible in an embedded project?
If you have access to all source code in the project, then no.
This is only possible in one specific case, and that is when you have an external library for which you don't have the C code, you only have a header file and an object file or lib file (or DLL etc).
For such cases it is possible (and good practice) for the library header to forward-declare an incomplete type in the header, and hide the actual implementation in the C file which you don't have access to.
You would then have something like this in the h file:
typedef struct Struct_type1 Struct_type1;
The compiler might often do things like this with its own libraries too, if they want to hide away the implementation. One such example is the FILE struct.
Not an answer, but possibly a way to find the answer. Idea: Let compiler help you.
Define the struct yourself, then look at compiler errors like "struct struct_type1 is already defined in... at line ..."
If you get no compiler error in this case, maybe the struct is only forward declared, but not defined.
To explain why this is sometimes done, here a bit of code:
// Something.h
struct struct_type1; // Forward declaration.
struct struct_type1 *SomethingInit();
void SomethingDo( struct struct_type1 * context );
In code looking like the above, the definition of the struct is hidden inside the implementation. On the outside, it need not be known, how the struct is defined or its size etc, as it is only traded as a pointer to the struct (and never as a value). This technique is used to keep internal types out of public header files and used often by library designers. You can think of it as an opaque handle of sorts.
But then, you still should be able to find the forward declaration, albeit not the definition.

Modular programming in C (nested headers)

I'm creating a large program that's supposed to be simulating a MIPS pipeline. I'm trying to modularize my code as much as possible to keep things simple, but I'm having trouble compiling.
Currently my program contains the files:
pipe.c --- Containing main
IF.h
ID.h
EX.h
MEM.h
WB.h
global.h --- containing global #define functions
reg.h
ALU.h
control.h
dMem.h
fBuffer.h
parser.h
bin.h
I'm new to C programming but I have protected myself against multiple includes using #ifndef, #define, #endif in every header file. My problem is that when I compile I get errors claiming: "previous implicit declaration of..."
Many of the header files are used by multiple files, so I'm not sure if this is the issue. Is there some sort of big thing that I'm missing?
an implicit declaration means that there was something that wasn't declared in a header (instead, the compiler simply found the function). a previous implicit declaration means that it's come across the declaration later, after assuming an implicit declaration for a "raw" function (or, i guess, as Doug suggests in the comments, another function with the same name).
there are a number of ways this can occur:
maybe you didn't include the header in the associated file. so IF.c doesn't include IF.h. the compiler will read IF.c and create the implicit definition. later, when it reads IF.h somewhere else, it will give this error.
maybe you use a function in a file that doesn't include the relevant header. so maybe IF.h defines myfunction(), but you use myfunction() in dMem.c and don't include IF.h there. so the compiler sees the use of myfunction() in dMem.c before it sees the definition in IF.h when included in IF.c.
without header files at all, you can get this with mutually recursive functions. see How to sort functions in C? "previous implicit declaration of a function was here" error
as Doug suggested, you define two functions with the same name (and don't have a definition in a header).
basically, somewhere, somehow, the compiler got to a function before it got to the header with the associated declaration. when it did find the header it realised things were messed up and generated the error.
(one classic source of header errors is cut+paste the "ifdefs" from one file to another and forget to change the name...)
[reading your question again i assumed you'd only listed the header files. but now i see that is all the files you have. why do you have so many more headers than source files? typically each source file is associated with one or two headers that contain the declarations for the functions it defines (although it will likely import others that it needs for support). this is unrelated to your compiler error, but it sounds like maybe you need to split your source up. it also suggests that either i have misunderstood you, or you are misunderstanding how headers are typically used.]

Resources