C - Scope of C Functions - c

I apologize if this is a beginner's question, but after working in C for a bit, I finally would like a bit of clarification on exactly what kind of files/functions are available to a function.
I understand we can explicitly include other files with the #include macro like so:
#include bar.c
But what about files that are in the same directory? Let's say we have the following file structure:
src-
|
a.c
b.c
And let's say a.c has the function "foo()" and b.c has "bar()"
Would I be able to call bar() within a.c even without explicitly including b.c in the header file?
And what about in sub-directories such as the following:
src-
|
a.c
|
someFolder-
|
b.c
Would I still be able to access bar() in a.c?
Basically, how exactly is the scope of functions (without including them in headers) defined?

You are confusing scope with linkage.
The term scope descibes if an identifier is visible to the compiler in a given context. The broadest scope known to the C language is file scope which means that the identifier is visible from the point it is declared to the end of the translation unit after preprocessing. You can make any identifier visible/known by declaring it, but making an identifier visible does not mean that refering to the identifier refers to the thing it refers to in another file.
The term linkage describes how two identifiers refer to the same thing. There are three levels of linkage: with no linkage, each identifier with the name refers to a different thing. With internal linkage, each identifier within the same translation unit refers to the same thing. With external linkage, each identifier in the whol program refers to the same thing. For two identifiers in two translation units to refer to the same thing, you need to declare both of them to have external linkage.
This is independent of how the declaration is obtained. There is no difference in writing int foo() in a common header file to writing the same line in both source files. In either case, all identifiers refer to the same foo() as functions have external linkage unless explicitly declared to have internal linkage with the static type specifier.
It is also independent of how your source code is laid out. As long as all translation units are linked into the same binary, you can call all functions with external linkage in each translation unit.

Depends on whether you are linking these into one binary or not.
If linking - then yes, functions can be called with implicit declaration.
Common pattern in this case is making a header(s) file with all declarations, including *.c files is usually not preferred, since #include is just textual inclusion of file contents

#include (a preprocessor directive, not a macro, BTW) has a configurable list of directories it searches.
If you use the #include "something.h" form (as compared to the angle bracket form for including system headers), the first directory in the list is the directory of the including file.
Although #include does a simple textual inclusion so it is technically possible to include any text file, c files are almost never included.
In C, you usually compile c files separately into object files and link them together, and what you do include is header files which consist of declarations, which represent interfaces of the other object files which your current compilation unit otherwise wouldn't see.
Check out http://www.cs.bu.edu/teaching/c/separate-compilation/ or another article on that topic.

The forward slash "/" should work in this case.
If you can include foo.h without problem, this means the directory (say, C:\DIRFOO) is predefined in the "include" directive of the compiler. So, another file (say bar.h) in a sub directory (say, C:\DIRFOO\DIRBAR) can be included via #include <DIRBAR/bar.h>.

Related

weird static function behaviour

i am looking at static functions, i know they have a scope limited to the file they are declared in. For clarification i use code blocks with GCC. If i declare the function in a ..c file and include it in my main.c file, the function can be accessed (but the compiler will complain if a non static function is defined, as i have multiple definitions). However if i have a c file with some static function and another non-static function, then the static function is not accessible and the non-static function is accessible. This to me is a rather weird. I know that the #include directive copies the contents of the to-be-included file in to the file where the include directive is declared. But then why can i access the non-static function without including the .c file in the main.c file?
Any suggestions on where i could read up on this topic? I am thinking it has something to do with linking, but i may be wrong.
When a function is declared without static, it has external linkage.1 This means that, when the program is linked2, the same name in different translation units3 can be made to refer to the same thing.
When you define a function in one source file, say int square(int x) { return x*x; }, compiling that source file produces an object file that contains information saying “The function foo is defined here.” When you declare a function in another source file and use that function, such as int square(int x); … b = square(a);, compiling that source file produces an object file that contains information saying “The function foo is used here.” When the linker or program loader processes these object files, it modifies the place where foo is used to refer to the place where foo is defined.
If you use a function without declaring it in that translation unit, your compiler might provide a default declaration for the function, as if it had been declared to return int with no information about its parameters. This behavior is a legacy of old versions of C. Quite likely your compiler issues a warning such as “implicit declaration of function” for this. You should not ignore this warning. When it appears, find the code causing the warning and fix it—either correct the typing of a misspelled function name or insert a declaration for the function.
When using GCC or Clang, you should use the switch -Werror to elevate warnings to errors, so that the compiler will not compile programs that cause warning messages. While this may at first make it more of a nuisance to get your programs compiled, it causes you to fix errors early and to learn to program correctly.
Footnotes
1 With static, an identifier for a function has internal linkage. The rules for identifiers for objects (which you may think of as variables) are more complicated, and they can also have no linkage.
2 In practice, each compilation produces an object module. Linking combines object modules to make an executable program or, in complicated builds, new object modules.
3 A translation unit is the source file being compiled including all the files it includes with #include statements.
So I found a decent answer to your question in a different stackoverflow question, but before I get to that I want to explain why including .c files is bad practice, and shouldn't be done:
As you pointed out, #include copies the content of the file. This means that you can't include the .c file more than once, since you would be creating >1 instance of the same function (of the same name).
If you try to create a universal .c file, say, math.c, you can't create any other .c files that use any of its functions, since you can only include it once, and thats reserved for your main.c file. This severely limits the usefulness of every function in your .c files
instead, you should only ever include header files (.h). You declare your functions in your .h file, define in your .c file.
Whatever functions you declare in your .h file, you can then use in any other files where said .h file is included, without causing compiler errors
I get that this is confusing, but here's examples from another stackoverflow thread
here's a further better document explaining how header files (.h) work
Why can you access non-static function despite not including the .c file?
this should not be possible generally in c if you're not explicitly including said .c file. If I were to do that I would get an "implicit declaration" function, because the compiler doesn't recognize the functions, as they've not been declared previously.
relevant stack overflow answer
long story short, declaring a function static in a .c file means that its scope is limited only to said .c file, you can't call it anywhere else

What is and isn't available to linked files in C?

As of late, I've been trying to work through opaque pointers as a programming concept and one of the main things I've had difficulties with is figuring out what is or isn't available to other files. In a previous question, I failed in trying to create an opaque pointer to a struct and even though the answer explained how to fix that, I still don't quite understand where I went wrong.
I think that if a struct is defined in file2.c, file1.c can use it if both files include header.h which includes a declaration of the struct? That doesn't entirely make sense to me. header.h is used by both files, so I can see how they would access the stuff in it, but I don't understand how they would use it to access each other.
When I started programming, I thought it was pretty straight forwards, where you have program files, they can't access anything in each other, and those program files can #include header files with definitions and declarations in them (e.g. file1.c has access to variables/functions/etc. defined in header.h). Turns out I was wrong and things are quite a bit more complicated.
So from what I can tell, func() defined in header.h can be used by file1.c without being declared in file1.c, if file1.c includes header.h. As opposed to var defined in header.h which needs to be declared in file1.c with the extern keyword? And I think if var is defined in file2.c, file1.c can use it if it extern declares it, even if neither file1.c nor file2.c include header.h?
I apologize if the previous paragraphs makes no sense, I'm having quite a bit of difficulty with trying to describe something that confuses me. By all means, please edit this if you are able to fix mistakes or whatnot.
Books and webpages don't seem to help at all. They end up giving me misconceptions because I already don't understand something and draw the wrong conclusions, or they bring up concepts that throw me off even more.
What I'm looking what I'm looking for is an answer that lays this all down in front of me. For example 'this can access this under these circumstances', 'this cannot access this'.
Functions defined in one .c file can use anything defined in another .c file except for those things which are marked as static. Functions and global variables which are marked as static cannot be accessed from other translation units.
Whether something is declared in a header file or not doesn't really matter--you can declare functions locally in the same .c file which calls them if you want.
Your question asks about “access” at several points, but I do not think that is what you mean to use. Any object or function can be accessed (for an object: read or written, for a function: called) from anywhere as long as a pointer to it is provided in some way). I think what you mean to ask is what names are available.
Any declaration that is outside of a function is an external declaration. In this use of “external” in the C standard, it simply means outside of a function. (That includes a function declaration or definition; although it is declaring or defining a function, it is not inside itself or any other function declaration, so it is outside of any function.)
Any identifier for an object or function with an external declaration has either internal linkage or external linkage. If it is first declared with static, it has internal linkage (and may be later declared with extern, but that will not change the linkage). Otherwise, it has external linkage.
Any identifier with external linkage will refer to the same object or function in all translation units (provided other rules of the C standard are satisfied—a program can do various things that will result in behavior not defined by the C standard).
Thus your answer is: The name of any object or function that is (a) defined outside of any function and (b) not initially declared with static is available to be linked to from other translation units.
Some technicalities that may be of interested:
What people think of as a variable is two things: an identifier (the name) and an object (a region of memory that stores the value).
Identifiers have scope, which is where they are visible in the source code. Identifiers declared outside functions have file scope; they are visible for the rest of the translation unit. Identifiers declared inside functions have various other types of scope: function scope, function prototype scope, and block scope.
You may sometimes seem people refer to global scope or external scope, but these are misnomers; they are not terms used in the C standard.
Linkage is related to scope and is sometimes confused with it, but linkage is a different concept: Two identical identifiers declared in different places can be made to refer to the same thing. Those identifiers have different scopes, notably one having file scope in one translation unit and the other having file scope in a different translation unit. Since each translation unit is compiled separately, the compiler generates code regarding each identifier separately. When the object modules are linked, then the code is bounded together, causing the separate identifiers to refer to the same object or function.
Identifiers can be declared with extern inside functions, but these can only link to objects or functions defined elsewhere; external definitions cannot appear inside functions.

Do all C functions need to be declared in a header file

Do I need to declare all functions I use in a .c file in a header file, or can I just declare and define right there in the .c file? If so, does a definition in the .c file in this case count as the declaration also?
Do I need to declare all functions I use in a .c file in a header file,
or can I just declare and define right there in the .c file?
You used "use" in the first question and "define" in the next question. There is a difference.
void foo()
{
bar(10);
}
Here, foo is defined and bar is used. You should declare bar. If you don't declare bar, the compiler makes assumptions about its return type.
You can declare bar in the .c file or add the declaration in a .h file and #include the .h file in the .c file. Whether you use the first method or the second method is up to you. If you use the declaration in more than one .c file, it is better to put that in a .h file.
You can define foo without a declaration.
If so, does a definition in the .c file in this case count as the declaration also?
Every function definition counts as a declaration too.
For the compiler, it does not matter if a declaration occurs in a .h or a .c file, because the compiler sees the preprocessed form.
For the human developer reading and contributing to your code, it is much better (to avoid copy&pasting the same declaration twice) to put the declaration of any function used in more than one translation unit (i.e. .c file) in some #include-d header.
And you can define a function before using it.
BTW, you might even avoid declaring a function that you are calling (it defaults to returning int for legacy purposes), but this is poor taste and obsolete way of coding (and most compilers can emit a warning in that case).
No, it is not necessary.
The reason of the header files is to separate the interface from the implementation. The header declares "what" a class (or whatever is being implemented) will do, while the .c file defines "how" it will perform those features.
This reduces dependencies so that code that uses the header doesn't necessarily need to know all the details of the implementation and any other classes/headers needed only for that. This will reduce compilation times and also the amount of recompilation needed when something in the implementation changes.
The answer to both questions is yes. You can declare c-functions in both header and .c file. Same with definition. However, if you are defining it in header file, you may have slight problems during compilation.
By default functions have external linkage. It means that it is supposed that functions potentially will be used in several compilation units.
However sometimes some auxiliary functions that form implementations of other functions are not designed to be used in numerous compilation units. Such functions declared with keyword static have internal linkage.
Usually they are declared and defined inside some .c module and are not visible in other compilation units.
One occasion that requires functions to be declared in a separate header is when one is creating a library for other developers to use. Some libraries are distributed as closed source and they are provided to you as a library file (*.dll / *.so ...) and a header.
The header file would contain declarations of all publicly accessible functions and definitions of all publicly required structures, enums and datatypes etc.
Without this header file the 3rd party library user would not know how to interface with the library file and thus would not be able to link against it.
But for small, trivial C programs that are not intended for use by other people, no you can just dump everything into a C file and build it. Although you might curse yourself years later when you need to maintain that code :)

Why won't C compile if two separate source files in the same workspace share function names?

I'm using eclipse indigo, gcc and cdt in a project. If two functions in separate source files share names (regardless of return type or parameters), eclipse flags a redefinition error. This isn't a huge issue regarding this project given I can easily rename these functions, and I'm well aware of wrappers if it were. Although this isn't a critical issue, it does make me think I'm not understanding the c build process. What occurs during the build process in which a program structure like this would cause issue?
Here's some more info. on the situation, and where my understanding is so far -- not necessary to answer the question, although there must be a hole in my understanding.
In this case, the two functions are intended to be used only locally, as such their prototypes are not given in the .h interface, and for the sake of my point, neither are defined 'static'.
Neither of these source files are being included anywhere in the project, so they shouldn't be sharing any compilation units. With that in consideration, I would have assumed that the neither source file is aware of the presence of the other, and the compiler would have no problem indexing the two functions, as the separate files would allow for proper distinguishing between the two during linking -- so long as they weren't included in the same compilation unit.
I noticed that statically defining either instance of the function declaration removes the error. I remember reading at some point that every function not declared static is global -- although given these functions are not a part of the .h interface, the practical example in which including the .h interface doesn't allow for the including program to reference all .c functions would indicate "hiding" these functions would be of no issue.
What am I overlooking?
Some insight would be greatly appreciated, thanks!
This is the concept of "linkage". Every function and variable in C has a linkage type, one of "external", "internal", and "none". (Only variables can have no linkage.)
Functions have external linkage by default, which means that they can be called by name from any compilation unit (where "compilation unit" roughly means one source file and all the headers it includes). This can be expressed explicitly by declaring them extern, or it can be overridden by declaring them static. Functions declared static have internal linkage, meaning they can be referenced by name only from other functions in the same compilation unit.
No two external functions anywhere in the same program can have the same name, regardless of header files, but static functions in different compilation units may have the same name. A static function may have the same name as an external function, too -- then the name resolves to the static function within its compilation unit, and to the external function elsewhere. These restrictions make sense, for otherwise it would be possible for a function call to be ambiguous.
Header files don't factor into the linkage equation at all. They are primarily a vehicle for sharing declarations, but a function's linkage depends only on how it is declared, not on where.
I leave discussion of variables' linkage for another time.
It doesn't matter whether one source module includes headers for another. Header files only contain declarations for the purpose of local functions being able to find functions in other modules. It doesn't mean that functions not declared don't exist from the perspective of that module.
When everything gets linked together, anything not specifically defined to be local to one source module (i.e. static) has to have a unique name across all linked components.
remember reading at some point that every function not declared static is global
Having understood this you got the main point and reason for the behaviour observed.
.h files are not known to the linker, after pre-processing there are only translation units left (typically a .c file with all includes merged in), from which .o files are compiled.
There are no interfaces on language level in C.
Neither of these source files are being included anywhere in the project, so they shouldn't be sharing any compilation units.
Declare those functions as static. This is the only way to "hide" a function from the linker "inside" a translation unit.
C doesn't "mangle" function names the way C++ or Java do (since C doesn't support function polymorphism).
For example, in C++, the functions
void foo( void );
void foo( int x );
void foo( int x, double y );
have their names "mangled" into the unique symbols1
_Z3fooid
_Z3fooi
_Z3foov
which is how overloaded function/method calls are disambiguated at the machine level.
C doesn't do that; instead, the linker sees two different function definitions using the same symbol and yaks because it has no way to disambiguate the two.
1. This is what happens on my system, anyway

In C, how do I restrict the scope of a global variable to the file in which it's declared?

I'm new to C. I have a book in front of me that explains C's "file scope", including sample code. But the code only declares and initializes a file scoped variable - it doesn't verify the variable's scope by, say, trying to access it in an illegal way. So! In the spirit of science, I constructed an experiment.
File bar.c:
static char fileScopedVariable[] = "asdf";
File foo.c:
#include <stdio.h>
#include "bar.c"
main()
{
printf("%s\n", fileScopedVariable);
}
According to my book, and to Google, the call to printf() should fail - but it doesn't. foo.exe outputs the string "asdf" and terminates normally. I would very much like to use file scoping. What am I missing?
You #included bar.c, which has the effect of causing the preprocessor to literally copy the contents of bar.c into foo.c before the compiler touches it.
Try getting rid of the include, but telling your compiler to compile both files (e.g. gcc foo.c bar.c) and watch it complain as you expect.
Edit: I suppose the primary confusion is between the compiler and the preprocessor. Language rules are enforced by the compiler. The preprocessor runs before the compiler, and acts on those commands prefixed by #. All the preprocessor does is manipulate plain text. It doesn't parse the code or try to interpret the meaning of the code in any way. The "#include" directive is very literal - it tells the preprocessor "insert the contents of this file here". This is why you normally only use #include on .h (header) files, and you only place function prototypes and extern variable declarations in header files. Otherwise, you will end up compiling the same functions, or defining the same variables, multiple times, which is not legal.
This is caused by confusing terms. file scope in C does not refer to restrict linkage of an identifier to only one translation unit. It also doesn't mean that scope is limited to one physical file. Instead, file scope means that your identifier is global. The term file, here, refers to the text that results from processing all #include, #define and other pre-processor directives.
In general, scope is only a concept taking effect within one translation unit. When multiple compilations are involved, linkage starts to happen.
If you declare your file scope variable static, then it gives the variable internal linkage, which means that it isn't visible outside of that translation unit.
If you don't declare it static explicitly, or if you declare the file scope variable extern, then it is visible to other translation units: Those, if they declare a file-scope variable with the same identifier, will have that identifier link to that same variable.
In your case, the inclusion of bar.c into foo.c inserts the definition of fileScopeVariable into the translation unit being compiled. Thus, it's visible in that unit.
Don't ever #include a .c file like you are doing there. The language allows it, but C coders just don't do that, so you will confuse the heck out of people if you do it. Most likely including yourself.
"#include" means "Compiler, please go to that other file and tack it to the front of this one before you start compiling my code."
I once lost a whole day in total confusion because one of the vxWorks source files did this. I'm still POed at them over that.
Remove the second include instruction. As it was said above...
Before compiling your code a compiler preprocesses it. In that stage it processes all the instructions starting with '#', like #include, #define and etc.
To see the result of that stage you can just run 'gcc -E ' (if you are using gcc).

Resources