I was reviewing some code and I came across something similar to this.
File foo.c:
int bar(int param1)
{
return param1*param1;
}
File main.c:
#include <stdio.h>
int bar(int param1, int unusedParam);
int main (void)
{
int param = 2, unused = 0;
printf("%d\n", bar(param, unused));
}
Running gcc main.c foo.c -Wall --pedantic -O0 it compiles, links and works properly without throwing a single warning in the process. Why is that?
Thanks!
This really depends on the calling convention and architecture. For example, with cdecl on x86, where arguments are pushed right to left and the caller restores the stack, the presence of an additional parameter is transparent to the function bar:
push 11
push 10
call _bar
add esp, 8
bar will only "see" the 10, and will function as expected with that parameter, returning 100. The stack is restored afterwards so there is no misalignment in main either; if you had just passed the 10 it would have added 4 to esp instead.
This is also true of the x64 calling conventions for both MSVC on Windows and the System V ABI, where the first few1 integral arguments are passed in registers; the second argument will be populated in its designated register by the call in main, but not even looked at by bar.
If, however, you tried to use an alternate calling convention where the callee is responsible for cleaning up the stack, you would run into trouble either at the build stage or (worse) at runtime. stdcall, for example, decorates the function name with the number of bytes used by the argument list, so I'm not even able to link the final executable by changing bar to use stdcall instead:
error LNK2019: unresolved external symbol _bar#8 referenced in function _main
This is because bar now has the signature _bar#4 in its object file, as it should.
This gets interesting if you use the obsolete calling convention pascal, where parameters are pushed left-to-right:
push 10
push 11
call _bar
Now bar returns 121, not 100, like you expected. That is, if the function successfully returns, which it won't, since the callee was supposed to clean up the stack but failed due to the extra parameter, trashing the return address.
1: 4 for MSVC on Windows; 6 on System V ABI
Normally you'd have this file structure:
foo.c
#include "foo.h"
int bar(int param1)
{
return param1*param1;
}
foo.h
int bar(int param1);
main.c
#include <stdio.h>
#include "foo.h"
int main (void)
{
int param = 2, unused = 0;
printf("%d\n", bar(param, unused));
}
Now you'll get a compilation error as soon as you use bar with non matching parameters.
Related
I have been trying to implement a small simulation to understand memory allocation of malloc(). I created a shared library called mem.c. I am linking the library to the main but cannot pass the correct address of the simulated "heap". Heap is created by a malloc() call in the shared library.
Address in the shared library: 0x55ddaff662a0
Address in the main: 0xffffffffaff662a0
Only last 4 bytes seem to be correct. Rest is set to 0xf.
However, when I #include "mem.c" in the main it works correctly. How can I achieve the same result without including the mem.c. I am trying to solve this without including mem.c or mem.h. I create shared library as this:
gcc -c -fpic mem.c
gcc -shared -o libmem.so mem.o
gcc main.c -lmem -L. -o main
From your comments
I am trying to implement without using #include mem.h or mem.c.
Then you must provide by other means a prototype for the function you're calling. Without an explicit function prototype, following the tradition of K&R and then later ANSI C, undeclared functions are assumed to return an int and take parameters of type int.
EDIT: Essentially you need to write what'd you normally find in a header, somewhere before you make first use of the function. Or of it's a function pointer you need an appropriate variable to store the function pointer.
For example to declare a function that returns an untyped pointer, and an arbitrary, unspecified number of arguments you'd write
void *getAddr();
Note that using the extern keyword here is not required, since extern linkage is always implied for non-static function declarations.
In case you want to dynamically link at runtime (using dlopen / LoadLibrary → dlsym / GetProcAddress), you'd define a function pointer variable
void* (*getAddr_fptr)();
You can set it using dlsym with
*(void**)(&getAddr_fptr) = dlsym(…)
This awkward way of writing it comes due to function pointers being allowed to have a different size and alignment as data pointers (see the dlsym manpage for details).
These days on the majority of platforms int is a 4 byte type and the most common calling convention pass the first few function arguments by register. On x86 (and x86_64) the registers are AX, BX, CX and DX and may be accessed in different sizes, but may read and write with different size (to allow size conversion). This explains why only the first 4 bytes are passed: It's passed via register and only the write to the register is done as a 4 byte wide write. When the function then reads from the register, it does so with a wider type, with the higher value bits set to all 1.
From the comments:
Do you have a declaration for getAddr in your main code?
No I don't have but I am trying to implement without a declaration, is it possible?
Then that's your problem. Without a declaration, the compiler falls back to a default declaration of int getAddr(). This is incompatible with the actual definition which returns a void *, and calling a function through an incompatible declaration triggers undefined behavior.
What probably happened is that when the return value of the function was actually returned you only got back the 4 low-order bytes. Assuming your system is little-endian, and int is 4 bytes, and a void * is 8 bytes, this would explain the low bits being the same.
You must include a valid declaration before the function is called. It doesn't necessarily have to reside in a header file, but it has to be visible at the point the call happens.
I'm assuming you're trying to accomplish something like this? For mem.c
#include <stdlib.h>
#include <stdio.h>
void* getAddr() {
char *heap = (char *)malloc(10);
printf("%p\n", (void*)heap);
return heap;
}
And then without including any headers for the mem.c functions, you'd probably create a library out of mem.c as you've already mentioned in the question and have something as follows in main.c
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
typedef void* (*getAddr)(); //prototype for getAddr() in mem.c
int main() {
void* handle = dlopen("./libmem.so", RTLD_LAZY);
if(handle) {
void* fn = dlsym(handle, "getAddr");
if(fn) {
void* addr = ((getAddr)(fn))();
printf("%p\n", addr);
free(addr);
addr = NULL;
} else {
printf("Failed to dlsym %s\n", dlerror());
}
} else {
printf("Failed to dlopen %s\n", dlerror());
}
}
EDIT: For OP's purpose as #Zilog80 mentioned, since the library is being linked with main executable, the dlopen() part can be gotten rid of and main.c can be simplified as
#include <stdio.h>
#include <stdlib.h>
extern void* getAddr(); //prototype for getAddr() in mem.c
int main() {
void* addr = getAddr();
printf("%p\n", addr);
free(addr);
addr = NULL;
}
And used similar compilation commands as OP i.e.
gcc -shared -o libmem.so -fpic mem.c
gcc main.c -lmem -L . -o main
while executing
LD_LIBRARY_PATH=. ./main
I am using library that I shouldn't change it files, that including my h file.
the code of the library looks somthing like like:
#include "my_file"
extern void (*some_func)();
void foo()
{
(some_func)();
}
my problem is that I want that some_func will be extern function and not extern pointer to function (I am implementing and linking some_func). and that how main will call it.
that way I will save little run time and code space, and no one in mistake will change this global.
is it possible?
I thought about adding in my_file.h somthing as
#define *some_func some_func
but it won't compile because asterisk is not allowed in #define.
EDIT
The file is not compiled already, so changes at my_file.h will effect the compilation.
First of all, you say that you can't change the source of the library. Well, this is bad, and some "betrayal" is necessary.
My approach is to let the declaration of the pointer some_func as is, a non-constant writable variable, but to implement it as constant non-writable variable, which will be initialized once for all with the wanted address.
Here comes the minimal, reproducible example.
The library is implemented as you show us:
// lib.c
#include "my_file"
extern void (*some_func)();
void foo()
{
(some_func)();
}
Since you have this include file in the library's source, I provide one. But it is empty.
// my_file
I use a header file that declares the public API of the library. This file still has the writable declaration of the pointer, so that offenders believe they can change it.
// lib.h
extern void (*some_func)();
void foo();
I separated an offending module to try the impossible. It has a header file and an implementation file. In the source the erroneous assignment is marked, already revealing what will happen.
// offender.h
void offend(void);
// offender.c
#include <stdio.h>
#include "lib.h"
#include "offender.h"
static void other_func()
{
puts("other_func");
}
void offend(void)
{
some_func = other_func; // the assignment gives a run-time error
}
The test program consists of this little source. To avoid compiler errors, the declaration has to be attributed as const. Here, where we are including the declarating header file, we can use some preprocessor magic.
// main.c
#include <stdio.h>
#define some_func const some_func
#include "lib.h"
#undef some_func
#include "offender.h"
static void my_func()
{
puts("my_func");
}
void (* const some_func)() = my_func;
int main(void)
{
foo();
offend();
foo();
return 0;
}
The trick is, that the compiler places the pointer variable in the read-only section of the executable. The const attribute is just used by the compiler and is not stored in the intermediate object files, and the linker happily resolves all references. Any write access to the variable will generate a runtime error.
Now all of this is compiled in an executable, I used GCC on Windows. I did not bother to create a separated library, because it doesn't make a difference for the effect.
gcc -Wall -Wextra -g main.c offender.c lib.c -o test.exe
If I run the executable in "cmd", it just prints "my_func". Apparently the second call of foo() is never executed. The ERRORLEVEL is -1073741819, which is 0xC0000005. Looking up this code gives the meaning "STATUS_ACCESS_VIOLATION", on other systems known as "segmentation fault".
Because I deliberately compiled with the debugging flag -g, I can use the debugger to examine more deeply.
d:\tmp\StackOverflow\103> gdb -q test.exe
Reading symbols from test.exe...done.
(gdb) r
Starting program: d:\tmp\StackOverflow\103\test.exe
[New Thread 12696.0x1f00]
[New Thread 12696.0x15d8]
my_func
Thread 1 received signal SIGSEGV, Segmentation fault.
0x00000000004015c9 in offend () at offender.c:16
16 some_func = other_func;
Alright, as I intended, the assignment is blocked. However, the reaction of the system is quite harsh.
Unfortunately we cannot get a compile-time or link-time error. This is because of the design of the library, which is fixed, as you say.
You could look at the ifunc attribute if you are using GCC or related. It should patch a small trampoline at load time. So when calling the function, the trampoline is called with a known static address and then inside the trampoline there is a jump instruction that was patched with the real address. So when running, all jump locations are directly in the code, which should be efficient with the instruction cache. Note that it might even be more efficient than this, but at most as bad as calling the function pointer. Here is how you would implement it:
extern void (*some_func)(void); // defined in the header you do not have control about
void some_func_resolved(void) __attribute__((ifunc("resolve_some_func")));
static void (*resolve_some_func(void)) (void)
{
return some_func;
}
// call some_func_resolved instead now
In C90, can I redefine main and give it another name, and possibly add extra parameters using #define?
Have this in a header file for example:
#include <stdio.h>
#include <stdlib.h>
#define main( void ) new_main( void )
int new_main( void );
The header doesn't show any errors when compiling.
When I try compiling it with the main C file, however, I keep getting an error
In function '_start': Undefined reference to 'main'
No, you cannot do that, because it would be against language and OS standards. The name main and its arguments argc, argv and environ constitute a part of system loader calling conventions.
A bit simplifying explanation (no ABI level, just API level) ensues. When your program has been loaded into memory and is about to start, the loader needs to know which function to call as an entrypoint, and how to pass its environment to it. If it was be possible to change the name of main and/or its parameter list, it would have been needed to communicate details of new calling interface back to the loader. And there is no convenient way to do it (apart from writing your own executable loader).
In function '_start': Undefined reference to 'main'
Here you can see an implementation detail of Linux/POISX ELF loader interface. The compiler adds function _start to your program behind the scenes, which is an actual program entrypoint. _start is tasked to do extra initialization steps common to most programs that use LibC. It is _start that later calls your main. Theoretically, you could write a program that has its own function called _start and no main and it would be fine. It is not trivial as you will have to make sure that the default _start code is no longer being attached to your program (no double definitions), but it is doable. And no, you cannot choose other name than _start for the same reasons.
The presence of #define main new_main within a compilation unit will not affect the name of the function the implementation will call on program startup. The implementation is going to call a function called main regardless of any macros you define.
If you are going to use a #define like that to prevent the primary declaration of main() from producing a function by that name, you'll need to include a definition of main() somewhere else; that alternate version could then invoke the original. For example, if the original definition didn't use its arguments, and if the program exits only by returning from main() [as opposed to using exit()] you might put #define main new_main within a header file used by the primary definition of main, and then in another file do something like:
#include <stdio.h>
#include <conio.h> // For getch() function.
int main(void)
{
int result = main();
printf("\nExit code was %d. Strike any key.\n", result);
getch();
return result;
}
In most cases, it would be better to add any such code within the ordinary "main" function, but this approach can be useful in cases where the file containing main is produced by code generation tools on every build, or for some other reason cannot be modified to include such code.
No you cannot (as Grigory said).
You can however, immediate call your proxy main,
int
your_new_main(int argc, char* argv[], char* envp[]) {
... //your stuff goes here
}
//just place this in an include file, and only include in main...
int
main( int argc, char* argv[], char* envp[])
{
int result = your_new_main(argc, argv);
return result;
}
As far as whether envp is supported everywhere?
Is char *envp[] as a third argument to main() portable
Assuming you're using gcc passing -nostdlib to your program, and then set a new entry, by passing this to gcc which passing it to the linker, -Wl,-enew_main. Doing this won't give you access to any of the nice features that the C runtime does before calling your main, and you'd have to do it yourself.
You can look at resources about what happens before main is called.
What Happens Before main
I have a homework assignment that requires us to open, read and write to file using system calls rather than standard libraries. To debug it, I want to use std libraries when test-compiling the project. I did this:
#ifdef HOME
//Home debug prinf function
#include <stdio.h>
#else
//Dummy prinf function
int printf(const char* ff, ...) {
return 0;
}
#endif
And I compile it like this: gcc -DHOME -m32 -static -O2 -o main.exe main.c
Problem is that I with -nostdlib argument, the standard entry point is void _start but without the argument, the entry point is int main(const char** args). You'd probably do this:
//Normal entry point
int main(const char** args) {
_start();
}
//-nostdlib entry point
void _start() {
//actual code
}
In that case, this is what you get when you compile without -nostdlib:
/tmp/ccZmQ4cB.o: In function `_start':
main.c:(.text+0x20): multiple definition of `_start'
/usr/lib/gcc/i486-linux-gnu/4.7/../../../i386-linux-gnu/crt1.o:(.text+0x0): first defined here
Therefore I need to detect whether stdlib is included and do not define _start in that case.
The low-level entry point is always _start for your system. With -nostdlib, its definition is omitted from linking so you have to provide one. Without -nostdlib, you must not attempt to define it; even if this didn't get a link error from duplicate definition, it would horribly break the startup of the standard library runtime.
Instead, try doing it the other way around:
int main() {
/* your code here */
}
#ifdef NOSTDLIB_BUILD /* you need to define this with -D */
void _start() {
main();
}
#endif
You could optionally add fake arguments to main. It's impossible to get the real ones from a _start written in C though. You'd need to write _start in asm for that.
Note that -nostdlib is a linker option, not compile-time, so there's no way to automatically determine at compile-time that that -nostdlib is going to be used. Instead just make your own macro and pass it on the command line as -DNOSTDLIB_BUILD or similar.
I have these files
test1.h
extern int value;
void inc_value();
int print_value();
test1.c
#include "test1.h"
int value=0;
void inc_value()
{
printf("inc value from test3.c = %d\n", value++);
}
int print_value()
{
printf(" value in test1.c = %d\n", value);
return value;
}
test3.c
# include "test1.h"
main()
{
inc_value();
}
test4.c
# include <stdio.h>
#include "test1.h"
main()
{
printf("value from test4 = %d\n", print_value());
}
I'm updating variable "value" from test3.c and trying to read it from test4.c. However test3.c is unable to update the "value" that is declared in test1.h and defined in test1.c
What point am I missing here..
This will never work.
You can't use an external variable from two different programs and magically expect it to work. It's just ... wrong. Each program runs in its own address space, and doesn't know anything about any other process' address spaces. There are techniques for doing this (look up interprocess communucation), but that's a whole different area.
The way extern works is that it allows you to access a variable defined in a different C file within the same program.
You seem to be mis-understanding at a quite fundamental level how the programs you are writing work and execute, since you expect this to work. I recommend reading up more on how C works, and also perhaps a bit on how operating systems host programs in order to run them.
One way of sharing information between programs like you describe is to store the data in a file, which is written by one program (the one that runs first) and read by the other, but that is quite tricky to get right, too.
If you want to call void inc_value() from another file, you should declare it (probably in the header):
void inc_value();
If you want to directly access value, you can, as it was declared as an extern:
# include "test1.h"
main()
{
value = 6;
}
Also note, that in current implementation of inc_value, the value will be incremented after it is passed to printf, e.g. the printed value will be the previous one.
You should put extern int value in the test3.c and just put int value in test1.h.Look at this link: http://www.learncpp.com/cpp-tutorial/42-global-variables/ Hope this helps...