IMAGE_REL_AMD64_ADDR64 64-bit relocation - c

I was trying to get the Microsoft compiler to generate a relocation of type IMAGE_REL_AMD64_ADDR64 for testing purposes. The compiler prefers to use relative 32-bit LEA instructions, which is understandable, but I thought the following program would require it to use a 64-bit relocation:
#include <stdio.h>
char a[5000000000];
char b[5000000000];
int main(){
printf("%p %p %p %p\n", a, b, b - a, a - b);
}
It didn't, so I tried it with MinGW (both compilers in 64-bit mode on 64-bit Windows), and got the same results, respectively:
0000000169CE5A40 000000013FC86840 FFFFFFFFD5FA0E00 000000002A05F200
000000002A46D580 000000000040E380 FFFFFFFFD5FA0E00 000000002A05F200
I then tried it with GCC on Linux and the result was more enlightening, being a linker error message: 'relocation truncated to fit: R_X86_64_32'.
So just to make sure I'm not missing something, this is a bug in both compilers, except that in the case of GCC on Linux, at least the linker notices the problem instead of silently giving a wrong answer?
And how do you get the Microsoft compiler to generate 64-bit relocations? They occur in the Microsoft standard library (which is what prompted me to try to generate some in the first place) so presumably there must be a way?

Mind you, the bug is understandable because the compiler doesn't know that the offsets will end up exceeding 32 bits; basically it's an interaction between the separate compiling and linking model and a quirk of the x64 instruction set.
Anyway, it turns out you can get actual 64-bit relocations with e.g.
char *p = a;

Related

What is this value? [duplicate]

This question already has answers here:
How printf("%d","<string>") work in C?
(2 answers)
Closed 4 years ago.
printf("%d", "10+10");
then I get "17661648" and similar thing in too
printf("%d", "Hello");
What is this value?
sum of "1,0,+,1,0" "H,e,l,l,o" as ASCII code in decimal number? or just a garbage value?
According the the C11 standard n1570 (see its §7.21.6.1) you've got undefined behavior (UB), which is also documented here or in printf(3). So be very scared, since arbitrarily bad things could happen. So take the habit of reading the documentation of every function that you are using.
If you ask your compiler to disassemble the generated form of your program (e.g. by compiling with gcc -S -O -fverbose-asm if you use GCC, on Linux/x86-64) you'll discover that the address of the string literal "10+10" is passed (on 64 bits) and then truncated (inside printf, because of the %d) to an int. So the 17661648 could correspond to the lowest 32 bits of that address.
Details are of course implementation specific (and could vary from one run to the next one because of ASLR, depends upon the compiler and the ABI and the target system). To actually understand and explain the behavior requires diving into many details (your particular computer, your particular compiler and optimization flags, your particular operating system, the compiler generated assembler & machine code, your particular C standard library, etc....) and you don't want to do that (because it could take years).
You should take several hours to read more about UB. It is an essential notion to understand when programming in C, and you should avoid it.
Any good compiler would have warned you, and then you should improve your code to get no warnings. If using GCC, be sure to compile with gcc -Wall -Wextra -g to get all warnings and debug info. Then use the gdb debugger to understand the actual behavior of your program on your system. In all cases, be sure to configure your C compiler to enable all warnings and debug info, and learn to use your debugger. Read How To Debug Small Programs.
Someting like this should work:
printf("Hello");
total = 20;
printf("10+10 = %d", total);

What is wrong with printf("%llx")?

I have this piece of code that is challenging all my knowledge of C.
Here I have :
int main(void){
unsigned long long int massage ;
scanf("%llX", &massage); //input: 0x1234567890abcdef
printf("%llX", massage);
return 0;
}
On my "64bit - Corei5 - Fedora - GCC" it prints out exactly what I fed it. but on my buddy's system (32bit, MS XP, MinGW) it prints 90ABCDEF. I don't understand why. does anyone know?
BTW: sizeof(unsigned long long int) on his system is 8.
The issue is a discrepancy between what the compiler believes (as reflected in sizeof: sizeof(unsigned long long int) is evaluated at compile-time) and what the run-time library believes (as reflected in printf: the printf function is called at run-time, so that's when its format-specifiers take effect).
According to "C99" in the MinGW documentation:
GCC does not include a C runtime library. This is supplied by the platform. The MinGW port of GCC uses Microsoft's original (old) Visual C runtime, MSVCRT, which was targeted by Microsoft Visual Studio 6 (released in 1998).
[…]
Because MinGW relies on MSVCRT, it has many of the same limitations and quirks with compatibility as Visual Studio 6. You should assume that MinGW applications cannot rely on C99 behaviour, only on C89. For example, the newer format characters in printf like %a and %ll are not supported, although there exists a workaround for %ll.
(The workaround that it mentions is to use I64 instead of ll: so, %I64X. Annoyingly, at least on my system, GCC will issue a warning when it sees that in a literal format-string, because it assumes it'll have a better run-time library.)
The Windows C library uses "%I64d", not "%lld", to print arguments of
type "long long".
Ref: http://gcc.gnu.org/ml/gcc-patches/2004-11/msg01966.html

Is the -mx32 GCC flag implemented (correctly)?

I am trying to build a program that communicates with a 32-bit embedded system, that runs on a Linux based x86_64 machine (host). On the host program I have a structure containing a few pointers that reflects an identical structure on the embedded system.
The problem is that on the host, pointers are natively 64-bits, so the offset of the structure members is not the same as in the embedded system. Thus, when copying the structure (as memcpy), the contents end up at the wrong place in the host copy.
struct {
float a;
float b;
float *p;
float *q;
} mailbox;
// sizeof(mailbox) is 4*4=16 on the embedded, but 2*4+2*8=24 on the host
Luckily, I found out here that gcc has an option -mx32 for generating 32-bit pointers on x86_64 machines. But, when trying to use this, I get an error saying:
$ gcc -mx32 test.c -o test.e
cc1: error: unrecognized command line option "-mx32"
This is for gcc versions 4.4.3 and 4.7.0 20120120 (experimental).
Why doesn't this option work? Is there a way around this?
EDIT: Accrding to the v4.4.7 manual, there was no -mx32 option available, and this is true up to v4.6.3. OTOH, v4.7.0 does show that option, so it may be that the Jan-20 version I am using is not the final one?!
Don't do this. First, x32 is a separate architecture. It's not merely a compiler switch. You need an x32 version of every library you link against to make this work. Linux distros aren't yet producing x32 versions, so that means you'll be either linking statically or rolling your own library environment.
More broadly: that's just asking for trouble. If your structure contains pointers they should be pointers. If it contains "32 bit addresses" they should be a 32 bit integer type.
You might need a newer version of binutils
Though I think gcc 4.8 is recommended
But in general you need a kernel compiled multilib with it: https://unix.stackexchange.com/questions/121424/linux-and-x32-abi-how-to-use

Are macro definitions compatible between MIPS and Intel C compiler?

I seem to be having a problem with a macro that I have defined in a C program.
I compile this software and run it sucessfully with the MIPS compiler.
It builds OK but throws the error "Segmentation fault" at runtime when using icc.
I compiled both of these on 64 bit architectures (MIPS on SGI, with -64 flag and icc on an intel platform).
Is there some magic switch I need to use to make this work correctly on both system? I turned on warnings for the intel compiler, and EVERY one of the places in my program where a macro is invoked throws a warning. Usually something along the lines of mismatched types on the macro's parameters (int to char *) or some such thing.
Here is the offending macro
#define DEBUG_ENTER(name) {tdepth++;
if(tnames[tdepth] == NULL) tnames[tdepth] = memalign(8, sizeof(char)*MAXLEN);
strcopy(tnames[tdepth],name);
FU_DEBUG("Entering \n");}
This basically is used for debugging - printing to a log file with a set number of tabs in based on how many function calls there are. (tdepth = tab depth)
I did some checking around in man pages. it seems like memalign is only supported on IRIX. This may be my problem. I am going to track it down.
This might have to do with the system's "endianness." Looking here it seems that MIPS has switchable endianness. I'm not sure if you are using the correct endianness already, but if you aren't, you will DEFINATELY have problems.
This might be a byte order issue. MIPS can be big endian but intel is little endian.
It sounds like the array tnames is an array of int. If you're assigning pointers to it, it should be an array of a pointer type - in this case probably char * is appropriate.
(Also, strcopy() isn't a standard function - are you sure you don't mean strcpy()?)

__udivdi3 undefined — how to find the code that uses it?

Compiling a kernel module on 32-Bit Linux kernel results in
"__udivdi3" [mymodule.ko] undefined!
"__umoddi3" [mymodule.ko] undefined!
Everything is fine on 64-bit systems. As far as I know, the reason for this is that 64-bit integer division and modulo are not supported inside a 32-bit Linux kernel.
How can I find the code issuing the 64-bit operations. They are hard to find manually because I cannot easily check if an "/" is 32-bit wide or 64-bit wide. If "normal" functions are undefined, I can grep them, but this is not possible here. Is there another good way to search the references? Some kind of "machine code grep"?
The module consists of some thousand lines of code. I can really not check every line manually.
First, you can do 64 bit division by using the do_div macro. (note the prototype is uint32_t do_div(uint64_t dividend, uint32_t divisor) and that "dividend" may be evaluated multiple times.
{
unsigned long long int x = 6;
unsigned long int y = 4;
unsigned long int rem;
rem = do_div(x, y);
/* x now contains the result of x/y */
}
Additionally, you should be able to either find usage of long long int (or uint64_t) types in your code, or alternately, you can build your module with the -g flag and use objdump -S to get a source annotated disassembly.
note: this applies to 2.6 kernels, I have not checked usage for anything lower
Actually, 64-bit integer divison and modulo are supported within a 32-bit Linux kernel; however, you must use the correct macros to do so (which ones depend on your kernel version, since recently new better ones were created IIRC). The macros will do the correct thing in the most efficient way for whichever architecture you are compiling for.
The easiest way to find where they are being used is (as mentioned in #shodanex's answer) to generate the assembly code; IIRC, the way to do so is something like make directory/module.s (together with whatever parameters you already have to pass to make). The next easiest way is to disassemble the .o file (with something like objdump --disassemble). Both ways will give you the functions where the calls are being generated (and, if you know how to read assembly, a general idea of where within the function the division is taking place).
After compilation stage, you should be able to get some documented assembly, and see were those function are called. Try to mess with CFLAGS and add the -S flags.
Compilation should stop at the assembly stage. You can then grep for the offending function call in the assembly file.

Resources