Why does GCC store global and static int differently? - c

Here is my C program with one static, two global, one local and one extern variable.
#include <stdio.h>
int gvar1;
int gvar2 = 12;
extern int evar = 1;
int main(void)
{
int lvar;
static int svar = 4;
lvar = 2;
gvar1 = 3;
printf ("global1-%d global2-%d local+1-%d static-%d extern-%d\n", gvar1, gvar2, (lvar+1), svar, evar);
return 0;
}
Note that gvar1, gvar2, evar, lvar and svar are all defined as integers.
I disassembled the code using objdump and the debug_str for this shows as below:
Contents of section .debug_str:
0000 76617269 61626c65 732e6300 6c6f6e67 variables.c.long
0010 20756e73 69676e65 6420696e 74002f75 unsigned int./u
0020 73657273 2f686f6d 6534302f 72616f70 sers/home40/raop
0030 2f626b75 702f6578 616d706c 65730075 /bkup/examples.u
0040 6e736967 6e656420 63686172 00737661 nsigned char.sva
0050 72006d61 696e006c 6f6e6720 696e7400 r.main.long int.
0060 6c766172 0073686f 72742075 6e736967 lvar.short unsig
0070 6e656420 696e7400 67766172 31006776 ned int.gvar1.gv
0080 61723200 65766172 00474e55 20432034 ar2.evar.GNU C 4
0090 2e342e36 20323031 31303733 31202852 .4.6 20110731 (R
00a0 65642048 61742034 2e342e36 2d332900 ed Hat 4.4.6-3).
00b0 73686f72 7420696e 7400 short int.
Why is it showing the following?
unsigned char.svar
long int.lvar
short unsigned int.gvar1.gvar2.evar
How does GCC decide which type it should be stored as?
I am using GCC 4.4.6 20110731 (Red Hat 4.4.6-3)

Why is it showing the following?
Simple answer: It is not showing what you think but it is showing:
1 "variables.c"
2 "long unsigned int"
2a "unsigned int"
2b "int"
3 "/users/home40/raop/bkup/examples"
4 "unsigned char"
4a "char"
5 "svar"
6 "main"
7 "long int"
8 "lvar"
9 "short unsigned int"
10 "gvar1"
11 "gvar2"
12 "evar"
13 "GNU C 4.4.6 20110731 (Red Hat 4.4.6-3)"
14 "short int"
The section is named .debug_str; it contains a list of strings which are separated by NUL bytes. These strings are in any order and they are referenced by the section .debug_info. So the fact that svar is following unsigned char has no meaning at all.
The .debug_info section contains the actual debugging information. This section does not contain strings. Instead it will contain information like this:
...
Item 123:
Type of information: Data type
Name: 2b /* String #2b in ".debug_str" is "int" */
Kind of data type: Signed integer
Number of bits: 32
... some more information ...
Item 124:
Type of information: Global variable
Name: 8 /* "lvar" */
Data type defined by: Item 123
Stored at: Address 0x1234
... some more information ...
Begin item 125:
Type of information: Function
Name: 6 /* "main" */
... some more information ...
Item 126:
Type of information: Local variable
Name: 5 /* "svar" */
Data type defined by: Item 123
Stored at: Address 0x1238
... some more information ...
End item 125 /* Function "main" */
Item 127:
...
You can see this information using the following command:
readelf --debug-dump filename.o
Why does GCC store global and static int differently?
I compiled your example twice: Once with optimization and once without optimization.
Without optimization svar and gvar1 were stored exactly the same way: Data type int, stored on a fixed address. lvar was: Data type int, stored on the stack.
With optimization lvar and svar were stored the same way: Data type: int, not stored at all, instead they are treated as constant value.
(This makes sense because the values of these variables never change.)

The C11 specification (read n1570) -or older C standards- does not define at what addresses or offsets are stored global or static variables, so the implementation (your gcc compiler and your ld linker) is free to put them at any place.
The organization and layout of the data segments is an implementation detail.
You may want to read more about DWARF to understand debug information, which is useful to the gdb debugger.
You may want to read more about linkers and loaders, and about the ELF format, if you want to understand how they are working. On Linux, there are several utilities to inspect elf(5) files, including objdump(1), readelf(1), nm(1).
Notice that your GCC4.4 is an obsolete and old version of GCC. Current version is GCC7, and GCC8 will be released in a few weeks (spring 2018). I strongly recommend to upgrade your compiler.
If you need to understand how and why the data segments are organized in such way and why your implementation chooses such a layout, you could take advantage that both gcc and ld (from binutils) are free software, and study their source in details. You'll need many years of work, since they are complex software (more than ten million lines of source code).
If you happen to start studying the internals of GCC, be sure to study a recent version. Most people of the GCC community have probably forgotten the details of GCC4.4 (released in 2009). A lot of things have changed in GCC since that ancient thing. A few years ago, I have written many slides about GCC internals, see the documentation of GCC MELT.
BTW, the layout of data segments, or of variables inside them, might vary with optimization options. It might happen that lvar does not sit in memory (e.g. stays in a register only); it could happen that a static variable is removed (using something like the as-if rule) etc.
For a single translation unit foo.c, you might compile it into assembler code using gcc -fverbose-asm -S -O foo.c and look into the emitted foo.s assembler code.
To understand more how your ld linker work, you might look into some relevant linker script. You could find how ld is invoked from gcc by using gcc -v (instead of gcc) in your compilation and linking command.
In most cases, you should not care about the particular offsets (in object files or executables) or addresses (in the virtual address space of your process) of global or static variables. Be also aware of ASLR. The proc(5) filesystem can be used to understand your process.
(your question is severely lacking some motivation and context)

Related

How to tell gcc to not align function parameters on the stack?

I am trying to decompile an executable for the 68000 processor into C code, replacing the original subroutines with C functions one by one.
The problem I faced is that I don't know how to make gcc use the calling convention that matches the one used in the original program. I need the parameters on the stack to be packed, not aligned.
Let's say we have the following function
int fun(char arg1, short arg2, int arg3) {
return arg1 + arg2 + arg3;
}
If we compile it with
gcc -m68000 -Os -fomit-frame-pointer -S source.c
we get the following output
fun:
move.b 7(%sp),%d0
ext.w %d0
move.w 10(%sp),%a0
lea (%a0,%d0.w),%a0
move.l %a0,%d0
add.l 12(%sp),%d0
rts
As we can see, the compiler assumed that parameters have addresses 7(%sp), 10(%sp) and 12(%sp):
but to work with the original program they need to have addresses 4(%sp), 5(%sp) and 7(%sp):
One possible solution is to write the function in the following way (the processor is big-endian):
int fun(int bytes4to7, int bytes8to11) {
char arg1 = bytes4to7>>24;
short arg2 = (bytes4to7>>8)&0xffff;
int arg3 = ((bytes4to7&0xff)<<24) | (bytes8to11>>8);
return arg1 + arg2 + arg3;
}
However, the code looks messy, and I was wondering: is there a way to both keep the code clean and achieve the desired result?
UPD: I made a mistake. The offsets I'm looking for are actually 5(%sp), 6(%sp) and 8(%sp) (the char-s should be aligned with the short-s, but the short-s and the int-s are still packed):
Hopefully, this doesn't change the essence of the question.
UPD 2: It turns out that the 68000 C Compiler by Sierra Systems gives the described offsets (as in UPD, with 2-byte alignment).
However, the question is about tweaking calling conventions in gcc (or perhaps another modern compiler).
Here's a way with a packed struct. I compiled it on an x86 with -m32 and got the desired offsets in the disassembly, so I think it should still work for an mc68000:
typedef struct {
char arg1;
short arg2;
int arg3;
} __attribute__((__packed__)) fun_t;
int
fun(fun_t fun)
{
return fun.arg1 + fun.arg2 + fun.arg3;
}
But, I think there's probably a still cleaner way. It would require knowing more about the other code that generates such a calling sequence. Do you have the source code for it?
Does the other code have to remain in asm? With the source, you could adjust the offsets in the asm code to be compatible with modern C ABI calling conventions.
I've been programming in C since 1981 and spent years doing mc68000 C and assembler code (for apps, kernel, device drivers), so I'm somewhat familiar with the problem space.
It's not a gcc 'fault', it is 68k architecture that requires stack to be always aligned on 2 bytes.
So there is simply no way to break 2-byte alignment on the hardware stack.
but to work with the original program they need to have addresses
4(%sp), 5(%sp) and 7(%sp):
Accessing word or long values off the ODD memory address will immediately trigger alignment exception on 68000.
To get integral parameters passed using 2 byte alignment instead of 4 byte alignment, you can change the default int size to be 16 bit by -mshort. You need to replace all int in your code by long (if you want them to be 32 bit wide). The crude way to do that is to also pass -Dint=long to your compiler. Obviously, you will break ABI compatibility to object files compiled with -mno-short (which appears to be the default for gcc).

How to define C functions with LuaJIT?

This:
local ffi = require "ffi"
ffi.cdef[[
int return_one_two_four(){
return 124;
}
]]
local function print124()
print(ffi.C.return_one_two_four())
end
print124()
Throws an error:
Error: main.lua:10: cannot resolve symbol 'return_one_two_four': The specified procedure could not be found.
I have a sort-of moderate grasp on C and wanted to use some of it's good sides for a few things, but I couldn't find many examples on LuaJIT's FFI library. It seems like cdef is only used for function declarations and not definitions. How can I make functions in C and then use them in Lua?
LuaJIT is a Lua compiler, but not a C compiler. You have to compile your C code into a shared library first. For example with
gcc -shared -fPIC -o libtest.so test.c
luajit test.lua
with the files test.c and test.lua as below.
test.c
int return_one_two_four(){
return 124;
}
test.lua
local ffi = require"ffi"
local ltest = ffi.load"./libtest.so"
ffi.cdef[[
int return_one_two_four();
]]
local function print124()
print(ltest.return_one_two_four())
end
print124()
Live example on Wandbox
A JIT within LuaJIT
In the comments under the question, someone mentioned a workaround to write functions in machine code and have them executed within LuaJIT on Windows. Actually, the same is possible in Linux by essentially implementing a JIT within LuaJIT. While on Windows you can just insert opcodes into a string, cast it to a function pointer and call it, the same is not possible on Linux due to page restrictions. On Linux, memory is either writeable or executable, but never both at the same time, so we have to allocate a page in read-write mode, insert the assembly and then change the mode to read-execute. To this end, simply use the Linux kernel functions to get the page size and mapped memory. However, if you make even the tiniest mistake, like a typo in one of the opcodes, the program will segfault. I'm using 64-bit assembly because I'm using a 64-bit operating system.
Important: Before executing this on your machine, check the magic numbers in <bits/mman-linux.h>. They are not the same on every system.
local ffi = require"ffi"
ffi.cdef[[
typedef unsigned char uint8_t;
typedef long int off_t;
// from <sys/mman.h>
void *mmap(void *addr, size_t length, int prot, int flags,
int fd, off_t offset);
int munmap(void *addr, size_t length);
int mprotect(void *addr, size_t len, int prot);
// from <unistd.h>
int getpagesize(void);
]]
-- magic numbers from <bits/mman-linux.h>
local PROT_READ = 0x1 -- Page can be read.
local PROT_WRITE = 0x2 -- Page can be written.
local PROT_EXEC = 0x4 -- Page can be executed.
local MAP_PRIVATE = 0x02 -- Changes are private.
local MAP_ANONYMOUS = 0x20 -- Don't use a file.
local page_size = ffi.C.getpagesize()
local prot = bit.bor(PROT_READ, PROT_WRITE)
local flags = bit.bor(MAP_ANONYMOUS, MAP_PRIVATE)
local code = ffi.new("uint8_t *", ffi.C.mmap(ffi.NULL, page_size, prot, flags, -1, 0))
local count = 0
local asmins = function(...)
for _,v in ipairs{ ... } do
assert(count < page_size)
code[count] = v
count = count + 1
end
end
asmins(0xb8, 0x7c, 0x00, 0x00, 0x00) -- mov rax, 124
asmins(0xc3) -- ret
ffi.C.mprotect(code, page_size, bit.bor(PROT_READ, PROT_EXEC))
local fun = ffi.cast("int(*)(void)", code)
print(fun())
ffi.C.munmap(code, page_size)
Live example on Wandbox
How to find opcodes
I see that this answer has attracted some interest, so I want to add something which I was having a hard time with at first, namely how to find opcodes for the instructions you want to perform. There are some resources online most notably the IntelĀ® 64 and IA-32 Architectures Software Developer Manuals but nobody wants to go through thousands of PDF pages just to find out how to do mov rax, 124. Therefore some people have made tables which list instructions and corresponding opcodes, e.g. http://ref.x86asm.net/, but looking up opcodes in a table is cumbersome as well because even mov can have many different opcodes depending on what the target and source operands are. So what I do instead is I write a short assembly file, for example
mov rax, 124
ret
You might wonder, why there are no functions and no things like segment .text in my assembly file. Well, since I don't want to ever link it, I can just leave all of that out and save some typing. Then just assemble it using
$ nasm -felf64 -l test.lst test.s
The -felf64 option tells the assembler that I'm using 64-bit syntax, the -l test.lst option that I want to have a listing of the generated code in the file test.lst. The listing looks similar to this:
$ cat test.lst
1 00000000 B87C000000 mov rax, 124
2 00000005 C3 ret
The third column contains the opcodes I am interested in. Just split these into units of 1 byte and insert them into you program, i.e. B87C000000 becomes 0xb8, 0x7c, 0x00, 0x00, 0x00 (hexadecimal numbers are luckily case-insensitive in Lua and I like lowercase better).
Technically you can do the sorts of things you want to do without too much trouble (as long as the code is simple enough).
Using something like this:
https://github.com/nucular/tcclua
With tcc (which is very small, and you can even deploy with it easily) its quite a nice way to have the best of both worlds, all in a single package :)
LuaJIT includes a recognizer for C declarations, but it isn't a full-fledged C compiler. The purpose of its FFI system is to be able to define what C functions a particular DLL exports so that it can load that DLL (via ffi.load) and allow you to call those functions from Lua.
LuaJIT can load pre-compiled code through a DLL C-based interface, but it cannot compile C itself.

The hidden __result local variable in armcc DWARF debug information

I'm writing tools for debugging Cortex-M and I have discovered an artefact when reviewing the DWARF .debug_info section which the armcc outputs for some C source. (The exact compiler is ARM Compiler 5.05.)
For example when the C source contains a simple function such as:
int function(int a)
{
int x;
int y;
I have discovered that the .debug_info describes the x and y local variables as expected, and additionally a "hidden variable" called __result as follows:
<2><218>: Abbrev Number: 94 (DW_TAG_variable)
<219> DW_AT_name : __result
<222> DW_AT_type : DW_FORM_ref2 <0x188>
<225> DW_AT_location : 1 byte block: 50 (DW_OP_reg0 (r0))
<227> DW_AT_start_scope : 64
<228> DW_AT_artificial : 1
The clue of the "hidden" nature of this "variable", being the presence of the DW_AT_artificial flag.
I've have read the DWARF documentation regarding the DW_AT_artificial flag, which confirmed by suspicions. I've also deduced by experimentation that this feature relates to return value which this function, since this "variable" does not appear in the DWARF for a void typed function.
What I cannot find is any confirmation of this entity's use as intended by the designers of the armcc toolchain. Can anybody elaborate on the meaning and usage of my discovery?

The parameter type is not valid for a function of this linkage type

I'm working on AIX with IBM's XL C compiler. I'm catching a compile error and I'm not sure how to proceed:
$ xlc -g3 -O0 -qarch=pwr8 -qaltivec fips197-p8.c -o fips197-p8.exe
"fips197-p8.c", line 59.16: 1506-754 (W) The parameter type is not valid for a function of this linkage type.
The relevant source code is shown below. The complete source code is available at fips197-p8.c. The source code is a test driver for Power 8 __cipher and __vcipherlast. It has a main and a few C functions. Effectively is a minimal complete working example for Power 8 AES.
$ cat -n fips197-p8.c
...
11 #if defined(__xlc__) || defined(__xlC__)
12 // #include <builtins.h>
13 #include <altivec.h>
14 typedef vector unsigned char uint8x16_p8;
15 typedef vector unsigned int uint64x2_p8;
16 #else
17 #include <altivec.h>
18 typedef vector unsigned char uint8x16_p8;
19 typedef vector unsigned long long uint64x2_p8;
20 #endif
...
52 uint8x16_p8 Load8x16(const uint8_t src[16])
53 {
54 #if defined(__xlc__) || defined(__xlC__)
55 /* IBM XL C/C++ compiler */
56 # if defined(__LITTLE_ENDIAN__)
57 return vec_xl_be(0, src);
58 # else
59 return vec_xl(0, src);
60 # endif
61 #else
62 /* GCC, Clang, etc */
63
64 #endif
65 }
The compiler version is shown below. We don't control the compiler, so this is what we have:
$ xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0000
vec_xl is fine on a little-endian. vec_xl for big-endian is giving us the trouble.
What is the problem, and how do I fix it?
So a little guesswork (confirmed by OP comments since it works) led me to think that this cryptic & obscure "The parameter type is not valid for a function of this linkage type." message (google first match is this question !) could be a qualifier issue.
Since your contract is
uint8x16_p8 Load8x16(const uint8_t src[16])
it is possible that, given the options & the current endianness, the compiler/prototype believes that vec_xl_be expects a non-const parameter as src.
So passing a const violates the contract (and that's the nicest way xlc could find to notifu you)
So either change to
uint8x16_p8 Load8x16(uint8_t src[16])
(with the risk of dropping constant constraints for all callers)
or drop the const by a non-const cast (like we do when the prototype lacks const, but the data is in fact not modified in the function):
vec_xl_be(0,(uint8_t*)src);

How to place a variable at a given absolute address in memory (with GCC)

The RealView ARM C Compiler supports placing a variable at a given memory address using the variable attribute at(address):
int var __attribute__((at(0x40001000)));
var = 4; // changes the memory located at 0x40001000
Does GCC have a similar variable attribute?
I don't know, but you can easily create a workaround like this:
int *var = (int*)0x40001000;
*var = 4;
It's not exactly the same thing, but in most situations a perfect substitute. It will work with any compiler, not just GCC.
If you use GCC, I assume you also use GNU ld (although it is not a certainty, of course) and ld has support for placing variables wherever you want them.
I imagine letting the linker do that job is pretty common.
Inspired by answer by #rib, I'll add that if the absolute address is for some control register, I'd add volatile to the pointer definition. If it is just RAM, it doesn't matter.
You could use the section attributes and an ld linker script to define the desired address for that section. This is probably messier than your alternatives, but it is an option.
Minimal runnable linker script example
The technique was mentioned at: https://stackoverflow.com/a/4081574/895245 but now I will now provide a concrete example.
main.c
#include <stdio.h>
int myvar __attribute__((section(".mySection"))) = 0x9ABCDEF0;
int main(void) {
printf("adr %p\n", (void*)&myvar);
printf("val 0x%x\n", myvar);
myvar = 0;
printf("val 0x%x\n", myvar);
return 0;
}
link.ld
SECTIONS
{
.mySegment 0x12345678 : {KEEP(*(.mySection))}
}
GitHub upstream.
Compile and run:
gcc -fno-pie -no-pie -o main.out -std=c99 -Wall -Wextra -pedantic link.ld main.c
./main.out
Output:
adr 0x12345678
val 0x9abcdef0
val 0x0
So we see that it was put at the desired address.
I cannot find where this is documented in the GCC manual, but the following syntax:
gcc link.ld main.c
seems to append the given linker script to the default one that would be used.
-fno-pie -no-pie is required, because the Ubuntu toolchain is now configured to generate PIE executables by default, which leads the Linux kernel to place the executable on a different address every time, which messes with our experiment. See also: What is the -fPIE option for position-independent executables in gcc and ld?
TODO: compilation produces a warning:
/usr/bin/x86_64-linux-gnu-ld: warning: link.ld contains output sections; did you forget -T?
Am I doing something wrong? How to get rid of it? See also: How to remove warning: link.res contains output sections; did you forget -T?
Tested on Ubuntu 18.10, GCC 8.2.0.
You answered your question,
In your link above it states:
With the GNU GCC Compiler you may use only pointer definitions to access absolute memory locations. For example:
#define IOPIN0 (*((volatile unsigned long *) 0xE0028000))
IOPIN0 = 0x4;
Btw http://gcc.gnu.org/onlinedocs/gcc-4.5.0/gcc/Variable-Attributes.html#Variable%20Attributes
Here is one solution that actually reserves space at a fixed address in memory without having to edit the linker file:
extern const uint8_t dev_serial[12];
asm(".equ dev_serial, 0x1FFFF7E8");
/* or asm("dev_serial = 0x1FFFF7E8"); */
...
for (i = 0 ; i < sizeof(dev_serial); i++)
printf((char *)"%02x ", dev_serial[i]);
In GCC you can place variable into specific section:
__attribute__((section (".foo"))) static uint8_t * _rxBuffer;
or
static uint8_t * _rxBuffer __attribute__((section (".foo")));
and then specify address of the section in GNU Linker Memory Settings:
.foo=0x800000
I had a similar issue. I wanted to allocate a variable in my defined section at a special offset. In the same time I wanted the code to be portable (no explicit memory address in my C code). So I defined the RAM section in the linker script, and defined an array with the same length of my section (.noinit section is 0x0F length).
uint8_t no_init_sec[0x0f] __attribute__ ((section (".noinit")));
This array maps all locations of this section. This solution is not suitable when the section is large as the unused locations in the allocated array will be a wasted space in the data memory.
The right answer to my opinion is the Minimal runnable linker script example one.
However, there was something not mentioned there:
If the variable is not used in code (e.g. the variable holds read-only data such as version...), it is necessary to add the 'used' attribute.
Refer to my answer at https://stackoverflow.com/a/75468786/3887115.

Resources