I'm writing my own kernel (using multiboot2) and have followed this tutorial to bring it into long-mode. I am now linking with the following C code:
void kernel_main()
{
*(uint64_t*) 0xb8000 = 0x2f592f412f4b2f4f;
}
This prints OKAY to the screen.
However, I now create a global variable called VGA_buffer that holds this memory address.
volatile static const void* VGA_buffer = 0xb8000;
void kernel_main()
{
*(uint64_t*) VGA_buffer = 0x2f592f412f4b2f4f;
}
The code is no longer working, OKAY is not appearing on the screen.
How do I fix this?
I think this is because my linker script is not including the global variable data. This is what I've got:
ENTRY(start)
SECTIONS
{
. = 1M;
.boot :
{
*(.multiboot_header)
}
.text :
{
*(.text)
}
}
I also tried adding the following with no luck:
...
.rodata :
{
*(.rodata)
}
.data :
{
*(.data)
}
.bss :
{
*(.bss)
}
I'm not very familiar with custom linker scripts so I really don't know what I'm doing, and I'm not sure if this is even the problem.
You need to both link the .data segment and execute some initialization code for it, in order to initialize VGA_buffer. Meaning that you need to ensure that some manner of "CRT" (C run-time) code executes .data initialization. If you run some "no ABI" version of the compiler, this part might not happen at all unless you write it yourself manually.
Casting away const and volatile qualifiers invokes undefined behavior. Not sure why you added the const in the first place.
volatile static void* VGA_buffer = 0xb8000; isn't valid C. See "Pointer from integer/integer from pointer without a cast" issues
Pedantically, always write storage class specifier at the start of the declaration. volatile static... is obsolete style C. Instead, always write static volatile...
Hours wasted because I was missing two characters: -c
When compiling my C code into kernel.o, I forgot to tell gcc that this was a compilation-only step and that no linking should be done. After adding the -c flag, everything worked!
And I got great hints on the way, but I never even noticed them:
I was wondering, when compiling my C code, why I had to put in -nostdlib to stop standard library from being linked... because I forgot the -c flag.
I was also wondering why it was complaining that some externs were not provided because it was trying to link at this step when it shouldn't have been.
Related
I am using library that I shouldn't change it files, that including my h file.
the code of the library looks somthing like like:
#include "my_file"
extern void (*some_func)();
void foo()
{
(some_func)();
}
my problem is that I want that some_func will be extern function and not extern pointer to function (I am implementing and linking some_func). and that how main will call it.
that way I will save little run time and code space, and no one in mistake will change this global.
is it possible?
I thought about adding in my_file.h somthing as
#define *some_func some_func
but it won't compile because asterisk is not allowed in #define.
EDIT
The file is not compiled already, so changes at my_file.h will effect the compilation.
First of all, you say that you can't change the source of the library. Well, this is bad, and some "betrayal" is necessary.
My approach is to let the declaration of the pointer some_func as is, a non-constant writable variable, but to implement it as constant non-writable variable, which will be initialized once for all with the wanted address.
Here comes the minimal, reproducible example.
The library is implemented as you show us:
// lib.c
#include "my_file"
extern void (*some_func)();
void foo()
{
(some_func)();
}
Since you have this include file in the library's source, I provide one. But it is empty.
// my_file
I use a header file that declares the public API of the library. This file still has the writable declaration of the pointer, so that offenders believe they can change it.
// lib.h
extern void (*some_func)();
void foo();
I separated an offending module to try the impossible. It has a header file and an implementation file. In the source the erroneous assignment is marked, already revealing what will happen.
// offender.h
void offend(void);
// offender.c
#include <stdio.h>
#include "lib.h"
#include "offender.h"
static void other_func()
{
puts("other_func");
}
void offend(void)
{
some_func = other_func; // the assignment gives a run-time error
}
The test program consists of this little source. To avoid compiler errors, the declaration has to be attributed as const. Here, where we are including the declarating header file, we can use some preprocessor magic.
// main.c
#include <stdio.h>
#define some_func const some_func
#include "lib.h"
#undef some_func
#include "offender.h"
static void my_func()
{
puts("my_func");
}
void (* const some_func)() = my_func;
int main(void)
{
foo();
offend();
foo();
return 0;
}
The trick is, that the compiler places the pointer variable in the read-only section of the executable. The const attribute is just used by the compiler and is not stored in the intermediate object files, and the linker happily resolves all references. Any write access to the variable will generate a runtime error.
Now all of this is compiled in an executable, I used GCC on Windows. I did not bother to create a separated library, because it doesn't make a difference for the effect.
gcc -Wall -Wextra -g main.c offender.c lib.c -o test.exe
If I run the executable in "cmd", it just prints "my_func". Apparently the second call of foo() is never executed. The ERRORLEVEL is -1073741819, which is 0xC0000005. Looking up this code gives the meaning "STATUS_ACCESS_VIOLATION", on other systems known as "segmentation fault".
Because I deliberately compiled with the debugging flag -g, I can use the debugger to examine more deeply.
d:\tmp\StackOverflow\103> gdb -q test.exe
Reading symbols from test.exe...done.
(gdb) r
Starting program: d:\tmp\StackOverflow\103\test.exe
[New Thread 12696.0x1f00]
[New Thread 12696.0x15d8]
my_func
Thread 1 received signal SIGSEGV, Segmentation fault.
0x00000000004015c9 in offend () at offender.c:16
16 some_func = other_func;
Alright, as I intended, the assignment is blocked. However, the reaction of the system is quite harsh.
Unfortunately we cannot get a compile-time or link-time error. This is because of the design of the library, which is fixed, as you say.
You could look at the ifunc attribute if you are using GCC or related. It should patch a small trampoline at load time. So when calling the function, the trampoline is called with a known static address and then inside the trampoline there is a jump instruction that was patched with the real address. So when running, all jump locations are directly in the code, which should be efficient with the instruction cache. Note that it might even be more efficient than this, but at most as bad as calling the function pointer. Here is how you would implement it:
extern void (*some_func)(void); // defined in the header you do not have control about
void some_func_resolved(void) __attribute__((ifunc("resolve_some_func")));
static void (*resolve_some_func(void)) (void)
{
return some_func;
}
// call some_func_resolved instead now
Having a header that defines some static inline function that contains static variables in it, how to achieve merging of identical static local variables across all TUs that comprise final loadable module?. In a less abstract way:
/*
* inc.h
*/
#include <stdlib.h>
/*
* This function must be provided via header. No extra .c source
* is allowed for its definition.
*/
static inline void* getPtr() {
static void* p;
if (!p) {
p = malloc(16);
}
return p;
}
/*
* 1.c
*/
#include "inc.h"
void* foo1() {
return getPtr();
}
void* bar1() {
return getPtr();
}
/*
* 2.c
*/
#include "inc.h"
void* foo2() {
return getPtr();
}
void* bar2() {
return getPtr();
}
Platform is Linux, and this file set is built via:
$ clang -O2 -fPIC -shared 1.c 2.c
It is quite expected that both TUs receive own copies of getPtr.p. Though inside each TU getPtr.p is shared across all getPtr() instantiations. This can be confirmed by inspecting final loadable binary:
$ readelf -s --wide a.out | grep getPtr
32: 0000000000201030 8 OBJECT LOCAL DEFAULT 21 getPtr.p
34: 0000000000201038 8 OBJECT LOCAL DEFAULT 21 getPtr.p
At the same time I'm looking for a way of how to share getPtr.p across separate TU boundary. This vaguely resembles what happens with C++ template instantiations. And likely GRP_COMDAT would help me but I was not able to find any info about how to label my static var to be put into COMDAT.
Is there any attribute or other source-level (not a compiler option) way to achieve merging such objects?
If I understand correctly what you want, you can get this effect by simply declaring a global variable.
/*
* inc.h
*/
void* my_p;
static inline void* getPtr() {
if (!my_p) {
my_p = malloc(16);
}
return my_p;
}
This will use the same variable my_p for all instances of getPtr throughout the program (since it's global). And it is not necessary to have an explicit definition of my_p in any module. It will be initialized to NULL, which is just what you want. So nothing besides inc.h needs to change, and no additional .c file is needed.
Of course, you'll probably want to give my_p a name that is less likely to conflict with any identifier in the user's program. Maybe Sergios_include_file_p_for_getPtr or something of the sort.
This is actually an extension to standard C (mentioned in Annex J.5.11 in N2176), but it's provided by gcc and clang on most modern platforms. It's documented under the -fcommon compiler option (which is enabled by default). It's typically implemented by putting the variable in a common section, and the linker then merges all instances together, just as you suggest. But the code above shows how to access the feature without needing to use attributes or other obscure incantations.
If you want to be extra paranoid, you can declare my_p with __attribute__((common)) which will cause the variable to be treated in this way even if -fno-common is in effect. (Of course, that may cause trouble if -fno-common was being used for a reason...)
I am developing a software for AVR microcontroller. Saying in fromt, now I only have LEDs and pushbuttons to debug. The problem is that if I pass a string literal into the following function:
void test_char(const char *str) {
if (str[0] == -1)
LED_PORT ^= 1 << 7; /* Test */
}
Somewhere in main()
test_char("AAAAA");
And now the LED changes state. On my x86_64 machine I wrote the same function to compare (not LED, of course), but it turns out that str[0] equals to 'A'. Why is this happening?
Update:
Not sure whether this is related, but I have a struct called button, like this:
typedef struct {
int8_t seq[BTN_SEQ_COUNT]; /* The sequence of button */
int8_t seq_count; /* The number of buttons registered */
int8_t detected; /* The detected button */
uint8_t released; /* Whether the button is released
after a hold */
} button;
button btn = {
.seq = {-1, -1, -1},
.detected = -1,
.seq_count = 0,
.released = 0
};
But it turned out that btn.seq_count start out as -1 though I defined it as 0.
Update2
For the later problem, I solved by initializing the values in a function. However, that does not explain why seq_count was set to -1 in the previous case, nor does it explain why the character in string literal equals to -1.
Update3
Back to the original problem, I added a complete mini example here, and same occurs:
void LED_on() {
PORTA = 0x00;
}
void LED_off() {
PORTA = 0xFF;
}
void port_init() {
PORTA = 0xFF;
DDRA |= 0xFF;
}
void test_char(const char* str) {
if (str[0] == -1) {
LED_on();
}
}
void main() {
port_init();
test_char("AAAAA");
while(1) {
}
}
Update 4
I am trying to follow Nominal Animal's advice, but not quite successful. Here is the code I have changed:
void test_char(const char* str) {
switch(pgm_read_byte(str++)) {
case '\0': return;
case 'A': LED_on(); break;
case 'B': LED_off(); break;
}
}
void main() {
const char* test = "ABABA";
port_init();
test_char(test);
while(1) {
}
}
I am using gcc 4.6.4,
avr-gcc -v
Using built-in specs.
COLLECT_GCC=avr-gcc
COLLECT_LTO_WRAPPER=/home/carl/Softwares/AVR/libexec/gcc/avr/4.6.4/lto-wrapper
Target: avr
Configured with: ../configure --prefix=/home/carl/Softwares/AVR --target=avr --enable-languages=c,c++ --disable-nls --disable-libssp --with-dwarf2
Thread model: single
gcc version 4.6.4 (GCC)
Rewritten from scratch, to hopefully clear up some of the confusion.
First, some important background:
AVR microcontrollers have separate address spaces for RAM and ROM/Flash ("program memory").
GCC generates code that assumes all data is always in RAM. (Older versions used to have special types, such as prog_char, that referred to data in the ROM address space, but newer versions of GCC do not and cannot support such data types.)
When linking against avr-libc, the linker adds code (__do_copy_data) to copy all initialized data from program memory to RAM. If you have both avr-gcc and avr-libc packages installed, and you use something like avr-gcc -Wall -O2 -fomit-frame-pointer -mmcu=AVRTYPE source.c -o binary.elf to compile your source file into a program binary, then use avr-objcopy to convert the elf file into the format your firmware utilities support, you are linking against avr-libc.
If you use avr-gcc to only produce an object file source.o, and some other utilities to link and upload your program to your microcontroller, this copying from program memory to RAM may not happen. It depends on what linker and libraries your use.
As most AVRs have only a few dozen to few hundred bytes of RAM available, it is very, very easy to run out of RAM. I'm not certain if avr-gcc and avr-libc reliably detect when you have more initialized data than you have RAM available. If you specify any arrays containing strings, it is very likely you're already overrun your RAM, causing all sorts of interesting bugs to appear.
The avr/pgmspace.h header file is part of avr-libc, and defines a macro, PROGMEM, that can be used to specify data that will only be referred to by functions that take program memory addresses (pointers), such as pgm_read_byte() or strcmp_P() defined in the same header file. The linker will not copy such variables to RAM -- but neither will the compiler tell you if you're using them wrong.
If you use both avr-gcc and avr-libc, I recommend using the following approach for all read-only data:
#include <avr/pgmspace.h>
/*
* Define LED_init(), LED_on(), and LED_off() functions.
*/
void blinky(const char *str)
{
while (1) {
switch (pgm_read_byte(str++)) {
case '\0': return;
case 'A': LED_on(); break;
case 'B': LED_off(); break;
}
/* Add a sleep or delay here,
* or you won't be able to see the LED flicker. */
}
}
static const char example1[] PROGMEM = "AB";
const char example2[] PROGMEM = "AAAA";
int main(void)
{
static const char example3[] PROGMEM = "ABABB";
LED_init();
while (1) {
blinky(example1);
blinky(example2);
blinky(example3);
}
}
Because of changes (new limitations) in GCC internals, the PROGMEM attribute can only be used with a variable; if it refers to a type, it does nothing. Therefore, you need to specify strings as character arrays, using one of the forms above. (example1 is visible within this compilation unit only, example2 can be referred to from other compilation units too, and example3 is visible only in the function it is defined in. Here, visible refers to where you can refer to the variable; it has nothing to do with the contents.)
The PROGMEM attribute does not actually change the code GCC generates. All it does is put the contents to .progmem.data section, iff without it they'd be in .rodata. All of the magic is really in the linking, and in linked library code.
If you do not use avr-libc, then you need to be very specific with your const attributes, as they determine which section the contents will end up in. Mutable (non-const) data should end up in the .data section, while immutable (const) data ends up in .rodata section(s). Remember to read the specifiers from right to left, starting at the variable itself, separated by '*': the leftmost refers to the content, whereas the rightmost refers to the variable. In other words,
const char *s = p;
defines s so that the value of the variable can be changed, but the content it points to is immutable (unchangeable/const); whereas
char *const s = p;
defines s so that you cannot modify the variable itself, but you can the content -- the content s points to is mutable, modifiable. Furthermore,
const char *s = "literal";
defines s to point to a literal string (and you can modify s, ie. make it point to some other literal string for example), but you cannot modify the contents; and
char s[] = "string";
defines s to be a character array (of length 6; string length + 1 for end-of-string char), that happens to be initialized to { 's', 't', 'r', 'i', 'n', 'g', '\0' }.
All linker tools that work on object files use the sections to determine what to do with the contents. (Indeed, avr-libc copies the contents of .rodata sections to RAM, and only leaves .progmem.data in program memory.)
Carl Dong, there are several cases where you may observe weird behaviour, even reproducible weird behaviour. I'm no longer certain which one is the root cause of your problem, so I'll just list the ones I think are likely:
If linking against avr-libc, running out of RAM
AVRs have very little RAM, and copying even string literals to RAM easily eats it all up. If this happens, any kind of weird behaviour is possible.
Failing to linking against avr-libc
If you think you use avr-libc, but are not certain, then use avr-objdump -d binary.elf | grep -e '^[0-9a-f]* <_' to see if the ELF binary contains any library code. You should expect to see at least <__do_clear_bss>:, <_exit>:, and <__stop_program>: in that list, I believe.
Linking against some other C library, but expecting avr-libc behaviour
Other libraries you link against may have different rules. In particular, if they're designed to work with some other C compiler -- especially one that supports multiple address spaces, and therefore can deduce when to use ld and when lpm based on types --, it might be impossible to use avr-gcc with that library, even if all the tools talk to each other nicely.
Using a custom linker script and a freestanding environment (no C library at all)
Personally, I can live with immutable data (.rodata sections) being in program memory, with myself having to explicitly copy any immutable data to RAM whenever needed. This way I can use a simple microcontroller-specific linker script and GCC in freestanding mode (no C library at all used), and get complete control over the microcontroller. On the other hand, you lose all the nice predefined macros and functions avr-libc and other C libraries provide.
In this case, you need to understand the AVR architecture to have any hope of getting sensible results. You'll need to set up the interrupt vectors and all kinds of other stuff to get even a minimal do-nothing loop to actually run; personally, I read all the assembly code GCC produces (from my own C source) simply to see if it makes sense, and to try to make sure it all gets processed correctly.
Questions?
I faced a similar problem (inline strings were equal to 0xff,0xff,...) and solved it by just changing a line in my Makefile
from :
.out.hex:
$(OBJCOPY) -j .text \
-j .data \
-O $(HEXFORMAT) $< $#
to :
.out.hex:
$(OBJCOPY) -j .text \
-j .data \
-j .rodata \
-O $(HEXFORMAT) $< $#
or seems better :
.out.hex:
$(OBJCOPY) -R .fuse \
-R .lock \
-R .eeprom \
-O $(HEXFORMAT) $< $#
You can see full problem and answer here : https://www.avrfreaks.net/comment/2943846#comment-2943846
I just cannot find the solution to this issue..
What I'm trying to do is calling an assembly function using gcc. Just take a look:
// Somewhere in start.s
global _start_thread
_start_thread:
; ...
// Somewhere in UserThread.cpp
extern void _start_thread( pointer );
static void UserMainHack()
{
_start_thread(((UserThread*)currentThread)->getUserMain());
}
Thanks for any help..
Did you know that many C linkers automatically adds the leading underscore when looking for identifiers? So in the C source (not the assembler source), just remove the leading underscore:
extern void start_thread( pointer );
static void UserMainHack()
{
start_thread(((UserThread*)currentThread)->getUserMain());
}
Give your function [declaration] assembly linkage by using an "Asm Label":
extern void start_thread(pointer) __asm__("start_thread");
(and have the .global on the asm side match it.)
It works much like extern "C" in that it can be used for both functions and variables, and that it's one-sided (but on the C side this time).
The RealView ARM C Compiler supports placing a variable at a given memory address using the variable attribute at(address):
int var __attribute__((at(0x40001000)));
var = 4; // changes the memory located at 0x40001000
Does GCC have a similar variable attribute?
I don't know, but you can easily create a workaround like this:
int *var = (int*)0x40001000;
*var = 4;
It's not exactly the same thing, but in most situations a perfect substitute. It will work with any compiler, not just GCC.
If you use GCC, I assume you also use GNU ld (although it is not a certainty, of course) and ld has support for placing variables wherever you want them.
I imagine letting the linker do that job is pretty common.
Inspired by answer by #rib, I'll add that if the absolute address is for some control register, I'd add volatile to the pointer definition. If it is just RAM, it doesn't matter.
You could use the section attributes and an ld linker script to define the desired address for that section. This is probably messier than your alternatives, but it is an option.
Minimal runnable linker script example
The technique was mentioned at: https://stackoverflow.com/a/4081574/895245 but now I will now provide a concrete example.
main.c
#include <stdio.h>
int myvar __attribute__((section(".mySection"))) = 0x9ABCDEF0;
int main(void) {
printf("adr %p\n", (void*)&myvar);
printf("val 0x%x\n", myvar);
myvar = 0;
printf("val 0x%x\n", myvar);
return 0;
}
link.ld
SECTIONS
{
.mySegment 0x12345678 : {KEEP(*(.mySection))}
}
GitHub upstream.
Compile and run:
gcc -fno-pie -no-pie -o main.out -std=c99 -Wall -Wextra -pedantic link.ld main.c
./main.out
Output:
adr 0x12345678
val 0x9abcdef0
val 0x0
So we see that it was put at the desired address.
I cannot find where this is documented in the GCC manual, but the following syntax:
gcc link.ld main.c
seems to append the given linker script to the default one that would be used.
-fno-pie -no-pie is required, because the Ubuntu toolchain is now configured to generate PIE executables by default, which leads the Linux kernel to place the executable on a different address every time, which messes with our experiment. See also: What is the -fPIE option for position-independent executables in gcc and ld?
TODO: compilation produces a warning:
/usr/bin/x86_64-linux-gnu-ld: warning: link.ld contains output sections; did you forget -T?
Am I doing something wrong? How to get rid of it? See also: How to remove warning: link.res contains output sections; did you forget -T?
Tested on Ubuntu 18.10, GCC 8.2.0.
You answered your question,
In your link above it states:
With the GNU GCC Compiler you may use only pointer definitions to access absolute memory locations. For example:
#define IOPIN0 (*((volatile unsigned long *) 0xE0028000))
IOPIN0 = 0x4;
Btw http://gcc.gnu.org/onlinedocs/gcc-4.5.0/gcc/Variable-Attributes.html#Variable%20Attributes
Here is one solution that actually reserves space at a fixed address in memory without having to edit the linker file:
extern const uint8_t dev_serial[12];
asm(".equ dev_serial, 0x1FFFF7E8");
/* or asm("dev_serial = 0x1FFFF7E8"); */
...
for (i = 0 ; i < sizeof(dev_serial); i++)
printf((char *)"%02x ", dev_serial[i]);
In GCC you can place variable into specific section:
__attribute__((section (".foo"))) static uint8_t * _rxBuffer;
or
static uint8_t * _rxBuffer __attribute__((section (".foo")));
and then specify address of the section in GNU Linker Memory Settings:
.foo=0x800000
I had a similar issue. I wanted to allocate a variable in my defined section at a special offset. In the same time I wanted the code to be portable (no explicit memory address in my C code). So I defined the RAM section in the linker script, and defined an array with the same length of my section (.noinit section is 0x0F length).
uint8_t no_init_sec[0x0f] __attribute__ ((section (".noinit")));
This array maps all locations of this section. This solution is not suitable when the section is large as the unused locations in the allocated array will be a wasted space in the data memory.
The right answer to my opinion is the Minimal runnable linker script example one.
However, there was something not mentioned there:
If the variable is not used in code (e.g. the variable holds read-only data such as version...), it is necessary to add the 'used' attribute.
Refer to my answer at https://stackoverflow.com/a/75468786/3887115.