How do you get a pointer to the .text section? - c

How do you get a pointer to the .text section of memory for a program from within that program? I also need the length of the section to do a "Flash to Memory" compare as part of a continuous selftest that runs in the background.
The toolset automatically generates the linker .cmd file for the tools I'm using, and the Board Support Package for the board I'm using requires I use the generated .cmd file instead of making my own. (No make file either to add a script to muck with it afterwords.)
Edit:
I'm working with a TI TMS 6713 DSP using the code composer 3.1 environment. The card I'm using was contracted by our customer and produced by another organization so I can't really point you to any info on it. However the BSP is dependant upon TI's "DSP BIOS" config tool, and I can't really fudge the settings too much without digging into an out of scope effort.

You need to put "variables" in the linker script.
In one of my projects I have this in one of my sections:
__FlashStart = .;
In the C program I have this:
extern unsigned long int _FlashStart;
unsigned long int address = (unsigned long int)&_FlashStart;

It would definitely be easier if you could modify the linker script. Since you cannot, it is possible to extract section names, addresses, and sizes from the program binary. For example, here is how one would use libbfd to examine all code sections.
#include <bfd.h>
bfd *abfd;
asection *p;
char *filename = "/path/to/my/file";
if ((abfd = bfd_openr(filename, NULL)) == NULL) {
/* ... error handling */
}
if (!bfd_check_format (abfd, bfd_object)) {
/* ... error handling */
}
for (p = abfd->sections; p != NULL; p = p->next) {
bfd_vma base_addr = bfd_section_vma(abfd, p);
bfd_size_type size = bfd_section_size (abfd, p);
const char *name = bfd_section_name(abfd, p);
flagword flags = bfd_get_section_flags(abfd, p);
if (flags & SEC_CODE) {
printf("%s: addr=%p size=%d\n", name, base_addr, size);
}
}
If you only want to look at the .text segment, you'd strcmp against the section name.
The downside of this approach? Libbfd is licensed under the GPL, so your entire project would be encumbered with the GPL. For a commercial project, this might be a non-starter.
If your binary is in ELF format, you could use libelf instead. I'm not familiar with how the libelf APIs work, so I can't provide sample code. The Linux libelf is also GPL, but I believe the BSD projects have their own libelf which you could use.
Edit: as you're working on a DSP in a minimal real-time OS environment, this answer isn't going to work. Sorry, I tried.

Could you clarify which tool chain and architecture you are interested in.
On the compiler I am using right now (IAR ARM C/C++) there are operators built into the compiler which return the segment begin address __sfb(...), segment end address __sfe(...), and segment size __sfs(...)

The symbols you're looking for are __text__ and __etext__ which point to the start and end of the .text section, respectively.
You may find the generated .map file useful, as it lists all the symbols and sections defined in your application.

Related

How can I make sure that my struct fits in a memory area defined in a linker script?

I'm having a problem very similar to this one, but no answer there is helping me. Building with gcc on ARM Cortex M4.
I have:
a memory area defined in a linker script
a complex structure, the size of which is computed and stored in a define in a header file
I would like:
a compile or link time error, if that structure does not fit in the memory area.
I tried (like the person asking the question I linked to above):
importing linker symbols with extern uint8_t __AreaStart[]; and extern uint8_t __AreaEnd[]; from the linker script. No compile time error, which makes sense since the values in the Area symbols are not known at compile time.
I could see:
making ASSERTS in the linker script, but that would mean giving the size of the struct to the linker, and I'm not sure how to do that. For one, the size is currently in a pre-processor macro, not in an actual C symbol (it would be neat not to spend actual memory for communicating size from C to the linker).
giving the struct type to the linker, so if I could get the equivalent of sizeof(type) in the linker script.
actually defining a variable of that type in the memory area, in the C file. If it doesn't fit, the linker should complain. The problem is that this area holds user data, and needs to stay untouched over reprogramming. It cannot be part of the final binary, or user data would get overwritten. I could make an additional separate application just for the sake of checking, but I feel there must be a simpler solution (as of today, the Area does not even have a Section. Maybe add a NOLOAD section there?).
How should I go about failing at building, knowing that the size of the struct is available in a macro, generated at each build?
Background: the struct is generated by protobuf, that's why the size is considered variable. Maybe I could make the check after generating the struct.
A solution is to create a section in the Area in the linker script:
SECTIONS
{
.config_no_load (NOLOAD) :
{
. = ALIGN(4);
*(.config_no_load*);
KEEP(*(.config_no_load))
} > CONFIG1
}
KEEP is required to avoid optimization of the unused memory. CONFIG1 is the Area:
MEMORY
{
/* ... */
/* Configuration areas (App data) */
CONFIG1 (rx) : ORIGIN = 0x000f6000, LENGTH = 0x0100
}
Then in the C code, define a variable that is too large:
__attribute__((section(".config_no_load"))) __attribute__((unused))
static uint8_t volatile fake_config_header[0x200] = {0};
results in error as such:
LD application.elf
/usr/lib/gcc/arm-none-eabi/9.2.1/../../../arm-none-eabi/bin/ld: application.elf section `.config_no_load' will not fit in region `CONFIG1'

How to specify the memory location (fast/slow) of functions/vars in C99?

On many embedded architectures, it is possible to run the code or store data either into the internal RAM (fast access) or the external SDRAM (slow access).
On architectures like SHARC processors it is possible to define the memory region where a function will be linked to.
segment("seg_ext_dm32") void foo( void ); // External memory 32-bit location
Unfortunately the specifier segment("seg_ext_dm32") is not really ANSI and I cannot really omit it on my generic libraries that could be unit-tested on a different architecture (x86 for instance).
So I am looking for a more generic solution to classify my functions/variables to be stored either in a slow or a fast memory segment. Here is an example:
___slow void fft_configure( int parameter );
___fast void fft_tick();
What would be the most common way to do this?
Of course one easy way to do it is to add a general header file to my specific compiler to define what __slow or __fast would be:
In my main file:
#ifndef __slow
#define __slow /* nothing */
#endif
#ifndef __fast
#define __fast /* nothing */
#endif
In my compiler:
cc -D__slow=segment("seg_ext_dm32") -D__fast=segment("seg_dm32")
But I assume this is not the best solution.
There are a LOT of ways to skin this cat, some more portable that others.
Probably the most portable method would be to segregate your functions into separate source files that are 'fast' or 'slow' (or even finer grain that that by putting one function group per file), and then have the linker descriptor file deal with sticking the segments where you want. This keeps all the non-standard stuff out of your source files and puts it in one spot.
The linker descriptor file will have to be managed by the person using the library, but they'll have to do that anyways to properly locate any segments into 'fast' and 'slow' memory. With this method, they'll just have to specify the right .o files in the load segment they've defined in the right place, rather than relying on the compiler to emit the catchall name you've chosen.
This is what I do.
I create an attribute that I can change depending on the compiler.
For my embedded application I use the following
#ifdef IS_EMBEDDED
#define FAST_DATA __attribute__ ((section ("fast_data")))
#else
#define FAST_DATA /* DO NOTHING */
#endif
Then when I create global data that needs to be inside of the fast data section I apply the section location as follows.
char fastStack[1024*5] FAST_DATA;
Finally inside of my linker script.
MEMORY
{
fastMemory : ORIGIN = 0x00000050, LENGTH = 0x0000ffff
DDR3 : ORIGIN = 0x60000000, LENGTH = 0x8000000
FLASH : ORIGIN = 0xC0000000, LENGTH = 0x02000000
}
...
.fast_data : {
__fast_data_start = .;
. = ALIGN(8);
*(fast_data)
. = ALIGN(8);
__fast_data_end = .;
} > fastMemory
...

Pointer to end of a function code

I understand that a function pointer points to the starting address of the code for a function. But is there any way to be able to point to the end of the code of a function as well?
Edit: Specifically on an embedded system with a single processor and no virtual memory. No optimisation too. A gcc compiler for our custom processor.
I wish to know the complete address range of my function.
If you put the function within its own special linker section, then your toolchain might provide a pointer to the end (and the beginning) of the linker section. For example, with Green Hills Software (GHS) MULTI compiler I believe you can do something like this:
#pragma ghs section text=".mysection"
void MyFunction(void) { }
#pragma ghs section
That will tell the linker to locate the code for MyFunction in .mysection. Then in your code you can declare the following pointers, which point to the beginning and end of the section. The GHS linker provides the definitions automatically.
extern char __ghsbegin_mysection[];
extern char __ghsend_mysection[];
I don't know whether GCC supports similar functionality.
You didn't say why you need this information, but on some embedded system it's required to copy a single function from flash to ram in order to (re)program the flash.
Normally you are placing this functions into a new unique section and depending of your linker you can copy this section with pure C or with assembler to the new (RAM) location.
You also need to tell the linker that the code will run from another address than that it is placed in flash.
In a project the flash.c could look like
#pragma define_section pram_code "pram_code.text" RWX
#pragma section pram_code begin
uint16_t flash_command(uint16_t cmd, uint16_t *addr, uint16_t *data, uint16_t cnt)
{
...
}
#pragma section pram_code end
The linker command file looks like
.prog_in_p_flash_ROM : AT(Fpflash_mirror) {
Fpram_start = .;
# OBJECT(Fflash_command,flash.c)
* (pram_code.text)
. = ALIGN(2);
# save data end and calculate data block size
Fpram_end = .;
Fpram_size = Fpram_end - Fpram_start;
} > .pRAM
But as others said, this is very toolchain specific
There is no way with C to point to the end of a function. A C compiler has a lot of latitude as to how it arranges the machine code it emits during compilation. With various optimization settings, a C compiler may actually merge machine code intermingling the machine code of the various functions.
Since along with what ever the C compiler does there is also what is done by the linker as well as the loader as a part of linking the various compiled pieces of object code together and then loading the application which may also be using various kinds of shared libraries.
In the complex running environment of modern operating systems and modern development tool chains, unless the language provides a specific mechanism for doing something, it is prudent to not try to get fancy leaving yourself open to an application which suddenly stops working due to changes in the operating environment.
In most cases if you use a non-optimizing setting of the compiler with static linked libraries, the symbol map that most linkers provide will give you a good idea as to where functions begin and end. However the only thing you can really depend on is knowing the address of the function entry points.
In some implementations (including gcc) you could do something like this (but its not guaranteed and lots of implementation details could affect it):
int foo() {
printf("testing\n");
return 7;
}
void foo_end() { }
int sizeof_foo() {
// assumes no code size optimizations across functions
// function could be smaller than reported
// reports size, not range
return (int (*)())foo_end - foo;
}

creating a C function with a given size in the text segment

I'm programming an embedded powerpc 32 system with a 32 kbyte 8-way set associative L2 instruction cache. To avoid cache thrashing we align functions in a way such that the text of a set of functions called at a high frequency (think interrupt code) ends up in separate cache sets. We do this by inserting dummy functions as needed, e.g.
void high_freq1(void)
{
...
}
void dummy(void)
{
__asm__(/* Silly opcodes to fill ~100 to ~1000 bytes of text segment */);
}
void high_freq2(void)
{
...
}
This strikes me as ugly and suboptimal. What I'd like to do is
avoid __asm__ entirely and use pure C89 (maybe C99)
find a way to create the needed dummy() spacer that the GCC optimizer does not touch
the size of the dummy() spacer should be configurable as a multiple of 4 bytes. Typical spacers are 260 to 1000 bytes.
should be feasible for a set of about 50 functions out of a total of 500 functions
I'm also willing to explore entirely new techniques of placing a set of selected functions in a way so they aren't mapped to the same cache lines. Can a linker script do this?
Use GCC's __attribute__(( aligned(size) )).
Or, pass -falign-functions=n on your GCC command line.
GCC Function Attributes
GCC Optimize Options
Maybe linker scripts are the way to go. The GNU linker can use these I think... I've used LD files for the AVR and on MQX both of which we using GCC based compilers... might help...
You can define your memory sections etc and what goes where... Each time I come to write one its been so long since the last I have to read up again...
Have a search for SVR3-style command files to gem up.
DISCLAIMER: Following example for a very specific compiler... but the SVR3-like format is pretty general... you'll have to read up for your system
For example you can use commands like...
ApplicationStart = 0x...;
MemoryBlockSize = 0x...;
ApplicationDataSize = 0x...;
ApplicationLength = MemoryBlockSize - ApplicationDataSize;
MEMORY {
RAM: ORIGIN = 0x... LENGTH = 1M
ROM: ORIGIN = ApplicationStart LENGTH = ApplicationLength
}
This defines three memory sections for the linker. Then you can say things like
SECTIONS
{
GROUP :
{
.text :
{
* (.text)
* (.init , '.init$*')
* (.fini , '.fini$*')
}
.my_special_text ALIGN(32):
{
* (.my_special_text)
}
.initdat ALIGN(4):
// Blah blah
} > ROM
// SNIP
}
The SECTIONS command tells the linker how to map input sections into output sections, and how to place the output sections in memory... Here we're saying what is going into the ROM output section, which we defined in the MEMORY definition above. The bit possible of interest to you is .my_special_text. In your code you can then do things like...
__attribute__ ((section(".my_special_text")))
void MySpecialFunction(...)
{
....
}
The linker will put any function preceded by the __attribute__ statement into the my_special_text section. In the above example this is placed into ROM on the next 4 byte aligned boundary after the text section, but you can put it anyway you like. So you could make a few sections, one for each of the functions you describe, and make sure the addresses won't cause clashes...
You can the size and memory location of the section using linker defined variables of the form
extern char_fsection_name[]; // Set to the address of the start of section_name
extern char_esection_name[]; // Set to the first byte following section_name
So for this example...
extern char _fmy_special_text[]; // Set to the address of the start of section_name
extern char _emy_special_text[]; // Set to the first byte following section_name
If you are willing to expend some effort, you can use
__attribute__((section(".text.hotpath.a")))
to place the function into a separate section, and then in a custom linker script explicitly place the functions.
This gives you a bit more fine-grained control than simply asking for the functions to be aligned, but requires more hand-holding.
Example, assuming that you want to lock 4KiB into cache:
SECTIONS {
.text.hotpath.one BLOCK(0x1000) {
*(.text.hotpath.a)
*(.text.hotpath.b)
}
}
ASSERT(SIZEOF(.text.hotpath.one) <= 0x1000, "Hot Path functions do not fit into 4KiB")
This will make sure the hot path functions a and b are next to each other and both fit into the same block of 4 KiB that is aligned on a 4 KiB boundary, so you can simply lock that page into the cache; if the code doesn't fit, you get an error.
You can even use
NOCROSSREFS(.text.hotpath.one .text)
to forbid hot path functions calling other functions.
Assuming you're using GCC and GAS, this may be a simple solution for you:
void high_freq1(void)
{
...
}
asm(".org .+288"); /* Advance location by 288 bytes */
void high_freq2(void)
{
...
}
You could, possibly, even use it to set absolute locations for the functions rather than using relative increments in address, which would insulate you from consequences due to the functions changing in size when/if you modify them.
It's not pure C89, for sure, but it may be less ugly than using dummy functions. :)
(Then again, it should be mentioned that linker scripts aren't standardized either.)
EDIT: As noted in the comments, it seems to be important to pass the -fno-toplevel-reorder flag to GCC in this case.

Avoiding the main (entry point) in a C program

Is it possible to avoid the entry point (main) in a C program. In the below code, is it possible to invoke the func() call without calling via main() in the below program ? If Yes, how to do it and when would it be required and why is such a provision given ?
int func(void)
{
printf("This is func \n");
return 0;
}
int main(void)
{
printf("This is main \n");
return 0;
}
If you're using gcc, I found a thread that said you can use the -e command-line parameter to specify a different entry point; so you could use func as your entry point, which would leave main unused.
Note that this doesn't actually let you call another routine instead of main. Instead, it lets you call another routine instead of _start, which is the libc startup routine -- it does some setup and then it calls main. So if you do this, you'll lose some of the initialization code that's built into your runtime library, which might include things like parsing command-line arguments. Read up on this parameter before using it.
If you're using another compiler, there may or may not be a parameter for this.
When building embedded systems firmware to run directly from ROM, I often will avoid naming the entry point main() to emphasize to a code reviewer the special nature of the code. In these cases, I am supplying a customized version of the C runtime startup module, so it is easy to replace its call to main() with another name such as BootLoader().
I (or my vendor) almost always have to customize the C runtime startup in these systems because it isn't unusual for the RAM to require initialization code for it to begin operating correctly. For instance, typical DRAM chips require a surprising amount of configuration of their controlling hardware, and often require a substantial (thousands of bus clock cycles) delay before they are useful. Until that is complete, there may not even be a place to put the call stack so the startup code may not be able to call any functions. Even if the RAM devices are operational at power on, there is almost always some amount of chip select hardware or an FPGA or two that requires initialization before it is safe to let the C runtime start its initialization.
When a program written in C loads and starts, some component is responsible for making the environment in which main() is called exist. In Unix, linux, Windows, and other interactive environments, much of that effort is a natural consequence of the OS component that loads the program. However, even in these environments there is some amount of initialization work to do before main() can be called. If the code is really C++, then there can be a substantial amount of work that includes calling the constructors for all global object instances.
The details of all of this are handled by the linker and its configuration and control files. The linker ld(1) has a very elaborate control file that tells it exactly what segments to include in the output, at what addresses, and in what order. Finding the linker control file you are implicitly using for your toolchain and reading it can be instructive, as can the reference manual for the linker itself and the ABI standard your executables must follow in order to run.
Edit: To more directly answer the question as asked in a more common context: "Can you call foo instead of main?" The answer is "Maybe, but but only by being tricky".
On Windows, an executable and a DLL are very nearly the same format of file. It is possible to write a program that loads an arbitrary DLL named at runtime, and locates an arbitrary function within it, and calls it. One such program actually ships as part of a standard Windows distribution: rundll32.exe.
Since a .EXE file can be loaded and inspected by the same APIs that handle .DLL files, in principle if the .EXE has an EXPORTS section that names the function foo, then a similar utility could be written to load and invoke it. You don't need to do anything special with main, of course, since that will be the natural entry point. Of course, the C runtime that was initialized in your utility might not be the same C runtime that was linked with your executable. (Google for "DLL Hell" for hint.) In that case, your utility might need to be smarter. For instance, it could act as a debugger, load the EXE with a break point at main, run to that break point, then change the PC to point at or into foo and continue from there.
Some kind of similar trickery might be possible on Linux since .so files are also similar in some respects to true executables. Certainly, the approach of acting like a debugger could be made to work.
A rule of thumb would be that the loader supplied by the system would always run main. With sufficient authority and competence you could theoretically write a different loader that did something else.
Rename main to be func and func to be main and call func from name.
If you have access to the source, you can do this and it's easy.
If you are using an open source compiler such as GCC or a compiler targeted at embedded systems you can modify the C runtime startup (CRT) to start at any entry point you need. In GCC this code is in crt0.s. Generally this code is partially or wholly in assembler, for most embedded systems compilers example or default start-up code will be provided.
However a simpler approach is to simply 'hide' main() in a static library that you link to your code. If that implementation of main() looks like:
int main(void)
{
func() ;
}
Then it will look to all intents and purposes as if the user entry point is func(). This is how many application frameworks with entry points other than main() work. Note that because it is in a static library, any user definition of main() will override that static library version.
The solution depends on the compiler and linker which you use. Always is that not main is the real entry point of the application. The real entry point makes some initializations and call for example main. If you write programs for Windows using Visual Studio, you can use /ENTRY switch of the linker to overwrite the default entry point mainCRTStartup and call func() instead of main():
#ifdef NDEBUG
void mainCRTStartup()
{
ExitProcess (func());
}
#endif
If is a standard practice if you write the most small application. In the case you will receive restrictions in the usage of C-Runtime functions. You should use Windows API function instead of C-Runtime function. For example instead of printf("This is func \n") you should use OutputString(TEXT("This is func \n")) where OutputString are implemented only with respect of WriteFile or WriteConsole:
static HANDLE g_hStdOutput = INVALID_HANDLE_VALUE;
static BOOL g_bConsoleOutput = TRUE;
BOOL InitializeStdOutput()
{
g_hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE);
if (g_hStdOutput == INVALID_HANDLE_VALUE)
return FALSE;
g_bConsoleOutput = (GetFileType (g_hStdOutput) & ~FILE_TYPE_REMOTE) != FILE_TYPE_DISK;
#ifdef UNICODE
if (!g_bConsoleOutput && GetFileSize (g_hStdOutput, NULL) == 0) {
DWORD n;
WriteFile (g_hStdOutput, "\xFF\xFE", 2, &n, NULL);
}
#endif
return TRUE;
}
void Output (LPCTSTR pszString, UINT uStringLength)
{
DWORD n;
if (g_bConsoleOutput) {
#ifdef UNICODE
WriteConsole (g_hStdOutput, pszString, uStringLength, &n, NULL);
#else
CHAR szOemString[MAX_PATH];
CharToOem (pszString, szOemString);
WriteConsole (g_hStdOutput, szOemString, uStringLength, &n, NULL);
#endif
}
else
#ifdef UNICODE
WriteFile (g_hStdOutput, pszString, uStringLength * sizeof (TCHAR), &n, NULL);
#else
{
//PSTR pszOemString = _alloca ((uStringLength + sizeof(DWORD)));
CHAR szOemString[MAX_PATH];
CharToOem (pszString, szOemString);
WriteFile (g_hStdOutput, szOemString, uStringLength, &n, NULL);
}
#endif
}
void OutputString (LPCTSTR pszString)
{
Output (pszString, lstrlen (pszString));
}
This really depends how you are invoking the binary, and is going to be reasonably platform and environment specific. The most obvious answer is to simply rename the "main" symbol to something else and call "func" "main", but I suspect that's not what you are trying to do.

Resources