Static linking creates Segmentation Fault error - c

I have a problem linking my C application statically. All libraries exist (.a) and just a month ago I was able to static link my application without an error. But as soon as I activate the static linking option in eclipse, I can compile without an error but when I try to run it, I receive an "Segmentation Error" and it stops.
I tried to debug and that is what eclipse is showing me:
No source available for "_start() at 0x4017f7"
No source available for "__libc_start_main() at 0x522389"
No source available for "__libc_csu_init() at 0x5228f7"
No source available for "frame_dummy() at 0x4018bd"
No source available for "__register_frame_info_bases() at 0x52194b"
No source available for "0x0"
I use the following libraries: -lgcrypt -lgpg-error -lmxml -lpthread -lrt. Any ideas what the problem could be? I can also post the gdb traces, but its long.
Linker command:
Invoking: Invoking: GCC C Linker
gcc -static -o "X - Client" ./src/lib/stopwatch-0.2/stopwatch.o ./src/lib/rscode-1.3/berlekamp.o ./src/lib/rscode-1.3/crcgen.o ./src/lib/rscode-1.3/galois.o ./src/lib/rscode-1.3/rs.o ./src/lib/Salsa20/ecrypt.o ./src/lib/helper-Client.o ./src/PoR-Client.o -lgcrypt -lgpg-error -lmxml -lpthread -lrt
Finished building target: X - Client

This is probably not a problem with the linking. You probably have a problem with reading uninitialized memory, or reading and writing past the end of an array.
What happens in such cases is that on one build the memory you are reading just happens to be set to a non-crashing value ( e.g. you read past the end of an array into an area that has zeroes ), but then in another build the data structures are in a different order and now you are reading from something with unexpected values.
Or you could in one build be writing past the end into a data structure that you don't need any more, and this build, the thing past the end of the array is critical.
Also check if your program runs differently on debug vs. optimized builds. Optimization changes the layout, padding and initialization of data structures. ( e.g. debug builds will for example typically zero all memory, and stack frames are padded out with debugging data ).
I strongly suggest you run your program through a tool like valgrind. It will find these kinds of problems for you.

Related

GDB cannot step into function, OZone can

I have a setup like this:
GDB from "GNU Arm Embedded Toolchain" 10.3-2021.10
GDB server from "Segger JLink" 7.54d
JLink Ultra+ connected to my PC and my embedded device
Arm Compiler 6.15
I'm having problems stepping into a certain function from a C module (let's call it "F1"). When trying, I get the error message
Single stepping until exit from function "F1", which has no line number information.
If I use Segger Ozone, with the same .elf file, stepping into "F1" works fine.
I've tried to narrow down the problem and have the following observations:
A single line of code from the C module holding "F1" makes the difference. If I remove this line, it works. This line is a simple incrementation (++) of a static uint32_t variable and it is in a separate function (i.e. not "F1").
If I don't link with "--inline" option, it stops working - even with the "fix" in (1)
All source files (a mix of C and C++ files) are compiled with -g option.
I may try to reproduce it in a much smaller context which I could share here but until then, I'm hoping for some hints.
Anything is appreciated.
[Update 2021-11-10] Tried with older/newer versions of "GNU Arm Embedded Toolchain" as well as "Segger JLink". Same problem.
[Update 2021-11-10] Compiler/linker command used:
armclang -g --target=arm-arm-none-eabi -mcpu=cortex-m33 -mfloat-abi=soft -MMD -Werror -D__STDC_LIMIT_MACROS -I<my_include_paths>
armlink --inline --info=sizes --info=veneers --info=unused --info=totals --map --symbols --scatter=<my_scatter_file> --list=list.txt

Parse Error in Diab compiler

i'm trying to run LZMA (C version) (7-zip.org/sdk.html) on an MPC5748G from NXP, by compiling a simple code to encode/decode a stream , but i get some errors ...
However the files are compiled successfully on my laptop and i was able to run the LZMA application
Here is the errors that DCC displays:
scons: done reading SConscript files.
scons: Building targets ...
..\tools\wr\mpc5748_wr594\diab\5.9.4.2\WIN32\bin\dcc.exe -c -Xenum-is-best -Xrtti-off -Xexceptions-off -Xforce-declarations -ee1481 -tPPCVLEES:simple -Xsection-split -g3 -XO -Xsize-opt -DTGT_MPC5748_WR594 -DC_DERIVATIVE_MPC5748G -DFREESCALE_OS -DAUTOSAR_OS_USED -DOSDIABPPC -DADC_INTERRUPT_TYPE=MCAL_ISR_TYPE_NONE -DCAN_INTERRUPT_TYPE=MCAL_ISR_TYPE_NONE -DGPT_INTERRUPT_TYPE=MCAL_ISR_TYPE_NONE -DICU_INTERRUPT_TYPE=MCAL_ISR_TYPE_NONE -DLIN_INTERRUPT_TYPE=MCAL_ISR_TYPE_NONE -DPWM_INTERRUPT_TYPE=MCAL_ISR_TYPE_NONE -DSPI_INTERRUPT_TYPE=MCAL_ISR_TYPE_NONE -DTGT_DBG -DTGT_APP -DCFG_CFG -DCFG_CAN -DCFG_CSL -DCFG_MCU -DCFG_DUT -DCFG_MEM -DCFG_MOV -DCFG_GPI -DCFG_GPO -DCFG_ADC -DCFG_SED -DCFG_FRY -DCFG_LPM -DCFG_ETH -IC:\GW_MCU\tools\wr\mpc5748_wr594\diab\5.9.4.2\include -Ibsw\mcal\mcalAS\inc -I. lib\lzma\Alloc.c -o lib\lzma\Alloc.o
..\tools\wr\mpc5748_wr594\diab\5.9.4.2\WIN32\bin\dcc.exe -tPPCVLEES:simple -u__lear_calypso_memory_init -Wl,-Xremove-unused-sections -Wl,-Xunused-sections-list -lc -Wl,-m6 -Wm bsw/mcal/mcalm/linkerDescriptionVLE_App.dld -o out\app\BmwBdc2018GwmDutApp.elf out/app\objToLink.inl 1>out/app/BmwBdc2018GwmDutApp.map
dld: warning: Undefined symbol '__HEAP_END' in file 'sbrk.o(C:\GW_MCU\tools\wr\mpc5748_wr594\diab\5.9.4.2\PPCVLEE\libchar.a)'
dld: warning: Undefined symbol '__HEAP_START' in file 'sbrk.o(C:\GW_MCU\tools\wr\mpc5748_wr594\diab\5.9.4.2\PPCVLEE\libchar.a)'
dld: error: Undefined symbols found - no output written
This error is proper to diab compiler and here is what i've found in documentation:
Dynamic Memory Allocation - the heap, malloc( ), sbrk( )
malloc( ) allocates memory from a heap managed by function sbrk( ) in src/sbrk.c.
There are two ways to create the heap:
■ Define __HEAP_START and __HEAP_END, typically in a linker command file.
See the files conf/default.dld, conf/sample.dld, and 25.6 Command File
Structure, p.419 for examples.
■ Recompile sbrk.c as follows:
dcc -ttarget -c -D SBRK_SIZE=n sbrk.c
where n is the size of the desired heap in bytes.
I'm not the author of either source code of target neither LZMA SDK in C.
What i have understood is that the LZMA encoder allocate at least 1Mb of RAM for the encoder and The MPC5748 provide only 768Kb of RAM.
So i have tagged the question to LZMA and diab compiler(No tag found) , only someone how worked in both could help me
UPDATE :
i removed the problem in dld: warning: Undefined symbol 'LzmaEncProps_Init' in file 'lib/lzma/LzmaLib.o'by including the corresponding source file into my makefile however the HEAP problem persist.
The problem disapeared However the apps doesn't run
Using trace32 debugger i was able to diagnostic the source of this error:
at line
p->probs = (CLzmaProb *)alloc->Alloc(alloc, numProbs * sizeof(CLzmaProb));
The line returns empty p->probs so it seems that alloc has not been able to allocate all the needed size
Thanks
Maybe malloc is not allowed in the MCU that you are using, try to use static allocation or implement your own malloc.
You can found a usefull explanation here:
https://www.quora.com/Why-is-malloc-harmful-in-embedded-systems

How can compiling the same source code generate different object files?

After a long sequence of debugging I've narrowed my problem down to one file. And the problem is that the file compiles differently in two different directories, when everything else is the same.
I'm using CodeSourcery's arm gcc compiler (gcc version 4.3.3, Sourcery G++ Lite 2009q1-161) to compile a simple file. I was using it in one module with no issues and then I copied it to another module to use there. When it compiles, the object file is significantly different. The command line to compile the two files is identical (I used the linux history to make sure), and the 3 include files are also identical copies (checked with diff).
I did a binary compare on the two object files and they have a lot of individual byte differences scattered around. I did an objdump -D of both and compared them and there are a lot of differences. Here is dump1, dump2, and the diff. The command line is "
arm-none-eabi-gcc --std=gnu99 -Wall -O3 -g3 -ggdb -Wextra -Wno-unused -c crc.c -o crc.o".
How is this possible? I've also compiled with -S instead of -c and looked at the assembler output and that's identical except for the directory path. So how can the object file be different?
My real problem is that when I try to link the object file for dump2 into my program, I get undefined reference errors, so something in the object is wrong, whereas the object for dump1 gets no such errors and links fine.
For large scale software, there are many implementations are doing hashing on pointers. This is one major reason that cause result randomization. Usually if the program logic is correct, the order of some internal data structures could be different which is not harmful in most cases.
And also, don't compare the 'objdump -D' output, since your are compiling the code from different directory, the string table, symbol table, DWARF or eh_frame should be different. You will certainly get lots of diff lines.
The only comparison that makes sense is to compare the output of 'objdump -d' which only takes care of the text section. If text section is same(similar) then it can be considered as identical.
Most likely your file picks up different include files. This this the most likely reason.
Check that your include paths are exactly the same, paths in the include statements. They may point to different directories. C and C++ has a feature that when you #include abcd.h it tries to load abcd.h from the directory of the calling file. Check this.

Can the object files output by gcc vary between compilations of the same source with the same options?

Does the gcc output of the object file (C language) vary between compilations? There is no time-specific information, no change in compilation options or the source code. No change in linked libraries, environmental variables either. This is a VxWorks MIPS64 cross compiler, if that helps. I personally think it shouldn't change. But I observe that sometimes randomly, the instructions generated changes. I don't know what's the reason. Can anyone throw some light on this?
How is this built? For example, if I built the very same Linux kernel, it includes a counter that is incremented each build. GCC has options to use profiler information to guide code generation, if the profiling information changes, so will the code.
What did you analyze? The generated assembly, an objdump of object files or the executable? How did you compare the different versions? Are you sure you looked at executable code, not compiler/assembler/linker timestamps?
Did anything change in the environment? New libraries (and header files/declarations/macro definitions!)? New compiler, linker? New kernel (yes, some header files originate with the kernel source and are shipped with it)?
Any changes in environment variables (another user doing the compiling, different machine, different hookup to the net gives a different IP address that makes it's way into the build)?
I'd try tracing the build process in detail (run a build and capture the output in a file, and do so again; compare those).
Completely mystified...
I had a similar problem with g++. Pre 4.3 versions produced exactly the same object files each time. With 4.3 (and later?) some of the mangled symbol names are different for each run - even without -g or other recordings. Perhaps the use a time stamp or random number (I hope not). Obviously some of those symbols make it into the .o symbol table and you get a difference.
Stripping the object file(s) makes them equal again (wrt. binary comparison).
g++ -c file.C ; strip file.o; cmp file.o origfile.o
Why should it vary? It is the same result always. Try this:
for i in `seq 1000`; do gcc 1.c; md5sum a.out; done | sort | uniq | wc -l
The answer is always 1. Replace 1.c and a.out to suit your needs.
The above counts how many different executables are generated by gcc when compiling the same source for 1000 times.
I've found that in at least some environments, the same source may yield a different executable if the source tree for the subsequent build is located in a different directory. Example:
Checkout a pristine copy of your project to dir1. Do a full rebuild from scratch.
Then, with the same user on the same machine, checkout the same exact copy of your source code to dir2 (dir1 != dir2). Do another full rebuild from scratch.
These builds are minutes apart, with no change in the toolchain or any 3rd party libs or code. Binary comparison of source code is the same. However, the executable in dir1 has different md5sum than the executable in dir2.
If I compare the different executables in BeyondCompare's hex editor, the difference is not just some tiny section that could plausibly be a timestamp.
I do get the same executable if I build in dir1, then rebuild again in dir1. Same if I keep building the same source over and over from dir2.
My only guess is that some sort of absolute paths of the include hierarchy are embedded in the executable.
My gcc sometimes produces different code for exactly the same Input. The output object files differ in exactly one byte.
Sometimes this causes linker Errors, because one possible object file is invalid. Recompiling another version usually fixes the linker error.
The gcc Version is 4.3.4 on Suse Linux Enterprise.
The gcc Parameters are:
cc -std=c++0x -Wall -fno-builtin -march=native -g -I<path1> -I<path2> -I<path3> -o obj/file.o -c file.cpp
If someone experiences the same effect, then please let me know.

gprof : How to generate call graph for functions in shared library that is linked to main program

I am working on Linux environment. I have two 'C' source packages train and test_train.
train package when compiled generates libtrain.so
test_train links to libtrain.so and generates executable train-test
Now I want to generate a call graph using gprof which shows calling sequence of functions in main program as well as those inside libtrain.so
I am compiling and linking both packages with -pg option and debugging level is o0.
After I do ./train-test , gmon.out is generated. Then I do:
$ gprof -q ./train-test gmon.out
Here, output shows call graph of functions in train-test but not in libtrain.so
What could be the problem ?
gprof won't work, you need to use sprof instead. I found these links helpful:
How to use sprof?
http://greg-n-blog.blogspot.com/2010/01/profiling-shared-library-on-linux-using.html
Summary from the 2nd link:
Compile your shared library (libmylib.so) in debug (-g) mode. No -pg.
export LD_PROFILE_OUTPUT=`pwd`
export LD_PROFILE=libmylib.so
rm -f $LD_PROFILE.profile
execute your program that loads libmylib.so
sprof PATH-TO-LIB/$LD_PROFILE $LD_PROFILE.profile -p >log
See the log.
I found that in step 2, it needs to be an existing directory -- otherwise you get a helpful warning. And in step 3, you might need to specify the library as libmylib.so.X (maybe even .X.Y, not sure) -- otherwise you get no warning whatsoever.
I'm loading my library from Python and didn't have any luck with sprof. Instead, I used oprofile, which was in the Fedora repositories, at least:
operf --callgraph /path/to/mybinary
Wait for your application to finish or do Ctl-c to stop profiling. Now let's generate a profile summary:
opreport --callgraph --symbols
See the documentation to interpret it. It's kind of a mess. In the generated report, each symbol is listed in a block of its own. The block's main symbol is the one that's not indented. The items above it are functions that call that function, and the ones below it are the things that get called by it. The percentages in the below section are the relative amount of time it spent in those callees.
If you're not on Linux (like me on Solaris) you simply out of luck as there is no sprof there.
If you have the sources of your library you can solve your problem by linking a static library and making your profiling binary with that one instead.
Another way I manage to trace calls to shared libraries, is by using truss. With the option -u [!]lib,...:[:][!]func, ... one can get a good picture of the call history of a run. It's not completely the same as profiling but can be very usefull in some scenarios.

Resources