Storing CRC into an AXF/ELF file - c

I'm currently working on a C program in the LPCXpresso (eclipse-based) tool-chain on Windows 7, an IDE with gcc targeting the an NXP Cortex M3 microprocessor. It provides a simple way to compile-link-program the microprocessor over JTAG. The result of a build is an AXF file (ELF format) that is loaded by a debug configuration.
The loaded program resides in Flash memory from 0x00000 to 0x3FFFB. I'd like to include a 4-byte CRC-32 at 0x3FFFC to validate the program at start-up. I added another section and use the gcc __attribute__ directive to access that memory location.
uint32_t crc32_build __attribute__ ((section(".text_MFlashCRC")));
To compute and store the CRC-32 value, my plan was to use SRecord with the following post-build steps:
arm-none-eabi-size "${BuildArtifactFileName}"
arm-none-eabi-objcopy -O binary "${BuildArtifactFileName}" "${BuildArtifactFileBaseName}.bin"
checksum -p ${TargetChip} -d "${BuildArtifactFileBaseName}.bin"
../util/srec_cat "${BuildArtifactFileBaseName}.bin" -binary -crop 0 0x3FFFC -fill 0xFF 0x00000 0x3FFFC -crc32-b-e 0x3FFFC -o "${BuildArtifactFileBaseName}.crc.bin" -binary
echo ""
echo "CRC32:"
../util/srec_cat "${BuildArtifactFileBaseName}.crc.bin" -binary -crop 0x3FFFC 0x40000 -o - -hex-dump
This creates a binary with a checksum (necessary for bootloader) and then computes the CRC over the used Flash memory, storing the CRC value at 0x3FFFC.
However, I don't think I can load the binary file using the debugger. There is a built in programming utility with LPCXpresso that can load the modified binary file, however, that doesn't let me debug. I believe I can then try to start a debugging session with the original AXF file using "attach-only" mode, however, this becomes cumbersome.
I've been able to use readelf to inspect the crc32_build variable in the AXF file. Is there a way to edit the variable in the AXF file? Is there an industry-standard approach to inserting a CRC as a post-build step?

There is no industry standard that I am aware of. There are various techniques to do this. I would suggest that you use the crc32_build as an extern in 'C' and define it via a linker script. For instance,
$ cat ld.script
.text : {
_start_crc_region = .;
*(.text);
_end_crc_region = .;
crc32_build = .;
LONG(CALC_CRC);
}
You pass the value CALC_CRC as zero for a first invocation and then relink with the value set. For instance,
$ ld --defsym=CALC_CRC=0 -T ld.script *.o -o phony.elf
$ objcopy -j sections phony.elf -o phony.bin # sections means checksum 'areas'
$ ld --defsym=CALC_CRC=`crc32 phony.bin` -T ld.script *.o -o target.elf
I use this technique to add digital signing to images; it should apply equally well to crc values. The linker script allows you to position the variable, which is often important for integrity checks like a CRC, but wouldn't matter for a simple checksum. A linker script also allows you to define symbols for both the start and end of the region. Without a script, you need some elf introspection.
You can of course extend the idea to include init data and other allocated sections. At some point you need to use objcopy to extract the sections and do the integrity check at build time. The sections may have various alignment constraints and you need to mimic this (in phony.bin above) on the host when doing the build time crc calculation.
As a bonus, everything is already done when you generate an srec file.
If you have trouble with --defsym, you can just pre-process the ld.script with sed, awk, perl, python, etc and substitute text with a hex value where CALC_CRC is.

Related

What is difference between u-boot.bin and u-boot.rom

I have builded the U-boot for minnowboard max. I am seeing the files like uboot.rom uboot.bin etc.
what is the difference between uboot.rom and uboot.bin ? Which files I should flash to SPI NOR flash.
This is explained in doc/README.x86. In short, if you are going to be writing U-Boot to SPI NOR then you need to ensure that you have the correct binary blobs in the correct locations AND use BUILD_ROM=y so that u-boot.rom is generated as this is the file that is required on x86 to run on bare metal (rather than say as a coreboot payload).
Edit to address the comment:
The file 'u-boot' is the ELF object that is the result of building all of the U-Boot sources and linking them. This includes all of the extra sections and information an ELF file can contain. This is also by and large not bootable. The u-boot.bin file is the ELF u-boot but passed via objcopy to strip out (by and large, see the Makefile for the various flags or build with V=1) everything except for text/data sections so that we have only what is required to boot. Then u-boot.rom is the combination of objects and formatting that the x86 architecture requires in order to execute and run an image. Building with V=1 will show all of the details here.

binutils - kernel - "_binary" meaning?

I am reading xv6 lectures.
I have a file named initcode.S that is to be linked in the kernel.
Now two symbols are created that way :
extern char _binary_initcode_start[], _binary_initcode_size[];
inside a function.
The lecture says :
as part of the kernel build process, the linker embeds that binary that defines two special symbols, _binary_initcode_starcode_size, indicating the location and size of the binary.
I understand that binutils is getting the address and the size of this assembled code.
I wonder about the notation : is it default ? my searches didn't prove that clearly.
_binary -> it is originally an assembly code
_initcode -> the name of my file
_start -> the parameter i am interested in.
It would imply that any assembly code compiled would have those variables too.
I have no proof of that, though.
The question is :
is _binary_myAsmFileHere_myParameterhere the default variable structure binutils give to the assembly file to export their address, size and so on ?
Could someone tell me if my assumption is right and if it is better than that : the rule
Thanks
Strangely enough, it doesn't seem to be documented in the ld manual. However, man objcopy does say this:
You can access this binary data inside a program by referencing the
special symbols that are created by the conversion process. These
symbols are called _binary_objfile_start, _binary_objfile_end and
_binary_objfile_size. e.g. you can transform a picture file into an object file and then access it in your code using these symbols.
Apparently the same logic is used by ld when embedding binary files.
Notice that the Makefile for xv6 contains this line for linking the kernel:
$(LD) $(LDFLAGS) -T kernel.ld -o kernel entry.o $(OBJS) -b binary initcode entryother
As you can see, it uses -b binary to embed the files initcode and entryother, so the above symbols will be defined during this process.
when a .global variable is defined in an assembly file, for a C file to be able to reference that variable, the C file has to prepend a '_' to the variable name. This is so the linker can 'link' the name in the C file with the name in the assembly file.

Difference between using objcopy and xxding the file into a c source

Say I want to embed a file called data in my C executable.
The result which comes up from google is this linuxjournal page which says use objdump like this
objcopy --input binary \
--output elf32-i386 \
--binary-architecture i386 data data.o
However this is dependent on the architecture of the computer, for example when compiling the object from the previous command it gives i386 architecture of input file 'data.o' is incompatible with i386:x86-64 output and I have to change the arguments.
However with the unix tool xxd, I can simply make a c source code with the data in a unsigned char array and an integer with its length and obtain the same result with device independent compilation commands.
data.o: data.c
gcc -c data.c
data.c: data
xxd -i data > data.c
What is the preferred method and why?
The xxd is not a standard UNIX tool. It is actually part of VIM and is used for implementing its hex editor function. VIM is an optional tool and is not universally available.
The GNU objcopy, on the other hand, is part of GNU binutils and generally is preinstalled on all GNU systems.
In general, when one needs to include a binary file into a program, something simple (as you do with xxd) is preferred over the objcopy. Mainly, for the simple reason that objcopy is heavily under-documented and leaves impression of being an unpolished front-end to the BFD, the underlying library of the binutils. Another reason is that along with the .c file, you can also create the .h file, and make the generated files an integral part of your project.
The article you link already contains a number of examples how to accomplish that. Probably the most popular tool for the purpose is the hexdump, preinstalled on literally all systems. For example, from the top of my head:
# .c
echo 'char data[] = {' > data.c
hexdump -v -e '1/1 "0x%02X,"' < data.bin >> data.c
echo >> data.c
echo '};' >> data.c
echo 'size_t data_size = sizeof(data);'
# .h
echo 'extern char data[];' > data.h
echo 'extern size_t data_size;' >> data.h

Width of symbols created by gcc's objectcopy

I am using objcopy to remove some necessary scripting to embed a resource file (zip file) in flash memory (ARM embedded thingy).
I am using objcopy like this:
arm-none-eabi-objcopy.exe -I binary -O elf32-littlearm -B arm --rename-section .data=.rodata input.zip input.zip.o
arm-none-eabi-nm.exe -S -t d input.zip.o
00006301 R _binary_input_zip_end
00006301 A _binary_input_zip_size
00000000 R _binary_input_zip_start
What I need to know is what is the width of the _end and _size symbols. I can only guess that the _start is an address which can be accessed like a byte array: extern uint8_t _binary_input_zip_start[];. And I am assuming that the _end and _size are of 'native' int-size, and I suppose I can safely assume I can interpret these as uint32_t.
However I can't be certain. I can't find anything "size" related in the docs of objcopy: https://sourceware.org/binutils/docs/binutils/objcopy.html
I'm not %100 sure if this will work, but try adding the option --sort-size to arm-none-eabi-nm. This is supposed to sort the symbols by size, by comparing them to the next symbol above. In combination with the -S option, it should print a size. Hopefully, this will help you deduce their width.
What ARM micro are you using? 32-bits is a good guess, but there are exceptions. If you happen to be using a Texas Instruments part, I can help a lot more.
I don't have an ARM project handy that I can test this on, but it's worth a shot. If that doesn't work, I'll keep digging.
Source: My knowledge, and double-checking via http://manned.org/arm-none-eabi-nm

Tool to analyze size of ELF sections and symbol

I need a way to analyze output file of my GCC compiler for ARM. I am compiling for bare metal and I am quite concerned with size. I can use arm-none-eabi-objdump provided by the cross-compiler but parsing the output is not something I would be eager to do if there exists a tool for this task. Do you know of such a tool existing? My search turned out no results.
One more thing, every function in my own code is in its own section.
You can use nm and size to get the size of functions and ELF sections.
To get the size of the functions (and objects with static storage duration):
$ nm --print-size --size-sort --radix=d tst.o
The second column shows the size in decimal of function and objects.
To get the size of the sections:
$ size -A -d tst.o
The second column shows the size in decimal of the sections.
The readelf utility is handy for displaying a variety of section information, including section sizes, e.g.:
arm-none-eabi-readelf -e foo.o
If you're interested in the run-time memory footprint, you can ignore the sections that do not have the 'A' (allocate) flag set.
When re-visiting this question 10 years later one must mention the little Python-based wrapper for readelf and nm that is elf-size-analyze:
puncover uses objdump and a few other gcc tools to generate html pages you can easily browse to figure out where your code and data space is going.
It's a much nicer frontend than the text output of the gcc tools.

Resources