Contiguous Hex file generation using GCC - c

I have a Hex file for STM32F427 that was built using GCC(gcc-arm-none-eabi) version 4.6 that had contiguous memory addresses. I wrote boot loader for loading that hex file and also added checksum capability to make sure Hex file is correct before starting the application.
Snippet of Hex file:
:1005C80018460AF02FFE07F5A64202F1D00207F5F9
:1005D8008E4303F1A803104640F6C821C2F2000179
:1005E8001A460BF053F907F5A64303F1D003184652
:1005F8000BF068F907F5A64303F1E80340F6FC1091
:10060800C2F2000019463BF087FF07F5A64303F145
:10061800E80318464FF47A710EF092FC07F5A643EA
:1006280003F1E80318460EF03DFC034607F5A64221
:1006380002F1E0021046194601F0F2FC07F56A5390
As you can see all the addresses are sequential. Then we changed the compiler to version 4.8 and i got the same type of Hex file.
But now we used compiler version 6.2 and the Hex file generated is not contiguous. It is somewhat like this:
:10016000B9BC0C08B9BC0C08B9BC0C08B9BC0C086B
:10017000B9BC0C08B9BC0C08B9BC0C08B9BC0C085B
:08018000B9BC0C08B9BC0C0865
:1001900081F0004102E000BF83F0004330B54FEA38
:1001A00041044FEA430594EA050F08BF90EA020FA5
As you can see after 0188 it is starting at 0190 means rest of 8 bytes(0189 to 018F) are 0xFF as they are not flashed.
Now boot loader is kind of dumb where we just pass the starting address and no of bytes to calculate the checksum.
Is there a way to make hex file in contiguous way as compiler 4.6 and compiler 4.8? the code is same in all the three times.

If post-processing the hex file is an option, you can consider using the IntelHex python library. This lets you manipulate hex file data (i.e. ignoring the 'markup'; record type, address, checksum etc) rather than as lines, will for instance create output with the correct line checksum.
A fast way to get this up and running could be to use the bundled convenience scripts hex2bin.py and bin2hex.py:
python hex2bin.py --pad=FF noncontiguous.hex tmp.bin
python bin2hex.py tmp.bin contiguous.hex
The first line converts the input file noncontiguous.hex to a binary file, padding it with FF where there is no data. The second line converts it the binary file back to a hex file.
The result would be
:08018000B9BC0C08B9BC0C0865
becomes
:10018000B9BC0C08B9BC0C08FFFFFFFFFFFFFFFF65
As you can see, padding bytes are added where the input doesn't have any data, equivalent to writing the input file to the device and reading it back out. Bytes that are in the input file are kept the same - and at the same address.
The checksum is also correct as changing the length byte from 0x08 to 0x10 compensates for the extra 0xFF bytes. If you padded with something else, IntelHex would output the correct checksum
You can skip the the creation of a temporary file by piping these: omit tmp.bin in the first line and replacing it with - in the second line:
python hex2bin.py --pad=FF noncontiguous.hex | python bin2hex.py - contiguous.hex
An alternative way could be to have a base file with all FF and use the hexmerge.py convenience script to merge gcc's output onto it with --overlap=replace
The longer, more flexible way, would be to implement your own tool using the IntelHex API. I've used this to good effect in situations similar to yours - tweak hex files to satisfy tools that are costly to change, but only handle hex files the way they were when the tool was written.

One of many possible ways:
Make your hex file with v6.2, e.g., foo.hex.
Postprocess it with this Perl oneliner:
perl -pe 'if(m/^:(..)(.*)$/) { my $rest=16-hex($1); $_ = ":10" . $2 . ("FF" x $rest) . "\n"; }' foo.hex > foo2.hex
Now foo2.hex will have all 16-byte lines
Note: all this does is FF-pad to 0x10 bytes. It doesn't check addresses or anything else.
Explanation
perl -pe '<some script>' <input file> runs <some script> for each line of <input file>, and prints the result. The script is:
if(m/^:(..)(.*)$/) { # grab the existing byte count into $1
my $rest=16 - hex($1); # how many bytes of 0xFF we need
$_ = ":10" . $2 . ("FF" x $rest) . "\n"; # make the new 16-byte line
# existing bytes-^^ ^^^^^^^^^^^^^^-pad bytes
}

Another solution is to change the linker script to ensure the preceding .isr_vector section ends on a 16 byte alignment, as the mapfile reveals that the following .text section is 16 byte aligned.
This will ensure there is no unprogrammed flash bytes between the two sections

You can use bincopy to fill all empty space with 0xff.
$ pip install bincopy
$ bincopy fill foo.hex

Use the -gap-fill option of objcopy, e.g.:
arm-none-eabi-objcopy --gap-fill 0xFF -O ihex firmware.elf firmware.hex

Related

IDA PRO change .rdata value

How can I change the circled value? (5CB4F8B3h)
I am not a professional, but it is vital for me to influence this value.
Treat it with understanding :)
Using a hex editor, locate the sector containing the hex value you need to modify, and change it. TimeDateStamp is in epoch time, so you should get the timestamp in epoch time and convert it to hexadecimal (you may use a calculator in "programming mode", type the decimal timestamp and you will get its hex value.
After some googling, I found that a tool called PE Explorer can modify the timestamp of PE executables, which is the thing you're attempting to do, so this could be a better solution for a no professional. The instructions are here. I'm in no way affiliated to this product, I just thought it may be useful for your needs.
Modify these values in IDA with idc function PatchDword:
success PatchDword(long ea,long value);
This function changes a dword (4 bytes) at address ea to value.
You want to change 3 values at 0x14010D484, 0x14010D4A0 and 0x14010D4BC.
In IDA run "File->IDC command..." and enter this script with newValue1, newValue2 and newValue3 set to new values for these addresses:
if( !PatchDword(0x14010D484, newValue1) || !PatchDword(0x14010D4A0, newValue2) || !PatchDword(0x14010D4BC, newValue3))
Message("Failed to patch!\n");
Click OK and check the output window - there should be no error messages.
Generate DIF file - "File->Produce File->Create DIF file...". The DIF file is a text file with a list of patched bytes. Each line looks like:
0005A827: 5D BB
In this example 0005A827 is an offset in the binary on the disk, 5D is the original value, BB is the new value.
Apply this DIF file with some patch applyer or do it manually in a hex editor - move to the offsets from the DIF file and change the values of the bytes.
Your binary can use checksum checking. In this case it will not run after the patch. The checksum fix depends on the checksum type used.

Compare data in bytes to a hex number

I need to read data from stdin where I need to check if 2 bytes are equal to 0x001 or 0x002 and then depending on the value execute one of two types of code. Here is what i tried:
uint16_t type;
uint16_t type1 = 0x0001;
while( ( read(0, &type, sizeof(type)) ) > 0 ) {
if (type == type1) {
//do smth
}
}
I'm not sure what numeric system the read uses since printing the value of type both as decimal and as hex returns something completely different even when i wrtie 0001 on stdin
The characters 0x0001 and 0x0002 are control characters in UTF-8, you wont be able to type them manually.
If you are trying to compare the character input - use gets and atoi to convert the character-based input of stdin into an integer, or if you want to continue using read - compare against the characters '0' '0' '0' '1'/'2' instead of the values to get proper results
You didn't say what kind of OS you're using. If you're using Unix or Linux or MacOS, at the command line, it can be pretty easy to rig up simple ways of testing this sort of program. If you're using Windows, though, I'm sorry, my answer here is not going to be very useful.
(I'm also going to be making heavy use of the Unix/Linux
"pipe" concept, that is, the vertical bar | which sends the output of one program to the input of the next. If you're not familiar with this concept, my answer may not make much sense at first.)
If it were me, after compiling your program to a.out, I would just run
echo 00 00 00 01 00 02 | unhex | a.out
In fact, I did compile your program to a.out, and tried this, and... it didn't work. It didn't work because my machine (like yours) uses little-endian byte order, so what I should have done was
echo 00 00 01 00 02 00 | unhex | a.out
Now it works perfectly.
The only problem, as you've discovered if you tried it, is that unhex is not a standard program. It's one of my own little utilities in my personal bin directory. It's massively handy, and it's be super useful if something like it were standard, but it's not. (There's probably a semistandard Linux utility for doing this sort of thing, but I don't know what it is.)
What is pretty standard, and you probably can use it, is a base-64 decoder. The problem there is that there's no good way to hand-generate the input you need. But I can generate it for you (using my unhexer). If you can run
echo AAABAAIA | base64 -d | a.out
that will perform an equivalent test of your program. AAABAAIA is the base-64 encoding of the three 16-bit little-endian words 0000 0001 and 0002, as you can see by running
echo AAABAAIA | base64 -d | od -h
On some systems, the base64 command uses a -D option, rather than -d, to request decoding. That is, if you get an error message like "invalid option -- d", you can try
echo AAABAAIA | base64 -D | a.out

Alpine APK Package Repositories, how are the checksums calculated?

I'm trying to work out how the pull checksum for packages is calculated within Alpine APK package repositories. The documentation regarding the format is lacking in any detail.
When I run apk index -o APKINDEX.unsigned.tar.gz *.apk which generates the repository. When you extract the txt file from inside the gz, it contains the following...
C:Q17KXT6xFVWz4EZDIbkcvXQ/uz9ys=
P:redis-server
V:3.2.3-0
A:noarch
S:2784844
I:102400
T:An advanced key-value store
U:http://redis.io/
L:
D:linux-headers
I'm interested in how the very first line is generated. I've tried to read the actual source that's used to generate this, but I'm not a C programmer, so it's hard for me to comprehend as it jumps all over the place.
The two files mentioned in the documentation are database.c and package.c.
Incase this somewhat helps, the original APK file has these various hashes...
CRC32 = ac17ea88
MD5 = a035ecf940a67a6572ff40afad4f396a
SHA1 = eca5d3eb11555b3e0464321b91cbd743fbb3f72b
SHA256 = 24bc1f03409b0856d84758d6d44b2f04737bbc260815c525581258a5b4bf6df4
The pull checksum is the sha1sum of the second tar.gz file in the apk file, containing the .PKGINFO file.
The Alpine APK package is actually a concatenation in disguise of 3 tar.gz files.
We can split the package below using gunzip-split into 3 .gz files, then rename them to .tar.gz
./gunzip-split -s -o ./out/ strace-5.14-r0.apk
mv ./out/file_1.gz ./out/file_1.tar.gz
mv ./out/file_2.gz ./out/file_2.tar.gz
mv ./out/file_3.gz ./out/file_3.tar.gz
sha1sum ./out/file_2.tar.gz
7a266425df7bfd7ce9a42c71a015ea2ae5715838 out/file_2.tar.gz
tar tvf out/file_2.tar.gz
-rw-r--r-- root/root 702 2021-09-03 01:34 .PKGINFO
In the case of the strace package the checksum value can be derived as above:
apk index strace-5.14-r0.apk -o APKINDEX.tar.gz
tar xvf APKINDEX.tar.gz
cat APKINDEX
echo eiZkJd97/XzppCxxoBXqKuVxWDg=|base64 -d|xxd
00000000: 7a26 6425 df7b fd7c e9a4 2c71 a015 ea2a z&d%.{.|..,q...*
00000010: e571 5838 .qX8
When comparing them we see that they match.
References
https://github.com/martencassel/apk-tools/blob/master/README.md
https://gitlab.com/cg909/gunzip-split/-/releases
https://lists.alpinelinux.org/~alpine/devel/%3C257B6969-21FD-4D51-A8EC-95CB95CEF365%40ferrisellis.com%3E#%3C20180309152107.472e4144#vostro.util.wtbts.net%3E
So...
/* Internal cointainer for MD5 or SHA1 */
struct apk_checksum {
unsigned char data[20];
unsigned char type;
};
Basically take the C: value then chop off the Q from the front then base 64 decode. Chop off the last value (type which defaults to SHA1) then you have your sha1. This appears to be made of the CONTENTS of the package but that would take further looking into it.
You need to look here: https://git.alpinelinux.org/cgit/apk-tools/tree/src/blob.c#n492
It is apk_blob_pull_csum
First 'Q' stands for encoding
Next '1' stands for SHA1
Looks like this checksum is made database.c in apk_db_unpack_pkg:
apk_sign_ctx_init(&ctx.sctx, APK_SIGN_VERIFY_IDENTITY, &pkg->csum, db->keys_fd);
tar = apk_bstream_gunzip_mpart(bs, apk_sign_ctx_mpart_cb, &ctx.sctx);
r = apk_tar_parse(tar, apk_db_install_archive_entry, &ctx, TRUE, &db->id_cache);
but I'm not sure, because I failed to trace this code.
It is really not easy to understand what are they doing.

Escape characters (0x1b/27) in binary packages don't get sent through Wi-Fi and corrupt message during transmission

I am developing on an embedded system (STM32F4) and I tried to send some data to a simple Windows Forms client program on the PC side. When I used a character based string format everything was working fine but when I changed to a binary package to increase performance I run into an problem with Escape characters.
I'm using nanopb to implement Googles Protocol Buffer for transmission and I observed that in 5% of package I'm receiving exceptions in my client program telling me that my packages are corrupted.
I debugged in WireShark and saw that in this corrupted package the size was 2-4 bytes smaller than the original package size. Upon further inspecting I found out that the corrupted packages always included the binary value 27 and other packages never included this value. I searched for it and saw that this value represents an escape character and that this might lead to problems.
The technical document of the Wi-Fi module I'm using (Gainspan GSM2100) mentions that commands are preceded by an escape character so I think I need to get rid of this values in my package.
I couldn't find a solution to my problem so I would appreciate if somebody more experienced could led me to the right approach to solve this problem.
How are you sending the data? Are you using a library or sending raw bytes? According to the manual, your data commands should start with an escape sequence, but also have data length specified:
// Each escape sequence starts with the ASCII character 27 (0x1B),
// the equivalent to the ESC key. The contents of < > are a byte or byte stream.
// - Cid is connection id (udp, tcp, etc)
// - Data Length is 4 ASCII char represents decimal value
// i.e. 1400 bytes would be '1' '4' '0' '0' (0x31 0x34 0x30 0x30).
// - Data size must match with specified length.
// Ignore all command or esc sequence in between data pay load.
<Esc>Z<Cid><Data Length xxxx 4 ascii char><data>
Note the remark regarding data size: "Ignore all command or esc sequence in between data pay load".
For example, this is how the GSCore::writeData function in GSCore.cpp looks like:
// Including a trailing 0 that snprintf insists to write
uint8_t header[8];
// Prepare header: <esc> Z <cid> <ascii length>
snprintf((char*)header, sizeof(header), "\x1bZ%x%04d", cid, len);
// First, write the escape sequence up to the cid. After this, the
// module responds with <ESC>O or <ESC>F.
writeRaw(header, 3);
if (!readDataResponse()) {
if (GS_LOG_ERRORS && this->error)
this->error->println("Sending bulk data frame failed");
return false;
}
// Then, write the rest of the escape sequence (-1 to not write the
// trailing 0)
writeRaw(header + 3, sizeof(header) - 1 - 3);+
// And write the actual data
writeRaw(buf, len);
This should most likely work. Alternatively, a dirty hack might be to "escape the escape character" before sending, i.e. replace each 0x27 with two characters (0x27 0x27) before sending - but this is just a wild guess and I am presuming you should just check the manual.

Linux programming: which device a file is in

I would like to know which entry under /dev a file is in. For example, if /dev/sdc1 is mounted under /media/disk, and I ask for /media/disk/foo.txt, I would like to get /dev/sdc as response.
Using stat system call on that file I will get its partition major and minor numbers (8 and 33, for sdc1). Now I need to get the "root" device (sdc) or its major/minor from that. Is there any syscall or library function I could use to link a partition to its main device? Or even better, to get that device directly from the file?
brw-rw---- 1 root floppy 8, 32 2011-04-01 20:00 /dev/sdc
brw-rw---- 1 root floppy 8, 33 2011-04-01 20:00 /dev/sdc1
Thanks in advance!
The quick and dirty version: df $file | awk 'NR == 2 {print $1}'.
Programmatically... well, there's a reason I started with the quick and dirty version. There's no portable way to programmatically get the list of mounted filesystems. (getmntent() gets fstab entries, which is not the same thing.) Moreover, you can't even parse the output of mount(8) reliably; on different Unixes, the mountpoint may be the first or the last item. The most portable way to do this ends up being... parsing df output (And even that is iffy, as you noticed with the partition number.). So you're right back to the quick and dirty shell solution anyway, unless you want to traverse /dev and look for block devices with matching major(st_rdev) (major() being from sys/types.h).
If you restrict this to Linux, you can use /proc/mounts to get the list of mounted filesystems. Other specific Unixes can similarly be optimized: for example, on OS X and I think FreeBSD, you can use sysctl() on the vfs tree to get mountpoints. At worst you can find and use the appropriate header file to decipher whatever the mount table file is (and yes, even that varies: on Solaris it's /etc/mnttab, on many other systems it's /etc/mtab, some systems put it in /var/run instead of /etc, and on many Linuxes it's either nonexistent or a symlink to /proc/mounts). And its format is different on pretty much every Unix-like OS.
The information you want exists in sysfs which exposes the linux device tree. This models the relationships between the devices on the system and since you are trying to determine a parent disk device from a partition, this is the place to look. I don't know if there are any hard and fast rules you can rely on to stop your code breaking with future versions of the kernel, but the kernel developers do try to maintain sysfs as a stable interface.
If you look at /sys/dev/block/<major>:<minor>, you'll see it is a symlink with the tail components being block/<disk-device-name>/<partition-device-name>. If you were to perform a readlink(2) system call on that, you could parse the link destination to get the disk device name. In shell (since it's easier to express this way, but doing it in C will be pretty easy):
$ echo $(basename $(dirname $(readlink /sys/dev/block/8:33)))
sdc
Alternatively, you could take advantage of the nesting of partition directories in the disk directories (again in shell, but from C, its an open(2), read(2), and close(2)):
$ cat /sys/dev/block/8:33/../dev
8:32
That assumes your starting major:minor is actually for a partition, not some other sort of non-nested device.
What you looking for is impossible - there is no 1:1 connection between a block device file and the partition it is describing.
Consider:
You can create multiple block device files with different names (but the same major and minor numbers) and they are indistinguishable (N:1)
You can use a block device file as an argument to mount to mount a partition and then delete the block device file leaving the partition mounted. (0:1)
So there is no way to do what you want except in a few specific and narrow cases.
Major number will tell you which device it is: 3 - IDE on 1st controller, 22 - IDE on 2nd controller and 8 for SCSI.
Minor number will tell you partition number and - for IDE devices - if it's primary or secondary drive. This calculation is different for IDE and SCSI.
For IDE it is: x*64 + p, x is drive number on the controller (0 or 1) and p is partition
For SCSI it is: y*16 + p, where y is drive number and p is partition
Not a syscall, but:
df -h /path/to/my/file
From https://unix.stackexchange.com/questions/128471/determine-what-device-a-directory-is-located-on
So you could look at df's source code and see what it does.
I realize this post is old, but this question was the 2nd result in my search and no one has mentioned df -h

Resources