I am doing an ongoing project to write a simplified OS for hobby/learning purposes. I can generate hex files, and now I want to write a script on the chip to accept them via serial, load them into RAM, then execute them. For simplicity I'm writing in assembly so that all of the startup code is up to me. Where do I start here? I know that the hex file format is well documented, but is it as simple as reading the headers for each line, aligning the addresses, then putting the data into RAM and jumping to the address? It sounds like I need a lot more than that, but this is a problem that most people don't even try to solve. Any help would be great.
way too vague, there are many different file formats and at least two really popular ones that use text with the data in hex. So not really helping us here.
writing a script on chip means you have an operating system running on your microcontroller? what operating system is it and what does the command line look like, etc.
assembly is not required to completely control everything (basically baremetal) can use asm to bootstrap C and then the rest in C, not a problem.
Are you wanting to download to ram and run or wanting to download and then burn to flash to reset into in some way?
Basically what you are making is a bootloader. And yes we write bootloaders all the time, one for each platform, sometimes borrowing code from a prior platform sometimes not.
First off on your development computer, windows, mac, linux, whatever, write a program (in C or Pascal ideally, something you can easily port to the microcontroller) that reads the whole file into an array, then write some code that basically accepts one byte at a time like you would if you were receiving it serially. Parse through that file format whatever format you choose (initially, then perhaps change formats if you decide you no longer like it) take real programs that you have built which the disassembler or other tools should have other output options to show you what bytes or words should be landing at what addresses. Parse this data, printf out the address/byte or address/word items you find, and then compare that to what the toolchain showed. carve the parsing tool out and replace the printf with write to memory at that address. and then yes you jump to the entry point if you can figure that out and/or you as the designer decide all programs must have a specific entry point.
Intel hex and motorola s-record are good choices (-O ihex or -O srec, my current favorite is --srec-forceS3 -O srec), both are documented at wikipedia. and at least the gnu tools will produce both, you can use a dumb terminal program (minicom) to dump the file into your microcontroller and hopefully parse and write to ram as it comes in. If you cant handle that flow you might think of a raw binary (-O binary) and implement an xmodem receiver in your bootloader.
Related
I am trying to run some MC9S12DP256 example files, but I want to see the code to understand it. Are there any ways to convert a .s19 or .abs file to a C code?
An ".s19" or an ".abs" file contains mainly the machine code of the application. The source code of it is not included, independent of the language used to write it. Even if it were written in assembly language, all symbolic informations and comments are excluded.
However, you can try to de-compile the machine code. This is not a trivial or quick task, you need to know the target really well. I did this with software for other processors, it is feasible for code up to some KB.
These are the steps I recommend:
Get a disassembler and an assembler for the target processor, optimally from the vendor.
Let it disassemble the machine code into assembly source code. You might need to convert the ".s19" file into a binary file, one possible tool for this is "srecord".
Assemble the resulting source code again into ".s19" or ".abs", and make sure that it generates the same contents as your original.
Insert labels for the reset and interrupt entry points. Start at the reset entry point with your analysis.
Read the source code, think about what it does.
You will quickly "dive" into subroutines that execute small functions, like reading ADC or sending data. Place a label and replace the numerical value at the call sites with the label.
Expect sections of (constant) data mixed with executable code.
Repeat often from point 3. If you have a difference, undo your last step and redo it in another way until you produce the same contents.
If you want C source, it is commonly much more difficult. You need a lot of experience how C is compiled into machine code. Be aware that variables or even functions are commonly placed in another sequence than they are declared. If you want to go that route, you usually also have to use the exact version of the compiler used to generate the original machine code.
Be aware that the original might be produced with any other language.
In my project, we are building application to run on Linux both on x86 and ARM. Accidentally, I have run the x86 binary on ARM, and to my surprise the binary launched - sort of. It wrote one of the string literals to stdout and immediately ended with segfault.
No meaningful message along the lines "This binary cannot be run on this platform" was shown, which is something I was assuming would happen.
Is it technically possible to set my compiler/linker/anything in a way, that the output binary will not be run at all if launched on wrong architecture? Or that some meaningful message will be displayed?
What you want is FatELF.
Since that isn't really supported, you could write a shell-script, put your executable's content in there (base64-encoded), and write the correct executable for the correct architecture to /tmp, and if the architecture is not supported, you could display an error message.
That way, you'd have one executable for all Unix/Linux/Mac platforms for all processor architectures, with no dependency on the user making a (wrong) decision.
I am trying to read the MCU_ID (device electronic signature) from STM32L476 chip using a JTAG ST-Link/V2 on Windows 7. No code has to be uploaded inside the chip, the program shall just be launched on my cumputer and read this information from the flash memory.
I have managed to find and extract the following screenshot from the Reference Manuel given on ST website :
So I have to read the value stored in the flash memory at the adess 0x1FFF7590 by using a C program. I am using the Atollic TrueStudio IDE which is recommended by ST itself, but it seems to me that it includes the "stm32l476xx.h"library which does not even contain any function which could help me.
What I have done so far
After spending days and days looking for some functions or examples to do something as simple as read flash memory, I have asked on this very site How to interact with a STM32 chip memory, which helped me understand a couple of things about what I had to do; nevertheless, I haven't been able to find what I was looking for even after days reading all the links and docs advised in the comments.
I have asked a couple of professionals who told me that I should search for a JTAG driver to interact with the flash memory, but it seems a bit complicated and I haven't been able to found any.
Someone on this site told me that simply using pointer should be enough; the lack of C example and internet tutorials couldn't help me figure out how to do so.
Finally, I started recently digging around STM32Cube and HAL, even since I wanted to avoid using those because I thought that a simple read could be done without having to include those layers. Asking this question is my final hope before trying to use them.
In Conclusion :
I can't show any code since the only thing I have so far is a #include "stm32l476xx.h"and an empty main.
A hint or solution on How to read a STM32L476's flash memory in C would be just perfect. Every example of C (or any programming language which would be as low or higher level) program or instructions interacting with a STM32 chip's memory could help me a lot since it is very hard to find on internet.
Reading MCU ID using ST-Link (graphical interface)
You can use ST-Link Utility (can be downloaded from ST.com here: http://www.st.com/en/embedded-software/stsw-link004.html). After you do Target->Connect you can specify the address and number of bytes you want to read on top of the Window. This also works for the memory area where MCU ID is defined.
For STM32L476 MCU that you use it's going to be memory address 0x1FFF7590, size 0xC (96 bits). Pressing enter should allow you to see the unique ID read from the MCU you're connected to, in form of 3x32 bit values.
Reading MCU ID using ST-Link (command line interface)
ST-Link Utility provides CLI (command line interface) to do the most common operations. This is done using ST-LINK_CLI.exe located in your ST-Link Utility installation directory.
Reading Unique ID as 32-bit values (STM32L476 MCU from the example):
ST-LINK_CLI.exe -r32 0x1FFF7590 0xC
Reading as 8-bit values:
ST-LINK_CLI.exe -r8 0x1FFF7590 0xC
You can also read it to file using -Dump parameter:
ST-LINK_CLI.exe -Dump 0x1FFF7590 0xC D:\temp\out.bin
Keep in mind that you must have the priviledges to write to the destination directory. If you don't run the command prompt with administrative priviledges, in most common cases this means that you won't be able to create the file in locations such as root drive directory (C:\out.bin) or inside "Program Files", where most likely your program is installed (for example by specifying a relative path, such as giving an output file name only out.bin). The program sadly doesn't inform about failed attempts to write the file, however it does say when it succeeds to create the file. The program execution should end with a green line saying Dumping memory to D:\temp\out.bin succeded. In addition, keep in mind that only the following file extensions are supported: *.bin *.hex *.srec *.s19. It cannot be anything because the extension determines the format in which the data will be written to the file.
You can find more information about CLI parameters in User Manual UM0892.
Reading MCU ID using C code
The same can be done using a program loaded into the MCU. You read it by simply accessing the memory directly. Sample code:
#define STM32_UNIQUEID_ADDR 0x1FFF7590
uint32_t id[3];
id[0] = *(STM32_UNIQUEID_ADDR + 0);
id[1] = *(STM32_UNIQUEID_ADDR + 1);
id[2] = *(STM32_UNIQUEID_ADDR + 2);
After this operation id array should contain the same 3x32bit values you've previously read using ST-Link Utility. You may of course choose to read it as uint8_t byte array of size 12, you may even choose to read it into a struct, in case you're interested in the details (lot number, wafer number etc.). This example should however give you a general idea of how to access this value.
There is Texane stlink, that does what you want. It's written in C, interacts with STM32 chips through an ST-Link adapter, and it can read from chip memory.
What you are looking for is not a feature of ST but a feature of ARM.
Remember, ST simply uses an ARM core. I know most programmers load some code in RAM and use that to access flash. You can find these simple programs in the install directory or Keil for example.
I think this is the manual you will need. But I don't know if there is more information behind the login-wall.
I'm not sure if this is specific to the processor I'm using, so for what it's worth I'm using a Cortex M0+. I was wondering: if I generate a hex file through gcc using -fPIC, I produce...Position Independent Code. However, the intel hex file format that I get out of objcopy always has address information on each line's header. If I'm trying to write a bootloader, do I just ignore that information, skip the bytes relating to it, and load the actual code into memory wherever I want, or do I have to keep track of it somehow?
The intel-HEX format was specially designed to programm PROMs, EPROMS or processors with an internal EPROM and is normally used with programmers for theses devices. The addresses at the beginning of the records have not much to do with the program code directly. They indicate at which address of the PROM the data will be written. Remember also that the PROM can be mapped anywhere into the address space of the processor, thus the final address can change anyway.
As long as you don't want to program a PROM you must remove anything except the data from the records. (Don't forget the checksum at the end ;-)
As I understand the intel-HEX format the records must not be contiguous, there may be holes in between.
Some remarks:
The -f PIC parameter is not responsible for the intel-HEX format. I think that somewhere in your command lines you'll find -O ihex. If you want to have a file that could be executed, objcopy provides better suited output formats.
As long as you don't write earlier stages of the boot process by yourself, you don't load your bootloader - it will be loaded for you. The address at which this will happen is normally fixed and not changeable. So there is no need for position independent code, but it doesn't hurt either.
Why does inserting characters into an executable binary file cause it to "break" ?
And, is there any way to add characters without breaking the compiled program?
Background
I've known for a long time that it is possible to use a hex editor to change code in a compiled executable file and still have it run as normal...
Example
As an example in the application below, Facebook could be changed to Lacebook, and the program will still execute just fine:
But it Breaks with new Characters
I'm also aware that if new characters are added, it will break the program and it won't run, or it will crash immediately. For example, adding My in front of Facebook would achieve this:
What I know
I've done some work with C and understand that code is written in human readable, compiled, and linked into an executable file.
I've done introductory studies of assembly language and understand the concepts about data, commands, and pointers being moved around
I've written small programs for Windows, Mac and Linux
What I don't know
I don't quite understand the relationship between the operating system and the executable file. I'd guess that when you type in the name of the program and press return you are basically instructing the operating system to "execute" that file, which basically means loading the file into memory, setting the processor's pointer to it, and telling it 'Go!'
I understand why having extra characters in a text string of the binary file would cause problems
What I'd like to know
Why do the extra characters cause the program to break?
What thing determines that the program is broken? The OS? Does the OS also keep this program sandboxed so that it doesn't crash the whole system nowadays?
Is there any way to add in extra characters to a text string of a compiled program via a hex editor and not have the application break?
I don't quite understand the relationship between the operating system and the executable file. I'd guess that when you type in the name of the program and press return you are basically instructing the operating system to "execute" that file, which basically means loading the file into memory, setting the processor's pointer to it, and telling it 'Go!'
Modern operating systems just map the file into memory. They don't bother loading pages of it until it's needed.
Why do the extra characters cause the program to break?
Because they put all the other information in the file in the wrong place, so the loader winds up loading the wrong things. Also, jumps in the code wind up being to the wrong place, perhaps in the middle of an instruction.
What thing determines that the program is broken? The OS? Does the OS also keep this program sandboxed so that it doesn't crash the whole system nowadays?
It depends on exactly what gets screwed up. It may be that you move a header and the loader notices that some parameters in the header have invalid data.
Is there any way to add in extra characters to a text string of a compiled program via a hex editor and not have the application break?
Probably not reliably. At a minimum, you'd need to reliably identify sections of code that need to be adjusted. That can be surprisingly difficult, particularly if someone has attempted to make it so deliberately.
When a program is compiled into machine code, it includes many references to the addresses of instructions and data in the program memory. The compiler determines the layout of all the memory of the program, and puts these addresses into the program. The executable file is also organized into sections, and there's a table of contents at the beginning that contains the number of bytes in each section.
If you insert something into the program, the address of everything after that is shifted up. But the parts of the program that contain references to the program and data locations are not updated, they continue to point to the original addresses. Also, the table that contains the sizes of all the sections is no longer correct, because you increased the size of whatever section you modified.
The format of a machine-language executable file is based on hard offsets, rather than on parsing a byte stream (like textual program source code). When you insert a byte somewhere, the file format continues to reference information which follows the insertion point at the original offsets.
Offsets may occur in the file format itself, such as the header which tells the loader where things are located in the file and how big they are.
Hard offsets also occur in machine language itself, such in instructions which refer to the program's data or in branch instructions.
Suppose an instruction says "branch 200 bytes down from where we are now", and you insert a byte into those 200 bytes (because a character string happens to be there that you want to alter). Oops; the branch still covers 200 bytes.
On some machines, the branch couldn't even be 201 bytes even if you fixed it up because it would be misaligned and cause a CPU exception; you would have to add, say, four bytes to patch it to 204 (along with a myriad other things needed to make the file sane).