How to calculate md5 checksum of a firmware? - md5

I am working on a microcontroller. I have to check loaded firmware is real firmware. For this reason I have to calculate MD5 checksum of loaded firmware. But there is a problem in this action.
MD5 checksum is get 4 32 bits input and gives 4 32 bits output. How am I calculate the whole firmware' s checksum. When I try to calculate, the output is same size of the real firmware and that wastes too much RAM. Is there any posible way to get one output calculated firmware' s MD5 checksum, that is 4 32 bits?

The easy way: https://www.st.com/en/embedded-software/stm32-cryp-lib.html#overview
The hard way: https://github.com/mikeferguson/stm32/blob/master/libraries/lwip/src/netif/ppp/md5.c
Both work

Related

How does a bootloader know the "expected" CRC value?

Let's say we have some firmware, and a bootloader. When we flash both onto the device, during boot, the bootloader would know some "expected" CRC from the binary firmware image. The bootloader would compare the expected vs. actually calculated CRC value from the binary firmware image. If they're equal, it jumps to the firmware application startup address, and if not, it just stays in the bootloader.
What I'm confused is by how the bootloader would know some "expected" CRC value. How does a discrepancy grow between an incorrect CRC value and an expected one? And where does the "expected" one come from?
I use two methods.
The CRC is stored somewhere in the binary image. Bootloader calculates the CRC of the image and compares with that value. If they match - the image is good and can be executed.
Always the same CRC is used and some additional data is appended to the image to match this CRC. It requires a bit more complicated post-build steps.
Let's say we have some firmware, and a bootloader. When we flash both onto the device, during boot, the bootloader would know some "expected" CRC from the binary firmware image. The bootloader would compare the expected vs. actually calculated CRC value from the binary firmware image. If they're equal, it jumps to the firmware application startup address, and if not, it just stays in the bootloader.
CRCs are calculated following a well known formula, so the boot loader applies that formula to the full boot record to get what is called the actual value (for that, depending on the sofware some programs take of the CRC code comming on the data or not, depending on the CRC algorithm) and it compares with the expected value which is the code that comes in the data.
Other programs store in the CRC field a value derived from the calculated CRC that forces the algorithm to return a fixed (depending on the algorithm, but always the same) value (e.g. zero) This allows simplifying the CRC algorithm that has just to calculate the CRC over the full data, and if it has not been touched (modified) for example, an expected value of zero is expected.
If you are dealing with some established protocol that defines a CRC algorithm to calculate and verify the data in a boot record, you need to look for the documentation on how the CRC is calculated and stored in it. So your expected value will be described there.
Whith respect to CRCs some algorithms initialize the CRC machine in order to distinguish the stream start from a sequence of zeros by initializing (or prepending to the bitstring ---or the initial polynomial---) a fixed string of ones. This is easily implemented by initalizing the shift register with ones and start feeding the block contents to the shift register. Others add a trailing of zeros to xor in that field the calculated CRC (so the total chain of bits should always result in an all zeros expected CRC) You need to consult the firmware provider to see how the boot record CRC is calculated, as most probably the device will refuse to boot until the CRC is not properly set.

Getting CRC-32 over STM32 flash and consistency with other CRC-32 tools

I'm moving my STM32F1xx project to a solution with a bootloader.
Part of this is that I want to be able to compute a CRC value over the existing bootloader and application flash ranges to compare existing and possible upload candidates.
Using a simple implementation on the STM32 which just does the following steps:
Enable CRC periperal
Reset the peripheral CRC value (sets to 0xFFFFFFFF)
Iterate over flash range (in this case 0x08000000 to 0x08020000) passing values to CRC peripheral
Return CRC peripheral output
uint32_t get_crc(void) {
RCC->AHBENR |= RCC_AHBENR_CRCEN;
CRC->CR |= CRC_CR_RESET;
for(uint32_t *n = (uint32_t *)FLASH_BASE; n < (uint32_t *)(FLASH_BANK1_END + 1u); n ++) {
CRC->DR = *n;
}
return CRC->DR;
}
The value I get from this is 0x0deddeb3.
To compare this value with something I am running the .bin file through two tools.
The value I get from npm's crc-32 is 0x776b0ea2
The value I get from a zip file's CRC-32 is also 0x776b0ea2
What could be causing this? Is there a difference between iterating over the entire flash range and the contents of the bin file (smaller than entire flash range)? The polynomial used by the STM32 is 0x04c11db7 which seems to be fairly standard for a CRC-32. Would the zip tool and npm crc-32 be using a different polynomial?
I have also tried iterating over bytes and half-words as well as words on the STM32 in case the other tools used a different input format.
There is a similar question on here already, but I'm hoping to use a node-js solution because that is the platform my interface application is being developed on.
Calculating CRCs is a mine field. Your question already has some points to look at:
Is there a difference between iterating over the entire flash range and the contents of the bin file (smaller than entire flash range)?
Yes, of course.
Would the zip tool and npm crc-32 be using a different polynomial?
The documentation will tell you. And I'm sure that you can use another polynomial with this tools by an option.
Anyway, these are the things to consider when calculating CRCs:
The amount of bytes (words, ...) to "sum up".
The contents of the flash not covered by the binary file, most probably all bits set to 1.
Width of the polynomial (in your case fixed to 32 bits).
Value of the polynomial.
Initial value for the register.
Whether the bits of each byte are reflected before being processed.
Whether the algorithm feeds input bytes through the register or xors them with a byte from one end and then straight into the table.
Whether the final register value should be reversed (as in reflected versions).
Value to XOR with the final register value.
The points 3 to the last are shamelessly copied from "A PAINLESS GUIDE TO CRC ERROR DETECTION ALGORITHMS" that I suggest to read.
The polynomial is only one of several parameters that define a CRC. In this case the CRC is not reflected, whereas the standard zip CRC, using the same polynomial is reflected. Also that zip CRC is exclusive-or'ed with 0xffffffff at the end, whereas yours isn't. They do both get initialized the same, which is with 0xffffffff.

Random formula based of 15 seeds

I am working at my university degree and I got stuck at a random function.
I am using a microcontroller, which has no configured clock. So, I decided to use the ADC (analog to digital conversion) as seeds for my random function.
So I have 15 two bytes variables with stores some 'random' values ( the conversion is not always the same, and the difference is at the LSB ( the last bit in my case :eg now the value of an adc read is 700, in 5ms it is 701, then back to 700, then 702 etc). So, I was thinking to build a random function with use the last 4 bits lets say from those variables.
My question is: Can you give me an example of a good random formula?
Like ( Variable1 >> 4 ) ^ ( Variable2 << 4 ) and so on ...
I want to be able to obtain a pretty random number on 1 byte ( this is the best case ). It will be used in a RSA algorithm, which I have already implemented ( I have a big look up table with prime numbers, and I need 2 random numbers from that table ).
Usually a cryptographic hash function like SHA or MD5 is used for this purpose. As long as your input data contains enough entropy, you will get a random output. See https://en.wikipedia.org/wiki/Entropy_(computing)
However, that may be a little too much work for your use case. If you only need 8 bits, you could use an 8-bit cyclic redundancy code (CRC). It will have similar properties -- since any 8 of your input bits can be used to completely determine the output, the output will be random as long as at least 8 of your input bits are random. See http://www.sunshine2k.de/articles/coding/crc/understanding_crc.html
That will do what you ask for... but beware! It sounds like you are writing a completely insecure implementation of RSA. Under no circumstances could you use only 8 bits of randomness to securely generate an RSA key.
If you think that the LS bit of every word is truly random (which is likely), and if they are uncorrelated, pack 8 LS bits into 1 byte. There is no use for the remaining 15 x 16 - 8 bits.

array storage and multiplication in verilog

I have a peripheral connected to my altera fpga and am able read data from it using SPI. I would like to store this incoming data into an array, preferably as a floating point value. Further, I have a csv file on my computer and want to store that data in another array, and then after triggering a 'start' signal multiply both arrays and send the output via rs-232 to my pc. Any suggestions on how to go about this? Code for reading data from peripheral is as follows:
// we sample on negative edge of clock
always #(negedge SCL)
begin
// data comes as MSB first.
MOSI_reg[31:0] <= {MOSI_reg[30:0], MOSI}; // left shift for MOSI data
MISO_reg[31:0] <= {MISO_reg[30:0], MISO}; // left shift for MISO data
end
thank you.
A 1024x28 matrix of 32 bits each element requires 917504 bits of RAM in your FPGA, plus another 28*32 = 896 bits for the SPI data. Multiplying these two matrices will result in a vector of 1024x1 elements, thus add 32768 bits for the result. This sums 951168 bits you will need in your device. Does your FPGA chip have this memory?
Asumming you have, yes: you can instantiate a ROM inside your design and initialize with $readmemh or $readmemb (for values in binary or hexadecimal form respectively).
If precission is not an issue, go for fixed point, as implementing multiplication and addition in floating point is kind of hard job.
You need then a FSM to fill your source vector with SPI data, do the multiplication and store the result in your destination vector. You may consider instantiating a small processor to do the job more easily
Multiplication is non-trivial in hardware, and 'assign c = a*b' is not necessarily going to produce what you want.
If your FPGA has DSP blocks, you can use one of Altera's customizable IP cores to do your multiplication in a DSP block. If not, you can still use an IP core to tune the multiplier the way you want (with regards to signed/unsigned, latency, etc.) and likely produce a better result.

Checksum with low probability of false negative

At this moment I'm using a simple checksum scheme, that just adds the words in a buffer. Firstly, my question is what is the probability of a false negative, that is, the receiving system calculating the same checksum as the sending system even when the data is different (corrupted).
Secondly, how can I reduce the probability of false negatives? What is the best checksuming scheme for that. Note that each word in the buffer is of size 64 bits or 8 bytes, that is a long variable in a 64 bit system.
Assuming a sane checksum implementation, then the probability of a randomly-chosen input string colliding with a reference input string is 1 in 2n, where n is the checksum length in bits.
However, if you're talking about input that differs from the original by a low number of bits, then the probability of collision is generally much, much lower.
One possibility is to have a look at T. Maxino's thesis titled "The Effectiveness of Checksums for Embedded Networks" (PDF), which contains an analysis for some well-known checksums.
However, usually it is better to go with CRCs, which have additional benefits, such as detection of burst errors.
For these, P. Koopman's paper "Cyclic Redundancy Code (CRC) Selection for Embedded Networks" (PDF) is a valuable resource.

Resources