I am doing a GHASH for the AES-GCM implementation.
and i need to implement this
where v is the bit length of the final block of A, u is the bit length of the final block of C, and || denotes concatenation of bit strings.
How can I do the concatenation of A block to fill in the zeros padding from v to 128 bit, as I do not know the length of the whole block of A.
So I just take the A block and XOR it with an array of 128 bits
void GHASH(uint8_t H[16], uint8_t len_A, uint8_t A_i[len_A], uint8_t len_C,
uint8_t C_i[len_C], uint8_t X_i[16]) {
uint8_t m;
uint8_t n;
uint8_t i;
uint8_t j;
uint8_t zeros[16] = {0};
if (i == m + n) {
for(j=16; j>=0; j--){
C_i[j] = C_i[j] ^ zeros[j]; //XOR with zero array to fill in 0 of length 128-u
tmp[j] = X_i[j] ^ C_i[j]; // X[m+n+1] XOR C[i] left shift by (128bit-u) and store into tmp
gmul(tmp, H, X_i); //Do Multiplication of tmp to H and store into X
}
}
I am pretty sure that I am not correct. But I have no idea how to do it.
It seems to me that you've got several issues here, and conflating them is a big part of the problem. It'll be much easier when you separate them.
First: passing in a parameter of the form uint8_t len_A, uint8_t A_i[len_A] is not proper syntax and won't give you what you want. You're actually getting uint8_t len_A, uint8_t * A_i, and the length of A_i is determined by how it was declared on the level above, not how you tried to pass it in. (Note that uint8_t * A and uint8_t A[] are functionally identical here; the difference is mostly syntactic sugar for the programmer.)
On the level above, since I don't know if it was declared by malloc() or on the stack, I'm not going to get fancy with memory management issues. I'm going to use local storage for my suggestion.
Unit clarity: You've got a bad case going on here: bit vs. byte vs. block length. Without knowing the core algorithm, it appears to me that the undeclared m & n are block lengths of A & C; i.e., A is m blocks long, and C is n blocks long, and in both cases the last block is not required to be full length. You're passing in len_A & len_C without telling us (or using them in code so we can see) whether they're the bit length u/v, the byte length of A_i/C_i, or the total length of A/C, in bits or bytes or blocks. Based on the (incorrect) declaration, I'm assuming they're the length of A_i/C_i in bytes, but it's not obvious... nor is it the obvious thing to pass. By the name, I would have guessed it to be the length of A/C in bits. Hint: if your units are in the names, it becomes obvious when you try to add bitLenA to byteLenB.
Iteration control: You appear to be passing in 16-byte blocks for the i'th iteration, but not passing in i. Either pass in i, or pass in the full A & C instead of A_i & C_i. You're also using m & n without setting them or passing them in; the same issue applied. I'll just pretend they're all correct at the moment of use and let you fix that.
Finally, I don't understand the summation notation for the i=m+n+1 case, in particular how len(A) & len(C) are treated, but you're not asking about that case so I'll ignore it.
Given all that, let's look at your function:
void GHASH(uint8_t H[], uint8_t len_A, uint8_t A_i[], uint8_t len_C, uint8_t C_i[], uint8_t X_i[]) {
uint8_t tmpAC[16] = {0};
uint8_t tmp[16];
uint8_t * pAC = tmpAC;
if (i == 0) { // Initialization case
for (j=0; j<len_A; ++j) {
X_i[j] = 0;
}
return;
} else if (i < m) { // Use the input memory for A
pAC = A_i;
} else if (i == m) { // Use temp memory init'ed to 0; copy in A as far as it goes
for (j=0; j<len_A; ++j) {
pAC[j] = A_i[j];
}
} else if (i < m+n) { // Use the input memory for C
pAC = C_i;
} else if (i == m+n) { // Use temp memory init'ed to 0; copy in C as far as it goes
for (j=0; j<len_A; ++j) {
pAC[j] = C_i[j];
}
} else if (i == m+n+1) { // Do something unclear to me. Maybe this?
// Use temp memory init'ed to 0; copy in len(A) & len(C)
pAC[0] = len_A; // in blocks? bits? bytes?
pAC[1] = len_C; // in blocks? bits? bytes?
}
for(j=16; j>=0; j--){
tmp[j] = X_i[j] ^ pAC[j]; // X[m+n+1] XOR A or C[i] and store into tmp
gmul(tmp, H, X_i); //Do Multiplication of tmp to H and store into X
}
}
We only copy memory in the last block of A or C, and use local memory for the copy. Most blocks are handled with a single pointer copy to point to the correct bit of input memory.
if you don't care about every little bit of efficiency (i assume this is to experiment, and not for real use?) just reallocate and pad (in practice, you could round up and calloc when you first declare these):
size_t round16(size_t n) {
// if n isn't a multiple of 16, round up to next multiple
if (n % 16) return 16 * (1 + n / 16);
return n;
}
size_t realloc16(uint8_t **data, size_t len) {
// if len isn't a multiple of 16, extend with 0s to next multiple
size_t n = round16(len);
*data = realloc(*data, n);
for (size_t i = len; i < n; ++i) (*data)[i] = 0;
return n;
}
void xor16(uint8_t *result, uint8_t *a, uint8_t *b) {
// 16 byte xor
for (size_t i = 0; i < 16; ++i) result[i] = a[i] ^ b[i];
}
void xorandmult(uint8_t *x, uint8_t *data, size_t n, unint8_t *h) {
// run along the length of the (extended) data, xoring and mutliplying
uint8_t tmp[16];
for (size_t i = 0; i < n / 16; ++i) {
xor16(tmp, x, data+i*16);
multgcm(x, h, tmp);
}
}
void ghash(uint8_t *x, uint8_t **a, size_t len_a, uint8_t **c, size_t len_c, uint8_t *h) {
size_t m = realloc16(a, len_a);
xorandmult(x, *a, m, h);
size_t n = realloc16(c, len_c);
xorandmult(x, *c, n, h);
// then handle lengths
}
uint8_t x[16] = {0};
ghash(x, &a, len_a, &c, len_c, h);
disclaimer - no expert, just skimmed the spec. code uncompiled, unchecked, and not intended for "real" use. also, the spec supports arbitrary (bit) lengths, but i assume you're working in bytes.
also, i am still not sure i am answering the right question.
Related
I've been reading up on the use of pointers, and allocating memory for embedded projects. I must admit, that i perhaps don't understand it fully, as i can't seem to figure where my problem lies.
My two functions are supposed to take 4 float values, and return 16 bytes, that represent these, in order to transfer them through SPI. It works great, but only for a minute, before the program crashes and my SPI and I2C dies, lol.
Here are the functions:
/*Function that wraps a float value, by allocating memory and casting pointers.
Returns 4 bytes that represents input float value f.*/
typedef char byte;
byte* floatToByteArray(float f)
{
byte* ret = malloc(4 * sizeof(byte));
unsigned int asInt = *((int*)&f);
int i;
for (i = 0; i < 4; i++) {
ret[i] = (asInt >> 8 * i) & 0xFF;
}
return ret;
memset(ret, 0, 4 * sizeof(byte)); //Clear allocated memory, to avoid taking all memory
free(ret);
}
/*Takes a list of 4 quaternions, and wraps every quaternion in 4 bytes.
Returns a 16 element byte list for SPI transfer, that effectively contains the 4 quaternions*/
void wrap_quaternions(float Quaternion[4], int8_t *buff)
{
uint8_t m;
uint8_t n;
uint8_t k = 0;
for (m = 0; m < 4; m++)
{
for (n = 0; n < 4; n++)
{
byte* asBytes = floatToByteArray(Quaternion[m]);
buff[n+4*k] = asBytes[n];
}
k++;
}
}
The error message i receive after is the following, in the disassembly window of Atmel Studio
Atmel studio screenshot
You might drop all the dynamic memory allocation completely.
void floatToByteArray(float f, byte buf[4])
{
memcpy(buf, &f, sizeof(f));
}
void wrap_quaternions(float Quaternion[4], int8_t *buff)
{
for (int i = 0; i < 4; i++)
{
floatToByteArray(Quaternion[i], &buf[4*i]);
}
}
With this approach you do not need to care about freeing allocated memory after use. It is also much more efficient because dynamic memory allocation is rather expensive.
Gerhardh is correct, return prevent the memory from being released.
If you need to return 4 bytes, you might check if your environment can return a uint32_t or something like that.
As already mentioned, the lines below return ret; are never executed. And anyway if you want to return allocated memory in a function (what is fine) you can't free it in the function itself but it has to be freed by the caller when it isn't needed anymore. So your calling function should look like
/*Takes a list of 4 quaternions, and wraps every quaternion in 4 bytes.
Returns a 16 element byte list for SPI transfer, that effectively contains the 4 quaternions*/
void wrap_quaternions(float Quaternion[4], int8_t *buff)
{
uint8_t m;
uint8_t n;
uint8_t k = 0;
for (m = 0; m < 4; m++)
{
byte* asBytes = floatToByteArray(Quaternion[m]); // no need it to call for every n
for (n = 0; n < 4; n++)
{
buff[n+4*k] = asBytes[n];
}
free(asBytes); // asBytes is no longer needed and can be free()d
k++;
}
}
regarding:
buff[n+4*k] = asBytes[n];
This results in:
buff[0] << asBytes[0] // from first call to `byte* floatToByteArray(float f)`
buff[4] << asBytes[1] // from second call to `byte* floatToByteArray(float f)`
buff[8] << asBytes[2] // from third call to `byte* floatToByteArray(float f)`
buff[12] << asBytes[3] // from forth call to `byte* floatToByteArray(float f)`
most of the above problem can be fixed by using memcpy() to copy the 4 bytes from asBytes[] to buff[] similar to:
memcpy( &buff[ n*4 ], asBytes, 4 );
Of course, there is also the consideration: Is the length of a float, on your hardware/compiler actually 4 bytes.
'magic' numbers are numbers with no basis. 'magic' numbers make the code much more difficult to understand, debug, etc. I.E. 4. Suggest using something like: length = sizeof( float ); then using length everywhere that 4 is currently being used, except for the number of entries in the Quaternion[] array. for that 'magic' number, strongly suggest the statement: #define arraySize 4 be early in your code. Then using arraySize each time the code references the number of elements in the array
I'm trying to iteratively copy an unsigned char array to a uint_32t variable (in 4 byte blocks), perform some operation on the uint_32t variable, and copy it back to the unsigned char array.
Here's my code:
unsigned char byteArray[len]
for (int i=0; i<len; i+=4) {
uint32_t tmpInt = 0;
memcpy(&tmpInt, byteArray+(i*4), sizeof(uint32_t));
// do some operation on tmpInt here
memcpy((void*)(byteArray+(i*4)), &tmpInt, sizeof(uint32_t));
}
It doesn't work though. What's wrong, and how can I achieve what I want to do?
The problem is that you are adding 4 to i with each iteration and multiplying by 4. You should be using byteArray + i.
Also, as #WeatherVane pointed out below, your loop would be more consistent with a sizeof():
for (int i = 0; i < len; i += sizeof(uint32_t)).
As others pointed out you are doing too much by incrementing i as well as multiplying it by the size of your target.
On top of this
the code shown might run into a buffer overflow issue reading beyond the source array.
the sizeof operator evaluates to size_t not int.
the code repeats defining the size of the target independently several times.
Fixing all, the result might look like this:
unsigned char byte_array[len];
typedef uint32_t target_type;
const size_t s = sizeof (target_type);
for (size_t i = 0; i < (len/s)*s; i += s) {
target_type target;
memcpy(&target, byte_array + i, s);
// do some operation on target here
memcpy(byte_array + i, &target, s);
}
To avoid the typedef just define the target outside of the for-loop:
unsigned char byte_array[len];
{
uint32_t target;
const size_t s = sizeof target;
for (size_t i = 0; i < (len/s)*s; i += s) {
memcpy(&target, byte_array + i, s);
// do some operation on target here
memcpy(byte_array + i, &target, s);
}
}
An equivalent to
byte_array + i
would be
&byte_array[i]
which might be more intuitively to read.
To avoid the "strange" (len/s)*s one could step away from using an index at all, but use a pointer instead:
for (unsigned char p = byte_array; p < byte_array + len; p += s) {
memcpy(&target, p, s);
// do some operation on target here
memcpy(p, &target, s);
}
In my opinion this is a more elegant solution.
I'm using a C program on Linux to read data from a serial port.
The data to read comes from Code Composer Studio from the line: UART_writePolling(uartHandle, (uint8_t*) &value, sizeof(float));
value is the float I want to read in C, where value = 1.5.
When I read in the data from the serial port, in C, into a buffer and print with printf("%u\n", (int)buffer[i]);
I get value to be:
0
0
4294967232
63
and when I insert buffer[i] into a.array and print with
printf("%d\n", a.array[i]);
I get value to be:
0
0
-64
63
I've also tried using unions:
unsigned int value = 0;
for (int j = 3; j >= 0; j--){
//value <<= 8;
value = value + (int)a.array[i+8+j];
}
printf("value: %u\n", value);
data.u = value;
printf("(float): %f\n", data.f);
which doesn't give the correct answer.
How can I use union to get the correct data as a float?
Do I need to use <<?
EDIT: better idea of the code
//headers
typedef struct {
int *array;
size_t used;
size_t size;
} Array;
void initArray(Array *a, size_t initialSize) {
a->array = (int *)malloc(initialSize * sizeof(int));
a->used = 0;
a->size = initialSize;
}
... //more functions/code to resize array and free the memory later
union Data {
float f;
unsigned int u;
};
int main(){
union Data data;
//open serial port code
char buffer[1]; /* Buffer to store the data received,
reading one at a time */
Array a;
initArray(&a, 5); /* initialise an array to store the read data
that is read into buffer*/
//while loop to read in data for some amount of time/data
int b_read = 0;
b_read = read(fd, &buffer, sizeof(buffer));
for (int i=0; i<b_read; i++){
printf("%u\n", (int)buffer[i]);
// how the first set of values above were printed
insertArray(&a, buffer[i]);
// also adding the values read to buffer into array a
}
//end while
// close the port
for(int i=0; i<no. of elements in array a; i++){
printf("%d\n", a.array[i]);
// how the second set of results were printed
}
//below is an attempt at using union and <<:
unsigned int value = 0;
for (int j = 3; j >= 0; j--){
//value <<= 8;
value = value + (int)a.array[i+8+j]; //index used is particular to my code, where value is in a specific place in the array
}
printf("value: %u\n", value);
data.u = value;
printf("(float): %f\n", data.f);
//these printfs don't give a reasonable answer
// free memory
return 0;
}
Once the bytes are in buffer starting at offset i, you can reinterpret the bytes as a float with:
float f;
memcpy(&f, buffer+i, sizeof f);
To use a union, you could use:
union { uint32_t u; float f; } x;
x.u = value;
float f = x.f;
However, this requires that value contain all 32 bits that represent the float. When you attempted to construct the value with:
//value <<= 8;
value = value + (int)a.array[i+8+j];
There are two issues. First, value <<= 8 is needed. I presume you tried it first and did not get a correct answer, so you commented it out. However, it is required. Second, this code to insert the bytes one-by-one into value is order-dependent. Once the shift is restored, it will insert greater-addressed bytes into less-significant bits of value. Systems generally arrange bytes in objects in one of two orders: More significant bytes in lower addresses or more significant bytes in greater addresses. We do not know which order your system uses, so we do not know whether your code to insert the greater-addressed bytes in less significant bytes is correct.
Note: The above assumes that the bytes are read and written in the same order, or that issues of endianness have already been handled in other code.
You use printf with %u but cast into a int. So maybe it's not surprising to have this behavior since 2^32 = 4294967296, and 4294967296 - 64 (your second printf result) = 4294967232 (your first printf result).
Just cast into "unsigned" if you use "%u" or cast into "int" if you use "%d".
I want to declare a double type array dynamically, so here is my code
void function(int length, ...)
{
...
double *a = malloc(sizeof(double) * length);
memset(a, 1, sizeof(double) * length);
for (int i = 0; i < length; i++)
{
printf("%f", a[i]);
}
...
}
When I pass a length of 2, the code does not print all 1s. It just prints the following:
7.7486e-304
7.7486e-304
So, what should I do to fix it?
memset sets bytes. You're trying to set doubles. Just loop from 0 to length and set each one to 1.0:
for (int i = 0; i < length; i ++)
{
a[i] = 1; // or 1.0 if you want to be explicit
}
You are confusing setting an array and setting the underlying memory that stores an array.
A double is made up of 8 bytes. You are setting each byte that makes up the double to 1.
If you want to initialise each element of the array to 1.0 then you can use a for(;;) loop or since you do seem to be using C++ you can use a container and use a constructor to initialise each element (if the constructor has the ability) or use an algorithm to achieve the same effect.
memset sets every byte of your array to 1 not every int or double element.
You are trying to set double values (maybe 8 or more bytes.) Your approach will only work for the number 0.0 as it happens to be represented with all bytes 0 on systems that use IEEE-754 floating point formats. Note that this would be non portable as the C Standard allows other representations for floating point values.
If a was pointing to an array of integers, your approach would work for 0 and -1 and some special values such as 0x01010101... But it would still be a non portable approach as it would fail or even invoke undefined behavior on exotic architectures with padding bits or non 2s complement integer representation.
The correct way to initialize the array is an explicit loop like this:
for (int i = 0; i < length; i++) {
a[i] = 1.0;
}
The compiler will likely compile this loop into very efficient code.
memset sets 1 byte at a time. Because of that, I recommend that you use a custom function to set an array of any data type to a valid value like the following:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void *g_memset(void *dst, void *val, size_t valSize, size_t count);
int main(void)
{
double x = 1.0;
double Array[50];
g_memset(Array, &x, sizeof(x), 20); /* set the 1st 20 elements to 1.0 */
for (int n = 0; n < 20; n++) {
printf("%.1lf ", Array[n]);
}
putchar('\n');
return 0;
}
void *g_memset(void *dst, void *val, size_t valSize, size_t count)
{
char *ptr = (char *)dst;
while (count-- > 0) {
memcpy(ptr, val, valSize);
ptr += valSize;
}
return dst;
}
You use memset to set your every bytes of array a.Double variable are 8 bytes,after memset array a every bytes is 1.
Function memset is for char array.
If you want init your array a you can use a loop(while/for).
int j;
for(j = 0;i < length;j++)
a[j] = 1;
How do I read 3 bytes from unsigned char buffer at once (as a whole number)?
uint_24 p = *(unsigned char[3])buffer;
The above code doesn't work.
If the buffer can be redefined as part of a union and integer endian is as expected:
union {
unsigned char buffer[3];
uint_24 p;
} x;
foo(x.buffer); // somehow data is loaded into x.buffer
uint_24 destination = x.p; // read: let compiler do the work
By putting into a union, alignment issues are satisfied.
The short answer is: you can't (unless the machine int size is 3 bytes).
As machines generally have an even number of bytes as its int size (word size, register size), the hardware architecture will always fetch an even number of bytes from memory over the bus into its registers, or can fetch one single byte into a (lower) register. Hence the solutions provided in the comments to your question load a byte, shift it left and load the next byte etc. Alternatively you can fetch a word and AND-out the upper byte(s). You must also take the endianness into account. Lastly, not all machines can read ints starting at odd memory addersses, or they require them to be alligned at some even multiple.
you can copy any number of bytes that you want as following:
#include <stdio.h>
void showbits(int n)
{
int i,k,andmask;
for(i=31;i>=0;i--)
{
andmask = 1 << i;
k = n & andmask;
k == 0 ? printf("0") : printf("1");
}
printf("\n");
}
int main()
{
unsigned char buff[] = {'a',0,0,
0,'b',0,
0,0,'c'};
//'a'=97=01100001
//'b'=98=01100010
//'c'=99=01100011
void * src_ptr= (void *) buff;
int i;
for(i = 0 ; i < sizeof(buff) ; i += 3)
{
int num = 0 ;
void * num_ptr = #
memcpy(num_ptr , src_ptr , 3);
showbits(num);
src_ptr += 3;
}
return 0;
}
output:
00000000000000000000000001100001 00000000000000000110001000000000
00000000011000110000000000000000