Suspicious pointer-to-pointer conversion (area too small) in C

Suspicious pointer-to-pointer conversion (area too small) in C - c

uint8_t *buf;
uint16_t *ptr = (uint16_t *)buf;
I feel the above code is correct but I get "Suspicious pointer-to-pointer conversion (area too small)" Lint warning. Anyone knows how to solve this warning?

IMHO, don't cast the pointer at all. It changes the way (representation) a variable (and corresponding memory) is accessed, which is very problematic.
If you have to, cast the value instead.

Its a problem in your code
In line uint8_t buf you must have assigned it with an memory area which is 8 bits long
and then in line uint16_t *ptr = (uint16_t *)buf; you try to assign it to a pointer that will access it as 16 bits.
When you try to access with ptr variable you are acessing memory beyond what is assigned to you and its a undefined behaviour.

The problem here is strict aliasing. Imagine that when you have a case like this:
uint8_t raw_data [N]; // chunk of raw data
uint16_t val = 1;
memcpy(raw_data, &val, sizeof(val)); // copy an uint16_t into this array
uint16_t* lets_go_crazy = (uint16_t*)raw_data;
*lets_go_crazy = 5;
print(raw_data);
Assume the print function prints all the values of the bytes. You might then think that the two first bytes have changed to contain your machine's representation of an uint16_t containing the value 5. Not necessarily so.
Because the compiler on the other hand is free to assume that you have never modified the raw_data array since the memcpy, because an uint16_t is not allowed to alias an uint8_t. So it might optimize the whole code into something like:
uint8_t raw_data [N]; // chunk of raw data
uint16_t val = 1;
memcpy(raw_data, &val, sizeof(val)); // copy an uint16_t into this array
print(raw_data);
What will happen is undefined, because the strict aliasing was broken.
Though note that your code might be perfectly fine if, and only if, you can guarantee that your particular compiler does not apply strict aliasing to whatever scenario you have. There are compiler options to disable it on some compilers.
Alternatively, you could use a pointer to a struct or union containing an uint16_t, in which case the aliasing does not apply.

Related

How to convert/align my uint8_t pointer buffer to uint32 for it to work with auto-generated HAL functions

In my auto-generated HAL code for implementing CRC I have the following function:
uint32_t HAL_CRC_Accumulate(CRC_HandleTypeDef *hcrc, uint32_t pBuffer[], uint32_t BufferLength)
{
uint32_t index; /* CRC input data buffer index */
uint32_t temp = 0U; /* CRC output (read from hcrc->Instance->DR register) */
/* Change CRC peripheral state */
hcrc->State = HAL_CRC_STATE_BUSY;
switch (hcrc->InputDataFormat)
{
case CRC_INPUTDATA_FORMAT_WORDS:
/* Enter Data to the CRC calculator */
for (index = 0U; index < BufferLength; index++)
{
hcrc->Instance->DR = pBuffer[index];
}
temp = hcrc->Instance->DR;
break;
case CRC_INPUTDATA_FORMAT_BYTES:
temp = CRC_Handle_8(hcrc, (uint8_t *)pBuffer, BufferLength);
break;
case CRC_INPUTDATA_FORMAT_HALFWORDS:
temp = CRC_Handle_16(hcrc, (uint16_t *)(void *)pBuffer, BufferLength); /* Derogation MisraC2012 R.11.5 */
break;
default:
break;
}
/* Change CRC peripheral state */
hcrc->State = HAL_CRC_STATE_READY;
/* Return the CRC computed value */
return temp;
}
The problem is that my own inBuff is of type uint8_t * inBuff and I can see in the auto-generated code that it needs a uint32_t as input, and also later on just typecasts it to a uint8_t as i use the CRC_INPUTDATA_FORMAT_BYTES option for my CRC. The reason for needing my uint8_t buff is to make it work with the rest of my code (pretty big project). Is there any reason to work around this in an efficient way without damaging my own inBuff content? It needs to be aligned correctly.

STMCubeMX does NOT generate the best code and you have found a good example of this.
First of all, they(ST) DO want you to pass in a uint32_t buffer because they want the data aligned on a 32-bit boundary. See this line:
hcrc->Instance->DR = pBuffer[index];
They are expecting to read a uint32_t value and write it to the 32-bit DR register. A misaligned pointer would cause a fault right there.
You trying to cast your uint8_t buffer to a uint32_t could be problematic if the buffer isn't actually aligned on a 32-bit boundary. You could align yours like this:
uint8_t mybuf[128] __attribute__((aligned(4)));
You would still need to cast your pointer to a uint32_t* to pass it to HAL_CRC_Accumulate so the compiler wouldn't complain, BUT it would work because it is really aligned!
I would highly encourage you to NOT modify code generated by CubeMX. Unless your code is added within special /* USER */ sections of the ST code, it will be lost each time you regenerate the project. This will bite you in the ass over and over and over.

Problems:
A function which does not intend to access a pointer parameter through the pointed-at type should not be declared with that type. If HAL_CRC_Accumulate never intends to access pBuffer as an (array of) uint32_t then it shouldn't be using that type. It should then have been declared as uint8_t*.
Passing an uint8_t* pointing at the first item of an array to a function accepting uint32_t* is highly fishy code and could indeed cause misalignment problems. Most of your problems boils down to: why does this function use uint32_t[]? It doesn't make any sense.
Casting a uint32* to a uint16_t* is similarly bad practice and undefined behavior in case the function that this uint16_t* is passed to will access that parameter as uint16_t. Not only because of possible misalignment but also because of strict aliasing. Again, if that function has no intention of using the uint16_t* to access a uint16_t object, then it shouldn't be using that pointer type. Again, it seems like this whole code should be working on uint8_t* types.
"Chaining" multiple casts like this (uint16_t *)(void *) is nonsense. The void* adds nothing and solves nothing.
Normally a function that calculates CRC would work on a const qualified buffer, because it isn't supposed to change anything. If the so-called "FCS" (calculated checksum) should be appended at the end of the data buffer, then it might be better design to do that separately.
Since MISRA-C is mentioned in comments: all of the above is particularly unacceptable in a mission-critical code base. Auto-generated code is no excuse for sloppy, potentially broken use of types - on the contrary. More importantly than advisory rule 11.5, you have multiple violations of (Required) MISRA-C:2012 rule 11.3 and none of this is MISRA compliant.

Allocate your buffer to the desired size plus sizeof(uint32_t) - 1.
For example, if you buffer is statically allocated to SIZE, then you can do:
uint8_t buff[SIZE + sizeof(uint32_t) - 1];
Set your pointer to the lowest address within your buffer, which aligned to uint32_t:
size_t addr = (size_t)buff + sizeof(uint32_t) - 1;
uint8_t* inBuff = (uint8_t*)(addr / sizeof(uint32_t) * sizeof(uint32_t));

How to convert/align my uint8_t pointer buffer to uint32 for it to work with auto-generated HAL functions
Do nothing.
The problem is that my own inBuff is of type uint8_t * inBuff and I can see in the auto-generated code that it needs a uint32_t as input
So cast the pointer.
HAL_CRC_Accumulate(..., (uint32_t*)your_buffer, ...)
Is there any reason to work around this in an efficient way without damaging my own inBuff content?
No. (?)
Yes, it is unpleasant that it doesn't take a void* pointer or that there aren't just separate API for each INPUTDATA_FORMAT. Still, there is nothing for you to do, just pass the pointer value - the internal routines will access them via uint8_t* proper handle anyway.
The alignment needed for INPUTDATA_FORMAT_BYTES is 1, at bytes boundaries.

Casting uint64_t on bitfield

I found code where bitfield is used for network messages. I would like to know what casting bitfield_struct data = *(bitfield_struct *)&tmp; exaclty does and how it's syntax work. Won't it violate the strict aliasing rule? Here is part of code:
typedef struct
{
unsigned var1 : 1;
unsigned var2 : 13;
unsigned var3 : 8;
unsigned var4 : 10;
unsigned var5 : 7;
unsigned var6 : 12;
unsigned var7 : 7;
unsigned var8 : 6;
} bitfield_struct;
void print_data(u_int64_t * raw, FILE * f, int no_object)
{
uint64_t tmp = ntohll(*raw);
bitfield_struct data = *(bitfield_struct *)&tmp;
...
}

Won't it violate the strict aliasing rule?
Yes it will, so the code invokes undefined behavior. It is also highly non-portable:
We don't know the size of the abstract item called "addressable storage unit" that the given system uses. It isn't necessarily 64 bits, so there could in theory be padding and other nasty things hidden in the bit-field. 64 bit unsigned is fishy.
Neither do we know if the bit-field uses the same bit-order as uint64_t. Nor can we know if they use the same endianess.
If individual bit (fields) of the uint64_t need to be accessed, I would recommend doing so using bitwise shifts, as that makes the code fully portable even between different endianess architectures. Then you don't need the non-portable ntohll call either.

What it does (or attempts to do) is quite straightforward.
uint64_t tmp = ntohll(*raw);
This line takes the value in pointer raw, reverses the byte-order and copies it into temp.
bitfield_struct data = *(bitfield_struct *)&tmp;
This line reinterprets the data in temp (which was a uint64) as type bitfield_struct and copies it into data. This is basically the equivalent of doing:
/* Create a bitfield_struct pointer that points to tmp */
bitfield_struct *p = (bitfield_struct *)&tmp;
/* Copy the value in tmp to data */
bitfield_struct data = *p;
This because normally bitfield_struct and uint64 are incompatible types and you cannot assign one to the other with just bitfield_struct data = tmp;
The code presumably continues to access fields within the bitfield through data, such as data.var1.
Now, like people pointed out, there are several issues which makes this code unreliable and non-portable.
Bit-fields are heavily implementation-dependent. Solution? Read the manual and figure out how your specific compiler variant treats bit-fields. Or don't use bitfields at all.
There is no guarantee that a uint64_t and bitfield_struct have the same alignment. Which means there could be padding which can completely offset your expectations and make you end up with wrong data. One solution is to use memcpy to copy instead of pointers, which might let you this particular issue. Or specify packed alignment using the mechanism provided by your compiler.
The code invokes UB when strict aliasing rules are applied. Solution? Most compilers will have a no-strict-aliasing flag that can be enabled, at a performance cost. Or even better, create a union type with bitfield_struct and uint64_t and use this to reinterpret between one and the other. This is allowed even with the strict-aliasing rules. Using memcpy is also legal, since it treats the data as an array of chars.
However, the best thing to do is not use this piece of code at all. As you may have noticed, it relies too much on compiler and platform specific stuff. Instead, try to accomplish the same thing using bit masks and shifts. This gets rid of all three problems mentioned above, without needing special compiler flags or having to face any real question of portability. Most importantly, it saves other developers reading your code, from having to worry about such things in the future.

Right to left:
&tmp Take address of tmp
(bitfield_struct *)&tmp Address of tmp is address to data of type bitfield_struct
*(bitfield_struct *)&tmp Extract value out of tmp, assuming it's bitfield_struct data
bitfield_struct data = *(bitfield_struct *)&tmp; Store tmp to data, assuming that tmp is bitfield_struct
So it's just copy using extra pointers to avoid compilation errors/warnings of incompatible types.
What you may not understand is bit-addressing of structure.
unsigned var1 : 1;
unsigned var2 : 13;
Here you will find some more info about it: https://www.tutorialspoint.com/cprogramming/c_bit_fields.htm

How does the compiler handle the misalignment?

The SO question Does GCC's __attribute__((__packed__))…? mentions that __attribute__((__packed__)) does "packing which introduces alignment issues when accessing the fields of a packed structure. The compiler will account for that when the the fields are accessed directly, but not when they are accessed via pointers".
How does the compiler makes sure that the fields are accessed directly? I suppose it internally add some padding or does some pointer magic. In the case below, how does the compiler makes sure that the y is accessed correctly compared to the pointer?
struct packet {
uint8_t x;
uint32_t y;
} __attribute__((packed));
int main ()
{
uint8_t bytes[5] = {1, 0, 0, 0, 2};
struct packet *p = (struct packet *)bytes;
// compiler handles misalignment because it knows that
// "struct packet" is packed
printf("y=%"PRIX32", ", ntohl(p->y));
// compiler does not handle misalignment - py does not inherit
// the packed attribute
uint32_t *py = &p->y;
printf("*py=%"PRIX32"\n", ntohl(*py));
return 0;
}

When the compiler sees the notation p->y, it knows you're accessing a structure member, and that the structure is packed, because of the declaration of p. It translates this into code that reads byte by byte, and performs the necessary bit shifting to combine them into a uint32_t variable. Essentially, it treats the expression p->y as if it were something like:
*((char*)p+3) << 24 + *((char*)p+2) << 16 + *((char*p)+1) << 8 + *(char*)p
But when you indirect through *py, the compiler doesn't know where the value of that variable came from. It doesn't know that it points into a packed structure, so that it would need to perform this shifting. py is declared to point to uint32_t, which can normally be accessed using an instruction that reads an entire 32-bit word at once. But this instruction expects the pointer to be aligned to a 4-byte boundary, so when you try to do this you'll get a bus error due to the misalignment.

Pointer array of pointers with C?

I want an array of pointers and I want to set byte values in the memory addresses where the pointers (of the array) are pointing.
Would this work:
unsigned int *pointer[4] = {(unsigned int *) 0xFF200020, (unsigned int *) 0xFF20001C, (unsigned int *) 0xFF200018, (unsigned int *) 0xFF200014};
*pointer[0] = 0b0111111; // the value is correct for the address
Or is the syntax somehow different?
EDIT:
I'm coding for an SOC board and these are memory addresses that contain the case of some UI elements.
unsigned int *element1 = (unsigned int *) 0xFF200020;
*element1 = 0b0111111;
works so I'm just interested about the C syntax of this.
EDIT2: There was one 0 too much in ... = 0b0...

Short answer:
Everything you've written is fine.
Thoughts:
I'm a big fan of using the types from stdint.h. This would let you write uint32_t which is more clearly a 32 bit unsigned number than unsigned long.
You'll often see people write macros to refer to these registers:
#define REG_IRQ (*(volatile uint32_t *)(0xFF200020))
REG_IRQ = 0x42;
It's possible that you actually want these pointers to be to volatile integers. You want it to be volatile if the value can change outside of the execution of your program. That is, if that memory position doesn't act strictly like a piece of memory. (For example, it's a register that stores the interrupt flags).
With most compilers I've used on embedded platforms, you'll have problems from ignoring volatile once optimizations have been enabled.
0b00111111 is, sadly, non-standard. You can use octal, decimal, or hexadecimal.

Sure, this should work, providing you can find addresses in your own segment.
Most probably, you'll have a segmentation fault when running this code, because 0xFF200020 have really few chances to be in your program segment.

This will not throw any error and will work fine but hard-coding memory address the pointer is pointing to is not a good idea. De-referencing some unknown/non-existing memory location will cause segmentation fault but if you are sure about the memory location and hard-coding values to them as done here is totally fine.

C Casting from pointer to uint32_t to a pointer to a union containing uint32_t

I'd like to know if casting a pointer to uint32_t to a pointer of a union containing a uint32_t will lead to defined behavior in C, i.e.
typedef union
{
uint8_t u8[4];
uint32_t u32;
} T32;
void change_value(T32 *t32)
{
t32->u32 = 5678;
}
int main()
{
uint32_t value = 1234;
change_value((T32 *)&value); // value is 5678 afterwards
return EXIT_SUCCESS;
}
Is this valid C? Many thanks in advance.

The general answer to your question is, no, this is in general not defined. If the union contains a field that has larger alignment than uint32_t such a union must have the largest alignment and accessing that pointer would then lead to UB. This could e.g happen if you replace uint8_t in your example by double.
In your particular case, though, the behavior is well defined. uint8_t, if it exists, is most likely nothing other than unsigned char and all character types always have the least alignment requirement.
Edit:
As R.. mentions in his comments there are other issues with your approach. First, theoretically, uint8_t could be different from unsigned char if there is an unsigned "extended integer type" of that width. This is very unlikely, I never heard of such an architecture. Second, your approach is subject to aliasing issues, so you should be extremely careful.

At the risk of incurring downvotes... Conceptually, there is nothing wrong with what you are trying to do. That is, define a piece of storage that can be viewed as four bytes an a 32 bit integer, and then reference and modify that storage using a pointer.
However, I would ask why you would want to write code where its intent is obscured. What you are really doing is forcing the next programmer who reads your code to think for minutes and maybe even try a little test program. Thus, this programming style is "expensive".
You could have just as easily defined, value as:
T32 value;
// etc.
change_value(&value);
and then avoid the cast and subsequent angst.

Since all union members are guaranteed to start at the same memory address, your program as written does not lead to undefined behavior.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight