Location of function call changes functionality - c

When writing to a GPIO output register (BCM2711 chip on the Raspberry Pi 4), the location of my clear_register function call changes the result completely. The code is below.
void clear_register(long reg) {
*(unsigned int*)reg = (unsigned int)0;
int set_pin(unsigned int pin_number) {
long reg = SET_BASE; //0xFE200001c
unsigned int curval = *(unsigned int*)reg;
unsigned int value = (1 << pin_number);
value |= curval;
gpio_write(reg, value);
return 1;
When written like this, the register gets cleared. However, if I call clear_register(SET_BASE) before long reg = SET_BASE, the registers don't get cleared. I can't tell why this would make a difference; any help is appreciated.

I can't reproduce it since I don't have the target hardware which allows writing to that address but you have several bugs and dangerous style issues:
0xF E2 00 00 1c (64 bit signed) does not necessarily fit inside a long (possibly 32 bit) and certainly does not fit inside a unsigned int (16 or 32 bits).
Even if that was a typo and you meant to write 0xFE20001c (32 bit), your program still stuffers from "sloppy typing", which is when the programmer just type out long, int etc without given it any deeper thought. The outcome is strange, subtle and intermittent bugs. I think everyone can relate to the situation "just cast it to unsigned int and it works" but you have no idea why. Bugs like that are always caused by "sloppy typing".
You should be using the stdint.h types instead. uint32_t and so on.
Also related to "sloppy typing", there exist very few cases where you want to actually use signed numbers in embedded systems - certainly not when dealing with addresses and bits. Negative addresses in Linux is a thing - kernel space - but accidentally changing an address to a negative value is a bug.
Signed types will only cause problems when doing bitwise arithmetic and they come with overflows. So you need to nuke the presence of every signed variable in your code unless it explicitly needs to be signed.
This means no sloppy long but the correct type, which is either uint32_t or uintptr_t if it should hold an address, as is the case in your example. It also mean so sloppy hex constants, they must always have u suffix: 0xFE20001Cu or they easily boil down to the wrong type.
For these reasons 1 << is always a bug in a C program, because 1 is of type signed int and you may end up shifting into the sign bit of that signed int, which is undefined behavior. There exists no reason why you shouldn't write 1u << , always.
Whenever you are dealing with hardware registers, you must always use volatile qualified pointers or you might end up with very strange bugs caused by the optimizer. No exceptions here either.
(unsigned int)0 is a pointless cast. If you wish to be explict you could write 0u but that doesn't change anything in case of assignment, since a "lvalue conversion" to the type of the left operand always happens upon assignment. 0u does make some static analysers happy though, MISRA C checkers etc.
With all bug fixes, your clear function should for example look something like this:
void clear_register (uintptr_t reg) {
*(volatile uint32_t*)reg = 0;
Or better yet, don't write functions for such very basic things! *(volatile uint32_t*)SET_BASE = 0; is perfectly clear, readable and self-documented code. While clear_register(SET_BASE) suggests that something more advanced than just a simple write is taking place.
However, you could have placed the cast inside the SET_BASE macro. For details and examples see How to access a hardware register from firmware?


Are leftshift operators dependent on register size?

Let uint8 and uint16 be datatypes for 8bit and 16bit positive integers.
uint8 a = 1;
uint16 b = a << 8;
I tested this program on 32Bit architecture with result
b = 256
Would the same programm on a system with registers of 8bit length yield the result:
b = 0 ?
because all bits in register gets shifted to 0 by a << 8?
Registers are irrelevant. This is about the width of your types.
When you shift a value by more bits than it possesses, the behaviour is undefined. The compiler, the program, the computer, the tax office can legally manifest any results accordingly. And, no, that's not just theoretical.
However, operands in C are promoted before interesting things are done on them. So, your uint8_t becomes an int before the left-shift.
Now it depends on your architecture (as determined by your compiler configuration) as to what happens: is int on your implementation only 8-bit? No, it's not! The result, then — regardless of any "register size" — must abide by the rules of the language, yielding the mathematically appropriate answer (256). And, even if it were, you'd hit that undefined behaviour so the question would be moot.
Under the bonnet, if more than one register is needed to hold a variable, then that's what will and must happen (at whatever performance cost is implied as a result). That's if a register is used at all; remember, you're programming in an abstraction, not hand-crafting machine code. The program snippet you showed can be completely optimised away during compilation and doesn't require any runtime instructions at all.
Would the same programm on a system with registers of 8bit length the result be b=0?
In the expression a << 8 the variable a will get promoted to an int before the bit shift. And an int is guaranteed to be at least 16 bits.
b will have the value 256 on all platforms unless there's a bug in the compiler.
However, if you changed the second line to uint32 b = a << 16; you might get strange results. a would still get promoted to an int, but if int is two bytes long, then a << 16 will invoke undefined behavior.

Does the return type save space or time

Does the return type matter when returning from a function?
This is kind of a 2-part question.
I believe an 8-bit operation would be the same as a 32-bit operation.
I believe an 8-bit value is operated on in a 32-bit register, so it will be promoted to a 32-bit value. Then it would be casted back down to an 8-bit value.
unsigned char SomeFunc() <- Quickest and less memory.
unsigned short SomeFunc()
unsigned long SomeFunc()
"All operations should be performed on the smallest variable wherever possible, this saves both time and space" True or False?
On a 32 bit operating system, I don't believe it would matter, since the return register is 32 bits anyways, whether it be a variable or an address.
So it would neither save time, nor space.
I do understand that there might be a need to return a char/byte, if that's all your dealing with, but you could still return a long and cast it.
I think your still casting either way whether before or after you leave the function. I almost think it is easier and faster to deal with 32 bit values than 16 or 8 bit values.
Second part.
In the following function, I don't believe it would make it any quicker or save any more space if I were to return a unsigned short instead.
unsigned long SomeFunc(unsigned char a, unsigned char b);
unsigned long c = a + b;
return c;
unsigned long SomeFunc(unsigned char a);
//This will be promoted to a 32-bit value anyways.
return a & 0x1;
The following function would somehow be quicker and take up less memory?
unsigned char SomeFunc(unsigned char a);
//This will be promoted to a 32-bit value anyways.
return a & 0x1;
tl;dr: Trust your optimizer. Don't fight the type system.
Yes, you should use the smallest type.
Depending on your compiler they may compile differently and your compiler might have good reason to do that. If you lie to your compiler about the types you're messing with its ability to optimize your code.
And because you don't know how the return value will be used.
Not all memory is a single variable. Consider arrays and structs. They are allocated in large blocks beyond the 32 or 64 bit native sizes. If you return larger type than necessary you're forcing that an array or struct storing it to use more memory. For example, if you return an int where you should be returning a char and I have to store those values in array that array will be 4 to 8 times larger.
I do understand that there might be a need to return a char/byte, if that's all your dealing with, but you could still return a long and cast it. I think your still casting either way whether before or after you leave the function. I almost think it is easier and faster to deal with 32 bit values than 16 or 8 bit values.
The return value tells people reading the code what type to use to store the return value. If you habitually use larger types than necessary and just need to know which ones you can cast to smaller types, that's hidden information only you know. And if it's in your head you're going to forget.
Gratuitous casting defeats the safety of the type system. Type checks let you know if you're putting data of the wrong type or size into the wrong place. Casting tells the compiler "I know this looks wrong, but trust me I know what I'm doing". This should be done only when necessary. If you're gratuitously casting you lose this help from the compiler. If you make a mistake the compiler cannot help you.
Finally, it will befuddle anyone reading your code. They'll scratch their heads and wonder why you're always shoving longs into shorts and ints into chars. They'll never know which are ok and which are mistakes.
As to unsigned long vs unsigned char, compiling your two functions with clang -O3 -S and diffing the assembly reveals a slight difference:
- movq %rdi, %rax
+ movl %edi, %eax
The unsigned char implementation will use a 32-bit register while unsigned long will use a 64-bit register. Does this matter? Dunno, probably not. Definitely not enough to defeat the type system.

AVR uint8_t doesn't get correct value

I have a uint8_t that should contain the result of a bitwise calculation. The debugger says the variable is set correctly, but when i check the memory, the var is always at 0. The code proceeds like the var is 0, no matter what the debugger tells me. Here's the code:
temp = (path_table & (1 << current_bit)) >> current_bit;
//temp is always 0, debugger shows correct value
if (temp > 0) {
DS18B20_send_bit(pin, 0x01);
} else {
DS18B20_send_bit(pin, 0x00);
Temp's a uint8_t, path_table's a uint64_t and current_bit's a uint8_t. I've tried to make them all uint64_t but nothing changed. I've also tried using unsigned long long int instead. Nothing again.
The code always enters the else clause.
Chip's Atmega4809, and uses uint64_t in other parts of the code with no issues.
Note - If anyone knows a more efficient/compact way to extract a single bit from a variable i would really appreciate if you could share ^^
1 is an integer constant, of type int. The expression 1 << current_bit also has type int, but for 16-bit int, the result of that expression is undefined when current_bit is larger than 14. The behavior being undefined in your case, then, it is plausible that your debugger presents results for the overall expression that seem inconsistent with the observed behavior. If you used an unsigned int constant instead, i.e. 1u, then the resulting value of temp would be well defined as 0 whenever current_bit was greater than 15, because the result of the left shift would be zero.
Solve this problem by performing the computation in a type wide enough to hold the result. Here's a compact, correct, and pretty clear way to correct your code to do that:
DS18B20_send_bit(pin, (path_table & (((uint64_t) 1) << current_bit)) != 0);
Or if path_table has an unsigned type then I prefer this, though it's more of a departure from your original:
DS18B20_send_bit(pin, (path_table >> current_bit) & 1);
Realization #1 here is that AVR is 1980-1990s technology core. It is not a x64 PC that chews 64 bit numbers for breakfast, but an extremely inefficient 8-bit MCU. As such:
It likes 8 bit arithmetic.
It will struggle with 16 bit arithmetic, by doing tricks with 16 bit index registers, double accumulators or whatever 8 bit core tricks it prefers to do.
It will literally take ages to execute 32 bit arithmetic, by invoking software libraries inline.
It will probably melt through the floor if attempting 64 bit arithmetic.
Before you do anything else, you need to get rid of all 64 bit arithmetic and radically minimize the use of 32 bit arithmetic. Period. There should be no single variable of uint64_t in your code or you are doing it very very wrong.
With this revelation also comes that all 8 bit MCUs always have an int type which is 16 bits.
In the code 1<<current_bit, the integer constant 1 is of type int. Meaning that if current_bit is 15 or larger, you will shift bits into the sign bit of this temporary int. This is always a bug. Strictly speaking this is undefined behavior. In practice, you might end up with random change of sign of your numbers.
To avoid this, never use any form of bitwise operators on signed numbers. When mixing integer constants such as 1 with bitwise operators, change them to 1u to avoid bugs like the one mentioned.
If anyone knows a more efficient/compact way to extract a single bit from a variable i would really appreciate if you could share
The most efficient way in C is: uint8_t variable; ... if(variable & (1u << bits)). This should translate to the relevant "branch if bit set" instruction.
My general advise would be find your tool chain's disassembler and see what machine code that the C code actually generated. You don't have to be an assembler guru to read it, peeking at the instruction set should be enough.

What's the point in specifying unsigned integers with "U"?

I have always, for as long as I can remember and ubiquitously, done this:
for (unsigned int i = 0U; i < 10U; ++i)
// ...
In other words, I use the U specifier on unsigned integers. Now having just looked at this for far too long, I'm wondering why I do this. Apart from signifying intent, I can't think of a reason why it's useful in trivial code like this?
Is there a valid programming reason why I should continue with this convention, or is it redundant?
First, I'll state what is probably obvious to you, but your question leaves room for it, so I'm making sure we're all on the same page.
There are obvious differences between unsigned ints and regular ints: The difference in their range (-2,147,483,648 to 2,147,483,647 for an int32 and 0 to 4,294,967,295 for a uint32). There's a difference in what bits are put at the most significant bit when you use the right bitshift >> operator.
The suffix is important when you need to tell the compiler to treat the constant value as a uint instead of a regular int. This may be important if the constant is outside the range of a regular int but within the range of a uint. The compiler might throw a warning or error in that case if you don't use the U suffix.
Other than that, Daniel Daranas mentioned in comments the only thing that happens: if you don't use the U suffix, you'll be implicitly converting the constant from a regular int to a uint. That's a tiny bit extra effort for the compiler, but there's no run-time difference.
Should you care? Here's my answer, (in bold, for those who only want a quick answer): There's really no good reason to declare a constant as 10U or 0U. Most of the time, you're within the common range of uint and int, so the value of that constant looks exactly the same whether its a uint or an int. The compiler will immediately take your const int expression and convert it to a const uint.
That said, here's the only argument I can give you for the other side: semantics. It's nice to make code semantically coherent. And in that case, if your variable is a uint, it doesn't make sense to set that value to a constant int. If you have a uint variable, it's clearly for a reason, and it should only work with uint values.
That's a pretty weak argument, though, particularly because as a reader, we accept that uint constants usually look like int constants. I like consistency, but there's nothing gained by using the 'U'.
I see this often when using defines to avoid signed/unsigned mismatch warnings. I build a code base for several processors using different tool chains and some of them are very strict.
For instance, removing the ‘u’ in the MAX_PRINT_WIDTH define below:
#define MAX_PRINT_WIDTH (384u)
#define IMAGE_HEIGHT (480u) // 240 * 2
#define IMAGE_WIDTH (320u) // 160 * 2 double density
Gave the following warning:
"..\Application\Devices\MartelPrinter\mtl_print_screen.c", line 106: cc1123: {D} warning:
comparison of unsigned type with signed type
for ( x = 1; (x < IMAGE_WIDTH) && (index <= MAX_PRINT_WIDTH); x++ )
You will probably also see ‘f’ for float vs. double.
I extracted this sentence from a comment, because it's a widely believed incorrect statement, and also because it gives some insight into why explicitly marking unsigned constants as such is a good habit.
...it seems like it would only be useful to keep it when I think overflow might be an issue? But then again, haven't I gone some ways to mitigating for that by specifying unsigned in the first place...
Now, let's consider some code:
int something = get_the_value();
// Compute how many 8s are necessary to reach something
unsigned count = (something + 7) / 8;
So, does the unsigned mitigate potential overflow? Not at all.
Let's suppose something turns out to be INT_MAX (or close to that value). Assuming a 32-bit machine, we might expect count to be 229, or 268,435,456. But it's not.
Telling the compiler that the result of the computation should be unsigned has no effect whatsoever on the typing of the computation. Since something is an int, and 7 is an int, something + 7 will be computed as an int, and will overflow. Then the overflowed value will be divided by 8 (also using signed arithmetic), and whatever that works out to be will be converted to an unsigned and assigned to count.
With GCC, arithmetic is actually performed in 2s complement so the overflow will be a very large negative number; after the division it will be a not-so-large negative number, and that ends up being a largish unsigned number, much larger than the one we were expecting.
Suppose we had specified 7U instead (and maybe 8U as well, to be consistent). Now it works.. It works because now something + 7U is computed with unsigned arithmetic, which doesn't overflow (or even wrap around.)
Of course, this bug (and thousands like it) might go unnoticed for quite a lot of time, blowing up (perhaps literally) at the worst possible moment...
(Obviously, making something unsigned would have mitigated the problem. Here, that's pretty obvious. But the definition might be quite a long way from the use.)
One reason you should do this for trivial code1 is that the suffix forces a type on the literal, and the type may be very important to produce the correct result.
Consider this bit of (somewhat silly) code:
#define magic_number(x) _Generic((x), \
unsigned int : magic_number_unsigned, \
int : magic_number_signed \
unsigned magic_number_unsigned(unsigned) {
// ...
unsigned magic_number_signed(int) {
// ...
int main(void) {
unsigned magic = magic_number(10u);
It's not hard to imagine those function actually doing something meaningful based on the type of their argument. Had I omitted the suffix, the generic selection would have produced a wrong result for a very trivial call.
1 But perhaps not the particular code in your post.
In this case, it's completely useless.
In other cases, a suffix might be useful. For instance:
#include <stdio.h>
printf("%zu\n", sizeof(123));
printf("%zu\n", sizeof(123LL));
return 0;
On my system, it will print 4 then 8.
But back to your code, yes it makes your code more explicit, nothing more.

Safely punning char* to double in C

In an Open Source program I
wrote, I'm reading binary data (written by another program) from a file and outputting ints, doubles,
and other assorted data types. One of the challenges is that it needs to
run on 32-bit and 64-bit machines of both endiannesses, which means that I
end up having to do quite a bit of low-level bit-twiddling. I know a (very)
little bit about type punning and strict aliasing and want to make sure I'm
doing things the right way.
Basically, it's easy to convert from a char* to an int of various sizes:
int64_t snativeint64_t(const char *buf)
/* Interpret the first 8 bytes of buf as a 64-bit int */
return *(int64_t *) buf;
and I have a cast of support functions to swap byte orders as needed, such
int64_t swappedint64_t(const int64_t wrongend)
/* Change the endianness of a 64-bit integer */
return (((wrongend & 0xff00000000000000LL) >> 56) |
((wrongend & 0x00ff000000000000LL) >> 40) |
((wrongend & 0x0000ff0000000000LL) >> 24) |
((wrongend & 0x000000ff00000000LL) >> 8) |
((wrongend & 0x00000000ff000000LL) << 8) |
((wrongend & 0x0000000000ff0000LL) << 24) |
((wrongend & 0x000000000000ff00LL) << 40) |
((wrongend & 0x00000000000000ffLL) << 56));
At runtime, the program detects the endianness of the machine and assigns
one of the above to a function pointer:
int64_t (*slittleint64_t)(const char *);
if(littleendian) {
slittleint64_t = snativeint64_t;
} else {
slittleint64_t = sswappedint64_t;
Now, the tricky part comes when I'm trying to cast a char* to a double. I'd
like to re-use the endian-swapping code like so:
double d;
int64_t i;
} int64todouble;
int64todouble.i = slittleint64_t(bufoffset);
printf("%lf", int64todouble.d);
However, some compilers could optimize away the "int64todouble.i" assignment
and break the program. Is there a safer way to do this, while considering
that this program must stay optimized for performance, and also that I'd
prefer not to write a parallel set of transformations to cast char* to
double directly? If the union method of punning is safe, should I be
re-writing my functions like snativeint64_t to use it?
I ended up using Steve Jessop's answer because the conversion functions re-written to use memcpy, like so:
int64_t snativeint64_t(const char *buf)
/* Interpret the first 8 bytes of buf as a 64-bit int */
int64_t output;
memcpy(&output, buf, 8);
return output;
compiled into the exact same assembler as my original code:
movq (%rdi), %rax
Of the two, the memcpy version more explicitly expresses what I'm trying to do and should work on even the most naive compilers.
Adam, your answer was also wonderful and I learned a lot from it. Thanks for posting!
I highly suggest you read Understanding Strict Aliasing. Specifically, see the sections labeled "Casting through a union". It has a number of very good examples. While the article is on a website about the Cell processor and uses PPC assembly examples, almost all of it is equally applicable to other architectures, including x86.
Since you seem to know enough about your implementation to be sure that int64_t and double are the same size, and have suitable storage representations, you might hazard a memcpy. Then you don't even have to think about aliasing.
Since you're using a function pointer for a function that might easily be inlined if you were willing to release multiple binaries, performance must not be a huge issue anyway, but you might like to know that some compilers can be quite fiendish optimising memcpy - for small integer sizes a set of loads and stores can be inlined, and you might even find the variables are optimised away entirely and the compiler does the "copy" simply be reassigning the stack slots it's using for the variables, just like a union.
int64_t i = slittleint64_t(buffoffset);
double d;
memcpy(&d,&i,8); /* might emit no code if you're lucky */
printf("%lf", d);
Examine the resulting code, or just profile it. Chances are even in the worst case it will not be slow.
In general, though, doing anything too clever with byteswapping results in portability issues. There exist ABIs with middle-endian doubles, where each word is little-endian, but the big word comes first.
Normally you could consider storing your doubles using sprintf and sscanf, but for your project the file formats aren't under your control. But if your application is just shovelling IEEE doubles from an input file in one format to an output file in another format (not sure if it is, since I don't know the database formats in question, but if so), then perhaps you can forget about the fact that it's a double, since you aren't using it for arithmetic anyway. Just treat it as an opaque char[8], requiring byteswapping only if the file formats differ.
The standard says that writing to one field of a union and reading from it immediately is undefined behaviour. So if you go by the rule book, the union based method won't work.
Macros are usually a bad idea, but this might be an exception to the rule. It should be possible to get template-like behaviour in C using a set of macros using the input and output types as parameters.
As a very small sub-suggestion, I suggest you investigate if you can swap the masking and the shifting, in the 64-bit case. Since the operation is swapping bytes, you should be able to always get away with a mask of just 0xff. This should lead to faster, more compact code, unless the compiler is smart enough to figure that one out itself.
In brief, changing this:
(((wrongend & 0xff00000000000000LL) >> 56)
into this:
((wrongend >> 56) & 0xff)
should generate the same result.
Removed comments regarding how to effectively store data always big endian and swapping to machine endianess, as questioner hasn't mentioned another program writes his data (which is important information).Still if the data needs conversion from any endian to big and from big to host endian, ntohs/ntohl/htons/htonl are the best methods, most elegant and unbeatable in speed (as they will perform task in hardware if CPU supports that, you can't beat that).
Regarding double/float, just store them to ints by memory casting:
double d = 3.1234;
printf("Double %f\n", d);
int64_t i = *(int64_t *)&d;
// Now i contains the double value as int
double d2 = *(double *)&i;
printf("Double2 %f\n", d2);
Wrap it into a function
int64_t doubleToInt64(double d)
return *(int64_t *)&d;
double int64ToDouble(int64_t i)
return *(double *)&i;
Questioner provided this link:
as a prove that casting is bad... unfortunately I can only strongly disagree with most of this page. Quotes and comments:
As common as casting through a pointer
is, it is actually bad practice and
potentially risky code. Casting
through a pointer has the potential to
create bugs because of type punning.
It is not risky at all and it is also not bad practice. It has only a potential to cause bugs if you do it incorrectly, just like programming in C has the potential to cause bugs if you do it incorrectly, so does any programming in any language. By that argument you must stop programming altogether.
Type punning A form of pointer
aliasing where two pointers and refer
to the same location in memory but
represent that location as different
types. The compiler will treat both
"puns" as unrelated pointers. Type
punning has the potential to cause
dependency problems for any data
accessed through both pointers.
This is true, but unfortunately totally unrelated to my code.
What he refers to is code like this:
int64_t * intPointer;
// Init intPointer somehow
double * doublePointer = (double *)intPointer;
Now doublePointer and intPointer both point to the same memory location, but treating this as the same type. This is the situation you should solve with a union indeed, anything else is pretty bad. Bad that is not what my code does!
My code copies by value, not by reference. I cast a double to int64 pointer (or the other way round) and immediately deference it. Once the functions return, there is no pointer held to anything. There is a int64 and a double and these are totally unrelated to the input parameter of the functions. I never copy any pointer to a pointer of a different type (if you saw this in my code sample, you strongly misread the C code I wrote), I just transfer the value to a variable of different type (in an own memory location). So the definition of type punning does not apply at all, as it says "refer to the same location in memory" and nothing here refers to the same memory location.
int64_t intValue = 12345;
double doubleValue = int64ToDouble(intValue);
// The statement below will not change the value of doubleValue!
// Both are not pointing to the same memory location, both have their
// own storage space on stack and are totally unreleated.
intValue = 5678;
My code is nothing more than a memory copy, just written in C without an external function.
int64_t doubleToInt64(double d)
return *(int64_t *)&d;
Could be written as
int64_t doubleToInt64(double d)
int64_t result;
memcpy(&result, &d, sizeof(d));
return result;
It's nothing more than that, so there is no type punning even in sight anywhere. And this operation is also totally safe, as safe as an operation can be in C. A double is defined to always be 64 Bit (unlike int it does not vary in size, it is fixed at 64 bit), hence it will always fit into a int64_t sized variable.
