Embedded C: what does var = 0xFF; do? - c

I'm working with embedded C for the first time. Although my C is rusty, I can read the code but I don't really have a grasp on why certain lines are the way the are. For example, I want to know if a variable is true or false and send it back to another application. Rather than setting the variable to 1 or 0, the original implementor chose 0xFF.
Is he trying to set it to an address space? or else why set a boolean variable to be 255?

0xFF sets all the bits in a char.
The original implementer probably decided that the standard 0 and 1 wasn't good enough and decided that if all bits off is false then all bits on is true.
That works because in C any value other than 0 is true.
Though this will set all bytes in a char, it will also work for any other variable type, since any one bit being set in a variable makes it true.

If you are in desperate need of memory, you might want to store 8 booleans in one byte (or 32 in a long, or whatever)
This can easily be done by using a flag mask:
// FLAGMASK = ..1<<n for n in 0..7...
FLAGMASK = 0x10; // e.g. n=4
flags &= ~FLAGMASK; // clear bit
flags |= FLAGMASK; // set bit
flags ^= FLAGMASK; // flip bit
flags = (flags & ~FLAGMASK) | (booleanFunction() & FLAGMASK); // clear, then maybe set
this only works when booleanFunction() returns 0 (all bits clear) or -1 (all bits set).

0xFF is the hex representation of ~0 (i.e. 11111111)
In, for example, VB and Access, -1 is used as True.

These young guys, what do they know?
In one of the original embedded languages - PL/M (-51 yes as in 8051, -85, -86, -286, -386) - there was no difference between logical operators (!, &&, || in C) and bitwise (~, &, |, ^). Instead PL/M has NOT, AND, OR and XOR taking care of both categories. Are we better off with two categories? I'm not so sure. I miss the logical ^^ operator (xor) in C, though. Still, I guess it would be possible to construct programs in C without having to involve the logical category.
In PL/M False is defined as 0. Booleans are usually represented in byte variables. True is defined as NOT False which will give you 0ffh (PL/M-ese for C's 0xff).
To see how the conversion of the status flag carry took place defore being stored in a byte (boolean wasn't available as a type) variable, PL/M could use the assembly instruction "sbb al,al" before storing. If carry was set al would contain 0ff, if it wasn't it would contain 0h. If the opposite value was required, PL/M would insert a "cmc" before the sbb or append a "not al" after (actually xor - one or the other).
So the 0xff for TRUE is a direct compatibility port from PL/M. Necessary? Probably not, unless you're unsure of your skills (in C) AND playing it super safe.
As I would have.
PL/M-80 (used for the 8080, 8085 and Z80) did not have support for integers or floats, and I suspect it was the same for PL/M-51. PL/M-86 (used for the 8086, 8088, 80188 and 80186) added integers, single precision floating point, segment:offset pointers and the standard memory models small, medium, compact and large. For those so inclined there were special directives to create do-it-yourself hybrid memory models. Microsoft's huge memory model was equivalent to intel's large. MS also sported tiny, small, compact, medium and large models.

Often in embedded systems there is one programmer who writes all the code and his/her idiosyncrasies are throughout the source. Many embedded programmers were HW engineers and had to get a system running as best they could. There was no requirement nor concept of "portability". Another consideration in embedded systems is the compiler is specific for the CPU HW. Refer to the ISA for this CPU and check all uses of the "boolean".

As others have said, it's setting all the bits to 1. And since this is embedded C, you might be storing this into a register where each bit is important for something, so you want to set them all to 1. I know I did similar when writing in assembler.

What's really important to know about this question is the type of "var". You say "boolean", but is that a C++/C99's bool, or is it (highly likely, being an embedded C app), something of a completely different type that's being used as a boolean?

Also adding 1 to 0xff sets it to 0( assuming unsigned char) and the checking might have been in a loop with an increment to break.

Here's a likely reason: 0xff is the binary complement of 0. It may be that on your embedded architecture, storing 0xff into a variable is more efficient than storing, say, 1 which might require extra instructions or a constant stored in memory.
Or perhaps the most efficient way to check the "truth value" of a register in your architecture is with a "check bit set" instruction. With 0xff as the TRUE value, it doesn't matter which bit gets checked... they're all set.
The above is just speculation, of course, without knowing what kind of embedded processor you're using. 8-bit, 16-bit, 32-bit? PIC, AVR, ARM, x86???
(As others have pointed out, any integer value other than zero is considered TRUE for the purposes of boolean expressions in C.)

Related

Optimized way to find new bits on

We need to find new bits turned ON in interlock status received from the device compared to the last status read. This is for firing error codes for bits that are newly set. I am using the following statement.
bits_on =~last_status & new_status;
Is there any better ways to do this?
It's only 2 operations and an assignment, so the only way to improve it would be to do it in 1 operation and an assignment. It doesn't correspond to any of the simple C bit manipulation operators, so doing it in 1 operation is not possible.
However, depending on your architecture your compiler might actually already be compiling it to a single instruction.
ANDN (Logical AND NOT) is part of the BMI1 instruction set, and is equivalent to ~x & y.

AVR: if statement

I am new in AVR programming. I would like to control a variable (uint8_t received_msg) if it is equal to 0xFF. would it be correct to do:
if (!(received_msg ^ 0xFF))
or do I need to compare bit by bit
uint8_t test = 0;
test = received_msg ^ 0xFF
for (i =0; i<8; i++){
test = 0 & (1<<received_msg)
}
if(test==0)
If you want to know if a variable is equal to 0xff, just test for equality:
if (received_message == 0xff)
Your question had fairly little to do with the AVR but some mistaken ideas about how compilers and microcontrollers work. That's not a complaint that it's a bad question - any question that helps you learn is good!
(TLDR: "use bitwise operators" is only in contrast to AVR specific stuff, feel absolutely free to use all your normal operations.)
First, you've expressed what you want to do - an equality test - in English. The whole point of a programming language like C is to allow you to express computed operations in a fairly readable manner, so use the most obvious (and thus clear) translation of received_msg == 0xFF - it is the compiler's job to convert this into code for the specific computer (AVR), and even if it does a horrible job of it it will waste no more than a few microseconds. (It doesn't, but if you make the code convoluted enough it can fail to do an excellent job.)
Second, you've attempted to express the same operation - comparing every bit against a set value, and collecting the result to see if they were all equal - in two other manners. This gets tricky both to read and write, as is shown by the bugs in the second version, but more importantly the second version shows a misunderstanding of what C's bitwise operators do. Bitwise here means each bit of a value is processed independent of the other bits; they are still all processed. Therefore splitting it into a loop is not needed, and only makes the job of both programmer and compiler harder. The technique used to make bitwise operators only affect single bits, not to be confused with which they operate on, is known as masking; it relies on properties like "0 or n = n", "1 and n = n", and "0 xor n = n".
I'm also getting the impression this was based around the idea that a microcontroller like the AVR would be working on individual bits all the time. This is extremely rare, but frequently emulated by PLCs. What we do have is operations making single bit work less costly than on general purpose CPUs. For instance, consider "PORTB |= 1<<3". This can be read as a few fundamental operations:
v0 := 1 // load immediate
v1 := 3
v2 := v0 shiftleft v1 // shift left
v3 := PORTB // load I/O register
v4 := v3 or v2
PORTB := v4 // store back to I/O register
This interpretation would be an extremely reduced instruction set, where loads and stores never combine with ALU operations such as shift and or. You may even get such code out of the compiler if you ask it not to optimize at all. But since it's such a common operation for a microcontroller, the AVR has a single instruction to do this without spending registers on holding v0-v4:
SBI PORTB, 3 // (set bit in I/O register)
This brings us from needing two registers (from reusing vN which are no longer needed) and six instructions to zero registers and one instruction. Further gains are possible because once it's a single instruction, one can use a skip instead of a branch. But it relies on a few things being known, such as 1<<3 setting only a single, fixed bit, and PORTB being among the lowest 32 I/O registers. If the compiler did not know these things, it could never use the SBI instructions, and there was such a time. This is why we have the advice "use the bitwise operators" - you no longer need to write sbi(PORTB,PB3);, which is inobvious to people who don't know the AVR instruction set, but can now write PORTB |= 1<<3; which is standard C, and therefore clearer while being just as effective. Arguably better macro naming might make more readable code too, but many of these macros came along as typing shorthands instead - for instance _BV(x) which is equal to 1<<x.
Sadly some of the standard C formulations become rather tricky, like clearing bit N: port &= ~(1<<N); It makes a pretty good case for a "clear_bit(port, bit)" macro, like Arduino's digitalWrite. Some microcontrollers (such as 8051) provide specific addresses for single bit work, and some compilers provide syntax extensions such as port.3. I sometimes wonder why AVR Libc doesn't declare bitfields for bit manipulation. Pardon the rant. There also remain some optimizations the compiler doesn't know of, such as converting PORTB ^= x; into PINB = x; (which really looks weird - PIN registers aren't writable, so they used that operation for another function).
See also the AVR Libc manual section on bit manipulation, particularly "Porting programs that use the deprecated sbi/cbi macros".
You can also try useful switch(){ case } statement like :
#define OTHER_CONST_VALUE 0x19
switch(received_msg){
case 0xff:
do_this();
break;
case 0x0f:
do_that();
break;
case OTHER_CONST_VALUE:
do_other_thing();
break;
case 1:
case 2:
received_1_or_2();
break;
default:
received_somethig_else();
break;
}
this code will execute command depending on value of received_msg, it is important to place constant value after case word, and be careful with break statement it tells when jump off from { } block.
I'm unsure of what received_msg will be representing. If it is a numerical value, than by all means use a switch-case, if-else or other structure of comparison; no need for a bitmask.
However, if received_msg contains binary data and you only want to look at certain elements and exclude others, a bitmask would be the appropriate approach.

How do I work with bit data in C

In class I've been tasked with writing a C program that decompresses a text file and prints out the characters it contains. Each character in the file is represented by 2 bits (4 possible characters).
I've recently been informed that a byte is not necessarily 8 bits on all systems, and a char is not necessarily 1 byte. This then makes me wonder how on earth I'm supposed to know how many bits got loaded from a file when I loaded 1 byte. Also how am I supposed to keep the loaded data in memory when there are no data types that can guarantee a set amount of bits.
How do I work with bit data in C?
A byte is not necessarily 8 bits. That much is certainly true. A char, on the other hand, is defined to be a byte - C does not differentiate between the two things.
However, the systems you will write for will almost certainly have 8-bit bytes. Bytes of different sizes are essentially non-existant outside of really, really old systems, or certain embedded systems.
If you have to write your code to work for multiple platforms, and one or more of those have differently sized chars, then you write code specifically to handle that platform - using e.g. CHAR_BIT to determine how many bits each byte contains.
Given that this is for a class, assume 8-bit bytes, unless told otherwise. The point is not going to be extreme platform independence, the point is to teach you something about bit fiddling (or possibly bit fields, but that depends on what you've covered in class).
This then makes me wonder how on earth I'm supposed to know how many
bits got loaded from a file when I loaded 1 byte.
You'll be hard pressed to find a platform where a byte is not 8 bits. (though as noted above CHAR_BIT can be used to verify that). Also clarify the portability requirements with your instructor or state your assumptions.
Usually bits are extracted using shifts and bitwise operations, e.g. (x & 3) are the rightmost 2 bits of x. ((x>>2) & 3) are the next two bits. Pick the right data type for the platforms you are targettiing or as others say use something like uint8_t if available for your compiler.
Also see:
Type to use to represent a byte in ANSI (C89/90) C?
I would recommend not using bit fields. Also see here:
When is it worthwhile to use bit fields?
You can use bit fields in C. These indices explicitly let you specify the number of bits in each part of the field, if you are truly concerned about width. This page gives a discussion: http://msdn.microsoft.com/en-us/library/yszfawxh(v=vs.80).aspx
As an example, check out the ieee754.h for usage in the context of implementing IEEE754 floats

Are bitwise operations still practical?

Wikipedia, the one true source of knowledge, states:
On most older microprocessors, bitwise
operations are slightly faster than
addition and subtraction operations
and usually significantly faster than
multiplication and division
operations. On modern architectures,
this is not the case: bitwise
operations are generally the same
speed as addition (though still faster
than multiplication).
Is there a practical reason to learn bitwise operation hacks or it is now just something you learn for theory and curiosity?
Bitwise operations are worth studying because they have many applications. It is not their main use to substitute arithmetic operations. Cryptography, computer graphics, hash functions, compression algorithms, and network protocols are just some examples where bitwise operations are extremely useful.
The lines you quoted from the Wikipedia article just tried to give some clues about the speed of bitwise operations. Unfortunately the article fails to provide some good examples of applications.
Bitwise operations are still useful. For instance, they can be used to create "flags" using a single variable, and save on the number of variables you would use to indicate various conditions. Concerning performance on arithmetic operations, it is better to leave the compiler do the optimization (unless you are some sort of guru).
They're useful for getting to understand how binary "works"; otherwise, no. In fact, I'd say that even if the bitwise hacks are faster on a given architecture, it's the compiler's job to make use of that fact — not yours. Write what you mean.
The only case where it makes sense to use them is if you're actually using your numbers as bitvectors. For instance, if you're modeling some sort of hardware and the variables represent registers.
If you want to perform arithmetic, use the arithmetic operators.
Depends what your problem is. If you are controlling hardware you need ways to set single bits within an integer.
Buy an OGD1 PCI board (open graphics card) and talk to it using libpci. http://en.wikipedia.org/wiki/Open_Graphics_Project
It is true that in most cases when you multiply an integer by a constant that happens to be a power of two, the compiler optimises it to use the bit-shift. However, when the shift is also a variable, the compiler cannot deduct it, unless you explicitly use the shift operation.
Funny nobody saw fit to mention the ctype[] array in C/C++ - also implemented in Java. This concept is extremely useful in language processing, especially when using different alphabets, or when parsing a sentence.
ctype[] is an array of 256 short integers, and in each integer, there are bits representing different character types. For example, ctype[;A'] - ctype['Z'] have bits set to show they are upper-case letters of the alphabet; ctype['0']-ctype['9'] have bits set to show they are numeric. To see if a character x is alphanumeric, you can write something like 'if (ctype[x] & (UC | LC | NUM))' which is somewhat faster and much more elegant than writing 'if ('A' = x <= 'Z' || ....'.
Once you start thinking bitwise, you find lots of places to use it. For instance, I had two text buffers. I wrote one to the other, replacing all occurrences of FINDstring with REPLACEstring as I went. Then for the next find-replace pair, I simply switched the buffer indices, so I was always writing from buffer[in] to buffer[out]. 'in' started as 0, 'out' as 1. After completing a copy I simply wrote 'in ^= 1; out ^= 1;'. And after handling all the replacements I just wrote buffer[out] to disk, not needing to know what 'out' was at that time.
If you think this is low-level, consider that certain mental errors such as deja-vu and its twin jamais-vu are caused by cerebral bit errors!
Working with IPv4 addresses frequently requires bit-operations to discover if a peer's address is within a routable network or must be forwarded onto a gateway, or if the peer is part of a network allowed or denied by firewall rules. Bit operations are required to discover the broadcast address of a network.
Working with IPv6 addresses requires the same fundamental bit-level operations, but because they are so long, I'm not sure how they are implemented. I'd wager money that they are still implemented using the bit operators on pieces of the data, sized appropriately for the architecture.
Of course (to me) the answer is yes: there can be practical reasons to learn them. The fact that nowadays, e.g., an add instruction on typical processors is as fast as an or/xor or an and just means that: an add is as fast as, say, an or on those processors.
The improvements in speed of instructions like add, divide, and so on, just means that now on those processors you can use them and being less worried about performance impact; but it is true now as in the past that you usually won't change every adds to bitwise operations to implement an add. That is, in some cases it may depend on which hacks: likely some hack now must be considered educational and not practical anymore; others could have still their practical application.

Simple bitwise manipulation for little-endian integer, in big-endian machine?

For a specific need I am building a four byte integer out of four one byte chars, using nothing too special (on my little endian platform):
return (( v1 << 24) | (v2 << 16) | (v3 << 8) | v4);
I am aware that an integer stored in a big endian machine would look like AB BC CD DE instead of DE CD BC AB of little endianness, although would it affect the my operation completely in that I will be shifting incorrectly, or will it just cause a correct result that is stored in reverse and needs to be reversed?
I was wondering whether to create a second version of this function to do (yet unknown) bit manipulation for a big-endian machine, or possibly to use ntonl related function which I am unclear of how that would know if my number is in correct order or not.
What would be your suggestion to ensure compatibility, keeping in mind I do need to form integers in this manner?
As long as you are working at the value level, there will be absolutely no difference in the results you obtain regardless of whether your machine is little-endian or big-endian. I.e. as long as you are using language-level operators (like | and << in your example), you will get exactly the same arithmetical result from the above expression on any platform. The endianness of the machine is not detectable and not visible at this level.
The only situations when you need to care about endianness is when the data you are working with is examined at the object representation level, i.e. in situations when its raw memory representation is important. What you said above about "AB BC CD DE instead of DE CD BC AB" is specifically about the raw memory layout of the data. That's what functions like ntonl do: they convert one memory layout to another memory layout. So far you gave no indication that the actual raw memory layout is in any way important to you. Is it?
Again, if you only care about the value of the above expression, it is fully and totally endianness-independent. Basically, you are not supposed to care about endianness at all when you write C programs that don't attempt to access and examine the raw memory contents.
although would it affect the my operation completely in that I will be shifting incorrectly (?)
No.
The result will be the same regardless of the endian architecture. Bit shifting and twiddling are just like regular arithmetic operations. Is 2 + 2 the same on little endian and big endian architectures? Of course. 2 << 2 would be the same as well.
Little and big endian problems arise when you are dealing directly with the memory. You will run into problems when you do the following:
char bytes[] = {1, 0, 0, 0};
int n = *(int*)bytes;
On little endian machines, n will equal 0x00000001. On big endian machines, n will equal 0x01000000. This is when you will have to swap the bytes around.
[Rewritten for clarity]
ntohl (and ntohs, etc.) is used primarily for moving data from one machine to another. If you're simply manipulating data on one machine, then it's perfectly fine to do bit-shifting without any further ceremony -- bit-shifting (at least in C and C++) is defined in terms of multiplying/dividing by powers of 2, so it works the same whether the machine is big-endian or little-endian.
When/if you need to (at least potentially) move data from one machine to another, it's typically sensible to use htonl before you send it, and ntohl when you receive it. This may be entirely nops (in the case of BE to BE), two identical transformations that cancel each other out (LE to LE), or actually result in swapping bytes around (LE to BE or vice versa).
FWIW, I think a lot of what has been said here is correct. However, if the programmer has coded with endianness in mind, say using masks for bitwise inspection and manipulation, then cross-platform results could be unexpected.
You can determine 'endianness' at runtime as follows:
#define LITTLE_ENDIAN 0
#define BIG_ENDIAN 1
int endian() {
int i = 1;
char *p = (char *)&i;
if (p[0] == 1)
return LITTLE_ENDIAN;
else
return BIG_ENDIAN;
}
... and proceed accordingly.
I borrowed the code snippet from here: http://www.ibm.com/developerworks/aix/library/au-endianc/index.html?ca=drs- where there is also an excellent discussion of these issues.
hth -
Perry

Resources