can you address single bits of an int? - c

As I understand it, addressing a single bit in an int variable seems possible, if it is passed as a pointer. Am I correct?
uint8_t adcState[8];
uint16_t adcMessageId[8];
void adcEnable(uint8_t* data) {
// Set ADC Input as enabled
adcState[(uint8_t) data[0]] = 1;
// Get ADC Message ID
adcMessageId[(uint8_t) data[0]] = data[2] << 8 | data[1];
}
So far this is what I figured out that:
The function receives a pointer to 8bit int as an argument
It takes the least significant digit of that int (the pointer is treated as an array, and its first field is being read), and uses it as a field number for adcState array, which then is set to 1. For example this would mean if data was 729, the data[0] would be '9' and therefore the adcsState[9] becomes 1.
Is it possible? Can you use the pointers like this?
For the adcMessageId array a similar approach is taken. However here the value it is assigned depends on the third and second digit of the data int.
I don't understand the shift over here. Being a uint8_t value it has only 8 bits, so shifting with 8 bits always gives 0000 0000. Therefore an OR with data[1] would be just data[1] itself...
In our example, the adcMessageId[9] would become ('7' << 8) bitwise OR with '2', so just '2'.
Something in my logic seems wrong.

It would seem data is pointing to an array, not a single 8 bit int, and that:
The first element of the array is a pointer into the arrays adcState and adcMessageId
The second and third elements of the array comprise a data value for the array adcMessageId
As commenter #Eugene Sh. pointed out, data[2] is promoted to an int before shifting, so no bits are lost.
The pointer notation uint8_t * is as valid as array notation uint8_t [] in a function signature for passing an array; it's often how char * strings are passed, and arrays decay to a pointer to their first element when passed to functions anyway.

The function receives a pointer to 8bit int as an argument
Yes, roughly. And the function implementation assumes that the pointed-to uint8_t can be treated as the first element of an array of at least three uint8_t.
It takes the least significant digit of that int (the pointer is treated as an array, and its first field is being read), and uses it
as a field number for adcState array, which then is set to 1. For
example this would mean if data was 729, the data[0] would be '9' and
therefore the adcsState[9] becomes '1'. Is it possible? Can you use
the pointers like this?
No, you have completely misunderstood. data[0] means exactly the same thing as *data. It refers to the 8-bit unsigned integer to which data points. The number 729 is too large to be represented as a uint8_t, but if the object to which data pointed had the value 129 then data[0] would evaluate to 129.
You are perhaps confused by the appearance later in the function of data[1] and data[2]. These refer to two uint8_t objects following *data in memory, such as will reliably be present if data points to the first element of an array, as I alluded to above. Indexing with [] does not have the effect of slicing the uint8_t to which data points.
Pay attention also that I am saying "the object to which data points". One does not pass a uint8_t value directly as this function's parameter. It is not anticipating that an integer value would be reinterpreted as a pointer. You pass a pointer to the data you want the function to work with.
For the adcMessageId array a similar approach is taken. However here the value it is assigned depends on the third and second digit of
the data int.
In the adcMessageId case, again data[0] refers to the uint8_t to which data points. data[1] refers to another whole uint8_t following that in memory, and data[2] to the next after that.
I don't understand the shift over here. Being a uint8_t value it has only 8 bits, so shifting with 8 bits always gives 0000 0000.
uint8_t has only 8 bits, but all integer values narrower than int are converted to an integer type at least as wide as int, or perhaps to a floating-point type, for arithmetic operations. The specific promoted type depends in part on what the other operand is, and the result has the same, promoted type. Type int is at least 16 bits wide on all conforming C implementations. Thus this ...
data[2] << 8 | data[1]
... intends to pack the two uint8_t values data[2] and data[1] into one 16-bit integer, data[2] in the most-significant position. It's not entirely safe because the elements of data will be promoted to (signed) int instead of unsigned int, but that will present an issue only on implementations where int is 16 bits wide (which are uncommon these days), and even then, only if the value of data[2] is larger than 127. A safer way to express it would involve explicit casts:
(unsigned int) data[2] << 8 | (unsigned int) data[1]

You have a few misconceptions. Or maybe just wrong wording.
The function receives a pointer to 8bit int as an argument
More precisely it gets a pointer to an array of 8bit integers. Otherwise your usage would be invalid. Probably it gets a pointer to a string.
It takes the least significant digit of that int (the pointer is treated as an array, and its first field is being read),
That is wrong. You seem to use it as a pointer to a string holding a number.
In that case you access the first character, which is the MOST significant decimal digit.
and uses it as a field number for adcState array, which then is set to 1. For example this would mean if data was 729, the data[0] would be '9' and therefore the adcsState[9] becomes '1'. Is it possible? Can you use the pointers like this?
You are messing up things a bit.
If you want to access decimal digits, we are talking about strings and there the first element is '7' which is not to be confused with 7.
For the adcMessageId array a similar approach is taken. However here the value it is assigned depends on the third and second digit of the data int.
Maybe you should not talk about int if you are using strings.
I don't understand the shift over here. Being a uint8_t value it has only 8 bits, so shifting with 8 bits always gives 0000 0000. Therefore an OR with data[1] would be just data[1] itself... In our example, the adcMessageId[9] would become ('7' << 8) bitwise OR with '2', so just '2'.
That was already addressed in comments and Govind Parmar's answer: Integer promotion takes place before shifting.

Related

c programming question on reinterpret_cast

What is the reinterpret_cast of (char) doing here?
unsigned int aNumber = 258; // 4 bytes in allocated memory [02][01][00][00]
printf("\n is printing out the first byte %02i",(char)aNumber); // Outputs the first byte[02]
Why am i getting out the first byte without pointing to it? such as (char*)&aNumber
is the %02i doing this = (char)*&aNumber
or is the reinterpret_cast of (char) cutting out the rest 3 bytes since it is a char it only allocate one byte of them 4 bytes?
First, reinterpret_cast is a C++ operator. What you've shown is not that but a C-style cast.
The cast is converting a value of type unsigned int to a value of type char. Conversion of an out-of-range value is implementation defined, but in most implementations you're likely to come across, this is implemented as reinterpreting the lower order bytes as the converted value.
In this particular case, the low order byte of aNumber has the value 0x02, so that's what the result is when casted to a char.

What is forbidden after pointer-casting a big type to a smaller type in C

Say I have a bigger type.
uint32_t big = 0x01234567;
Then what can I do for (char*)&big, the pointer interpreted as a char type after casting?
Is that an undefined behavior to shift the address of (char*)&big to (char*&big)+1, (char*&big)+2, etc.?
Is that an undefined behavior to both shift and edit (char*)&big+1? Like the example below. I think this example should be an undefined behavior because after casting to (char*), we then have limited our eyesight to a char-type pointer, and we ought not access, even change the value outside this scope.
uint32_t big = 0x01234567;
*((char*)&big + 1) = 0xff;
printf("%02x\n\n\n", *((char*)&big+1));
printf("%02x\n\n\n", big);
(This pass my Visual C++ compiler. By the way, I want to ask a forked question on that why in this example the first printf gives ffffffff? Shouldn't it be ff?)
I have seen a code like this. And this is what I usually do when I need to achieve similar task. Is this UB or not? Why or why not? What is the standard way to achieve this?
uint8_t catcher[8] = { 0 };
uint64_t big = 0x1234567812345678;
memcpy(catcher, (uint8_t*)&big, sizeof(uint64_t));
Then what can I do for (char*)&big, the pointer interpreted as a char type after casting?
If a char is eight bits, which it is in most modern C implementations, then there are four bytes in the uint32_t big, and you can do arithmetic on the address from (char *) &big + 0 to (char *) &big + 4. You can also read and write the bytes from (char *) &big + 0 to (char *) &big + 3, and those will access individual bytes in the representation of big. Although arithmetic is defined to work up to (char *) &big + 4, that is only an endpoint. There is no defined byte there, and you should not use that address to read or write anything.
Is that an undefined behavior to shift the address of (char*)&big to (char*&big)+1, (char*&big)+2, etc.?
These are additions, not shifts, and the syntax is (char *) &big + 1, not (char*&big)+1. Arithmetic is defined for the offsets from +0 to +4.
Is that an undefined behavior to both shift and edit (char*)&big+1?
It is allowed to read and write the bytes in big using a pointer to char. This is a special rule for character types. Generally, the bytes of an object should not be accessed using an unrelated type. For example, a float object could not be accessed using an int type. However, the character types are special; you may access the bytes of any object using a character type.
However, it is preferable to use unsigned char for this, as it avoids complications with signed values.
I have seen a code like this.
It is allowed to read or write the bytes of an object using memcpy. memcpy is defined to work as if by copying characters.
Note that, while accessing the bytes of an object is defined by the C standard, how bytes represent values is partly implementation-defined. Different C implementations may use different orders for the bytes within an object, and there can be other differences.
By the way, I want to ask a forked question on that why in this example the first printf gives ffffffff? Shouldn't it be ff?
In your C implementation, char is signed and can represent values from −128 to +127. In *((char*)&big + 1) = 0xff;, 0xff is 255 and is too big to fit into a char. It is converted to a char value in an implementation-defined way. Your C implementation converts it to −1. (The eight-bit two’s complement representation of −1, bits 11111111, uses the same bits as the binary representation of 255, again bits 11111111.)
Then printf("%02x\n\n\n", *((char*)&big+1)); passes this value, −1, to printf. Since it is a char, it is promoted to int to be passed to printf. This produces the same value, −1, but it has 32 bits, 11111111111111111111111111111111. Then you are passing an int, but printf expects an unsigned int for %02x. The behavior of this is not defined by the C standard, but your C implementation reads the 32 bits as if they were an unsigned int. As an unsigned int, the 32 bits 11111111111111111111111111111111 represent the value 4,294,967,295 or 0xffffffff, so that is what printf prints.
You can print the correct value by using printf("%02hhx\n\n\n", * ((unsigned char *) &big + 1));. As an unsigned char, the bits 11111111 represent 255 or 0xff, and converting that to an int produces 255 or 0x000000ff.
For variadic functions (like printf) all arguments undergoes default argument promotion which promotes smaller integer types to int.
This conversion will include sign-extension if the smaller type is signed, so the value keeps its value.
So if char is a signed type (which is implementation defined) with a value of -1 then it will be promoted to the int value -1. Which is what you see.
If you want to print a smaller type you need to first of all cast to the correct type (unsigned char) then use the proper format (like %hhx for printing unsigned char values).

Typecasting of pointers in c

How integer pointer to char pointer conversion works?
I have a program that has integer value 320 and I'm typecasting into char*. It will show the output 64. I want to know how its works?
#include <stdio.h>
int main()
{
int i=320;
char *p=(char*)&i;
printf("%d",*p);
return 0;
}
Well, on your little-endian system, let's assume sizeof (int) is 4.
Then the memory for i looks like:
+--+-+-+-+
i: |64|1|0|0|
+--+-+-+-+
This is because 320 is 0x00000140, i.e. 320 = 1 * 256 + 64.
So you set p to point at the first byte (64), and then dereference it so that single byte is read.
Your final line is wrong, you meant:
printf("%d\n", *p);
Quoting C11, chapter §6.3.2.3, emphasis mine
A pointer to an object type may be converted to a pointer to a different object type. If the
resulting pointer is not correctly aligned for the referenced type, the behavior is
undefined. Otherwise, when converted back again, the result shall compare equal to the
original pointer. When a pointer to an object is converted to a pointer to a character type,
the result points to the lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining bytes of the object.
So, the binary representation would look like (little-endian architecture assumed, based on output you presented)
00000001 01000000
^^ ^^
HAB LAB HAB- High Address Byte, LAB - Low Address Byte
And, by the cast, you are essentially pointing to
01000000
Part. So the dereference will produce that value as the integer result, (01000000)2 == (64)10.
Note: Only a character type pointer is capable of aliasing any other pointer type. Don;t try it with other target types which are not compatible with the source type.
The different value is due to truncation; it also depends on the endian-ness of the platform The value 640, if stored in an int of say 16 byte, has the following binary pattern.
0000 0001 0100 0000
If a pointer to these two bytes is cast to a pointer of char, it will refer to the lower byte, which is as follows.
0100 0000
However, this bit pattern has a numerical value of 64, which is the output of the program.

Copying int to different memory location, receiving extra bytes than expected

Trying to pre-pend a 2 byte message length, after getting the length in a 4 byte int. I use memcpy to copy 2 bytes of the int. When I look at the second byte I copied, it is as expected, but accessing the first byte actually prints 4 bytes.
I would expect that dest[0] and dest[1] both contain 1 byte of the int. whether or not it's a significant byte, or the order is switched... I can throw in an offset on the memcpy or reversing 0 and 1. It does not have to be portable, I would just like it to work.
The same error is happening in Windows with LoadRunner and Ubuntu with GCC - so I have at least tried to rule out portability as a cause.
I'm not sure where I'm going wrong. I am suspecting it's related to my lack of using pointers recently? Is there a better approach to cast an int to a short and then put it in the first 2 bytes of a buffer?
char* src;
char* dest;
int len = 2753; // Hex - AC1
src=(char*)malloc(len);
dest=(char*)malloc(len+2);
memcpy(dest, &len, 2);
memcpy(dest+2, src, len);
printf("dest[0]: %02x", dest[0]);
// expected result: c1
// actual result: ffffffc1
printf("dest[1]: %02x", dest[1]);
// expected result: 0a
// actual result: 0a
You cannot just take a random two bytes out of a four byte object and call it a cast to short.
You will need to copy your int into a two byte int before doing your memcpy.
But actually, that isn't the best way to do it either, because you have no control over the byte order of an integer.
Your code should look like this:
dest[0] = ((unsigned)len >> 8) & 0xFF;
dest[1] = ((unsigned)len) & 0xFF;
That should write it out in network byte order aka big endian. All of the standard network protocols use this byte order.
And I'd add something like:
assert( ((unsigned)len & 0xFFFF0000) == 0 ); // should be nothing in the high bytes
Firstly, you are using printf incorrectly. This
printf("dest[0]: %02x", dest[0]);
uses x format specifier in printf. x format specifier requires an argument of type unsigned int. Not char, but unsigned int and only unsigned int (or alternatively an int with non-negative value).
The immediate argument you supplied has type char, which is probably signed on your platform. This means that your dest[0] contains -63. A variadic argument of type char is automatically promoted to type int, which turns 0xc1 into 0xffffffc1 (as a signed representation of -63 in type int). Since printf expects an unsigned int value and you are passing a negative int value instead, the behavior is undefined. The printout that you see is nothing more than a manifestation of that undefined behavior. It is meaningless.
One proper way to print dest[0] in this case would be
printf("dest[0]: %02x", (unsigned) dest[0]);
I'm pretty sure the output will still be ffffffc1, but in this case 0xffffffc1 is the prefectly expected result of integer conversion from negative -63 value to unsigned int type. Nothing unusual here.
Alternatively you can do
printf("dest[0]: %02x", (unsigned char) dest[0]);
which should give you your desired c1 output. Note that the conversion to int takes place in this case as well, but since the original value is positive (193), the result of the conversion to int is positive too and printf works properly.
Finally, if you want to work with raw memory directly, the proper type to use would be unsigned char from the very beginning. Not char, but unsigned char.
Secondly, an object of type int may easily occupy more than two 8-bit bytes. Depending on the platform, the 0xA and 0xC1 values might end up in completely different portions of the memory region occupied by that int object. You should not expect that copying the first two bytes of an int object will copy the 0xAC1 portion specifically.
You make the assumption that an "int" is two bytes. What justification do you have for that? Your code is highly unportable.
You make another assumption that "char" is unsigned. What justification do you have for that? Again, your code is highly unportable.
You make another assumption about the ordering of bytes in an int. What justification do you have for that? Again, your code is highly unportable.
instead of the literal 2, use sizeof(int). Never hard code the size of a type.
If this code should be portable, you should not use int, but a fixed size datatype.
If you need 16 bit, you could use int16_t.
Also, the printing of the chars would need a cast to unsigned. Now, the char is upcasted to an int, and the sign is extended. This gives the initial FFFF's

short type variable automatically extended to integer type?

I wanna print the value of b[FFFC] like below,
short var = 0xFFFC;
printf("%d\n", b[var]);
But it actually print the value of b[FFFF FFFC].
Why does it happen ?
My computer is operated by Windows XP in 32-bit architecture.
short is a signed type. It's 16 bits on your implementation. 0xFFFC represents the integer constant 65,532, but when converted to a 16 bit signed value, this is resulting in -4.
So, your line short var = 0xFFFC; sets var to -4 (on your implementation).
0xFFFFFFFC is a 32 bit representation of -4. All that's happening is that your value is being converted from one type to a larger type, in order to use it as an array index. It retains its value, which is -4.
If you actually want to access the 65,533rd element of your array, then you should either:
use a larger type for var. int will suffice on 32 bit Windows, but in general size_t is an unsigned type which is guaranteed big enough for non-negative array indexes.
use an unsigned short, which just gives you enough room for this example, but will go wrong if you want to get another 4 steps forward.
In current compilers we can't use short (16 bit) if write short use 32 bit .
for example i compile same code with gcc4 in Ubuntu Linux 32 bit :
int main(int argc, char** argv)
{
short var = 0xFFFC;
printf("%x\n", var);
printf("%d\n", var);
return (EXIT_SUCCESS);
}
and output is :
fffffffc
-4
you can see cast short to 32bit normal and use sign extension in 2's complement
As a refresher on the C's data types available, have a look here.
There is a rule, and that concerns the usage of C, some datatypes are promoted to their integral type, for instance
char ch = '2';
int j = ch + 1;
Now look at the RHS (Right Hand Side) of the expression and notice that the ch will automatically get promoted as an int in order to produce the desired results on the LHS (LHS) of the expression. What would the value of j be? The ASCII code for '2' is 50 decimal or 0x32 hexadecimal, add 1 on to it and the value of j would be 51 decimal or 0x33 hexadecimal.
It is important to understand that rule and that explains why a data type would be 'promoted' to another data type.
What is the b? That is an array I presume that has 655532 elements correct?
Anyway, using a format specifier %d is for of type int, the value got promoted to an int, firstly, and secondly the array subscript is of type int, hence the usage of the short var got promoted and since the data size of an int is 4 bytes, it got promoted and hence you are seeing the rest of the value 0xFFFF 0xFFFC.
This is where the usage of casting comes in, to tell the compiler to cast a data type to another which explains in conjunction to Gregory Pakosz's answer above.
Hope this helps,
Best regards,
Tom.
use %hx or %hd instead to indicate that you have a short variable, e.g:
printf("short hex: %hx\n", var); /* tell printf that var is short and print out as hex */
EDIT: Uups, I got the question wrong. It was not about printf() as I thought. So this answer might be a little bit OT.
New: Because you are using var as an index to an array you should declare it as unsigned short (instead of short):
unsigned short var = 0xFFFC;
printf("%d\n", b[var]);
The 'short var' could be interpreted as a negative number.
To be more precise:
You are "underflowing" into the negative value range: Values in the range from 0x0000 upto 0x7FFF will be OK. But values from 0x8000 upto 0xFFFF will be negative.
Here are some examples of var used as an index to array b[]:
short var=0x0000;; // leads to b[0] => OK
short var=0x0001; // leads to b[1] => OK
short var=0x7FFF; // leads to b[32767] => OK
short var=0x8000; // leads to b[-32768] => Wrong
short var=0xFFFC; // leads to b[-4] => Wrong
short var=32767; // leads to the same as b[0x7FFF] => OK
short var=32768; // compile warning or error => overflow into 32bit range
You were expecting to store JUST a 16bit variable in a 32bit-aligned memory... you see, each memory address holds a whole 32bit word (hardware).
The extra FFFF comes from the fact that short is a signed value, and when assigned to int (at the printf call), it got signed-extended. When extending two-complements from 16 to 32bit, the extension is done by replicating the last N bit to all other M-N on it's left. Of course, you did not intend that.
So, in this case, you're interested in absolute array positions, so you should declare your indexer as unsigned.
In the subject of your question you have already guessed what is happening here: yes, a value of type short is "automatically extended" to a value of type int. This process is called integral promotion. That's how it always works in C language: every time you use an integral value smaller than int that value is always implicitly promoted to a value of type int (unsigned values can be promoted to unsigned int). The value itself does not change, of course, only the type of the value is changed. In your above example the 16-bit short value represented by pattern 0xFFFC is the same as 32-bit int value represented by pattern 0xFFFFFFFC, which is -4 in decimals. This, BTW, makes the rest of your question sound rather strange: promotion or not, your code is trying to access b[-4]. The promotion to int doesn't change anything.

Resources