I fail to understand the following assignment - c

char errorString[20];
*(UInt32*)(errorString + 1) = CFSwapInt32HostToBig(statusCode);
I found this in a book about audio programming and, considering CFSwapInt32HostToBig returns an Int32, I can't understand why does it need to make that strange cast and why it assigns starting with the address of the second element (+1) in the char buffer.
What will errorString contain after this assignment?

errorString+1 (which is of type char*) is casted to pointer to UInt32 and then dereferenced. Hence, the four consequent bytes of errorString, from the second to the fifth (errorString[1] ... errorString[4]), will contain a binary representation of an integer that is result of CFSwapInt32HostToBig(statusCode).

I can't understand why does it need to make that strange cast
The cast is necessary to avoid truncating the data to a single char: if you drop the cast, like this
*(errorString + 1) = CFSwapInt32HostToBig(statusCode);
the assignment will modify a single char. Effectively, it's this:
*(errorString + 1) = (char)CFSwapInt32HostToBig(statusCode);
which is not what the author of the code wanted.
As far as adding a byte goes, the answer depends on the use of errorString: most likely, some other piece of data is supposed to go there.

CFSwapInt32HostToBig returns a value of a 32-bit type but errorString is an array of char.
The programmer wants to store the 4 bytes into the array of char starting from position &errorString[1].
Note that is not safe and should be avoided as it breaks aliasing rules and may break alignment.

Related

Using + to check if multiple pointers are all NULL

Syntactically it makes sense (Although it looks like some other language, which I don't particularly enjoy), it can save a lot of typing and code space, but how bad is it?
if(p1 + (unsigned)p2 + (unsigned)p3 == NULL)
{
// all pointers are NULL, exit
}
Using pointer arithmetic with a pointer rvalue, I don't see how it could give a false result (the entire expression to evaluate to NULL even though not all pointers are NULL), but I don't exactly know how much evilness this potentially hides, so is it bad to do this, not-common way of checking if plenty of pointers are all NULL?
Regarding to the original version of the question, which omitted the casts ...
it can save a lot of typing and code space, but how bad is it?
Very, very bad. Its behavior is altogether undefined, and if your compiler fails to reject it then you should get yourself a better one. Subtraction of one pointer from another is defined under some circumstances (and yields an integer result), but it is never meaningful to add two pointers.
Inasmuch as it shouldn't even compile, every keystroke used to type it instead of something that works is wasted, so no, it doesn't save typing or code space.
I don't see how it could give a false result.
If the compiler actually accepts it, the result can be anything at all. It is undefined.
so is it bad to do this, not-common way of checking if plenty of pointers are all NULL?
Yes.
Regarding the modified question in which all but one of the pointers are cast to integer:
The casts do not rescue the code -- multiple problems remain.
If the remaining pointer does not point to a valid object, or if the sum of the integers is negative or greater than the number of elements in the array to which the pointer points then the result of the pointer addition is still undefined (where a pointer to a scalar is treated as a pointer to a one-element array). Of course, the integer sum can't be negative in this particular case, but that's of minimal advantage.
C does not guarantee that casting a null pointer to an integer yields the value 0. It is common for it to do so, but the language does not require it.
C does not guarantee that non-null pointers convert to nonzero integers, and with your particular code that's a genuine risk. The type unsigned is not necessarily large enough to afford a distinct value to every distinct pointer.
Even if all of the foregoing were not a problem for some particular implementation -- that is, if you could safely perform arithmetic on a NULL pointer, and NULL pointers reliably converted to integers as zero, and non-NULL pointers reliably converted to nonzero -- the test could still go wrong because two nonzero unsigned integers can sum to zero. That happens where the arithmetic sum of the two is equal to UINT_MAX + 1.
There are multiple reasons why this is not a reliable method.
First, when you add an integer to a pointer, the C standard does not say what happens if the result is outside of the array into which the pointer points. (For these purposes, pointing just one past the last element, the end of the array, counts as inside, not outside. Also, a pointer to a single object counts as an array of one object.) Note that the C standard does not just not say what the result of the addition is; it does not say what the behavior of the entire program is. So, once you execute an addition that goes outside of an array, you cannot predict (from the C standard) what your program will do at all.
One likely result is that the compiler will see pointer + integer + integer and reason (or, more technically, apply transformations as if this reasoning were used) that pointer + integer is valid only if pointer is not NULL, and then the result is never NULL, so the expression pointer + integer is never NULL. Similarly, pointer + integer + integer is never NULL. Therefore pointer + integer + integer == NULL is always false, and we can optimize the program by removing this code completely. Thus, the code to handle the case when all pointers are NULL will be silently removed from your program.
Second, even if the C standard did guarantee a result of the addition, this expression could, hypothetically, evaluate to NULL even if none of the pointers were NULL. For example, consider a 16-bit address space where the first pointer were represented with the address 0x7000, the second were 0x6000, and the third were 0x3000. (I will also suppose these are char * pointers, so one element is one byte.) If we add these, the mathematical result is 0x10000. In 16-bit arithmetic, that wraps, so the computed result is 0x0000. Thus, the expression could evaluate to zero, which is likely used for NULL.
Third, unsigned may be narrower than pointers (for example, it may be 32 bits while pointers are 64), so the cast may lose information—there may be non-zero bits in the bits that were lost during the conversion, so the test will fail to detect them.
There are situations where we want to optimize pointer tests, and there are legitimate but non-standard ways to do it. On some processors, branching can be expensive, so doing some arithmetic with one test and one branch may be faster than doing three tests and three branches. C provides an integer type intended for working with pointer representations: uintptr_t, declared in <stdint.h>. With that, we can write this code:
if (((uintptr_t) p1 | (uintptr_t) p2 | (uintptr_t) p3) == 0) …
What this does is convert each pointer to an unsigned integer of a width suitable for working with pointer representations. The C standard does not say what the result of this conversion is, but it is intended to be unsurprising, and C implementations for flat address spaces may document that the result is the memory address. They may also document that NULL is the zero address. Once we have these integers, we OR them together instead of adding them. The result of an OR has a bit set if either of the corresponding bits in its operands was set. Thus, if any one of the addresses is not zero, then the result will not be zero either. So this code, if executed in a suitable C implementation, will perform the test you desire.
(I have used such tests in special high-performance code to test whether all pointers were aligned as desired, rather than to test for NULL. In that case, I had direct access to the compiler developers and could ensure the compiler would behave as desired. This is not standard C code.)
Using any sort of pointer arithmetic on non-array pointers is undefined behavior in C.

C Operator Precedence with pointer increments

I am trying to understand a line of C-code which includes using a pointer to struct value (which is a pointer to something as well).
Example C-code:
// Given
typedef struct {
uint8 *output
uint32 bottom
} myType;
myType *e;
// Then at some point:
*e->output++ = (uint8) (e->bottom >> 24);
Source: https://www.rfc-editor.org/rfc/rfc6386#page-22
My question is:
What exactly does that line of C-code do?
"What exactly does that line of C-code do?"
Waste a lot of time having to carefully read it instead just knowing at a glance. If I was doing code review of that, I'd throw it back to the author and say break it up into two lines.
The two things it does is save something at e->output, then advance e->output to the next byte. I think if you need to describe code with two pieces though, it should be on two lines with two separate statements.
As pointed out by Deduplicator in the comments above, looking at an operator precedence table might help.
*e->output++ = ... means "assign value ... to the location e->output is pointing to, and let e->output point to a new location 8 bits further afterwards (because output is of type uint8).
(uint8) (e->bottom >> 24) is then evaluated to get a value for ...
The line
*e->output++ = (uint8) (e->bottom >> 24);
does the following:
Find the field bottom of the structure pointed to by the pointer e.
Fetch the 32-bit value from that field.
Shift that value right 24 bits.
Re-interpret that value as a uint8_t, which now contains the high order byte.
Find the field output of the structure. It's a pointer to uint8_t.
Store the uint8_t we computed earlier into the address pointed to by output.
And finally, add 1 to output, causing it to point to the next uint8_t.
The order of some of those things might be rearranged a bit as long as the result
behaves as if they had been done in that order. Operator precedence is a completely
separate question from order in which operations are performed, and not really
relevant here.

Using bit operations to "turn off" binary digits of a pointer

I was able to use bit operations to "turn off" binary digits of a number.
Ex:
x = x & ~(1<<0)
x = x & ~(1<<1)
(and repeat until desired number of digits starting from the right are changed to 0)
I would like to apply this technique to a pointer's address.
Unfortunately, the & operator cannot be used with pointers. Using the same lines of code as above, where x is a pointer, the compiler says "invalid operands to binary & (have int and int)."
I tried to typecast the pointers as ints, but that doesn't work as I assume the ints are too small (and I just realized I'm not allowed to cast).
(note: though this is part of a homework problem, I've already reasoned out why I need to turn off some digits after a good couple hours, so I'm fine in that regard. I'm simply trying to see if I can get a clever technique to do what I want to do here).
Restrictions: I cannot use loops, conditionals, any special functions, constants greater than 255, division, mod.
(edit: added restrictions to the bottom)
Use uintptr_t from <stdint.h>. You should always use unsigned types for bit twiddling, and (u)intptr_t is specifically chosen to be able to hold a pointer's value.
Note however that adjusting a pointer manually and dereferencing it is undefined behaviour, so watch your step. You shall be able to recover the exact original value of the pointer (or another valid pointer) before doing so.
Edit : from your comment I understand that you don't plan on dereferencing the twiddled pointer at all, so no undefined behaviour for you. Here is how you can check if your pointers share the same 64-byte block :
uintptr_t p1 = (uintptr_t)yourPointer1;
uintptr_t p2 = (uintptr_t)yourPointer2;
uintptr_t mask = ~(uintptr_t)63u; // Shave off 5 low-order bits
return (p1 & mask) == (p2 & mask);
C language standard library includes the (optional though) type intptr_t, for which there is guarantee that "any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer".
Of course if you perform bitwise operation on the integer than the result is undefined behaviour.
Edit:
How unfortunate haha. I need a function to show two pointers are in
the same 64-byte block of memory. This holds true so long as every
digit but the least significant 6 digits of their binary
representations are equal. By making sure the last 6 digits are all
the same (ex: 0), I can return true if both pointers are equal. Well,
at least I hope so.
You should be able to check if they're in the same 64 block of memory by something like this:
if ((char *)high_pointer - (char *)low_pointer < 64) {
// do stuff
}
Edit2: This is likely to be undefined behaviour as pointed out by chris.
Original post:
You're probably looking for intptr_t or uintptr_t. The standard says you can cast to and from these types to pointers and have the value equal to the original.
However, despite it being a standard type, it is optional so some library implementations may choose not to implement it. Some architectures might not even represent pointers as integers so such a type wouldn't make sense.
It is still better than casting to and from an int or a long since it is guaranteed to work on implementations that supply it. Otherwise, at least you'll know at compile time that your program will break on a certain implementation/architecture.
(Oh, and as other answers have stated, manually changing the pointer when casted to an integer type and dereferencing it is undefined behaviour)

What does casting char* do to a reference of an int? (Using C)

In my course for intro to operating systems, our task is to determine if a system is big or little endian. There's plenty of results I've found on how to do it, and I've done my best to reconstruct my own version of a code. I suspect it's not the best way of doing it, but it seems to work:
#include <stdio.h>
int main() {
int a = 0x1234;
unsigned char *start = (unsigned char*) &a;
int len = sizeof( int );
if( start[0] > start[ len - 1 ] ) {
//biggest in front (Little Endian)
printf("1");
} else if( start[0] < start[ len - 1 ] ) {
//smallest in front (Big Endian)
printf("0");
} else {
//unable to determine with set value
printf( "Please try a different integer (non-zero). " );
}
}
I've seen this line of code (or some version of) in almost all answers I've seen:
unsigned char *start = (unsigned char*) &a;
What is happening here? I understand casting in general, but what happens if you cast an int to a char pointer? I know:
unsigned int *p = &a;
assigns the memory address of a to p, and that can you affect the value of a through dereferencing p. But I'm totally lost with what's happening with the char and more importantly, not sure why my code works.
Thanks for helping me with my first SO post. :)
When you cast between pointers of different types, the result is generally implementation-defined (it depends on the system and the compiler). There are no guarantees that you can access the pointer or that it correctly aligned etc.
But for the special case when you cast to a pointer to character, the standard actually guarantees that you get a pointer to the lowest addressed byte of the object (C11 6.3.2.3 §7).
So the compiler will implement the code you have posted in such a way that you get a pointer to the least significant byte of the int. As we can tell from your code, that byte may contain different values depending on endianess.
If you have a 16-bit CPU, the char pointer will point at memory containing 0x12 in case of big endian, or 0x34 in case of little endian.
For a 32-bit CPU, the int would contain 0x00001234, so you would get 0x00 in case of big endian and 0x34 in case of little endian.
If you de reference an integer pointer you will get 4 bytes of data(depends on compiler,assuming gcc). But if you want only one byte then cast that pointer to a character pointer and de reference it. You will get one byte of data. Casting means you are saying to compiler that read so many bytes instead of original data type byte size.
Values stored in memory are a set of '1's and '0's which by themselves do not mean anything. Datatypes are used for recognizing and interpreting what the values mean. So lets say, at a particular memory location, the data stored is the following set of bits ad infinitum: 01001010 ..... By itself this data is meaningless.
A pointer (other than a void pointer) contains 2 pieces of information. It contains the starting position of a set of bytes, and the way in which the set of bits are to be interpreted. For details, you can see: http://en.wikipedia.org/wiki/C_data_types and references therein.
So if you have
a char *c,
an short int *i,
and a float *f
which look at the bits mentioned above, c, i, and f are the same, but *c takes the first 8 bits and interprets it in a certain way. So you can do things like printf('The character is %c', *c). On the other hand, *i takes the first 16 bits and interprets it in a certain way. In this case, it will be meaningful to say, printf('The character is %d', *i). Again, for *f, printf('The character is %f', *f) is meaningful.
The real differences come when you do math with these. For example,
c++ advances the pointer by 1 byte,
i++ advanced it by 4 bytes,
and f++ advances it by 8 bytes.
More importantly, for
(*c)++, (*i)++, and (*f)++ the algorithm used for doing the addition is totally different.
In your question, when you do a casting from one pointer to another, you already know that the algorithm you are going to use for manipulating the bits present at that location will be easier if you interpret those bits as an unsigned char rather than an unsigned int. The same operatord +, -, etc will act differently depending upon what datatype the operators are looking at. If you have worked in Physics problems wherein doing a coordinate transformation has made the solution very simple, then this is the closest analog to that operation. You are transforming one problem into another that is easier to solve.

C array address confusion

Say we have the following code:
int main(){
int a[3]={1,2,3};
printf(" E: 0x%x\n", a);
printf(" &E[2]: 0x%x\n", &a[2]);
printf("&E[2]-E: 0x%x\n", &a[2] - a);
return 1;
}
When compiled and run the results are follows:
E: 0xbf8231f8
&E[2]: 0xbf823200
&E[2]-E: 0x2
I understand the result of &E[2] which is 8 plus the array's address, since indexed by 2 and of type int (4 bytes on my 32-bit system), but I can't figure out why the last line is 2 instead of 8?
In addition, what type of the last line should be - an integer or an integer pointer?
I wonder if it is the C type system (kinda casting) that make this quirk?
You have to remember what the expression a[2] really means. It is exactly equivalent to *(a+2). So much so, that it is perfectly legal to write 2[a] instead, with identical effect.
For that to work and make sense, pointer arithmetic takes into account the type of the thing pointed at. But that is taken care of behind the scenes. You get to simply use natural offsets into your arrays, and all the details just work out.
The same logic applies to pointer differences, which explains your result of 2.
Under the hood, in your example the index is multiplied by sizeof(int) to get a byte offset which is added to the base address of the array. You expose that detail in your two prints of the addresses.
When subtracting pointers of the same type the result is number of elements and not number of bytes. This is by design so that you can easily index arrays of any type. If you want number of bytes - cast the addresses to char*.
When you increment the pointer by 1 (p+1) then pointer would points to next valid address by adding ( p + sizeof(Type)) bytes to p. (if Type is int then p+sizeof(int))
Similar logic holds good for p-1 also ( of course subtract in this case).
If you just apply those principles here:
In simple terms:
a[2] can be represented as (a+2)
a[2]-a ==> (a+2) - (a) ==> 2
So, behind the scene,
a[2] - a[0]
==> {(a+ (2* sizeof(int)) ) - (a+0) } / sizeof(int)
==> 2 * sizeof(int) / sizeof(int) ==> 2
The line &E[2]-2 is doing pointer subtraction, not integer subtraction. Pointer subtraction (when both pointers point to data of the same type) returns the difference of the addresses in divided by the size of the type they point to. The return value is an int.
To answer your "update" question, once again pointer arithmetic (this time pointer addition) is being performed. It's done this way in C to make it easier to "index" a chunk of contiguous data pointed to by the pointer.
You may be interested in Pointer Arithmetic In C question and answers.
basically, + and - operators take element size into account when used on pointers.
When adding and subtracting pointers in C, you use the size of the data type rather than absolute addresses.
If you have an int pointer and add the number 2 to it, it will advance 2 * sizeof(int). In the same manner, if you subtract two int pointers, you will get the result in units of sizeof(int) rather than the difference of the absolute addresses.
(Having pointers using the size of the data type is quite convenient, so that you for example can simply use p++ instead of having to specify the size of the type every time: p+=sizeof(int).)
Re: "In addtion,what type of the last line should be?An integer,or a integer pointer??"
an integer/number. by the same token that the: Today - April 1 = number. not date
If you want to see the byte difference, you'll have to a type that is 1 byte in size, like this:
printf("&E[2]-E:\t0x%x\n",(char*)(&a[2])-(char*)(&a[0]))

Resources