C bit manipulation char array - c

I have a pointer to an unsigned char array, e.g. unsigned char *uc_array. If I shift the content that the address points to right by 8 bits will they be in uc_array + 1?

Shifting the content will modify its value, not move it in memory.

No.... if you dereference a pointer *uc_array++ you are incrementing the value of what the pointer is pointing to. However if you do this, uc_array++ you are incrementing the address of the pointer which points to the "next neighbouring value" returned by *uc_array.
Don't forget that pointer arithmetic is dependent on the size of the type of the pointer, for character pointers, it is 1, for ints, its 4 depending on the platform and compiler used...

Your question only makes sense to me when interpreted such as
memmove(uc_array + 1, uc_array, bytesize_of_array);
I'm assuming you are on 8 bit byte platform, and that by shifting you mean shift the bits when interpreted as a long bit-sequence of consecutive bytes (and there need to be one char after the array to account for the shift). Then indeed the value stored at address uc_array will then be stored at uc_array + 1.
However if you do a loop like this
for(unsigned char *x = uc_array; x != uc_array + byte_count; ++x)
*x >>= 8;
And assume 8 bit bytes you will just nullify everything there, byte for byte shifting away all bits.

No. Modifications to a value affect only that value, and not adjacent values. This includes the shift operators.
The bits shifted out by a shift operator are "lost".

It's depend on how you shift your data. if you do something like this (quint16)(*uc_array) >> 8 then first byte will move to the second. But if just do (*uc_array) >> 8 then, as says the others you will empty your data.

Related

Explanation for cryptic double pointer assignment

I'm reading the c code:
void **alignedData = (void **)(((size_t)temp + aligned - 1)&-aligned);
I do not known the means, especially the &- part.
Can anyone explain it?
Thanks!
When using this, aligned should be an unsigned type (or the C implementation should be using two’s complement) and have a value that is a power of two. Then this code calculates an amount of memory to be allocated:
(size_t) temp converts temp to the unsigned type size_t, which is suitable for working with sizes. This will be a number of bytes to be allocated.
(size_t) temp + aligned - 1 adds enough bytes to guarantee a multiple of aligned falls somewhere between the numbers temp and temp + aligned - 1, inclusive. For example, if temp is 37 and aligned is 8, then between 37 and 44 (37+8−1), there is a multiple of 8 (40).
-aligned makes a bit mask with 1 in each bit position that is a multiple of aligned and 0 in the lower bits. For example, if aligned is 8, then the bits that represent -aligned are 111…111000, because the 000 bits at the end represent values of 1, 2, and 4, while the other bits represent values of 8, 16, 32, and so on.
The & (bitwise AND) of (size_t) temp + aligned - 1 with -aligned then clears the low bits, leaving only bits that are multiples of aligned. Thus, it produces the multiple of aligned that is in the interval. For example, with the values of 37 and 8 mentioned before, ((size_t) temp + aligned - 1) & -aligned produces 40.
Thus, this expression produces the value of temp rounded up to the next multiple of aligned. It says “Calculate the number of bytes we need to allocate that is at least temp bytes and is a multiple of aligned.”
After this, the code converts this number to the type void ** and uses it to initialize void **alignedData. That is bad C code. There is generally no good reason for it. A number of bytes like this should not be used as any kind of pointer. The code may be attempting to “smuggle” this value through a data type it is compelled to use by some other software, but there is likely a better way to do it, such as by allocating memory to hold the value and supplying a pointer to that memory instead of trying to convert the value directly. Finding a better solution requires knowing more context of the code.

Why is this int not taking up two bytes of my array? [duplicate]

This question already has answers here:
Data Type Truncation
(2 answers)
Closed 3 years ago.
So basically I am really confused as to why when I assign some value of an array to an int is it not taking up two indices within that array.
I have tried to change my code to use pointers/addresses instead of directly setting the array spot to the int, but none of those fix my problem/confusion.
I have declared my array in my header .h file like this
char arr[4096] = {'\0'};
Then I assign some value of this array to 16.
arr[0]=16;
Then I test to see how much space in the array 16 (an int) has taken up:
p = 0;
while(arr[p]!='\0'){
printf("testing\n");
p++;
}
printf("%d\n",p);
However, it always prints 1 for the value of p instead of 2, indicating that only arr[0] is occupied by the int. I am so confused as to how this memory stuff works and how can I get an int to take up two spots of an array in memory. Why is the value 1 instead of 2? And why does the int only take up 1 spot of the array?
Assigning a value of type int to a member of a char array does not take up multiple elements of the array. Section 6.5.16.1p2 of the C standard regarding the assignment operator = states:
In simple assignment(=), the value of the right operand is
converted to the type of the assignment expression and
replaces the value stored in the object designated by the
left operand.
So the int value is converted to type char before assignment. And because the value 16 falls in the range of possible values for a char, no conversion of the value needs to take place.
If you really want to assign to multiple values of a char array like this, the proper way would be to assign the int value to an int variable and use memcpy:
int value = 16;
memcpy(&arr[0], &value, sizeof(value));
But even then the result would not be any different. Assuming an int is 4 bytes the contents of value are 0x00000010, so only one of the 4 bytes that make up this value contains a value other than 0.
Because 16 is cast down to char and it is placed only in the first cell. This is a reason why you always have 1.
If you try to set 16 as a pattern for few elements, eg:
memset(&arr[0], 16, 4);
or set bigger value (i have 4 bytes int)
*(int*)arr = 0xFFFFFFFF;
then you have got 4 spaces different than '\0'.
When you assign a value to a variable, the value will first be converted to the type of the value. That's just the way C (and basically all other languages) work.
Another thing that might (depending on architecture) the result is that 16 is small enough to fit in one byte. So even if your code were writing to both arr[0] and arr[1] it's not strange if it wrote 16 to arr[0] and 0 to arr[1]. So you should use larger numbers in order to be sure to detect something like this.
In order to achieve what you expect, you can do something like this:
int *p = (int*) &arr[0];
*p = 12345;
But unless you really know what you're doing, you should not do stuff like that. It's very tricky to get it right.
Also, the usual size for int is 4 and not 2.

Why address of i+2 is not 653064?

I'm learning pointers in C. I'm having confusion in Pointer arithmetic. Have a look at below program :
#include<stdio.h>
int main()
{
int a[] = 2,3,4,5,6;
int *i=a;
printf("value of i = %d\n", i); ( *just for the sake of simplicity I have use %d* )
printf("value of i+2 = %d\n", i+2);
return 0;
}
My question is if value of i is 653000 then why the value of i+2 is 653008 As far as I know every bit in memory has its address specified then according to this value of i+2 should be 653064 because 1 byte = 8 bit. Why pointer arithmetic is scaled with byte why not with bit?
THANKS in advance and sorry for my bad English!
As far as I know every bit in memory has its address specified
Wrong.
Why pointer arithmetic is scaled with byte why not with bit?
The byte is the minimal addressable unit of storage on a computer, not the bit. Addresses refer to bytes - you cannot create a pointer that points to a specific bit in memory1.
Addresses refer to *bytes*
|
|
v _______________
0x1000 |_|_|_|_|_|_|_|_| \
0x1001 |_|_|_|_|_|_|_|_| > Each row is one byte
0x1002 |_|_|_|_|_|_|_|_| /
\_______ _______/
v
Each column is one bit
As others have explained, this is basic pointer arithmetic in action. When you add n to a pointer *p, you're adding n elements, not n bytes. You're effectively adding n * sizeof(*p) bytes to the pointer's address.
1 - without using architecture-specific tricks like Bit-banding on ARM, as myaut pointed out
You should read about the pointer arithmetic.link given in the comment.
While incrementing the position of pointer , that will incremented based on the data type of that pointer.In this case i+2 will increment the byte into
eight bytes.
Integer is four bytes.(system defined). So i+2 will act as i+(2*sizeof(int)). So it will became i+8. So the answer is incremented by eight.
Addresses are calculating by the byte. Not the bit. Take a character pointer. Each byte having the 255 bits.
Consider the string like this. `"hi". It will stored like this.
h i
1001 1002
Ascii value of h is 104. It will be stored in one byte. signed character we can store positive in 0 to 127. So storing the one value we need the one byte in character dataype. Using the bits we cannot store the only value. so the pointer arithmetic is based on bytes.
When you do PTR + n then simple maths will be like
PTR + Sizeof(PTR)*n.
Here size of integer pointer is 4 Byte.

What does casting char* do to a reference of an int? (Using C)

In my course for intro to operating systems, our task is to determine if a system is big or little endian. There's plenty of results I've found on how to do it, and I've done my best to reconstruct my own version of a code. I suspect it's not the best way of doing it, but it seems to work:
#include <stdio.h>
int main() {
int a = 0x1234;
unsigned char *start = (unsigned char*) &a;
int len = sizeof( int );
if( start[0] > start[ len - 1 ] ) {
//biggest in front (Little Endian)
printf("1");
} else if( start[0] < start[ len - 1 ] ) {
//smallest in front (Big Endian)
printf("0");
} else {
//unable to determine with set value
printf( "Please try a different integer (non-zero). " );
}
}
I've seen this line of code (or some version of) in almost all answers I've seen:
unsigned char *start = (unsigned char*) &a;
What is happening here? I understand casting in general, but what happens if you cast an int to a char pointer? I know:
unsigned int *p = &a;
assigns the memory address of a to p, and that can you affect the value of a through dereferencing p. But I'm totally lost with what's happening with the char and more importantly, not sure why my code works.
Thanks for helping me with my first SO post. :)
When you cast between pointers of different types, the result is generally implementation-defined (it depends on the system and the compiler). There are no guarantees that you can access the pointer or that it correctly aligned etc.
But for the special case when you cast to a pointer to character, the standard actually guarantees that you get a pointer to the lowest addressed byte of the object (C11 6.3.2.3 §7).
So the compiler will implement the code you have posted in such a way that you get a pointer to the least significant byte of the int. As we can tell from your code, that byte may contain different values depending on endianess.
If you have a 16-bit CPU, the char pointer will point at memory containing 0x12 in case of big endian, or 0x34 in case of little endian.
For a 32-bit CPU, the int would contain 0x00001234, so you would get 0x00 in case of big endian and 0x34 in case of little endian.
If you de reference an integer pointer you will get 4 bytes of data(depends on compiler,assuming gcc). But if you want only one byte then cast that pointer to a character pointer and de reference it. You will get one byte of data. Casting means you are saying to compiler that read so many bytes instead of original data type byte size.
Values stored in memory are a set of '1's and '0's which by themselves do not mean anything. Datatypes are used for recognizing and interpreting what the values mean. So lets say, at a particular memory location, the data stored is the following set of bits ad infinitum: 01001010 ..... By itself this data is meaningless.
A pointer (other than a void pointer) contains 2 pieces of information. It contains the starting position of a set of bytes, and the way in which the set of bits are to be interpreted. For details, you can see: http://en.wikipedia.org/wiki/C_data_types and references therein.
So if you have
a char *c,
an short int *i,
and a float *f
which look at the bits mentioned above, c, i, and f are the same, but *c takes the first 8 bits and interprets it in a certain way. So you can do things like printf('The character is %c', *c). On the other hand, *i takes the first 16 bits and interprets it in a certain way. In this case, it will be meaningful to say, printf('The character is %d', *i). Again, for *f, printf('The character is %f', *f) is meaningful.
The real differences come when you do math with these. For example,
c++ advances the pointer by 1 byte,
i++ advanced it by 4 bytes,
and f++ advances it by 8 bytes.
More importantly, for
(*c)++, (*i)++, and (*f)++ the algorithm used for doing the addition is totally different.
In your question, when you do a casting from one pointer to another, you already know that the algorithm you are going to use for manipulating the bits present at that location will be easier if you interpret those bits as an unsigned char rather than an unsigned int. The same operatord +, -, etc will act differently depending upon what datatype the operators are looking at. If you have worked in Physics problems wherein doing a coordinate transformation has made the solution very simple, then this is the closest analog to that operation. You are transforming one problem into another that is easier to solve.

C array address confusion

Say we have the following code:
int main(){
int a[3]={1,2,3};
printf(" E: 0x%x\n", a);
printf(" &E[2]: 0x%x\n", &a[2]);
printf("&E[2]-E: 0x%x\n", &a[2] - a);
return 1;
}
When compiled and run the results are follows:
E: 0xbf8231f8
&E[2]: 0xbf823200
&E[2]-E: 0x2
I understand the result of &E[2] which is 8 plus the array's address, since indexed by 2 and of type int (4 bytes on my 32-bit system), but I can't figure out why the last line is 2 instead of 8?
In addition, what type of the last line should be - an integer or an integer pointer?
I wonder if it is the C type system (kinda casting) that make this quirk?
You have to remember what the expression a[2] really means. It is exactly equivalent to *(a+2). So much so, that it is perfectly legal to write 2[a] instead, with identical effect.
For that to work and make sense, pointer arithmetic takes into account the type of the thing pointed at. But that is taken care of behind the scenes. You get to simply use natural offsets into your arrays, and all the details just work out.
The same logic applies to pointer differences, which explains your result of 2.
Under the hood, in your example the index is multiplied by sizeof(int) to get a byte offset which is added to the base address of the array. You expose that detail in your two prints of the addresses.
When subtracting pointers of the same type the result is number of elements and not number of bytes. This is by design so that you can easily index arrays of any type. If you want number of bytes - cast the addresses to char*.
When you increment the pointer by 1 (p+1) then pointer would points to next valid address by adding ( p + sizeof(Type)) bytes to p. (if Type is int then p+sizeof(int))
Similar logic holds good for p-1 also ( of course subtract in this case).
If you just apply those principles here:
In simple terms:
a[2] can be represented as (a+2)
a[2]-a ==> (a+2) - (a) ==> 2
So, behind the scene,
a[2] - a[0]
==> {(a+ (2* sizeof(int)) ) - (a+0) } / sizeof(int)
==> 2 * sizeof(int) / sizeof(int) ==> 2
The line &E[2]-2 is doing pointer subtraction, not integer subtraction. Pointer subtraction (when both pointers point to data of the same type) returns the difference of the addresses in divided by the size of the type they point to. The return value is an int.
To answer your "update" question, once again pointer arithmetic (this time pointer addition) is being performed. It's done this way in C to make it easier to "index" a chunk of contiguous data pointed to by the pointer.
You may be interested in Pointer Arithmetic In C question and answers.
basically, + and - operators take element size into account when used on pointers.
When adding and subtracting pointers in C, you use the size of the data type rather than absolute addresses.
If you have an int pointer and add the number 2 to it, it will advance 2 * sizeof(int). In the same manner, if you subtract two int pointers, you will get the result in units of sizeof(int) rather than the difference of the absolute addresses.
(Having pointers using the size of the data type is quite convenient, so that you for example can simply use p++ instead of having to specify the size of the type every time: p+=sizeof(int).)
Re: "In addtion,what type of the last line should be?An integer,or a integer pointer??"
an integer/number. by the same token that the: Today - April 1 = number. not date
If you want to see the byte difference, you'll have to a type that is 1 byte in size, like this:
printf("&E[2]-E:\t0x%x\n",(char*)(&a[2])-(char*)(&a[0]))

Resources