I recently did an assignment using bit masking and shifting to manipulate a 4 byte int.
I got to wondering if it was possible to set a char pointer to the start of the int variable and then step through the int as if it was a 1 byte char by using the char pointer.
Is there a way to do this or something similar? I tried to set the char pointer to an int but when I step ahead by 1 it jumps 4 bytes instead.
Just trying to think of alternative ways of doing the same thing.
Of course you can, this code shows the behavior:
#include <stdio.h>
int main()
{
int value = 1234567;
char *pt = (char*) &value;
printf("first char: %p, second char: %p\n", pt, pt+1);
}
This outputs:
first char: 0x7fff5fbff448, second char: 0x7fff5fbff449
As you can see difference is just 1 byte as intended, this because arithmetic on pointers has been done after casting the type to a smaller kind of data.
I imagine this should do what you want:
int x = 42;
char *c = (char *) &x;
char byte0 = c[0];
char byte1 = c[1];
char byte2 = c[2];
char byte3 = c[3];
Yes a char pointer would step by 1byte at a time, you probably inadvertently cast it to an int.
Another complexity is the order of the bytes in an int, at least on Intel
Related
I am trying to understand, what's going on in this program. The output is -121 3. How do we get this output?
#include <stdio.h>
int main(void) {
int a = 903;
char *p = (char *) &a;
printf("%d ",*p++);
printf("%d",*p);
return 0;
}
Runnable Code at ideone
Well what happens...
903 equals 0x387 in hex.
int a = 903;
You make a pointer to it, and cast it to a signed char pointer:
char *p = (char *) &a;
That now points to the 0x387's LSB, which reads 0x87, and when treated as signed char, you get -121. Then you advance to the MSB (by incrementing the pointer).
printf("%d ",*p++);
Now you read the MSB, which is 3.
printf("%d",*p);
However, it's not a very good idea to cast int* to char*.
#include<stdio.h>
int main()
{
char arr[] = "somestring";
char *ptr1 = arr;
char *ptr2 = ptr1 + 3;
printf("ptr2 - ptr1 = %ld\n", ptr2 - ptr1);
printf("(int*)ptr2 - (int*) ptr1 = %ld", (int*)ptr2 - (int*)ptr1);
return 0;
}
I understand
ptr2 - ptr1
gives 3 but cannot figure out why second printf prints 0.
It's because when you substract two pointers, you get the distance between the pointer in number of elements, not in bytes.
(char*)ptr2-(char*)ptr1 // distance is 3*sizeof(char), ie 3
(int*)ptr2-(int*)ptr1 // distance is 0.75*sizeof(int), rounded to 0
EDIT: I was wrong by saying that the cast forces the pointer to be aligned
If you want to check the distance between addresses don't use (int *) or (void *), ptrdiff_t is a type able to represent the result of any valid pointer subtraction operation.
#include <stdio.h>
#include <stddef.h>
int main(void)
{
char arr[] = "somestring";
char *ptr1 = arr;
char *ptr2 = ptr1 + 3;
ptrdiff_t diff = ptr2 - ptr1;
printf ("ptr2 - ptr1 = %td\n", diff);
return 0;
}
EDIT: As pointed out by #chux, use "%td" character for ptrdiff_t.
Casting a char pointer with int* would make it aligned to the 4bytes (considering int is 4 bytes here). Though ptr1 and ptr2 are 3 bytes away, casting them to int*, results in the same address -- hence the result.
This is because sizeof(int) == 4
Each char takes 1 byte. Your array of chars looks like this in memory:
[s][o][m][e][s][t][r][i][n][g][0]
When you have an array of ints, each int occupies four bytes. storing '1' and '2' conceptually looks more like this:
[0][0][0][1][0][0][0][2]
Ints must therefore be aligned to 4-byte boundaries. Your compiler is aliasing the address to the lowest integer boundary. You'll note that if you use 4 instead of 3 this works as you expected.
The reason you have to perform a subtraction to get it to do it (just passing the casted pointers to printf doesn't do it) is because printf is not strictly typed, i.e. the %ld format does not contain the information that the parameter is an int pointer.
The below program tests for Little/Big endian on intel processor. Actually little endian is correct output. First I am casting int to char* and accessing its value without initialization to int *.I am not understanding second part of output. Here int pointer is casted to char *. So why is not int pointer not changed its alignment to char *?
00000000 00000000 00000011 01111111 = 895
0 0 3 127
int main() {
int num = 895;
if(*(char *)&num == 127)
{
printf("\nLittle-Endian\n");
}
else
{
printf("Big-Endian\n");
}
int *p = (char *)&num ;
if(*p == 127)
{
printf("\nLittle-Endian\n");
}
else
{
printf("Big-Endian\n");
}
printf("%d\n",*p);
}
o/p
Little-Endian
Big-Endian
895
The first half of your program using this comparison:
if(*(char *)&num == 127)
looks fine.
The second half of your program contains this assignment:
int *p = (char *)&num ;
Which isn't valid code. You can't convert pointer types without an explicit cast. In this case, your compiler might be letting you get away with it, but strictly speaking, it's incorrect. This line should read:
int *p = (int *)(char *)#
or simply this equivalent statement:
int *p = #
From this example, I'm sure you can see why your second test doesn't work the way you'd like it to - you're still operating on the whole int, not on the single byte you were interested in. If you made p a char *, it would work the way you expected:
char *p = (char *)#
Can int pointer be cast to char *?
Yes, it's only the inverse that would invoke undefined behavior, more precisely, using the result of a cast from char * to int * (since char is 1-byte aligned, so any data pointer type can safely be cast to char *).
I have a problem where I have a pointer to an area in memory. I would like to use this pointer to create an integer array.
Essentially this is what I have, a pointer to a memory address of size 100*300*2 = 60000 bytes
unsigned char *ptr = 0x00000000; // fictional point in memory goes up to 0x0000EA60
What i would like to achieve is to examine this memory as an integer array of size 100*150 = 15000 ints = 60000 bytes, like this:
unsigned int array[ 100 ][ 150 ];
I'm assuming it involves some casting though i'm not sure exactly how to formulate it. Any help would be appreciated.
You can cast the pointer to unsigned int (*)[150]. It can then be used as if it is a 2D array ("as if", since behavior of sizeof is different).
unsigned int (*array)[150] = (unsigned int (*)[150]) ptr;
Starting with your ptr declaration
unsigned char *ptr = 0x00000000; // fictional point in memory goes up to 0x0000EA60
You can cast ptr to a pointer to whatever type you're treating the block as, in this case array of array of unsigned int. We'll declare a new pointer:
unsigned int (*array_2d)[100][150] = (unsigned int (*)[100][150])ptr;
Then, access elements by dereferencing and then indexing just as you would for a normal 2d array.
(*array_2d)[50][73] = 27;
Some typedefs would help clean things up, too.
typedef unsigned int my_2d_array_t[100][150];
typedef my_2d_array_t *my_2d_array_ptr_t;
my_2d_array_ptr_t array_2d = (my_2d_array_ptr_t)ptr;
(*array_2d)[26][3] = 357;
...
And sizeof should work properly.
sizeof(array_2d); //4, given 32-bit pointer
sizeof(*array_2d); //60000, given 32-bit ints
sizeof((*array_2d)[0]); //600, size of array of 150 ints
sizeof((*array_2d)[0][1]); //4, size of 1 int
Here is the full code of it
#include <stdio.h>
#include <string.h>
void reverse_string(unsigned short *buf, int length)
{
int i;
unsigned short temp;
for (i = 0; i < length / 2; i++)
{
temp = buf[i];
buf[i] = buf[length - i - 1];
buf[length - i - 1] = temp;
}
}
int main(int argc, char **argv)
{
unsigned short* tmp = (unsigned short*)argv[1];
reverse_string(tmp,strlen(argv[1]) / 2);
printf("%s",argv[1]);
return 0;
}
As you can see, in main, we have
unsigned short* tmp = (unsigned short*)argv[1];
Arent pointers supposed to point "to the address of" of a variable? The one above isn't(using the ampersand). Yet the program works as intended.
Why is it like that?
And what does this part mean?
(unsigned short*)argv[1]
argv is a pointer-to-an-array-of-pointers:
argv[0][0] (a char)
argv[0] (a char*)
argv (a char**)
unsigned char* tmp = (unsigned char*)argv[1];
...works, because you're referencing the the second "string" in that set.
Note that in this case, "char" and "unsigned short" might be roughly equivolent depending on the compiler and platform, but it is probably not a good idea to assume that. For example, if you compiled to enable a "unicode" command line, then you might get "short" instead of "char" forwarded to you from the command line. But, that may be a dangerous assumption, as "these days" a "short" is usually 16-bits and a "char" is usually 8-bits.
Addressing the original questions:
argv is an array of pointers, each of which point to a character array. argv[1] is a pointer to the character array with the first argument (i.e. if you run ./program arg1 arg2, the pointer argv[1] points to the string arg1).
The ampersand is used to denote a reference, which is for most purposes the same as a pointer. It is syntactic sugar to make it easy to pass a reference to a variable that you have already declared. The common example is using scanf.
int x = 1;
scanf(..., &x, ...)
is equivalent to
int x = 1;
int *p = &x;
scanf(..., p, ...)
The program itself is designed to flip endianness. It's not sufficient to go character-by-character because you have to flip two bytes at a time (ie short-by-short), which is why it works using shorts.
(unsigned short*)argv[1] instructs the compiler to treat the address as if it were an array of shorts. To give an example:
unsigned char *c = (unsigned char *)argv[1];
c[1]; /*this points to the address one byte after argv*/
unsigned short *s = (unsigned short *)argv[1];
s[1]; /*this points to the address two bytes after argv */
Take a look at a primer on type casting.