In the main I call this function:
expects_unsigned_int("some text");
Defined like this:
expects_unsigned_int(unsigned int val)
I would like to print the string passed inside the function. Is it possible to do it in the way expects_unsigned_int() is defined?
This is what I tried:
expects_unsigned_int(unsigned int val) {
unsigned int* string = 0;
string = (unsigned int*) val;
printf("%s", (char*)string);
}
But it doesn't print anything.
The string when given as argument decays to the address of its first element, which is then converted to an unsigned int. If that integer is large enough to hold the address without losing bits, you could convert it back:
char* pointer1 = "abcde";
unsigned int integer = pointer1;
char* pointer2 = integer;
if (pointer1 == pointer2) {
printf("Works, kindof.\n");
}
However, as others pointed out in the comments, the very approach is bad and you shouldn't use this to solve whatever problem you have. Instead, first read about the meaning of an "XY problem" and then ask another question that addresses the actual problem here.
The whole point of a type in a language like C is that it describes some well-defined, useful set of values.
A value of type unsigned int can hold any integer in a range defined by your compiler and processor. This is typically a 32-bit integer, meaning that an unsigned int can hold any integer from 0 to 4294967295. But an unsigned int can not hold the value 5000000000 (it's too big), or the value 123.456 (it's not an integer) or the value "hello, world" (strings aren't integers).
A value of type char * can hold a pointer to a character anywhere in the usable address space on your computer. So it can hold a pointer to a single character, or it can hold a pointer to a null-terminated array of characters like "hello, world", or it can hold a NULL pointer. But it is not intended to hold an integer, or a floating-point value.
Sometimes, under constrained or unusual circumstances, programmers try to bend the rules, by wedging a value of one type into a variable of a different type. Sometimes you can make this work, sometimes you can't. It's almost always a significantly bad idea. Even if it can be made to work, it's often the case that it works properly on one machine, but not others.
Let's look more carefully at what you're doing. (I'm filling in a few details you left out.)
void expects_unsigned_int(unsigned int);
Here we tell the compiler that there's going to be a function named expects_unsigned_int that accepts one argument of type unsigned int and returns nothing.
#include <stdio.h>
int main()
{
expects_unsigned_int("some text");
}
Here we call that function, passing an argument of type char *. We're in trouble already, of course. You can't wedge a char * into an unsigned int sized slot. A proper compiler will give you a serious warning, if not an outright error, here. Mine says
warning: passing argument 1 of ‘expects_unsigned_int’ makes integer from pointer without a cast
expected ‘unsigned int’ but argument is of type ‘char *’
These warnings make sense, and are consistent with my explanations so far of what we should and shouldn't do with types.
As you may know, a pointer is "just" an address, and on most machines an address is "just" a bit pattern of some size, so you can convince yourself that it ought to be possible to jam a pointer into an integer. The key question, which we'll return to in a minute, is whether type unsigned int is literally big enough to hold all possible values of type char *.
void expects_unsigned_int(unsigned int val) {
Here we begin defining the details of function expects_unsigned_int. Again we say that it accepts one argument of type unsigned int and returns nothing. That's consistent with the earlier prototype declaration. All right so far.
unsigned int* string = 0;
Here we declare a pointer of type unsigned int * and initialize it to the null pointer. We don't really need this intermediate pointer, and in this case it doesn't matter whether we initialize it, since we're about to overwrite it.
string = (unsigned int*) val;
Here's where the trouble begins. We have an unsigned int value, and we attempt to convert it into a pointer. Again, this might seem reasonable, since pointers are "just" addresses and addresses are "just" bit patterns.
The other thing we have is an explicit cast. In this case, surprisingly, the cast is not really "doing" the conversion from unsigned int to unsigned int *. If we wrote the assignment without the cast, like this:
string = val;
the compiler would see an unsigned int value on the right-hand side, and a pointer of type unsigned int * on the left-hand side, and it would attempt to perform the same conversion implicitly. But since it's a dangerous and potentially meaningless conversion, the compiler would warn about it. Mine says
warning: assignment makes pointer from integer without a cast
But when you write an explicit cast, for most compilers what this means is, "trust me, I know what I'm doing, do this conversion and keep your doubts to yourself, I don't want to hear any of your warnings."
Finally,
printf("%s", (char*)string);
Here we do two things. First we explicitly convert the unsigned int * pointer into a char * pointer. That's also a questionable conversion, but of a much lesser concern. On the vast majority of computers today, all pointers (no matter what they point to) have the same size and representation, so a conversion like this is most unlikely to cause any problems.
And then the second thing we do is, finally, try to print the char * pointer using printf and %s. As you've discovered, it doesn't always work. It doesn't work for me on my computer, either.
There are computers where it would work, so the answer to your question "Is it possible to do it?" is "Yes, maybe, but."
Why didn't it work for you? I can't be sure, but it's probably for the same reason it didn't work for me. On my machine, pointers are 64 bits, but regular ints (including `unsigned int) are 32 bits. So when we called
expects_unsigned_int("some text");
and attempted to wedge a pointer into an int-sized slot, we scraped off 32 of its 64 bits. That's an information-losing transformation, so it's very likely to be an unrecoverable error.
Let's print some additional information, so we can confirm that this is what's going on. I encourage you to make these modifications to your program on your computer, so you can see what results you get.
Let's rewrite main like this:
int main()
{
char *string = "some text";
printf("string = %p = %s\n", string, string);
printf("int: %d, pointer: %d\n", (int)sizeof(unsigned int), (int)sizeof(string));
expects_unsigned_int(string);
}
We're using the printf format %p to print the pointer. This will show us a representation of the bit pattern that makes up the pointer value (however big it is), typically in hexadecimal. We're also using sizeof() to tell us how big ints and pointers are on the machine we're using.
Let's rewrite expects_unsigned_int like this:
void expects_unsigned_int(unsigned int val) {
char *string = val;
printf("val = %x\n", val);
printf("string = %p\n", string);
printf("string = %s\n", string);
}
Here we're printing both the value of val as it comes in, and the pointer we recover from it (again, using %p). Also, I'm making string of type char *, since there was no point in having it unsigned int *.
When I run the modified program, here's what I get:
string = 0x101295f20 = some text
int: 4, pointer: 8
val = 1295f20
string = 0x1295f20
Segmentation fault: 11
Immediately we see several things:
Pointers are bigger than ints on this machine (as I was saying earlier). There's no way we're going to be able to stuff a pointer into an int without potentially losing data.
We're indeed scraping off some of the bits of the sting pointer. It starts out being 101295f20 and ends up as 1295f20.
The program doesn't work. It crashes with a segmentation violation, likely because the mangled pointer value 0x1295f20 points outside its address space.
So how do we fix this? The best way would be to not try to pass a pointer value through a slot that's designed to hold integers.
Or, if we really wanted to, if we were bound and determined to convert pointers to integers and back again, we could try using a bigger integer, such as an unsigned long int. (And if that wasn't big enough, we could also try unsigned long long int.)
I rewrote main like this:
void expects_unsigned_long_int(unsigned long int val);
int main()
{
char *string = "some text";
printf("string = %p = %s\n", string, string);
printf("int: %d, pointer: %d\n", (int)sizeof(unsigned long int), (int)sizeof(string));
expects_unsigned_long_int(string);
}
And then expects_unsigned_long_int looks like this:
void expects_unsigned_long_int(unsigned long int val) {
char *string = val;
printf("val = %x\n", val);
printf("string = %p\n", string);
printf("string = %s\n", string);
}
I still get warnings when I compile it, but now when I run it it prints
string = 0x10a09df20 = some text
int: 8, pointer: 8
val = a09df20
string = 0x10a09df20
string = some text
So it looks like type unsigned long int is big enough (for now), and no bits get scraped off, and the original pointer value is successfully recovered inside expects_unsigned_long_int, and the string prints correctly.
But, in closing, please, figure out a better way of doing this!
Addendum: I said, with some uncertainty, that we could try unsigned long int, or maybe unsigned long long int. If you want more certainty, it turns out that there is a special integer type that's supposed to be the right size for jamming a pointer into it: uintptr_t, as defined in <stdint.h>. So, if you're "bound and determined to convert pointers to integers", that's the right type to use, although you have to beware that it's an optional type, since there can theoretically be an exotic machine out there somewhere that does support C but whose pointers are bigger than any of its integer types.
Related
I'm interested in whether an unsigned long long variable can be treated as an address, and the result seemed it was correct?
int var = 0;
// the variable var's address is 0x7fffffffe4a4 now in my system
unsigned long long ull = 0x7fffffffe4a4;
scanf("%d", ull); // I try to use ull as var's address and assign a value to var
printf("%d\n", var); // It seemingly works!
And I also conducted an experiment as follows.
int var = 0;
// the variable var's address is 0x7fffffffe4a4 now in my system
unsigned long long ull = 0x7fffffffe4a4;
int *ptr = &var;
// First, I want to know whether the memory used by unsigned long long and the one used by a pointer differ
printf("%zu\n", sizeof(ull));
printf("%zu\n", sizeof(ptr));
// The results are both "8" in my system
// Next I try to assign two different values to var by two ways
scanf("%d", ull); // I try to use x as an address to assign var
printf("%d\n", var);
scanf("%d", ptr); // And I also use a normal pointer to assign var
printf("%d\n", var);
// The results were in line with expectations!
Well, it seems an unsigned long long variable can be used as a pointer successfully (although the compiler warned me that the argument was expected to be int* rather than unsigned long long), and I wonder …
What's the difference between a variable and a pointer at the hardware level? How are these two types of objects processed when they are stored and used? Who processes these objects and is it recommended to perform the operations above?
(In the end is it a feature of c?)
On many (most) platforms, pointers and (unsigned) integers are stored in very similar formats at the hardware level (on your system, both an int* pointer and an unsigned long long are 8 bytes). However, from the point of view of the C language and compiler, they are very different types of variable.
One notable difference in their behaviour concerns arithmetic. For integral types, arithmetic operations like x = x + 1 do exactly what you would naturally expect. However, for pointers, such operations are performed in base units of the size of the pointed-to type.
The following code demonstrates this (on a platform with 8-byte pointers and long long and a 4-byte int):
#include <stdio.h>
int main()
{
int myInt = 42;
int* ptr = &myInt;
unsigned long long ull = (unsigned long long)ptr;
printf("%p %016llX\n", (void*)ptr, ull);
++ptr;
++ull;
printf("%p %016llX\n", (void*)ptr, ull);
return 0;
}
The output is:
0000005B8A4FFC10 0000005B8A4FFC10
0000005B8A4FFC14 0000005B8A4FFC11
For the first line (as you have already noted), the two values are identical, and their binary representations will also be the same (on this platform). However, notice that the ++ increment behaves differently on the two types, so that the second line of output shows that the pointer has been incremented by 4 (the size of an int) but the unsinged integer has been incremented by 1.
Here is a good analogy: using numbers instead of pointers is not wrong, it's dangerous. Imagine at the reason why you don't use a bicycle on a highway. Is it because you can't ride it? Of course you can, you have wheels, the asphalt is good. You don't do it because it's dangerous, the bicycle is not designed to do it.
For a more formal explanation:
Congratulations, you discovered that addresses are, ultimately, just numbers. The reason why in languages such as C or C++ (they are different!!) pointers are introduced is to avoid confusion and errors, letting the users and the compiler know specifically when they are dealing with an address or a number.
As I said at the beginning, at the end of the day, an address is a number that needs to tell the hardware where to look in the memory. However a pointer is a number that is treated with special care by the compiler:
You have the guarantee that a pointer has enough bits to store the address. Your example works "fine" with 64bits systems, but in a 16bits system a pointer will have the same range as a unsigned short value, and if you try to assign a unsigned long you will run into all sorts of problems.
Arithmetic on pointers follow the size of the underlying pointed data type. When you are pointing at a short at address 0x100 and you want to go to the next byte, you know that you have to look at '0x102'.
short* ps = (short*)0x100;
ps += 1; // ps is now 0x102, because the compiler knows that short is 2 bytes
*ps = 0xAABB;
short s = 0x100;
// I need to advance to the next short.. mmm I have to do this:
s += sizeof(short).
*((short*)s) = 0xAABB; // Also, ugly syntax.
// See how easy it is to make errors when you use plain numbers?
Bottom line: pointers are special numbers with special properties, especially thought to handle memory accesses and address arithmetic.
They have the same size on certain platforms, and can then be typecast into the other without disregarding any number of bits. There is the size_t type, which represents the "return" type of sizeof and the is used to store size used in memory(in bytes) for a variable. size_t can be only 2 bytes big(sizeof(size_t)=2) on some platforms, but long long(whose sizeof is the same as the sizeof of unsigned long long) is always 8 or higher. It makes sense for pointers to any type to have the same sizeof. The number of bytes in the memory before the first byte referred to by the pointer is stored in the pointer, so it makes sense that its sizeof should be same as sizeof of a pointer to any type. So, I'd recommend using size_t if you want to store an address in an integral type.
There are two fundamental issues here:
Interpretation matters. In almost any programming language, the values we work with are not just raw bit patterns, they are bit patterns interpreted as a particular type. For example, consider the binary number 0b111111101000000000000000000000, or in hexadecimal, 0x3fa00000. Interpreted as an int, that's the number 1067450368. Interpreted as a float, it's 1.25. Same bit pattern, two completely different, totally unrelated numbers. You could take the same bit pattern and try to interpret it as a pointer, but that would be a third, completely unrelated interpretation, and it wouldn't necessarily mean anything, either.
Just because you have a pointer value, pointing at (or containing) a certain memory address, doesn't necessarily mean you can actually access that memory. It might be in use by some other variable in your program. It might be in use by the actual code of your program. It might be in use by some other program. Or it might not exist at all — it might refer to a memory address that's higher than the total amount of memory in your computer.
So taking an integer value, converting it to a pointer, and then accessing the resulting memory is sort of like chipping golf balls off the top of a tall building: you have no idea where the balls are going to land, and they might hurt someone, and it's therefore a pretty irresponsible practice.
When I run the following code it gives a segmentation fault:
#include <stdio.h>
int main() {
int i;
char char_array[5] = {'a', 'b', 'c', 'd', 'e'};
int int_array[5] = {1, 2, 3, 4, 5};
unsigned int hacky_nonpointer;
hacky_nonpointer = (unsigned int) char_array;
for(i=0; i < 5; i++) { // Iterate through the int array with the int_pointer.
printf("[hacky_nonpointer] points to %p, which contains the char '%c'\n",
hacky_nonpointer, *((char *) hacky_nonpointer));
hacky_nonpointer = hacky_nonpointer + sizeof(char);
}
hacky_nonpointer = (unsigned int) int_array;
for(i=0; i < 5; i++) { // Iterate through the int array with the int_pointer.
printf("[hacky_nonpointer] points to %p, which contains the integer %d\n",
hacky_nonpointer, *((int *) hacky_nonpointer));
hacky_nonpointer = hacky_nonpointer + sizeof(int);
}
}
I was actually trying to do a typecast example. How can I resolve the segmentation fault?
My guess is that you're on a 64-bit machine, where pointers are 64 bits. That will lead to big problems (and undefined behavior) when you do
hacky_nonpointer = (unsigned int) char_array;
as the type int is typically still only 32 bits.
Once you're experimented with this, then throw it all away, and forget all about as well! This is bad code doing bad things that no real program should ever do.
To expand on Some_programmer_dude’s answer a bit, the safe way to store a pointer in an integral type is
#include <stdint.h>
/* ... */
uintptr_t hacky_nonpointer = (uintptr_t)(void*)p;
To convert back,
const char c = *(char*)(void*)hacky_nonpointer;
On most real-world compilers, a direct cast from any pointer type to uintptr_t will work just fine. However, the standard technically only says that any pointer can be converted to void* and back, and that any void* can be converted to uintptr_t and back.
A round-trip conversion will get you an equivalent pointer back. (See the footnote for if you care about the language-lawyering details.) That is, you can convert p to a uintptr_t value and back, and you are guaranteed to get another pointer to the same object. You cannot safely increment the uintptr_t value and convert that back, but you could increment the pointer and convert the incremented pointer to uintptr_t and back. That is how you would safely do what you appear to want.
Converting to an integral type and adding 1 (or equivalently sizeof(char), which is guaranteed to be 1) is not guaranteed to give you anything meaningful. It’s possible to imagine esoteric implementations that will crash if you try to convert that value back to a pointer! However, on mainstream compilers, it will work.
If your compiler didn’t give you a warning about this code, you need to turn on more warnings. If it did, you shouldn’t ignore compiler warnings.
As the Dude said, though, you should never write code like that in the real world. No program should ever do anything like that or will ever need to.
Footnote
There is one extremely pedantic loophole to this: the Standard guarantees that a pointer converted to uintptr_t and back will compare equal to the original pointer, and it forbids two pointers to compare equal unless they can be used the same way. With one exception.
A pointer to the start of an array object might compare equal to a pointer one-past-the-end of a different array object. By my reading of the standard, an implementation that allowed a pointer resulting from a round-trip conversion of either kind of pointer (the beginning of an array object, or one past its end) to be used in only one of those ways could claim to be technically in compliance.
However, any real-world implementation would allow such a pointer to be used in both contexts. That the standard does not spell this out appears to be an oversight.
So I saw a few example on how the endianness of an architecture could be found. Let's say we have an integer pointer that points to an int data type. And let's say the int value is 0x010A0B12. In a little endian architecture, the least significant byte, i.e, 12, will be stored in the lowest memory address, right? So the lowest byte in a 4-byte integer will be 12.
Now, on to the check. If we declare a char pointer p, and type cast the integer pointer to a char * and store it in p, and print the dereferenced value of p, we will get a clue on the endianness of the architecture. If it's 12, we're little endian; 01 signifies big endian. This sounds really neat...
int a = 0x010A0B12;
int *i = &a;
char *p = (char*)i;
printf("%d",*p); // prints the decimal equivalent of 12h!
Couple of questions here, really. Since pointers are strongly typed, shouldn't a character pointer strictly point to a char data type? And what's up with printing with %d? Shouldn't we rather print with %c, for character?
Since pointers are strongly typed, shouldn't a character pointer strictly point to a char data type?
C has a rule that any pointer can be safely converted to char* and to void*. Converting an int* to char*, therefore, is allowed, and it is also portable. The pointer would be pointing to the initial byte of your int's internal representation.
Shouldn't we rather print with %c, for character?
Another thing is in play here: variable-length argument list of printf. When you pass a char to an untyped parameter of printf, the default conversion applies: char gets converted to int. That is why %d format takes the number just fine, and prints it out as you expect.
You could use %c too. The code that processes %c specifier reads the argument as an int, and then converts it to a char. 0x12 is a special character, though, so you would not see a uniform printout for it.
Since pointers are strongly typed, shouldn't a character pointer strictly point to a char data type?
This is kind of undefined behavior - but such that most sane implementations will do what you mean. So most people would say ok to it.
And what's up with printing with %d?
Format %d expects argument of type int, and the actual arg of type char is promoted to int by usual C rules. So this is ok again. You probably don't want to use %c since the content of byte pointed by p may be any byte, not always a valid text character.
In the book Learn C The Hard Way at excercise 15 there is suggestion to break program by pointing integer pointer at array of strings and using C cast to force it. How can I do it?
Here is a small example. the result depends on the endianness of your system and the size of int. I would expect the first or fourth character to change to the next character in the alphabet.
#include<stdio.h>
int main(void) {
char string[100] = "Somestring";
int *p;
/* Let p point to the string */
p = (int*)string;
/* modify a value */
(*p)++;
/* Let's see if any character got changed */
printf("%s", string);
return 0;
}
It should be pointed out that not all casts are safe and that the result could be implementation defined or undefined. This example is actually undefined, since int could have stricter alignment constraints than char.
When writing portable code you need to take great care when using casts.
The code above could break on any system where sizeof(int) is greater than the string length regardless of alignment issues. In this case, where the string has size 100, we wouldn't expect that to happen in a long while. Had the string been 4-7 bytes it could happen sooner. The jump from 32- to 64-bit pointers broke a lot of old code that assumed that pointers and int were the same size.
Edit:
Is there an easy fix to the alignment problem? What if we could somehow make sure that the string starts in an address that is also suitable for an int. Fortunately, that is easy. The memory allocation function malloc is guaranteed to return memory aligned at an address that is suitable for any type.
So, instead of
char string[100] = "Somestring";
we can use
char *string = malloc(100);
strcpy(string, "Somestring");
The subsequent cast is now safe alignment-wise and is portable to systems where int is smaller than 100.
Note that malloc is declared in stdlib.h, so we should add the following at the top of our code file:
#include<stdlib.h>
That's simply an abusive way of casting.
// setup the pointers to the start of the arrays
int *cur_age = ages;
char **cur_name = names;
What the author of that link meant by "to break program by pointing integer pointer at array of strings and using C cast to force it." He meant that you can write something like this int *cur_age = (int *)names; That is to cast a pointer to pointer to char to a pointer to int. You can do that in C, which allows you to cast from one type of pointer to another type of pointer; but be warned you need to know what you are doing.
Here the author wanted to show how to break a program by pointing a pointer to a wrong type. His example, however, is probably making you more confused rather than helping you to understand pointers.
To cast, use the cast operator: (type)expression. For example, to cast an expression of type double to int:
(int)sqrt(2);
In your specific case, cast names to int* (the type of cur_age) to break the program:
cur_age = (int*)names;
To point incompatible pointer in c you only need to cast it to void.
//array of string declaration
char aStr[50][50];
Int *pint;
//do whatever you need with string array
pint = (*int)(*void)aStr;
I'm writing this from my cell phone.
if you increment your pointer past the allocated memory, you might end up in your program stack and change value to it.
I'm doing a random exercise where, given an integer array and double array, you are supposed to calculate the size of an integer and a double.
For the integer size, I simply use two pointers to point to two adjacent arrays, then I find their difference. Because pointer arithmetic calculates this as a difference of 1, I casted the pointers as integers because I assumed the pointers would explicitly refer to 4 bytes of memory, and I got the right result (4).
Now, I try the same exact thing for doubles, but I get this error: pointer value used where a floating point value was expected
A possible guess is that doubles are 8 bytes, whereas pointers are 4 bytes. But I' not sure if that actually matters.
Any insights?
my exact line:
int doubSize = (double)doubPtr2 - (double)doubPtr1;
//where doubtPtr1 and doubPtr2 poiny to two adjacent indexes of double array
The standard allows casting pointers to integer types, though unless the type is big enough, casting back won't recover the original pointer. Still, the actual mapping is implementation-defined though there is a suggestion it should mimic the underlying memory architecture, which sharply limits the utility of doing so. No such licence is given for floating-point types.
Take a look at intptr_t for a guaranteed-big-enough type.
Anyway, the better (and correct) way is getting those pointers to adjacent array elements, casting them to char* (char is guaranteed to be one byte big, a byte is not guaranteed to be an octet), and than subtracting them. (Result type should be size_t as for the operator)
That presupposes that for your exercise, sizeof was arbitrarily banned, which would otherwise be the solution of choice here.
Casting double pointer to a double isn't doing what you think it's doing, or at least I don't think it is... see for example:
int main(int argc, char *argv)
{
double *a;
double *b;
double dummy[3];
a = &dummy[0];
b = &dummy[1];
int *ai;
int *bi;
int dummyi[3];
ai = &dummyi[0];
bi = &dummyi[1];
char *ac;
char *bc;
ac = (char *)a;
bc = (char *)b;
printf("Check for double: %d vs %d\n",sizeof(double),bc-ac);
ac = (char *)ai;
bc = (char *)bi;
printf("Check for int: %d vs %d\n",sizeof(int),bc-ac);
}
(Print outs:
Check for double: 8 vs 8
Check for int: 4 vs 4
)
You're overthinking it. A pointer is an address, which is just a number. When you have an array of whatever type, each one is at the next available address. So, to find out the size of the type, you just need the addresses of adjacent entries in the array.
double tmp[2];
int size = (char *)(&(tmp[1])) - (char *)(&(tmp[0]));
The addresses have to be cast to char * in order to get the address of the first byte of the data type by treating the address space as storing bytes rather than whatever the data type is.