When I run the following code it gives a segmentation fault:
#include <stdio.h>
int main() {
int i;
char char_array[5] = {'a', 'b', 'c', 'd', 'e'};
int int_array[5] = {1, 2, 3, 4, 5};
unsigned int hacky_nonpointer;
hacky_nonpointer = (unsigned int) char_array;
for(i=0; i < 5; i++) { // Iterate through the int array with the int_pointer.
printf("[hacky_nonpointer] points to %p, which contains the char '%c'\n",
hacky_nonpointer, *((char *) hacky_nonpointer));
hacky_nonpointer = hacky_nonpointer + sizeof(char);
}
hacky_nonpointer = (unsigned int) int_array;
for(i=0; i < 5; i++) { // Iterate through the int array with the int_pointer.
printf("[hacky_nonpointer] points to %p, which contains the integer %d\n",
hacky_nonpointer, *((int *) hacky_nonpointer));
hacky_nonpointer = hacky_nonpointer + sizeof(int);
}
}
I was actually trying to do a typecast example. How can I resolve the segmentation fault?
My guess is that you're on a 64-bit machine, where pointers are 64 bits. That will lead to big problems (and undefined behavior) when you do
hacky_nonpointer = (unsigned int) char_array;
as the type int is typically still only 32 bits.
Once you're experimented with this, then throw it all away, and forget all about as well! This is bad code doing bad things that no real program should ever do.
To expand on Some_programmer_dude’s answer a bit, the safe way to store a pointer in an integral type is
#include <stdint.h>
/* ... */
uintptr_t hacky_nonpointer = (uintptr_t)(void*)p;
To convert back,
const char c = *(char*)(void*)hacky_nonpointer;
On most real-world compilers, a direct cast from any pointer type to uintptr_t will work just fine. However, the standard technically only says that any pointer can be converted to void* and back, and that any void* can be converted to uintptr_t and back.
A round-trip conversion will get you an equivalent pointer back. (See the footnote for if you care about the language-lawyering details.) That is, you can convert p to a uintptr_t value and back, and you are guaranteed to get another pointer to the same object. You cannot safely increment the uintptr_t value and convert that back, but you could increment the pointer and convert the incremented pointer to uintptr_t and back. That is how you would safely do what you appear to want.
Converting to an integral type and adding 1 (or equivalently sizeof(char), which is guaranteed to be 1) is not guaranteed to give you anything meaningful. It’s possible to imagine esoteric implementations that will crash if you try to convert that value back to a pointer! However, on mainstream compilers, it will work.
If your compiler didn’t give you a warning about this code, you need to turn on more warnings. If it did, you shouldn’t ignore compiler warnings.
As the Dude said, though, you should never write code like that in the real world. No program should ever do anything like that or will ever need to.
Footnote
There is one extremely pedantic loophole to this: the Standard guarantees that a pointer converted to uintptr_t and back will compare equal to the original pointer, and it forbids two pointers to compare equal unless they can be used the same way. With one exception.
A pointer to the start of an array object might compare equal to a pointer one-past-the-end of a different array object. By my reading of the standard, an implementation that allowed a pointer resulting from a round-trip conversion of either kind of pointer (the beginning of an array object, or one past its end) to be used in only one of those ways could claim to be technically in compliance.
However, any real-world implementation would allow such a pointer to be used in both contexts. That the standard does not spell this out appears to be an oversight.
Related
In the main I call this function:
expects_unsigned_int("some text");
Defined like this:
expects_unsigned_int(unsigned int val)
I would like to print the string passed inside the function. Is it possible to do it in the way expects_unsigned_int() is defined?
This is what I tried:
expects_unsigned_int(unsigned int val) {
unsigned int* string = 0;
string = (unsigned int*) val;
printf("%s", (char*)string);
}
But it doesn't print anything.
The string when given as argument decays to the address of its first element, which is then converted to an unsigned int. If that integer is large enough to hold the address without losing bits, you could convert it back:
char* pointer1 = "abcde";
unsigned int integer = pointer1;
char* pointer2 = integer;
if (pointer1 == pointer2) {
printf("Works, kindof.\n");
}
However, as others pointed out in the comments, the very approach is bad and you shouldn't use this to solve whatever problem you have. Instead, first read about the meaning of an "XY problem" and then ask another question that addresses the actual problem here.
The whole point of a type in a language like C is that it describes some well-defined, useful set of values.
A value of type unsigned int can hold any integer in a range defined by your compiler and processor. This is typically a 32-bit integer, meaning that an unsigned int can hold any integer from 0 to 4294967295. But an unsigned int can not hold the value 5000000000 (it's too big), or the value 123.456 (it's not an integer) or the value "hello, world" (strings aren't integers).
A value of type char * can hold a pointer to a character anywhere in the usable address space on your computer. So it can hold a pointer to a single character, or it can hold a pointer to a null-terminated array of characters like "hello, world", or it can hold a NULL pointer. But it is not intended to hold an integer, or a floating-point value.
Sometimes, under constrained or unusual circumstances, programmers try to bend the rules, by wedging a value of one type into a variable of a different type. Sometimes you can make this work, sometimes you can't. It's almost always a significantly bad idea. Even if it can be made to work, it's often the case that it works properly on one machine, but not others.
Let's look more carefully at what you're doing. (I'm filling in a few details you left out.)
void expects_unsigned_int(unsigned int);
Here we tell the compiler that there's going to be a function named expects_unsigned_int that accepts one argument of type unsigned int and returns nothing.
#include <stdio.h>
int main()
{
expects_unsigned_int("some text");
}
Here we call that function, passing an argument of type char *. We're in trouble already, of course. You can't wedge a char * into an unsigned int sized slot. A proper compiler will give you a serious warning, if not an outright error, here. Mine says
warning: passing argument 1 of ‘expects_unsigned_int’ makes integer from pointer without a cast
expected ‘unsigned int’ but argument is of type ‘char *’
These warnings make sense, and are consistent with my explanations so far of what we should and shouldn't do with types.
As you may know, a pointer is "just" an address, and on most machines an address is "just" a bit pattern of some size, so you can convince yourself that it ought to be possible to jam a pointer into an integer. The key question, which we'll return to in a minute, is whether type unsigned int is literally big enough to hold all possible values of type char *.
void expects_unsigned_int(unsigned int val) {
Here we begin defining the details of function expects_unsigned_int. Again we say that it accepts one argument of type unsigned int and returns nothing. That's consistent with the earlier prototype declaration. All right so far.
unsigned int* string = 0;
Here we declare a pointer of type unsigned int * and initialize it to the null pointer. We don't really need this intermediate pointer, and in this case it doesn't matter whether we initialize it, since we're about to overwrite it.
string = (unsigned int*) val;
Here's where the trouble begins. We have an unsigned int value, and we attempt to convert it into a pointer. Again, this might seem reasonable, since pointers are "just" addresses and addresses are "just" bit patterns.
The other thing we have is an explicit cast. In this case, surprisingly, the cast is not really "doing" the conversion from unsigned int to unsigned int *. If we wrote the assignment without the cast, like this:
string = val;
the compiler would see an unsigned int value on the right-hand side, and a pointer of type unsigned int * on the left-hand side, and it would attempt to perform the same conversion implicitly. But since it's a dangerous and potentially meaningless conversion, the compiler would warn about it. Mine says
warning: assignment makes pointer from integer without a cast
But when you write an explicit cast, for most compilers what this means is, "trust me, I know what I'm doing, do this conversion and keep your doubts to yourself, I don't want to hear any of your warnings."
Finally,
printf("%s", (char*)string);
Here we do two things. First we explicitly convert the unsigned int * pointer into a char * pointer. That's also a questionable conversion, but of a much lesser concern. On the vast majority of computers today, all pointers (no matter what they point to) have the same size and representation, so a conversion like this is most unlikely to cause any problems.
And then the second thing we do is, finally, try to print the char * pointer using printf and %s. As you've discovered, it doesn't always work. It doesn't work for me on my computer, either.
There are computers where it would work, so the answer to your question "Is it possible to do it?" is "Yes, maybe, but."
Why didn't it work for you? I can't be sure, but it's probably for the same reason it didn't work for me. On my machine, pointers are 64 bits, but regular ints (including `unsigned int) are 32 bits. So when we called
expects_unsigned_int("some text");
and attempted to wedge a pointer into an int-sized slot, we scraped off 32 of its 64 bits. That's an information-losing transformation, so it's very likely to be an unrecoverable error.
Let's print some additional information, so we can confirm that this is what's going on. I encourage you to make these modifications to your program on your computer, so you can see what results you get.
Let's rewrite main like this:
int main()
{
char *string = "some text";
printf("string = %p = %s\n", string, string);
printf("int: %d, pointer: %d\n", (int)sizeof(unsigned int), (int)sizeof(string));
expects_unsigned_int(string);
}
We're using the printf format %p to print the pointer. This will show us a representation of the bit pattern that makes up the pointer value (however big it is), typically in hexadecimal. We're also using sizeof() to tell us how big ints and pointers are on the machine we're using.
Let's rewrite expects_unsigned_int like this:
void expects_unsigned_int(unsigned int val) {
char *string = val;
printf("val = %x\n", val);
printf("string = %p\n", string);
printf("string = %s\n", string);
}
Here we're printing both the value of val as it comes in, and the pointer we recover from it (again, using %p). Also, I'm making string of type char *, since there was no point in having it unsigned int *.
When I run the modified program, here's what I get:
string = 0x101295f20 = some text
int: 4, pointer: 8
val = 1295f20
string = 0x1295f20
Segmentation fault: 11
Immediately we see several things:
Pointers are bigger than ints on this machine (as I was saying earlier). There's no way we're going to be able to stuff a pointer into an int without potentially losing data.
We're indeed scraping off some of the bits of the sting pointer. It starts out being 101295f20 and ends up as 1295f20.
The program doesn't work. It crashes with a segmentation violation, likely because the mangled pointer value 0x1295f20 points outside its address space.
So how do we fix this? The best way would be to not try to pass a pointer value through a slot that's designed to hold integers.
Or, if we really wanted to, if we were bound and determined to convert pointers to integers and back again, we could try using a bigger integer, such as an unsigned long int. (And if that wasn't big enough, we could also try unsigned long long int.)
I rewrote main like this:
void expects_unsigned_long_int(unsigned long int val);
int main()
{
char *string = "some text";
printf("string = %p = %s\n", string, string);
printf("int: %d, pointer: %d\n", (int)sizeof(unsigned long int), (int)sizeof(string));
expects_unsigned_long_int(string);
}
And then expects_unsigned_long_int looks like this:
void expects_unsigned_long_int(unsigned long int val) {
char *string = val;
printf("val = %x\n", val);
printf("string = %p\n", string);
printf("string = %s\n", string);
}
I still get warnings when I compile it, but now when I run it it prints
string = 0x10a09df20 = some text
int: 8, pointer: 8
val = a09df20
string = 0x10a09df20
string = some text
So it looks like type unsigned long int is big enough (for now), and no bits get scraped off, and the original pointer value is successfully recovered inside expects_unsigned_long_int, and the string prints correctly.
But, in closing, please, figure out a better way of doing this!
Addendum: I said, with some uncertainty, that we could try unsigned long int, or maybe unsigned long long int. If you want more certainty, it turns out that there is a special integer type that's supposed to be the right size for jamming a pointer into it: uintptr_t, as defined in <stdint.h>. So, if you're "bound and determined to convert pointers to integers", that's the right type to use, although you have to beware that it's an optional type, since there can theoretically be an exotic machine out there somewhere that does support C but whose pointers are bigger than any of its integer types.
I've got moderately stuck, googling the right words can't got me to the right answer. Even worse, I've already done that but my own code example lost somewhere in the source code.
#include <stdio.h>
int main()
{
short x = 0xABCD;
char y[2] = { 0xAB, 0xCD };
printf("%x %x\n", y[0], y[1]);
printf("%x %x\n", (char *)&x[0], (char *)&x[1]);
}
Basically I need to access individual variable bytes via array by pointer arithmetic, without any calculations, just by type casting.
Put parentheses around your cast:
printf("%x %x\n", ((char *)&x)[0], ((char *)&x)[1]);
Note that endian-ness may change your expected result.
In the future, compile with -Wall to see what the warnings or errors are.
It's somewhat supported in C99. By a process known as type punning via union.
union {
short s;
char c[2];
} pun;
pun.s = 0xABCD;
pun.c[0] // reinterprets the representation of pun.s as char[2].
// And accesses the first byte.
Pointer casting (as long as it's to char*, to avoid strict aliasing violations) is also ok.
short x = 0xABCD;
char *c = (char*)&x;
If you're only bothered about getting the values, you can store the address of the source variable in a char * and increment and dereference the char pointer to print the values of each byte.
Quoting C11, chapter §6.3.2.3
[....] When a pointer to an object is converted to a pointer to a character type,
the result points to the lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining bytes of the object.
Something like (consider pseudo-code, not tested)
#include <stdio.h>
int main(void)
{
int src = 0x12345678;
char * t = &src;
for (int i = 0; i < sizeof(src); i++)
printf("%x\t", t[i]);
return 0;
}
should do it.
That said, to elaborate on the accepted answer, the why part:
As per the operator precedence table, array indexing operator has higher precedence over the type-casting, so unless forced explicitly, in the expression
(char *)&x[0]
the type of x is not changed as expected. So, to enforce the meaningful usage of the type-casting, we need to enclose it into extra par of parenthesis.
I am developing a system (virtual machine? not sure how to call this) in which every data structure is an array (or struct) with three integer fields at the beginning (which must be at the beginning right after another without padding bytes) and n generic pointers. I had a lot of thought on this but it seems hard to find a clear solution.
Below is my first attempt.
void **a = malloc(SOME_SIZE);
a[0] = (void *)1;
a[1] = (void *)0;
a[2] = (void *)1;
a[3] = malloc(SOME_SIZE); a[4] = malloc(SOME_SIZE); //and so on...
I wasn't sure here whether the void * casts to the integers are safe. After a little study, I changed the code as follows.
void **a = malloc(SOME_SIZE);
*(uintptr_t *)a = 1;
*((uintptr_t *)a + 1) = 0;
*((uintptr_t *)a + 2) = 1;
a[3] = malloc(SOME_SIZE); a[4] = malloc(SOME_SIZE); //and so on...
But then I found that on some platforms sizeof(void *) may not equal sizeof(uintptr_t). So I decided to change this to a struct.
typedef struct data {
size_t counts[3]; //padding bytes are okay after this line
void *members[]; //flexible array member
} *data;
data a = malloc(sizeof(*a) + SOME_SIZE);
a->counts[0] = 1;
a->counts[1] = 0;
a->counts[2] = 1;
a->members[0] = malloc(SOME_SIZE); a->members[1] = malloc(SOME_SIZE); //and so on...
Here I found a problem that there is no generic pointer in C that is 100% portable, but well I cannot find a solution for that, so let me just ignore some wierd platforms with some different pointer sizes.
So given that a void * can store a pointer to any object, is my last solution satisfying my purpose? I am not much familier with flexible array members, so I may have made a mistake. Or there still might be some points I missed.
Any help appreciated.
You could use a single array of void*, casting them to uintptr_t to obtain their integer value.
uintptr_t is an unsigned integer type capable of storing a pointer.
See What is uintptr_t data type .
Keep in mind that this is a very ugly, unreadable and dangerous trick.
Assuming you'll have some means of knowing which things are integers and which are pointers, you could legitimately use a union which combines and integer and a pointer. On systems where the integer type in question does not have padding bits or trap representations, storing a pointer and then reading back an integer should always yield a value (as opposed to causing Undefined Behavior) but the value of the integer in question should be considered meaningless. Even if the integer type happens to be uintptr_t, I don't think there's any guarantee that type punning would be a legitimate means of converting between the pointer type and the integer type.
Storing an integer value and then reading the pointer would certainly be Undefined Behavior of the integer value was not one that might be read by reading a legitimate pointer. If one has a uintptr_t that was obtained by type-punning a legitimate pointer, using type punning to convert that value back to a pointer would probably work, but I don't know if it's specified. I would not expect that casting such an integer to the pointer type would work, nor that one could use type punning on a uintptr_t that was obtained via cast.
If we have to hold an address of any data type then we require a pointer of that data type.
But a pointer is simply an address, and an address is always int type. Then why does the holding address of any data type require the pointer of that type?
There are several reasons:
Not all addresses are created equal; in particular, in non Von Neuman (e.g. Harvard) architectures pointers to code memory (where you often store constants) and a pointers to data memory are different.
You need to know the underlying type in order to perform your accesses correctly. For example, reading or writing a char is different from reading or writing a double.
You need additional information to perform pointer arithmetic.
Note that there is a pointer type that means "simply a pointer" in C, called void*. You can use this pointer to transfer an address in memory, but you need to cast it to something useful in order to perform operations in the memory pointed to by void*.
Pointers are not just int. They implicitly have semantics.
Here are a couple of examples:
p->member only makes sense if you know what type p points to.
p = p+1; behaves differently depending on the size of the object you point to (in the sense that 'p' in in fact incremented, when seen as an unsigned integer, by the size of the type it points to).
The following example can help to understand the differences between pointers of different types:
#include <stdio.h>
int main()
{
// Pointer to char
char * cp = "Abcdefghijk";
// Pointer to int
int * ip = (int *)cp; // To the same address
// Try address arithmetic
printf("Test of char*:\n");
printf("address %p contains data %c\n", cp, *cp);
printf("address %p contains data %c\n", (cp+1), *(cp+1));
printf("Test of int*:\n");
printf("address %p contains data %c\n", ip, *ip);
printf("address %p contains data %c\n", (ip + 1), *(ip + 1));
return 0;
}
The output is:
It is important to understand that address+1 expression gives different result depending on address type, i.e. +1 means sizeof(addressed data), like sizeof(*address).
So, if in your system (for your compiler) sizeof(int) and sizeof(char) are different (e.g., 4 and 1), results of cp+1 and ip+1 is also different. In my system it is:
E05859(hex) - E05858(hex) = 14702684(dec) - 14702681(dec) = 1 byte for char
E0585C(hex) - E05858(hex) = 14702684(dec) - 14702680(dec) = 4 bytes for int
Note: specific address values are not important in this case. The only difference is the variable type the pointers hold, which clearly is important.
Update:
By the way, address (pointer) arithmetic is not limited by +1 or ++, so many examples can be made, like:
int arr[] = { 1, 2, 3, 4, 5, 6 };
int *p1 = &arr[1];
int *p4 = &arr[4];
printf("Distance between %d and %d is %d\n", *p1, *p4, p4 - p1);
printf("But addresses are %p and %p have absolute difference in %d\n", p1, p4, int(p4) - int(p1));
With output:
So, for better understanding, read the tutorial.
You can have a typeless pointer in C very easily -- you just use void * for all pointers. This would be rather foolish though for two reasons I can think of.
First, by specifying the data that is pointed to in the type, the compiler saves you from many silly mistakes, typo or otherwise. If instead you deprive the compiler of this information you are bound to spend a LOT of time debugging things that should never have been an issue.
In addition, you've probably used "pointer arithmetic". For example, int *pInt = &someInt; pInt++; -- that advances the pointer to the next integer in memory; this works regardless of the type, and advances to the proper address, but it can only work if the compiler knows the size of what is being pointed to.
Because your assumption that "address is always int type" is wrong.
It's totally possible to create a computer architecture where, for instance, pointers to characters are larger than pointers to words, for some reason. C will handle this.
Also, of course, pointers can be dereferenced and when you do that the compiler needs to know the type of data you expect to find at the address in question. Otherwise it can't generate the proper instructions to deal with that data.
Consider:
char *x = malloc(sizeof *x);
*x = 0;
double *y = malloc(sizeof *y);
*y = 0;
These two snippets will write totally different amounts of memory (or blow up if the allocations fail, nevermind that for now), but the actual literal constant (0 which is of type int) is the same in both cases. Information about the types of the pointers allows the compiler to generate the proper code.
It's mostly for those who read the code after you so they could know what is stored at that address. Also, if you do any pointer arithmetics in your code, the compiler needs to know how much is he supposed to move forward if you do something like pSomething++, which is given by the type of the pointer, since the size of your data type is known before compilation.
Because the type of a pointer tells the compiler that at a time on how many bytes you can perform the operation.
Example: in case of char, only one byte. And it may be different in case of int of two bytes.
I am working with doing some serial communications in C in Linux. I am doing this using file descriptors. For some reason after char* s = "Hello world", I can write s to the serial port using the write method, no problem. I am using a serial monitor program to check the other end. However, I cannot send any other sort of data. I get a "Bad Address" error from the write function.
However, I noticed that if I did something very strange: int* x = "5"; That I could then send this x. My question is, what in the world does int* x = "5" mean?
int* x = "5";
This is not valid C code. You have to cast the value of the array to an int * but a dereference of the pointer can still break alignment rules and be undefined behavior.
int *x = (int *) "5";
This last code stores an unnamed array object of type char [2]. The value of "5" is a pointer to its first element, the pointer is a char *. The cast converts the char * to an int * and stores it in x.
int* x = "5";
is a constraint violation. That means that any conforming compiler must issue a diagnostic for it. It needn't be treated as a fatal error; a compiler is allowed to issue a warning and then successfully translate the program. But the language does not define the behavior of this declaration.
There is no implicit conversion from char* (the type of "5" after it decays) to int*.
This is as close as C gets to saying that something is illegal.
In practice, compilers that accept this declaration will probably treat it as equivalent to:
int *x = (int*)"5";
i.e., they'll insert a conversion. (This isn't the only possible interpretation, but most compilers will either interpret it this way or reject it.) This takes the char* value that results from the decay of the array expression "5" (i.e., the address of the '5' character at the beginning of the string), and converts to int*.
The resulting int* pointer points to an int object that may or may not be valid. The string "5" is two bytes long ({ '5', '\0' }). If int is two bytes, *x may evaluate to the result of interpreting those two bytes as an int value -- which will depend on the system's endianness. Or, if the string literal isn't correctly aligned for an int object, evaluating *x might terminate your program. And if int is wider than two bytes (as it very commonly is), *x refers to memory past the end of the string literal. In any case, attempting to modify *x has yet another kind of undefined behavior, since attempting to modify a string literal is explicitly undefined.
You should have gotten at least a warning when you compiled that declaration. If so, you definitely should not have ignored it. If you didn't get a warning, you should find out how to coax your compiler to produce more warnings.
TL;DR: Don't do that.
int* x = "5" implicitly casts "5" (a const char*) to an int* and stores it in x. Thus, x will point to sizeof(int) bytes in which the lowest is 0x35 (the character '5'), the next is 0, and the rest are indeterminate and will lead to undefined behavior when read.