Directly declare and assign value to C pointers - c

I have read
Directly assigning values to C Pointers
However, I am trying to understand this different scenario...
int *ptr = 10000;
printf("value: %d\n", ptr);
printf("value: %d\n", *ptr);
I got a segmentation fault on the second printf.
Now, I am under the impression that 10000 is a memory location because pointers point to the address in the memory. I am also aware that 10000 could be anywhere in the memory (which might already be occupied by some other process)
Therefore, I am thinking so the first print is just saying that "ok, just give me the value of the address as some integer value", so, ok, I got 10000.
Then I am saying "ok, now deference it for me", but I have not put anything in it so (or it is uninitialized) so I got a segmentation fault.
Maybe my logic is already totally off the track and this point.
UPDATED::::
Thanks for all the quick responses.. So here is my understanding.
First,
int *ptr = 10000;
is UB because I cannot assign a pointer to a constant value.
Second, the following is also UB because instead of using %p, I am using %d.
printf("value: %d\n", ptr)
Third, I have given an address (although it is UB), but I have not initialized to some value so, the following statement got seg fault.
print("value: %d\n", *ptr)
Is my understanding correct now ?
thanks.

int *ptr = 10000;
This is not merely undefined behavior. This is a constraint violation.
The expression 10000 is of type int. ptr is of type int*. There is no implicit conversion from int to int* (except for the special case of a null pointer constant, which doesn't apply here).
Any conforming C compiler, on processing this declaration, must issue a diagnostic message. It's permitted for that message to be a non-fatal warning, but once it's issued that message, the program's behavior is undefined.
A compiler could treat it as a fatal error and refuse to compile your program. (In my opinion, compilers should do this.)
If you really wanted to assign ptr to point to address 10000, you could have written:
int *ptr = (int*)10000;
There's no implicit conversion from int to int*, but you can do an explicit conversion with a cast operator.
That's a valid thing to do if you happen to know that 10000 is a valid address for the machine your code will run on. But in general the result of converting an integer to a pointer "is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation" (N1570 section 6.3.2.3). If 10000 isn't a valid address (and it very probably isn't), then your program still has undefined behavior, even if you try to access the value of the pointer, but especially if you try to dereference it.
This also assumes that converting the integer value 10000 to a pointer type is meaningful. Commonly such a conversion copies the bits of the numeric value, but the C standard doesn't say so. It might do some strange implementation-defined transformation on the number to produce an address.
Addresses (pointer values) are not numbers.
printf("value: %d\n", ptr);
This definitely has undefined behavior. The %d format requires an int argument. On many systems, int and int* aren't even the same size. You might end up printing, say, the high-order half of the pointer value, or even some complete garbage if integers and pointers aren't passed as function arguments in the same way. To print a pointer, use %p and convert the pointer to void*:
printf("value: %p\n", (void)ptr);
Finally:
printf("value: %d\n", *ptr);
The format string is correct, but just evaluating *ptr has undefined behavior (unless (int*)10000 happens to be a valid address).
Note that "undefined behavior" doesn't mean your program is going to crash. It means that the standard says nothing about what will happen when you run it. (Crashing is probably the best possible outcome; it makes it obvious that there's a bug.)

No, the definition int *ptr = 10000 does not give undefined behaviour.
It converts the literal value 10000 into a pointer, and initialises ptr with that value.
However, in your example
int *ptr = 10000;
printf("value: %d\n", ptr);
printf("value: %d\n", *ptr);
both of the printf() statements give undefined behaviour.
The first gives undefined behaviour because the %d format tells printf() that the corresponding argument is of type int, which ptr is not. In practice (with most compilers/libraries) it will often happily print the value 10000, but that is happenstance. Essentially (and a little over-simplistically), for that to happen, a round-trip conversion (e.g. converting 10000 from int to pointer, and then converting that pointer value to an int) needs to give the same value. Surviving that round trip is NOT guaranteed, although it does happen with some implementations, so the first printf() might APPEAR well behaved, despite involving undefined behaviour.
Part of the problem with undefined behaviour is that one possible result is code behaving as the programmer expects. That doesn't make the behaviour defined. It simply means that a particular set of circumstances (behaviour of compiler, operating system, hardware, etc) happen to conspire to give behaviour that seems sensible to the programmer.
The second printf() statement gives undefined behaviour because it dereferences ptr. The standard gives no basis to expect that a pointer with value 10000 corresponds to anything in particular. It might be a location in RAM. It might be a location in video memory. It might be a value that does not correspond to any location in memory that exists on your computer. It might be a logical or physical memory location that your operating system deems your process is not allowed to access (which is actually what causes an access violation under several operating systems, which then send a signal to the process running your program directing it to terminate).
A lot of C compilers (if appropriately configured) will give a warning on the initialisation of ptr because of this - an initialisation like this is easier for the compiler to detect, and usually indicates problems in subsequent code.

This may cause undefined behavior since the pointer converted from 10000 may be invalid.
Your OS may not allow your program to access the address 10000, so it will raise Segmentation Fault.
int *x = some numerical value (i.e. 10, whatever)
may be for microcomputers or low-level (example: creating OS).

Related

Casual value of a not initialized pointer to pointer

When I define a pointer without initializing it:
int *pi;
it points to a random part of the memory.
What happens when I define a pointer to pointer without initializing it?
int **ppi;
Where does it point? It should points to another pointer, but I didn't define it so maybe it points to a random part of the memory? If possible, could you show the difference with an example please?
To make it clear consider the following declaration
T *ptr;
where T is some type specifier. If the declared variable ptr has automatic storage duration then the pointer is initialized neither explicitly nor implicitly.
So the pointer has an indeterminate value.
T can be any type. You can define T as for example
typedef int T;
or
typedef int *T;
Pointers are scalar objects. Thus in this declaration
typedef int *T;
T *ptr;
the pointer ptr has indeterminate value the same way as in the declaration
typedef int T;
T *ptr;
Any local variable that isn't initialized contains an indeterminate value. Regardless of type. There's no obvious difference here between for example an uninitialized int, int* or int**.
However, there is a rule in C saying that if you don't access the address of such an uninitialized local variable, but use its value, you invoke undefined behavior - meaning a bug, possibly a crash etc. The rationale is likely that such variables may be allocated in registers and not have an addressable memory location. See https://stackoverflow.com/a/40674888/584518 for details.
So these examples below are all bad and wrong, since the address of the local variables themselves are never used:
{
int i;
int* ip;
int** ipp;
printf("%d\n, i); // undefined behavior
printf("%p\n, (void*)ip); // undefined behavior
printf("%p\n, (void*)ipp); // undefined behavior
}
However, if you take the address of a variable somewhere, C is less strict. In such a case you end up with the variable getting an indeterminate value, which means that it could contain anything and the value might not be consistent if you access it multiple times. This could be a "random address" in case of pointers, but not necessarily so.
An indeterminate value may be what's known as a "trap representation", a forbidden binary sequence for that type. In such cases, accessing the variable (read or write) invokes undefined behavior. This isn't likely to happen for plain int unless you have a very exotic system that is not using 2's complement - because in standard 2's complement systems, all value combinations of an int are valid and there are no padding bits, negative zero etc.
Example (assuming 2's complement):
{
int i;
int* ip = &i;
printf("%d\n", *ip); // unspecified behavior, might print anything
}
Unspecified behavior meaning that the compiler need not document the behavior. You can get any kind of output and it need not be consistent. But at least the program won't crash & burn as might happen in the case of undefined behavior.
But trap representations is more likely to be a thing for pointer variables. A specific CPU could have a restricted address space or at the low level initialization, the MMU could be set to have certain regions as virtual, some regions to only contain data or some regions to only contain code etc. It might be possible that such a CPU generates a hardware exception even when you read an invalid address value into an index register. It is certainly very likely that it does so if you attempt to access memory through an invalid address.
For example, the MMU might block runaway code which tries to execute code from the data segment of the memory, or to access the contents of the code memory as if it was data.

What harm would arise by pointer arithmetic beyond a valid memory range?

I followed the discussion on One-byte-off pointer still valid in C?.
The gist of that discussion, as far as I could gather, was that if you have:
char *p = malloc(4);
Then it is OK to get pointers up to p+4 by using pointer arithmetic. If you get a pointer by using p+5, then the behavior is undefined.
I can see why dereferencing p+5 could cause undefined behavior. But undefined behavior using just pointer arithmetic?
Why would the arithmetic operators + and - not be valid operations? I don’t see any harm by adding or subtracting a number from a pointer. After all, a pointer is a represented by a number that captures the address of an object.
Of course, I was not in the standardization committee :) I am not privy to the discussions they had before codifying the standard. I am just curious. Any insight will be useful.
The simplest answer is that it is conceivable that a machine traps integer overflow. If that were the case, then any pointer arithmetic which wasn't confined to a single storage region might cause overflow, which would cause a trap, disrupting execution of the program. C shouldn't be obliged to check for possible overflow before attempting pointer arithmetic, so the standard allows a C implementation on such a machine to just allow the trap to happen, even if chaos ensues.
Another case is an architecture where memory is segmented, so that a pointer consists of a segment address (with implicit trailing 0s) and an offset. Any given object must fit in a single segment, which means that valid pointer arithmetic can work only on the offset. Again, overflowing the offset in the course of pointer arithmetic might produce random results, and the C implementation is under no obligation to check for that.
Finally, there may well be optimizations which the compiler can produce on the assumption that all pointer arithmetic is valid. As a simple motivating case:
if (iter - 1 < object.end()) {...}
Here the test can be omitted because it must be true for any pointer iter whose value is a valid position in (or just after) object. The UB for invalid pointer arithmetic means that the compiler is not under any obligation to attempt to prove that iter is valid (although it might need to ensure that it is based on a pointer into object), so it can just drop the comparison and proceed to generate unconditional code. Some compilers may do this sort of thing, so watch out :)
Here, by the way, is the important difference between unspecified behaviour and undefined behaviour. Comparing two pointers (of the same type) with == is defined regardless of whether they are pointers into the same object. In particular, if a and b are two different objects of the same type, end_a is a pointer to one-past-the-end of a and begin_b is a pointer to b, then
end_a == begin_b
is unspecified; it will be 1 if and only if b happens to be just after a in memory, and otherwise 0. Since you can't normally rely on knowing that (unless a and b are array elements of the same array), the comparison is normally meaningless; but it is not undefined behaviour and the compiler needs to arrange for either 0 or 1 to be produced (and moreover, for the same comparison to consistently have the same value, since you can rely on objects not moving around in memory.)
One case I can think of where the result of a + or - might give unexpected results is in the case of overflow or underflow.
The question you refer to points out that for p = malloc(4) you can do p+4 for comparison. One thing this needs to guarantee is that p+4 will not overflow. It doesn't guarantee that p+5 wont overflow.
That is to say that the + or - themselves wont cause any problems, but there is a chance, however small, that they will return a value that is unsuitable for comparison.
Performing basic +/- arithmetic on a pointer will not cause a problem. The order of pointer values is sequential: &p[0] < &p[1] < ... &p[n] for a type n objects long. But pointer arithmetic outside this range is not defined. &p[-1] may be less or greater than &p[0].
int *p = malloc(80 * sizeof *p);
int *q = p + 1000;
printf("p:%p q:%p\n", p, q);
Dereferencing pointers outside their range or even inside the memory range, but unaligned is a problem.
printf("*p:%d\n", *p); // OK
printf("*p:%d\n", p[79]); // OK
printf("*p:%d\n", p[80]); // Bad, but &p[80] will be greater than &p[79]
printf("*p:%d\n", p[-1]); // Bad, order of p, p[-1] is not defined
printf("*p:%d\n", p[81]); // Bad, order of p[80], p[81] is not defined
char *r = (char*) p;
printf("*p:%d\n", *((int*) (r + 1)) ); // Bad
printf("*p:%d\n", *q); // Bad
Q: Why is p[81] undefined behavior?
A: Example: memory runs 0 to N-1. char *p has the value N-81. p[0] to p[79] is well defined. p[80] is also well defined. p[81] would need to the value N to be consistent, but that overflows so p[81] may have the value 0, N or who knows.
A couple of things here, the reason p+4 would be valid in such a case is because iteration to one past the last position is valid.
p+5 would not be a problem theoretically, but according to me the problem will be when you will try to dereference (p+5) or maybe you will try to overwrite that address.

Assigning an int value to a pointer in C

According to this Berkeley course and this Stanford course, assigning an int to a pointer like so should crash my program:
#include<stdio.h>
int main() {
int *p;
*p = 5;
printf("p is %d\n", *p);
return 0;
}
Yet when I compile this on OSX 10.9.2 using GCC {Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)} I get the following output:
Chris at iMac in ~/Source
$ ./a.out
p is 5
How come? I see the reverse question has been asked (Allocating a value to a pointer in c) but this only confuses me further as my program is nearly identical to the code posted in that question.
You are not allocating int to a pointer. If p is a pointer, *p points to its value. You are assigning *p = 5; which is plain integer to integer assignment.
In this case, p is not initialized and may give rise to segmentation fault.
According to this Berkeley course and this Stanford course, allocating an int to a pointer like so should crash my program:
First off let's get your terminology correct. You are not "allocating an int to a pointer". Allocation is the creation of a storage location. An int is a value. A pointer is a value. A pointer may be dereferenced. The result of dereferencing a valid pointer is to produce a storage location. The result of dereferencing an invalid pointer is undefined, and can literally do anything. A value may be assigned to a storage location.
So you have:
int *p;
You have allocated a (short-lived) storage location called p. The value of that storage location is a pointer. Since you did not initialize it to a valid pointer, its value is either a valid pointer or an invalid pointer; which is it? That's up to your compiler to decide. It could always be invalid. It could always be valid. Or it could be left up to chance.
Then you have:
*p = 5;
You dereferenced a pointer to produce a storage location, then you stuck 5 in that storage location. If it was a valid pointer, congratulations, you stuck 5 into a valid storage location. If it was an invalid pointer then you've got undefined behaviour; again, anything could happen.
printf("p is %d\n", *p); // 5!
Apparantly you got lucky and it was a valid storage location this time.
According to this Berkeley course and this Stanford course [this] should crash my program:
Then those courses are wrong, or you are misquoting them or misunderstanding them. That program is permitted to do anything whatsoever. It is permitted to send an email inviting the president of Iran to your movie night, it is permitted to print anything whatsoever, it is permitted to crash. It is not required to do anything, because "permitted to do anything" and "required to do something" are opposites.
My answers: You're lucky!
The fact is, the value you happen to have in p is a valid memory address, that you can actually write to, this is why it works...
Try playing with optimising options (-O3) in your compiler, than you'll get different results...
Your program is not guaranteed to fail, neither is it guaranteed to succeed. You are dereferencing a pointer which could hold any address. It is undefined behaviour.
It's not that it should crash, but it is undefined behavior.
Depending on the system and the runtime circumstances, assigning to a dereferenced and uninitialized pointer can either segfault or continue running "normally," which is to say that it appears work but may have unusual side effects.

Differences between int variable and pointer variable in C [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
#include <stdio.h>
int j;
int *ptr;
int main(void)
{
j = 13232323;//adress I want to assign
ptr = j;//assign the adress to the pointer
printf("%d",ptr);
}
OUTPUT:13232323
Am I doing wrong as assigning adress directly to a pointer? Pointer is nothing but a variable contains value in the address form, so I constructed an address and then assign it to the pointer,it works as it was supposed to be, except that I can not construct an address containing characters ABCDEF,so,what's the big difference between int and pointer?
EDIT:
This code means nothing but solely for testing purpose
Actually what your trying is out of your eagerness., I agree, I too often do this way.
The Very first thing is if you store 13232323 in a pointer variable, the hex value of it is OXC9E8C3., so really at the time of your assigning the pointer variable (ptr) doesnot know whether really it is a valid address or invalid address. But when you dereference this address with *ptr, then comes the problem. It tries to seek the value in the address.
Then there are 2 cases..
Really if what you assigned is a valid address , then it will return the value. (Practically impossible case)
Mostly your address will be a invalid one, hence program will crash (Segmentation fault).
So, even though your program compiles, runs, until and unless you store a valid address in ptr., ptr has no use.
Your Q: It works as it was surposed to be,except that I can not construct an
address containing characters ABCDEF,so,what's the big difference
between int and pointer? printf("%d",ptr);
I think you are asking, whatever the case, I cannot store ABCDEF, hence ptr works same as int type, so what is the difference between integer and pointer?
Here it is :
You cannot deference an integer value, where as pointer can do it.
Hence it is called as pointer :)
You are seeing only numbers because you are printing the address with %d, trying printing with %x or %p.
Atlast, do you notice the compiler warning, warning: assignment makes pointer from integer without a cast , because ptr = j; in that ptr is of int* type, and j is of int type.
you need to use %x, or you can send *ptr to the printf
printf("%d", *ptr);// if you are using this you need to know what are you pointing at, and that it will be an integer
or
printf("%x", ptr);
as the comments below says, it is always better to use %p instead of %x because %x invoke undefined behaver. also when using %p one should cast his pointer to (void *)
Your code is invalid. C language does not allow this assignment
ptr = j;
Integer values cannot be converted to pointer types without an explicit cast (constant zero being the only exception).
If the code was accepted by your compiler, it simply means that your compiler extends the language beyond its formal bounds. Many compilers do that in one way or another. However, if your compiler accepted this code without issuing a diagnostic message (a warning or an error), then your compiler is seriously broken.
In other words, the behavior of our program has very little (or nothing) to do with C language.
In C you can do most of the type conversion, even the types are totally unrelated. It may cause warning though. make sure that your type conversion make sense.
Syntactically, you are not doing anything wrong here. You can anyways, assign one pointer to another, and since they only contain addresses like the one you assigned here, this is what actually happens during the pointer assignment.
But you should never assign a random address and try to dereference it, because your program doesn't have access to a memory location unless alloted by the machine. You should request the memory from the machine using a function like malloc().
As for differences between pointers and integers, the biggest one is that you can't use a dereference operator (*) on an integer to access the value at the address it contains.
And you can construct addresses using ABCDEF by prefixing your address value with 0X or 0x to signify a hexadecimal notation.
Am I doing wrong as assigning adress directly to a pointer?
Yes, the only integer types that are supposed to work directly with pointers as you do are intptr_t and uintptr_t, if they exist. If they don't exist you are not supposed to do this at all. Even then you would have to use an explicit cast to do the conversion.
Pointer is nothing but a variable contains value in the address
form,so I constructed an adress and then assign it to the pointer,it
works as it was surposed to be,except that I can not construct an
address containing characters ABCDEF,so,what's the big difference
between int and pointer?
No, on some architectures pointers can be more complex beasts than that. Memory can e.g be segmented, so there would be one part of the pointer value that encodes the segment and another that encodes the offset in that segment. Such architectures are rare nowadays, nevertheless this is something that is still permitted by the C standard.
you need ptr to hold the address of j by doing the following
ptr = &j;
ptr hold, equal to, the address of j
and you want to print the content of whats ptr is pointing to by the following
printf("%d",*ptr);
and to get the address of ptr do the following
printf("%x",ptr);
x for the hex representation

Pointer arithmetic in c and array bounds

I was browsing through a webpage which had some c FAQ's, I found this statement made.
Similarly, if a has 10 elements and ip
points to a[3], you can't compute or
access ip + 10 or ip - 5. (There is
one special case: you can, in this
case, compute, but not access, a
pointer to the nonexistent element
just beyond the end of the array,
which in this case is &a[10].
I was confused by the statement
you can't compute ip + 10
I can understand accessing the element out of bounds is undefined, but computing!!!.
I wrote the following snippet which computes (let me know if this is what the website meant by computing) a pointer out-of-bounds.
#include <stdio.h>
int main()
{
int a[10], i;
int *p;
for (i = 0; i<10; i++)
a[i] = i;
p = &a[3];
printf("p = %p and p+10 = %p\n", p, p+10);
return 0;
}
$ ./a.out
p = 0xbfa53bbc and p+10 = 0xbfa53be4
We can see that p + 10 is pointing to 10 elements(40 bytes) past p. So what exactly does the statement made in the webpage mean. Did I mis-interpret something.
Even in K&R (A.7.7) this statement is made:
The result of the + operator is the
sum of the operands. A pointer to an
object in an array and a value of any
integral type may be added. ... The
sum is a pointer of the same type as
the original pointer, and points to
another object in the same array,
appropriately offset from the original
object. Thus if P is a pointer to an
object in an array, the expression P+1
is a pointer to the next object in the
array. If the sum pointer points
outside the bounds of the array,
except at the first location beyond
the high end, the result is
undefined.
What does being "undefined" mean. Does this mean the sum will be undefined, or does it only mean when we dereference it the behavior is undefined. Is the operation undefined even when we do not dereference it and just calculate the pointer to element out-of-bounds.
Undefined behavior means exactly that: absolutely anything could happen. It could succeed silently, it could fail silently, it could crash your program, it could blue screen your OS, or it could erase your hard drive. Some of these are not very likely, but all of them are permissible behaviors as far as the C language standard is concerned.
In this particular case, yes, the C standard is saying that even computing the address of a pointer outside of valid array bounds, without dereferencing it, is undefined behavior. The reason it says this is that there are some arcane systems where doing such a calculation could result in a fault of some sort. For example, you might have an array at the very end of addressable memory, and constructing a pointer beyond that would cause an overflow in a special address register which generates a trap or fault. The C standard wants to permit this behavior in order to be as portable as possible.
In reality, though, you'll find that constructing such an invalid address without dereferencing it has well-defined behavior on the vast majority of systems you'll come across in common usage. Creating an invalid memory address will have no ill effects unless you attempt to dereference it. But of course, it's better to avoid creating those invalid addresses so that your code will work perfectly even on those arcane systems.
The web page wording is confusing, but technically correct. The C99 language specification (section 6.5.6) discusses additive expressions, including pointer arithmetic. Subitem 8 specifically states that computing a pointer one past the end of an array shall not cause an overflow, but beyond that the behavior is undefined.
In a more practical sense, C compilers will generally let you get away with it, but what you do with the resulting value is up to you. If you try to dereference the resulting pointer to a value, as K&R states, the behavior is undefined.
Undefined, in programming terms, means "Don't do that." Basically, it means the specification that defines how the language works does not define an appropriate behavior in that situation. As a result, theoretically anything can happen. Generally all that happens is you have a silent or noisy (segfault) bug in your program, but many programmers like to joke about other possible results from causing undefined behavior, like deleting all of your files.
The behaviour would be undefined in the following case
int a[3];
(a + 10) ; // this is UB too as you are computing &a[10]
*(a+10) = 10; // Ewwww!!!!

Resources