Why can't we use * on non-pointers? - c

Let's say I have the code:
int x = 5;
int* p = &x;
then writing *p will return 5 and allow me to modify x (as expected). Say, for whatever reason that I then write:
int y = p; // y holds x's address
*y = 3; // this is invalid and throws an error when compiling
*((int*)y) = 3; // this is okay
(when compiling on gcc 9.2)
My question is: why does C not allow us to use * on non-pointer types?

C is a strongly typed language, which means that the operations which are allowed on an object (and the interpretation of those operations) is a function of the object's type. That's literally what it means for an object to have a type: the type determines the operations you can do with the object.
Unary * (the pointer indirection operator) is defined for pointer types, and it's not defined for integer types.
If you want to treat an integer's value as if it were a pointer, you can use an explicit cast, as in the *((int *)y) = 3; example you mentioned in your question.
There are two reasons the unary * operator is not defined for integers:
Taking an integer and pretending it's a pointer is generally a bad idea, not something to be encouraged. If you really want to do it, the extra cost imposed on you -- namely that you have to use that pointer cast -- is appropriate.
The bare expression *y doesn't contain enough information to know how big the pointed-to object might be. If you write *y = 3 and it were legal, how would the compiler know to assign an int, a short, or a char?
Point 2 is key. It's important to remember that C does not have one "pointer" type. Every pointer type incorporates a specification of the type of object which the pointer will point to. That's no accident, it's fundamental, and there's no way around it.
So you can't implicitly treat an integer as if it were a pointer, and even if you do it explicitly -- that is, with a cast, as in *((int *)y) = 3, you may still be on shaky ground, especially if integers and pointers don't have the same size on your machine.
These days, this is all generally such a bad idea that the compilers are slowly dropping their old "the programmer must know what he's doing" attitude, and getting somewhat hissy with warnings. For example, int y = p will generally get you a warning about a pointer-to-int assignment, and even with the explicit cast, *((int *)y) = 3 might get you a warning about "cast to pointer from integer of different size".

Boring answer - because that's how the language is defined:
6.5.3.2 Address and indirection operators
Constraints
1 The operand of the unary & operator shall be either a function designator, the result of a
[] or unary * operator, or an lvalue that designates an object that is not a bit-field and is
not declared with the register storage-class specifier.
2 The operand of the unary * operator shall have pointer type.
C 2011 Online Draft
Slightly-less boring answer:
Pointers are not integers; they do not behave like integers. The operations on pointers and integers are different. While there is such a thing as pointer arithmetic, it does not behave like integer arithmetic. Pointers are abstractions of memory addresses, which do not have to have integer representation.
Type matters in C (not as much as in some other languages, but it does matter). Operations on integer types do not apply to pointer types and vice versa, just like operations on aggregate (struct or array types) do not apply to integer types.
You can't use * on an integer operand for the same reason you can't use [] or () or . or -> on an integer operand; those operations are not defined for integer types.

The reason that we can't use * on integer because pointer and integer both are different.
you can not use int y to store address because integer can,t hold addresses it can only take integer value.
int y = p; // y holds x's address
*y = 3; // this is invalid and throws an error when compiling
*((int*)y) = 3; // this is okay
On the other hand pointer are designed to store address of variable so you can use * to access the value store at that address while integer can not store address.
operations for both integer and pointer are different from each other

*((int*)y) = 3; // this is okay No, this is not ok, it causes UB. Just because the compiler does not complain does not mean the code is ok or is free from UB.

Related

What does this statement about pointer operators mean? "& can be used only with a variable, * can be used with variable, constant or expression."

'Address of' operator gives memory location of variables. So it can be used with variables.
I tried compiling this code.
#include<stdio.h>
int main()
{
int i=889,*j,*k;
j=&889;
k=*6422296;
printf("%d\n",j);
return 0;
}
It showed this error error: lvalue required as unary '&' operand for j=&889.
And I was expecting this error: invalid type argument of unary '*' (have 'int')| for k=*6422296.
6422296 is the memory location of variable i.
Can someone give examples of when '*' is used with constants and expressions?
P.S:- I have not yet seen any need for this But....
All constants in a program are also assigned some memory. Is it possible to determine their address with &? (Just wondering).
An expression that is a value (rvalue in C idom) may not represent a variable with a defined lifetime and for that reason you cannot take its address.
In the opposite direction, it is legal (and common) to dereference an expression:
int a[] = {1,2,3}
int *pt = a + 1; // pt points to the second element of the array
inf first = *(a - 1); // perfectly legal C
Dereferencing a constant is not common in C code. It only makes sense when dealing directly with the hardware, that is in kernel mode, or when programming for some embedded systems. Then you can have special registers that are mapped at well known addresses.
first_byte_of_screen = *((char *) 0xC0000); // may remember things to old MS/DOS programmers
But best practices would recommed to define a constant
#define SCREEN ((unsigned char *) 0xC0000)
first_byte = *SCREEN; // or even SCREEN[0] because it is the same thing
k=*6422296 means go to address number 6422296, read the content inside it and assign it to k, which is completely valid.
j=&889 means get me the address of 889, 889 is an rvalue, it's a temporary that theoretically only exists temporarily in the CPU registers, and might never even get stored in the memory, so asking for it's memory address makes no sense.
All constants in a program are also assigned some memory.
That's not necessarily the case for numeric literals; they're often hardcoded into the machine code instructions with no storage allocated for them.
A good rule of thumb is that anything that can't be the target of an assignment (such as a numeric literal) cannot be the operand of the unary & operator1.
6.5.3.2 Address and indirection operators
Constraints
1 The operand of the unary & operator shall be either a function designator, the result of a
[] or unary * operator, or an lvalue that designates an object that is not a bit-field and is
not declared with the register storage-class specifier.
2 The operand of the unary * operator shall have pointer type.
C 2011 Online Draft
*6422296 "works" (as in, doesn't result in a diagnostic from the compiler) because integer expressions can be converted to pointers:
6.3.2.3 Pointers
...
5 An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.67)
67) The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to
be consistent with the addressing structure of the execution environment.
Like all rules of thumb, there are exceptions; array expressions are lvalues, but cannot be the target of an assignment; if you declare an array like int a[10];, then you can't reassign a such as a = some_other_array_expression ;. Such lvalues are known as non-modifiable lvalues.

tricky Pointer arithmetic in C: **k

Assume k is a pointer to an integer in C.
For the expression **k, when we try to evaluate this on the right side of an assignment operator("="), would the value be illegal?
Here is my thought:
**k is actually *(*k). When we dereference k, we get the value of an integer. Then we try to dereference an integer, which is an illegal operation.
But my textbook says this expression on the right side is actually legal.
Why so?
The C 2018 standard says, in clause 6.5.3.2, paragraph 2, “The operand of the unary * operator shall have pointer type.” If k is a pointer to an integer, then *k is an integer, which is not a pointer type, so it cannot be the operand of a unary * operator. Thus an expression such as x = **k violates this rule.
The rule ins 6.5.3.2 2 is a constraint, meaning that a conforming compiler is required to produce a diagnostic message for and that the C standard does not define the behavior.
Technically, a C compiler could, in addition to issuing the diagnostic message, accept the expression and define as it pleases. I am not aware of any that do so, and no common compiler does so.
It is possible the characters **k might appear in some larger expression where they do not both act as unary * operators, such as in x = y**k, which is equivalent to x = y * *k, in which the first * is a binary multiplication operator. You should show the exact text shown in your textbook.
Beyond what the standard says, dereferencing an int fundamentally makes no sense. Let's say the compiler is willing to assume that ints can be converted to directly to pointers (a BIG assumption). What TYPE of pointer will the compiler assume it is? The only safe assumption is void*. And dereferencing a void* makes no sense because even when assigning to a known type, the type of pointer still matters:
unsigned int n = 0xFFFFFFFF;
void *pN = &n;
unsigned int fromIntPtr = *(unsigned int*)pN;
unsigned int fromCharPtr = *(unsigned char*)pN;
printf("%X\n", fromIntPtr);
printf("%X\n", fromCharPtr);
Output:
FFFFFFFF
FF
Lacking a type of pointer, the compiler could perhaps infer unsigned int* based on the LHS expression. That's 1) a wildly stupid inference and 2) C doesn't really infer types. (Shoving a typed RHS into a differently typed LHS value isn't inference :))

De-referencing pointer to a volatile int after increment

unsigned int addr = 0x1000;
int temp = *((volatile int *) addr + 3);
Does it treat the incremented pointer (ie addr + 3 * sizeof(int)), as a pointer to volatile int (while dereferencing). In other words can I expect the hardware updated contents of say (0x1012) in temp ?
Yes.
Pointer arithmetic does not affect the type of the pointer, including any type qualifiers. Given an expression of the form A + B, if A has the type qualified pointer to T and B is an integral type, the expression A + B will also be a qualified pointer to T -- same type, same qualifiers.
From 6.5.6.8 of the C spec (draft n1570):
When an expression that has integer type is added to or subtracted from a pointer, the
result has the type of the pointer operand.
Presuming addr is either an integer (variable or constant) with a value your implementation can safely convert to an int * (see below).
Consider
volatile int a[4] = [ 1,2,3,4};
int i = a[3];
This is exactly the same, except for the explicit conversion integer to volatile int * (a pointer to ...). For the index operator, the name of the array decays to a pointer to the first element of a. This is volatile int * (type qualifiers in C apply to the elements of an array, never the array itself).
This is the same as the cast. Leaves 2 differences:
The conversion integer to "pointer". This is implementation defined, thus if your compiler supports it correctly (it should document this), and the value is correct, it is fine.
Finally the access. The underlying object is not volatile, but the pointer/resp. access. This actually is a defect in the standard (see DR476 which requires the object to be volatile, not the access. This is in contrast to the documented intention (read the link) and C++ semantics (which should be identical). Luckily all(most all) implementations generate code as one would expect and perform the access as intended. Note this is a common ideom on embedded systems.
So if the prerequisites are fulfilled, the code is correct. But please see below for better(in terms of maintainability and safety) options.
Notes: A better approach would be to
use uintptr_t to guarantee the integer can hold a pointer, or - better -
#define ARRAY_ADDR ((volatile int *)0x1000)
The latter avoids accidental modification to the integer and states the implications clear. It also can be used easier. It is a typical construct in low-level peripheral register definitions.
Re. your incrementing: addr is not a pointer! Thus you increment an integer, not a pointer. Left apart this is more to type than using a true pointer, it also is error-prone and obfuscates your code. If you need a pointer, use a pointer:
int *p = ARRAY_ADDR + 3;
As a personal note: Everybody passing such code (the one with the integer addr) in a company with at least some quality standards would have a very serious talk with her team leader.
First note that conversions from integers to pointers are not necessarily safe. It is implementation-defined what will happen. In some cases such conversions can even invoke undefined behavior, in case the integer value cannot be represented as a pointer, or in case the pointer ends up with a misaligned address.
It is safer to use the integer type uintptr_t to store pointers and addresses, as it is guaranteed to be able to store a pointer for the given system.
Given that your compiler implements a safe conversion for this code (for example, most embedded systems compilers do), then the code will indeed behave as you expect.
Pointer arithmetic will be done on a type that is volatile int, and therefore + 3 means increase the address by sizeof(volatile int) * 3 bytes. If an int is 4 bytes on your system, you will end up reading the contents of address 0x100C. Not sure where you got 0x1012 from, mixing up decimal and hex notation?

Pointers, casting and different compilers

I am now taking an ANSI C programming language course and trying to run this code from lecturer's slide:
#include<stdio.h>
int main()
{
int a[5] = {10, 20, 30, 40, 50};
double *p;
for (p = (double*)a; p<(double*)(a+5); ((int*)p)++)
{
printf("%d",*((int*)p));
}
return 0;
}
Unfortunately it doesn't work. On MacOS, XCode, Clang I get an error:"Assignment to cast is illegal, lvalue casts are not supported" and on Ubuntu gcc I get the next error: "lvalue required as increment operand"
I suspect that issue is compiler as we learn ANCI C and it has its own requirements which can violent other standards.
Regarding ((int*)p)++:
The result of (int *) p is a value. This is different from an lvalue. An lvalue (potentially) designates an object. For example, after int x = 3;, the name x designates the object that we defined. We can use it in an expression generally, such as y = 2*x, and then the x is used for its value. But we can also use it in an assignment, such as x = 5, and then the x is used for the object.
The expression (int *) p takes p and converts it to a pointer to int. The result is only a value. It is not an lvalue, so it does not represent an object that can be modified.
The ++ operator modifies an object. So it can only be applied to an lvalue. Since (int *) p is not an lvalue, ++ cannot be applied to it.
The code from the slide, as you have shown it, is incorrect, and I would not expect it to work in any C implementation. (C does permit implementations to make many extensions, but extending C to permit this operation would be unusual.)
Regarding (double*)a and (int*)p:
C does permit you to convert pointers to objects to pointers to different kinds of objects. There are various rules about this. An important one is that the resulting pointer must have the correct alignment for the type it points to.
Objects have various alignment requirements, meaning they must be placed at particular addresses in memory. Commonly, char objects may have any address, int objects must be at multiples of four bytes, and double objects must be at multiples of eight bytes. These requirements vary from C implementation to C implementation. I will use these values for illustration.
When you convert a proper double * to an int *, we know that the resulting pointer is a multiple of four, because it started as a multiple of eight (assuming the requirements stated above). So that is a safe conversion. When you convert an int *, to a double *, it may have the wrong alignment. In particular, given an array a of int, we know that either a[0] or a[1] must be improperly aligned for a double, because, if one of them is at a multiple of eight bytes, the other one must be off by four bytes from a multiple of eight. Therefore, the conversions from int * to double * in this code are not defined by the C standard.
They might work in many C implementations, but you should not rely on them.
The C rules also state that when a pointer to an object is converted back to its original type, the result is equal to the original pointer. So, the round-trip conversions in the sample code would be okay if the alignment rules had been obeyed: You can convert an int * to a double * and back to an int *, provided alignment requirements are obeyed.
If the double * had been used to access a double, that would violate aliasing rules. Generally, C does not define the behavior when an object of one type is accessed as if it were another type. There are some exceptions, notably with character types. However, simply converting pointers back and forth without using them to access objects is okay, except for the alignment problem.

Assigning a pointer to an integer

Can I assign a pointer to an integer variable? Like the following.
int *pointer;
int array1[25];
int addressOfArray;
pointer = &array1[0];
addressOfArray = pointer;
Is it possible to do like this?
Not without an explicit cast, i.e.
addressOfArray = (int) pointer;
There's also this caveat:
6.3.2.3 Pointers
...
6 Any pointer type may be converted to an integer type. Except as previously specified, the
result is implementation-defined. If the result cannot be represented in the integer type,
the behavior is undefined. The result need not be in the range of values of any integer
type.
So the result isn't guaranteed to be a meaningful integer value.
No, it is no valid to assign a pointer to an integer. It's a constraint violation of the assignment operator (C99, 6.5.16.1p1). A compiler has the right to refuse to translate a program with a pointer to integer assignment.
It's done frequently in embedded programming with some caveats:
the operation (most likely) requires casting, and
generally implies that you are working without an operating system or working very closely with the RTOS (i.e, driver-level development)
Nope.
You are attempting to assign a "pointer to int" value to an "int" variable. You will get a compiler warning for sure. You could do:
int *pointer;
int array1[25];
int *addressOfArray;
pointer = &array1[0];
//The following commented lines are equivalent to the pointer assignment above and are also valid
//pointer = array
//pointer = &array[0]
addressOfArray = pointer;
This is known as a shallow copy. If you are not already familiar with the concept, I highly recommend you read it (Google "deep vs shallow copying").
You could do it like this:
memcpy(&addressOfArray, &pointer, sizeof(int));
Which would directly copy the underlying bytes and supplant the casting operator, but I definitely wouldn't recommend it, unless you know for sure that sizeof(int) == sizeof(int *) on your system.
As others have commented, it's possible to convert a pointer to an int with a cast. Not recommended, but possible. And it might turn out bad.
But there are more integer types than just int. For example, we have intptr_t and uintptr_t which are for this exact purpose.
So you can do this and it's 100% safe.
uintptr_t addressOfArray;
addressOfArray = (uintptr_t) pointer;
And then, if you really really want to (cannot see a reason why) you could do this:
int myInt;
if(addressOfArray <= INT_MAX) {
myInt = addressOfArray;
} else {

Resources