C sizes of pointer and value being pointed mismatch

C sizes of pointer and value being pointed mismatch - c

IAR Embedded Workbench for msp430. selected C standard 99
Hello, I am new to pointers and stuck in one place. Here is a part of code:
void read_SPI_CS_P2(uint8_t read_from, int save_to, uint8_t bytes_to_read)
{
uint8_t * ptr;
ptr = save_to;
...
From what I read about pointers I assume that:
uint8_t * ptr; - here I declare what type of data pointer points to (I wanna save uint8_t value)
ptr = save_to; - here I assign adress of memory I would like to write to (it's 0xF900 so int)
It gives me an Error[Pe513]: a value of type "int" cannot be assigned to an entity of type "uint8_t *"
The question is.. why? Size of data that will be saved (to save_to) and size of memory adress can't be different?

From a syntax point of view you can cast an int directly to a pointer changing its value. but is not probably a good idea or something that will help you in this case.
You've to use the compiler and the linker support to instruct them that you want some data at a specific location in memory. Usually you can do this (with IAR toolchains) with #pragma location syntax, using something like:
__no_init volatile uint8_t g_u8SavingLocation # 0xF900;
You cannot simply set the value of a pointer in memory and start to write at that location. The Linker is in charge to decide where stuffs goes in memory, and you can instruct with #pragma and linker setup files on what you want to achieve.

C language has no immediate feature that would allow you to create pointers to arbitrary numerical addresses. In C language pointers are obtained by taking addresses of existing objects (using unary & operator) or by receiving pointers from memory allocation functions like malloc. Additionally, you can create null pointers. In all such cases you never know and never need to know the actual numerical value of the pointer.
If you want to make your pointer to point to a specific address, then formally the abstract C language cannot help you with it. There's simply no such feature in the language. However, in the implementation-dependent portions of the language a specific compiler can guarantee, that if you forcefully convert an integral value (containing a numerical address) to pointer type, the resultant pointer will point to that address. So, if in your case the numerical address is stored in save_to variable, you can force it into a pointer by using an explicit cast
ptr = (unit8_t *) save_to;
The cast is required. Assigning integral values to pointers directly is not allowed in standard C. (It used to be allowed in pre-standard C though).
A better type to store integral addresses would be uintptr_t. int, or any other signed type, is certainly not a good choice for that purpose.
P.S. I'm not sure what point you are trying to make when you talk about sizes. There's no relation between the "size of data that will be saved" and size of memory address. Why?

Related

How can one use pointer constants to point to a specific memory address (e.g. an address like 0x0001a00) in ANSI C?

I am experimenting with pointers in C programming for a project and was looking to get some guidance on whether there are other ways to initialize a pointer constant to the memory address 0x0001a000.
The following was my approach:
volatile int *firstAddress = (volatile int *)0x0001a000;
printf("First Memory address is: %p\n", firstAddress);
Are there shorter ways to initialize the above in C programming?

This is exactly how you would initialize such a constant, however the results are very implementation specific.
If the given address isn't one explicitly documented as valid, you'll likely invoke undefined behavior.
You also can't really make it any more concise than this. Conversions between integers and pointers requires a cast.

Does C always have to use pointers to handle addresses?

As I understand it, all of the cases where C has to handle an address involve the use of a pointer. For example, the & operand creates a pointer to the program, instead of just giving the bare address as data (i.e it never gives the address without using a pointer first):
scanf("%d", &foo)
Or when using the & operand
int i; //a variable
int *p; //a variable that store adress
p = &i; //The & operator returns a pointer to its operand, and equals p to that pointer.
My question is: Is there a reason why C programs always have to use a pointer to manage addresses? Is there a case where C can handle a bare address (the numerical value of the address) on its own or with another method? Or is that completely impossible? (Being because of system architecture, memory allocation changing during and in each runtime, etc). And finally, would that be useful being that addresses change because of memory management? If that was the case, it would be a reason why pointers are always needed.
I'm trying to figure out if the use pointers is a must in C standardized languages. Not because I want to use something else, but because I want to know for sure that the only way to use addresses is with pointers, and just forget about everything else.
Edit: Since part of the question was answered in the comments of Eric Postpischil, Michał Marszałek, user3386109, Mike Holt and Gecko; I'll group those bits here: Yes, using bare adresses bear little to no use because of different factors (Pointers allow a number of operations, adresses may change each time the program is run, etc). As Michał Marszałek pointed out (No pun intended) scanf() uses a pointer because C can only work with copies, so a pointer is needed to change the variable used. i.e
int foo;
scanf("%d", foo) //Does nothing, since value can't be changed
scanf("%d", &foo) //Now foo can be changed, since we use it's address.
Finally, as Gecko mentioned, pointers are there to represent indirection, so that the compiler can make the difference between data and address.
John Bode covers most of those topics in it's answer, so I'll mark that one.

A pointer is an address (or, more properly, it’s an abstraction of an address). Pointers are how we deal with address values in C.
Outside of a few domains, a “bare address” value simply isn’t useful on its own. We’re less interested in the address than the object at that address. C requires us to use pointers in two situations:
When we want a function to write to a parameter
When we need to track dynamically allocated memory
In these cases, we don’t really care what the address value actually is; we just need it to access the object we’re interested in.
Yes, in the embedded world specific address values are meaningful. But you still use pointers to access those locations. Like I said above, a pointer is an address for our purposes.

C allows you to convert pointers to integers. The <stdint.h> header provides a uintptr_t type with the property that any pointer to void can be converted to uintptr_t and back, and the result will compare equal to the original pointer.
Per C 2018 6.3.2.3 6, the result of converting a pointer to an integer is implementation-defined. Non-normative note 69 says “The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.”
Thus, on a machine where addresses are a simple numbering scheme, converting a pointer to a uintptr_t ought to give you the natural machine address, even though the standard does not require it. There are, however, environments where addresses are more complicated, and the result of converting a pointer to an integer may not be straightforward.

int i; //a variable
int *p; //a variable that store adres
i = 10; //now i is set to 10
p = &i; //now p is set to i address
*p = 20; //we set to 20 the given address
int tab[10]; // a table
p = tab; //set address
p++; //operate on address and move it to next element tab[1]
We can operate on address by pointers move forward or backwards. We can set and read from given address.
In C if we want get return values from functions we must use pointers. Or use return value from functions, but that way we can only get one value.
In C we don't have references therefore we must use pointers.
void fun(int j){
j = 10;
}
void fun2(int *j){
*j = 10;
}
int i;
i = 5; // now I is set to 5
fun(i);
//printf i will print 5
fun2(&i);
//printf I will print 10

Accessing struct member of pointer's address in C

I'm reading this book, and I found this code snippet in Chapter 14.
struct kobject *cdev_get(struct cdev *p)
{
struct module *owner = p->owner;
struct kobject *kobj;
if (owner && !try_module_get(owner))
return NULL;
kobj = kobject_get(&p->kobj);
if (!kobj)
module_put(owner);
return kobj;
}
I understand that this dereferences p, a cdev pointer then accesses its owner member
p->owner // (*p).owner
However, how does this work? It seems like it dereferences the memory address of a cdev pointer then access the kobj member of the pointer itself?
&p->kobj // (*(&p)).kobj
I thought pointers weren't much more than memory addresses so I don't understand how they can have members. And if it was trying to access a member of the pointer itself, why not just do p.kobj?

As per p being defined as struct cdev *p, p is very much a "memory address" but that's not all it is - it also has a type attached to it.
Since the expression *ptr is "the object pointed to by ptr", that also has the type attached, so you can logically do (*ptr).member.
And, since ptr->member is identical to (*ptr).member, it too is valid.
Bottom line is, your contention that "pointers [aren't] much more than memory addresses" is correct. But they are a little bit more :-)
In terms of &ptr->member, you seem to be reading that as (&ptr)->member, which is not correct.
Instead, as per C precedence rules, it is actually &(ptr->member), which means the address of the member of that structure.
These precedence rules are actually specified by the ISO C standard (C11 in this case). From 6.5 Expressions, footnote 85:
The syntax specifies the precedence of operators in the evaluation of an expression, which is the same as the order of the major subclauses of this subclause, highest precedence first.
And, since 6.5.2 Postfix operators (the bit covering ->) comes before 6.5.3 Unary operators (the bit covering &), that means -> evaluates first.

A pointer variable contains a memory address. What you need to consider is how the C programming language is used to write source code in a higher level language that is then converted for you into the machine code actually used by the computer.
The C programming language is a language that was designed to make using the hardware of computers easier than using assembly code or machine code. So it has language features to make it easier to write source code that is more readable and easier to understand than assembly code.
When we declare a pointer variable in C as a pointer to a type what we are telling the compiler is the type of the data at the memory location whose address is stored in the pointer. However the compiler does not really know if we are telling it the truth or not. The key thing to remember is that an actual memory address does not have a type, it is just an address. Any type information is lost once the compiler compiles the source code into machine code.
A struct is a kind of template or pattern or stencil that is used to virtually overlay a memory area to determine how the bytes in the memory area are to be interpreted. A programmer can use higher level language features when working with data without having to know about memory addresses and offsets.
If a variable is defined as the struct type then a memory area large enough to hold the struct is allocated and the compiler will figure out member offsets for you. If a variable is defined as a pointer to a memory area that is supposed to contain the data for that type again the compiler will figure out member offsets for you. However it is up to you to have the pointer variable containing the correct address.
So if you have a struct something like the following:
struct _tagStruct {
short sOne;
short sTwo;
};
And you then use it such as:
struct _tagStruct one; // allocate a memory area large enough for a struct
struct _tagStruct two; // allocate a memory area large enough for a struct
struct _tagStruct *three; // a pointer to a memory area to be interpreted as a struct
one.sOne = 5; // assign a value to this memory area interpreted as a short
one.sTwo = 7; // assign a value to this memory area interpreted as a
two = one; // make a copy of the one memory area in another
three = &one; // assign an address of a memory area to our pointer
three->sOne = 405; // modify the memory area pointed to, one.sOne in this case
You do not need to worry about the details of the memory layout of the struct and offsets to the struct members. And assigning one struct to another is merely an assignment statement. So this all works at a human level rather than a machine level of thinking.
However what if I have a function, short funcOne (short *inoutOne), that I want to use with the sOne member of the struct one? I can just do this funcOne(&one.sOne) which calls the function funcOne() with the address of the sOne member of the struct _tagStruct variable one.
A typical implementation of this in machine code is to load the address of the variable one into a register, add the offset to the member sOne and then call the function funcOne() with this calculated address.
I could also do something similar with a pointer, funcOne(&three->sOne).
A typical implementation of this in machine code is to load the contents of the pointer variable three into a register, add the offset to the member sOne and then call the function funcOne() with this calculated address.
So in one case we load the address of a variable into a register before adding the offset and in the second case we load the contents of a variable into a register before adding the offset. In both cases the compiler is using an offset which is usually the number of bytes from the beginning of the struct to the member of the struct. In the case of the first member, sOne of struct _tagStruct this offset would be zero bytes since it is the first member of the struct. For many compilers the offset of the second member, sTwo, would be two bytes since the size of a short is two bytes.
However the compiler is free to make choices about the layout of a struct unless explicitly told otherwise so on some computers the offset of member sTwo may be four bytes in order to generate more efficient machine code.
So using the C programming language allows us some degree of independence from the underlying computer hardware unless there is some reason for us to actually deal with those details.
The C language standard specifies operator precedence meaning when different operators are mixed together in a statement and parenthesis are not used to specify an exact order of evaluation on the expression then the compiler will use these standard rules to determine how to turn the C language expression into the proper machine code (see Operator precedence table for the C programming language ).
Both the dot (.) operator and the dereference (->) operator have equal precedence as well as the highest precedence of the operators. So when you write an expression such as &three->sOne then what the compiler does is turn it into an express that looks like &(three->sOne). This is using the address of operator to calculate an address of the sOne member of the memory area pointed to by the pointer variable three.
A different expression would be (&three)->sOne which actually should throw a compiler error since &three is not a pointer to a memory area holding a struct _tagStruct value but is instead a pointer to a pointer since three is a pointer to a variable of type struct _tagStruct and not a variable of type struct _tagStruct.

->member has higher precedence than &.
&p->kobj
parses as
&(p->kobj)
i.e. it's taking the address of the kobj member of the struct pointed to by p.

You have the order of operations wrong:
&p->kobj // &(p->kobj)

Pointer to Array of Bytes

I'm having some trouble with a pointer declaration that one of my co-workers wants to use because of Misra C requirements. Misra (Safety Critical guideline) won't let us mere Programmers use pointers, but will let us operate on arrays bytes. He intends to procur a pointer to an array of bytes (so we don't pass the actual array on the stack.)
// This is how I would normally do it
//
void Foo(uint8_t* pu8Buffer, uint16_t u16Len)
{
}
// This is how he has done it
//
void Foo(uint8_t (*pu8Buffer)[], uint16_t u16Len)
{
}
The calling function looks something like;
void Bar(void)
{
uint8_t u8Payload[1024]
uint16_t u16PayloadLen;
// ...some code to fill said array...
Foo(u8Payload, u16PayloadLen);
}
But, when pu8Buffer is accessed in Foo(), the array is wrong. Obviously not passing what it is expecting. The array is correct in the calling function, but not inside Foo()
I think he has created an array of pointers to bytes, not a pointer to an array of bytes.
Anyone care to clarify? Foo(&u8Payload, u16PayloadLen); doesn't work either.

In void Foo(uint8_t (*pu8Buffer)[], uint16_t u16Len), pu8Buffer is a pointer to an (incomplete) array of uint8_t. pu8Buffer has an incomplete type; it is a pointer to an array whose size is unknown. It may not be used in expressions where the size is required (such as pointer arithmetic; pu8Buffer+1 is not allowed).
Then *pu8Buffer is an array whose size is unknown. Since it is an array, it is automatically converted in most situations to a pointer to its first element. Thus, *pu8Buffer becomes a pointer to the first uint8_t of the array. The type of the converted *pu8Buffer is complete; it is a pointer to uint8_t, so it may be used in address arithmetic; *(*pu8Buffer + 1), (*pu8Buffer)[1], and 1[*pu8Buffer] are all valid expressions for the uint8_t one beyond *pu8Buffer.

I take it you are referring to MISRA-C:2004 rule 17.4 (or 2012 rule 18.4). Even someone like me who is a fan of MISRA finds this rule to be complete nonsense. The rationale for the rule is this (MISRA-C:2012 18.4):
"Array indexing using the array subscript syntax, ptr[expr], is the
preferred form of pointer arithmetic because it is often clearer and
hence less error prone than pointer manipulation. Any explicitly
calculated pointer value has the potential to access unintended or
invalid memory addresses. Such behavior is also possible with array
indexing, but the subscript syntax may ease the task of manual review.
Pointer arithmetic in C can be confusing to the novice The expression
ptr+1 may be mistakenly interpreted as the addition of 1 to the
address held in ptr. In fact the new memory address depends on the
size in bytes of the pointer's target. This misunderstanding can lead
to unexpected behaviour if sizeof is applied incorrectly."
So it all boils down to MISRA worrying about beginner programmers confusing ptr+1 to have the outcome we would have when writing (uint8_t*)ptr + 1. The solution, in my opinion, is to educate the novice programmers, rather than to restrict the professional ones (but then if you hire novice programmers to write safety-critical software with MISRA compliance, understanding pointer arithmetic is probably the least of your problems anyhow).
Solve this by writing a permanent deviation from this rule!
If you for reasons unknown don't want to deviate, but to make your current code MISRA compliant, simply rewrite the function as
void Foo(uint8_t pu8Buffer[], uint16_t u16Len)
and then replace all pointer arithmetic with pu8Buffer[something]. Then suddenly the code is 100% MISRA compatible according to the MISRA:2004 exemplar suite. And it is also 100% functionally equivalent to what you already have.

How does the OS know how much to increment different pointers?

With a 32-bit OS, we know that the pointer size is 4 bytes, so sizeof(char*) is 4 and sizeof(int*) is 4, etc. We also know that when you increment a char*, the byte address (offset) changes by sizeof(char); when you increment an int*, the byte address changes by sizeof(int).
My question is:
How does the OS know how much to increment the byte address for sizeof(YourType)?

The compiler only knows how to increment a pointer of type YourType * if it knows the size of YourType, which is the case if and only if the complete definition of YourType is known to the compiler at this point.
For example, if we have:
struct YourType *a;
struct YourOtherType *b;
struct YourType {
int x;
char y;
};
Then you are allowed to do this:
a++;
but you are not allowed to do this:
b++;
..since struct YourType is a complete type, but struct YourOtherType is an incomplete type.
The error given by gcc for the line b++; is:
error: arithmetic on pointer to an incomplete type

The OS doesn't really have anything to do with that - it's the compiler's job (as #zneak mentioned).
The compiler knows because it just compiled that struct or class - the size is, in the struct case, pretty much the sum of the sizes of all the struct's contents.

It is primarily an issue for the C (or C++) compiler, and not primarily an issue for the OS per se.
The compiler knows its alignment rules for the basic types, and applies those rules to any type you create. It can therefore establish the alignment requirement and size of YourType, and it will ensure that it increments any YourType* variable by the correct value. The alignment rules vary by hardware (CPU), and the compiler is responsible for knowing which rules to apply.
One key point is that the size of YourType must be such that when you have an array:
YourType array[20];
then &array[1] == &array[0] + 1. The byte address of &array[1] must be incremented by sizeof(YourType), and (assuming YourType is a structure), each of the elements of array[1] must be properly aligned, just as the elements of array[0] must be properly aligned.

Also remember types are defined in your compiled code to match the hardware you are working on. It is entirely up to the source code that is used to work this out.
So a low end chipset 16 bit targeted C program might have need to define types differently to a 32 bit system.
The programming language and compiler are what govern your types. Not the OS or hardware.
Although of course trying to stick a 32 bit number into a 16 bit register could be a problem!

C pointers are typed, unlike some old languages like PL/1. This not only allows the size of the object to be known, but so widening operations and formatting can be carried out. For example getting the data at *p, is that a float, a double, or a char? The compiler needs to know (think divisions, for example).
Of course we do have a typeless pointer, a void *, which you cannot do any arithmetic with simply because the compiler has no idea how much to add to the address.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight