C programming pointer, bytes and memory allocation have 8 question - c

I am trying to get the bytes and pointers and how they are stored can any one explain or answer some of my questions. Thank you
int num = 513; <-- allocating a 4 bit memory by initializing
//[01][02][00][00] <-- (numbers are sorted and shown as litle endian)
char * ptr = &num; //char is (one byte)
↓
//[01][02][00][00]
// pointer always start from the [0] (as in this array byte length)
// in the allocated address in the memory ptr[0] is in this case = [01]
// (printed as %x02 "printf("the byte %02x\n",ptr[0]);" - if it's only
//single number 1 a zero will be added on the length so it prints out as 01)
int * ptr = &num; //now creating a pointer with the type of int (four bytes)
↓ ↓ ↓ ↓
//[01][02][00][00]
how can i access the first byte of this int pointer? [question01]
is there a way to see the bites inside the of the first byte([01])? [question02]
where does the pointer save the address? does it have to allocate a memory space in the ram to save whe address such as 0x233828ff21 and if so this(0x233828ff21) address requires a lot of bytes? [question03]
where does this int pointer stores it's type length (4bytes)? [question05]
what happens if i declare a type with longer byte memory allocation such as long long * ptr = &num; [01][02][00][00][00][00][00][00]
since i am pointing a long long to a 4 byte int, can those 4 last already been allocated by another program and in use? can i read it? [question06]
binary are only 0 and 1 and whether one of those(0 or 1) is called a bite? [question07]
one byte is 8 bits right? why am i getting 16 bits 0000000000000001 when converting the number 1 in this website (https://www.rapidtables.com/convert/number/decimal-to-binary.html) shouldn't it be 8? [question08]

Note: char * ptr = &num; should really be unsigned char * ptr = (unsigned char *)&num; to avoid compiler warnings and to ensure that the bytes are treated as unsigned values.
how can i access the first byte of this int pointer? [question01]
If you really want to access the first byte of a pointer, you can use:
unsigned char *ptr2 = (unsigned char *)&ptr;
then ptr2[0] is the first byte of the pointer ptr.
is there a way to see the bites inside the of the first byte([01])? [question02]
I assume you mean the bits inside the first byte. Bits are not directly addressable, so you need an expression (usually with bit-wise operators) to get the value each bit. For example, (ptr[m] >> n) & 1 will be the value of the nth bit of the mth byte of an object (where ptr is an unsigned char * pointing to the start of the object).
where does the pointer save the address? does it have to allocate a memory space in the ram to save whe address such as 0x233828ff21 and if so this(0x233828ff21) address requires a lot of bytes? [question03]
Addresses are stored in pointer variables in the same way as numbers are stored in variables of numeric type. At the CPU instruction level, there is no difference between a stored pointer value and a stored integer value, other than the width.
The most typical sizes of pointer types are 8 bytes or 4 bytes, depending on the target architecture of the compiler.
(There is no question04.)
where does this int pointer stores it's type length (4bytes)? [question05]
It doesn't store the length of the type, but the compiler knows that a TYPE * points to an object that is sizeof(TYPE) bytes long.
what happens if i declare a type with longer byte memory allocation such as long long * ptr = # [01][02][00][00][00][00][00][00] since i am pointing a long long to a 4 byte int, can those 4 last already been allocated by another program and in use? can i read it? [question06]
If the pointer is not correctly aligned for the referenced type (long long) then the behavior is undefined. Otherwise it can be converted back to the original pointer type int *. In any case, accessing *ptr will result in undefined behavior (unless long long is the same width as int, which is not typical).
binary are only 0 and 1 and whether one of those(0 or 1) is called a bite? [question07]
It is called a bit. There is also a type called _Bool. Expressions of type _Bool always have the value 0 or 1.
one byte is 8 bits right? why am i getting 16 bits 0000000000000001 when converting the number 1 in this website (https://www.rapidtables.com/convert/number/decimal-to-binary.html) shouldn't it be 8? [question08]
Who cares what some random web-site displays?
What C calls a "byte" is any type where sizeof(type) is 1, including char, signed char and unsigned char. It is at least 8 bits wide, but is wider than 8 bits on some exotic systems.
A pointer of character type (char *, signed char * or unsigned char *) can be used to access the individual bytes within any object, but that might not be true for pointers of other size 1 types, and is certainly not true for pointer to _Bool (_Bool *)!

• how can i access the first byte of this int pointer? [question01]
Generally, it is preferable to use unsigned char rather than char to access arbitrary bytes, so let’s do that.
After unsigned char *ptr = &num;, ptr is a pointer to unsigned char, and you could access the first byte of the int with *ptr or ptr[0], as in printf("The first byte, in hexadecimal, is 0x%02hhx.\n", *ptr);.
If instead you have int *ptr = &num;, there is no direct way to access the first byte. ptr here is a pointer to an int, and, to access an individual byte, you need a pointer to an unsigned char or other single-byte type. You could convert ptr to a pointer to unsigned char, as with (unsigned char *) ptr, and then you can access the individual byte with * (unsigned char *) ptr.
• is there a way to see the bites inside the of the first byte([01])? [question02]
The C standard does not provide a way to display the individual bits of a byte. Commonly programmers print the values in hexadecimal, as above, and read the bits from the hexadecimal digits. You can also write your own routine to write binary output from a byte.
• where does the pointer save the address? does it have to allocate a memory space in the ram to save whe address such as 0x233828ff21 and if so this(0x233828ff21) address requires a lot of bytes? [question03]
A pointer is a variable like your other int and char variables. It has space of its own in memory where its value is stored. (This model of variables having memory is used to specify the behavior of C programs. When a program is optimized by a compiler, it may change this.)
In current systems, pointers are commonly 32 or 64 bits (four or eight 8-bit bytes), depending on the target architecture. You can find out which with printf("The size of a 'char *' is %zu bytes.\n", sizeof (char *));. (The C standard allows pointers of different types to be different sizes, but that is rare in modern C implementations.)
• where does this int pointer stores it's type length (4bytes)? [question05]
The compiler knows the sizes of pointers. The pointer itself does not store the length of the thing it is pointing to. The compiler simply generates appropriate code when you use the pointer. If you use *ptr to get the value that a pointer points to, the compiler will generate a load-byte instruction if the type of ptr is char *, and it will generate a load-four-byte instruction of the type of ptr is int * (and int is four bytes in your C implementation).
• what happens if i declare a type with longer byte memory allocation such as long long * ptr = # [01][02][00][00][00][00][00][00] since i am pointing a long long to a 4 byte int, can those 4 last already been allocated by another program and in use? can i read it? [question06]
When long long is an eight-byte integer, and you have a long long *ptr that is pointing to a four-byte integer, the C standard does not define the behavior when you attempt to use *ptr.
In general-purpose multi-user operating systems, the memory after the int cannot be allocated by another program (unless this program and the other program have both arranged to share memory). Each process is given its own virtual address space, and their memory is kept separate.
Using this long long *ptr in your program may access memory beyond that of the int. This can cause various types of bugs in your program, including corrupting data and alignment errors.
• binary are only 0 and 1 and whether one of those(0 or 1) is called a bite? [question07]
One binary digit is a “bit”. Multiple binary digits are “bits”.
The smallest group of bits that a particular computer operates on as a unit is a “byte”. The size of a byte can vary; early computers had bytes of different sizes. Modern computers almost all use eight-bit bytes.
If your program includes the header <limits.h>, it defines a macro named CHAR_BIT that provides the number of bits in a byte. It is eight in almost all modern C implementations.
• one byte is 8 bits right? why am i getting 16 bits 0000000000000001 when converting the number 1 in this website (https://www.rapidtables.com/convert/number/decimal-to-binary.html) shouldn't it be 8? [question08]
The web site is not merely converting to one byte.
It seems to show at least 16 bits, choosing the least of 16, 32, or 64 bits that the value fits in as a signed integer type.

Related

What addresses do the pointer store?

I am currently in the learning process of pointers in C.
I came to know that, a pointer is a variable which stores the address of another variable.
So when I did something like,
#include <stdio.h>
int main()
{
int x = 10;
int *ptr;
ptr = &x;
printf("%d" ,ptr);
The above gave me the address in integer values.
My question is, the pointer variable ptr stores address of variable of type int.
As per my PC, int is taking 4 bytes which is 32 bits. As per my understanding each bit has a separate memory address.
So what is the address pointer will point to? Will it point to the first bits memory address or something else? Please let me know.
Please correct me if my understanding is wrong.
A memory address is the location of one byte. A 32-bit location is the location on a four byte boundary.
So memory address 0x0000 is equal to the first 32-bit memory location. Address 0x0004 would be equal to the next four-byte boundary or, in other words, the next 32-bit location.
Then that just leaves the issue of big-endian and little-endian.
On systems like x86, each individual byte has its own address. For multi-byte objects like ints or doubles, the address of the object is the address of its first byte. On a little-endian system like x86, the first byte is the least-significant byte, while on a big-endian system like Power the first byte is the most significant byte:
int x = 0x01234567;
A A+1 A+2 A+3 big-endian
+––––––+––––––+––––––+––––––+
| 0x01 | 0x23 | 0x45 | 0x67 |
+––––––+––––––+––––––+––––––+
A+3 A+2 A+1 A little-endian
Most architectures have alignment restrictions such that multi-byte entities must have an address that is a multiple of 2 or 4. This is why struct types may have "padding" bytes between members.
Addressing and byte ordering are a function of the underlying architecture, not the C language. There are some word-addressed systems where each individual byte does not have its own address, so pointers to smaller types like char may need to include an offset into the word. Representation of pointer types can vary.
Unless you’re working on bare metal, the address values you’re working with are virtual addresses, not physical.
As per my understanding each bit has a separate memory address.
No, every byte has a memory address. You'll have to use an additional offset to get individual bits. You cannot make a pointer point at a single bit.
So what is the address pointer will point to? Will it point to the first bits memory address or something else? Please let me know.
Nope. It points to the object as a whole. You cannot say which byte in the object, because that depends on endianness.
Furthermore, when compiled with -Wall -Wextra this gives a warning.
warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘int *’
You're using the wrong format specifier for printing a pointer. %d is for int. Your pointer, however, has the type int* which is not int. If you want to print the address, use this instead:
printf("%p", (void*)ptr);
The minimum addressable unit is byte. Independent on how much bytes an object of the type int occupies a pointer to such an object points to the first byte of the extent of memory occupied by the object.
From the C Standard
3.5
1 bit
unit of data storage in the execution environment large enough to hold
an object that may have one of two values
2 NOTE It need not be possible to express the address of each
individual bit of an object.
3.6
1 byte
addressable unit of data storage large enough to hold any member of
the basic character set of the execution environment
Pay attention to that this call
printf("%d" ,ptr);
invokes undefined behavior.
If you want to output the value of a pointer you should use the conversion specifier p. For example
printf("%p\n" , ( void * )ptr);

Why can't you store an extremely large single value in a dynamically allocated block of memory?

Why can't I store a large integer in this allocated block of memory?
int *dyn = malloc(16);
*dyn = 9999999999;
printf("%lli\n", *dyn);
free(dyn);
At compile-time, GCC warns me that an integer overflow will occur. And sure enough, when printed out, it has overflowed.
Why can't I use the entire block of memory to store a single value?
*dyn = 9999999999; does not instruct the computer to use all the memory that was allocated for dyn to store the value 9999999999.
In C, dyn has a type. The type is “pointer to int”. Then *dyn has type int, which has a specific, fixed number of bits. It does not have a type meaning “all the memory that was allocated for dyn”.
Since *dyn is an int, *dyn = 9999999999; tells the computer to put 9999999999 into an int. Since 9999999999 is too big for an int in your C implementation, an overflow occurs.
We can program computers and design programming languages to manage integers of arbitrary sizes. Some languages, such as Python, do this. However that requires extra software, particularly software that has to do some variable amount of work when the program is running in order to handle whatever sizes of numbers come along. C is designed to be an elementary language. It works with objects of specific sizes and generally translates C code to fixed amounts of work in processor instructions. These provide building blocks for programmers to build bigger software. So, in C, int objects have fixed sizes. Allocating 16 bytes of memory provides space for several int objects, but it does not provide a big integer object.
The size of int is usually 4 bytes (32 bits). And, it can take 2^32 distinct states from -2147483648 to 2147483647.
So when you try to store this *dyn = 9999999999; , integer overflow occurs.
It is not pointing to the memory location, it is pointing to the value of that variable.
Why can't I use the entire block of memory to store a single value?
Because the size of int is almost certainly not 16 bytes, and when you de-reference a int pointer in the *dyn = 9999999999; expression, that access is limited to the size of int, which is likely 2^31 - 1.
Please note that the integer constant 9999999999 has a type too, which is dynamically determined by the compiler depending on the number's size. In this case, very likely long long. So the actual bug here is your attempt to do int x = 9999999999; which isn't the slightest related to malloc or pointers. It's a simple overflow.
To use numbers larger than 2.14 billion, you must use a 64 bit type instead. Use int64_t/uint64_t from stdint.h.
You cannot allocate 16 bytes, memcpy som value in there and then access the data through pointers to some arbitrary integer type. This is because of the somewhat dysfunctional C type system. Simplified explanation: the chunk of data returned from malloc has no type internally, until you store something there. Then it gets the type you used when storing, and all subsequent access have to use that same type too, everything else invokes undefined behavior according to "the strict aliasing rule".
dyn is an integer pointer (actually, pointing to a reserved memory of int array of 16 bytes). *dyn is an integer (first element from this int array). As similar to arrays:
int dyn[4];
dyn[0]=9999999999;
Assigning 9999999999 to int leads to a variable overflow, since int allows on modern platforms only [−2 147 483 648, +2 147 483 647] range (and at least [−32767, +32767]).
9999999999 or better illustrated as 9,999,999,999 is not in the range of an int, to which dyn is pointing to, don´t matter how much memory malloc has allocated:
int *dyn = malloc(16); // `dyn` points to an `int` object, don´t matter
// if malloc allocates 16 bytes.
*dyn = 9999999999; // Attempt to assign an out-of-range value to an `int` object.
An object of type int shall be allocated by 4 Bytes in memory by the most modern systems.
4 Bytes can hold a maximum of 2^(8*4) = 2^32 = 4,294,967,296 values.
Now you have the type of int which is equivalent to the type of signed int.
signed int can store positive and negative numbers, but since it can store positive and negative numbers it has quite a different range.
signed int has the range of -2,147,483,648 to 2,147,483,647 and so is the range of int.
So you can not hold 9,999,999,999 in an int object because the maximum value an int object can store is the value of 2,147,483,647.
If you want to store the value of 9,999,999,999 or 9999999999 in an object, use f.e. long long int, but not long int since long int can hold the same range of an int and an unsigned int:
long long int *dyn = malloc(16); // `dyn` points to an `long long int` object.
*dyn = 9999999999; // Fine, because 9999999999 is in the range of an `long long int` object.

how is the size of every pointer variable being 8 bytes on a 64-bit machine justified?

I went through pointer arithmetic and the fact that you can't assign pointer of one data type to another data type. for example, below declaration is incorrect.
double x = 10;
int *ptrInt = &x;
We've assigned the address of double variable to a "pointer to integer". Double takes 8 bytes as compared to an integer, that takes 4 bytes and therefore an integer pointer will truncate those extra 4 bytes.
But how come, the size of any pointer variable is 8 bytes and that also means it will not truncate those extra 4 bytes and should work correctly(even though it doesn't).
I have this doubt. Can anybody help me with the clarification?
how come, the size of any pointer variable is 8 bytes ? pointer variable contains address & size of address is 8 byte on 64-bit system irrespective of whether pointer variable points to int or char or float objects as 64-bit pointer supports 8 byte address space.

How to create a pointer of a specific size, then have it point to a specific address in memory

I want to move through a hexdump one byte at a time, using a pointer, until I find a specific sequence of bytes that is X bytes long. To do this, I need to cast a pointer to a size of X bytes. For example, a pointer for a size of 3 bytes.
I know that I could simply use something like uint16_t if I wanted it to be 2 bytes, or uint32_t if I wanted it to be 4 bytes. But neither of these work for this.
I have to start by pointing to the start of the block of memory that I have the location of, so that I can move through it one byte at a time. How can I do this without losing that position?
This is not possible. If your system doesn't have any types that use 3 bytes of storage then you cannot have a pointer to such a type. Instead you should use an unsigned char * and read bytes out of where it is pointing to, and do something with them.
For example if those 3 bytes are the least significant 3 bytes of an int, in big-endian format, you could write:
unsigned char *ptr = /* find the right location */;
unsigned int foo = ptr[0] * 0x10000u + ptr[1] * 0x100u + ptr[2];

Can unsigned int replace Pointers in C

As range of "unsigned int" is equal to range of " int *" or any pointer since both take 4Bytes in 64bit platform. Can pointer in C can be replaced by unsigned int ??
There is no such guarantee for unsigned int. A special type uintptr_t was introduced in C99 to hold a pointer, regardless of the platform. You need to include <stdint.h> header to use this type.
Absolutely not. Pointers may be eight bytes, not four, on a 64-bit platform! You can sometimes get away with casting between long long and a 64-bit pointer, but even that is questionable.
You cant even be sure that int is 4 Bytes in 64Bit arch! as the type limits are absolutly implementation and enviroment defined. The standard only gives a limitation of which value has to have at least which range. Thats all. int could even be 8 Bytes on 64 Bit it could be 10 Bytes, or it could be 13 Bytes or it could be What ever the enviroment wants its size to have. So ofcourse you can replace it. But it could end in invalid alignments or cause data loss.
As each assignement of types of different size can have.
But at least you can assign a pointer value to an int value. But also this is the only conversion from value type to pointer type which is valid by ISO/IEC9899.
Short answer: no.
Slightly longer answer: there's no guarantee that unsigned int will be able to hold a valid pointer value for a given platform (several people have already pointed out that 64-bit platforms will likely have address values that fall outside of the range of unsigned int). This is further complicated by the fact that pointers to different types may have different sizes and representations.
Pointers don't just encode an address value; they also encode a type. This matters for pointer arithmetic and array subscripting. For example, assume the declarations:
char *cp = 0x4000;
int *ip = 0x4000;
float (*dap)[10] = 0x4000;
All three pointers (cp, ip, dap) start out with the same value: 0x4000. However, adding 1 to each pointer will give different results. Assuming 32-bit int and float types, we'd get:
cp + 1 == 0x4001
ip + 1 == 0x4004;
dap + 1 == 0x4028;
Adding 1 to cp gives us the address of the next char object (0x4001), adding 1 to ip gives us the address of the next int object (0x4004), and adding 1 to dap gives us the address of the next 10-element array of float object (0x4028). This allows us to use the [] operator on each pointer: we can write cp[i] and ip[i] and get the result we expect (the i'th element following the pointer). If you typed all those pointers as unsigned int, however, you wouldn't be able to use the subscript operator, and adding 1 to them would only give you the next byte address, not necessarily the address of the next object.

Resources