I have a pointer char *a pointing to a block of memory. I have another pointer char *b that points to a memory block. Let's say b points to address 0x10001111. I want to write this address to next 8 bytes to the memory block pointed by a. In short, when I deference a, I should get the next 8 bytes of memory and value = 0x10001111. How do I do that? This is a x86_64 machine.
My current code:
static void write_add(void *a, char *b)
{ *(unsigned long *)a= (unsigned long)b;
*(unsigned long *)a= (unsigned long)b;
return;
}
I see only 0x00001111 on dereferencing a. Why am I not able to see the complete address?
Why would you involve a separate, unrelated type of uncertain size? If you want to store the pointer value in exactly 8 bytes (and supposing that the value fits in 8 bytes), you would spell that like so:
#include <stdint.h>
static void write_add(void *a, char *b) {
*(uint64_t *) a = (uint64_t) b;
}
You could also use memcpy(), but that seems a bit heavy-handed.
Do note, however, that C guarantees nothing about the size of the representation of pointer values. Although it is likely that 64 bits is enough on your system, you cannot safely assume that it is enough on every system.
unsigned long is not necessarily 64 bits on your platform. You can't make that assumption simply because you're on a x86_64 platform. I'm willing to bet it's actually 32 bits.
You should use uintptr_t instead as suggested by cad. This type is always defined to be a type that is big enough to contain a pointer on your platform.
Related
I am trying to get the bytes and pointers and how they are stored can any one explain or answer some of my questions. Thank you
int num = 513; <-- allocating a 4 bit memory by initializing
//[01][02][00][00] <-- (numbers are sorted and shown as litle endian)
char * ptr = # //char is (one byte)
↓
//[01][02][00][00]
// pointer always start from the [0] (as in this array byte length)
// in the allocated address in the memory ptr[0] is in this case = [01]
// (printed as %x02 "printf("the byte %02x\n",ptr[0]);" - if it's only
//single number 1 a zero will be added on the length so it prints out as 01)
int * ptr = # //now creating a pointer with the type of int (four bytes)
↓ ↓ ↓ ↓
//[01][02][00][00]
how can i access the first byte of this int pointer? [question01]
is there a way to see the bites inside the of the first byte([01])? [question02]
where does the pointer save the address? does it have to allocate a memory space in the ram to save whe address such as 0x233828ff21 and if so this(0x233828ff21) address requires a lot of bytes? [question03]
where does this int pointer stores it's type length (4bytes)? [question05]
what happens if i declare a type with longer byte memory allocation such as long long * ptr = # [01][02][00][00][00][00][00][00]
since i am pointing a long long to a 4 byte int, can those 4 last already been allocated by another program and in use? can i read it? [question06]
binary are only 0 and 1 and whether one of those(0 or 1) is called a bite? [question07]
one byte is 8 bits right? why am i getting 16 bits 0000000000000001 when converting the number 1 in this website (https://www.rapidtables.com/convert/number/decimal-to-binary.html) shouldn't it be 8? [question08]
Note: char * ptr = # should really be unsigned char * ptr = (unsigned char *)# to avoid compiler warnings and to ensure that the bytes are treated as unsigned values.
how can i access the first byte of this int pointer? [question01]
If you really want to access the first byte of a pointer, you can use:
unsigned char *ptr2 = (unsigned char *)&ptr;
then ptr2[0] is the first byte of the pointer ptr.
is there a way to see the bites inside the of the first byte([01])? [question02]
I assume you mean the bits inside the first byte. Bits are not directly addressable, so you need an expression (usually with bit-wise operators) to get the value each bit. For example, (ptr[m] >> n) & 1 will be the value of the nth bit of the mth byte of an object (where ptr is an unsigned char * pointing to the start of the object).
where does the pointer save the address? does it have to allocate a memory space in the ram to save whe address such as 0x233828ff21 and if so this(0x233828ff21) address requires a lot of bytes? [question03]
Addresses are stored in pointer variables in the same way as numbers are stored in variables of numeric type. At the CPU instruction level, there is no difference between a stored pointer value and a stored integer value, other than the width.
The most typical sizes of pointer types are 8 bytes or 4 bytes, depending on the target architecture of the compiler.
(There is no question04.)
where does this int pointer stores it's type length (4bytes)? [question05]
It doesn't store the length of the type, but the compiler knows that a TYPE * points to an object that is sizeof(TYPE) bytes long.
what happens if i declare a type with longer byte memory allocation such as long long * ptr = # [01][02][00][00][00][00][00][00] since i am pointing a long long to a 4 byte int, can those 4 last already been allocated by another program and in use? can i read it? [question06]
If the pointer is not correctly aligned for the referenced type (long long) then the behavior is undefined. Otherwise it can be converted back to the original pointer type int *. In any case, accessing *ptr will result in undefined behavior (unless long long is the same width as int, which is not typical).
binary are only 0 and 1 and whether one of those(0 or 1) is called a bite? [question07]
It is called a bit. There is also a type called _Bool. Expressions of type _Bool always have the value 0 or 1.
one byte is 8 bits right? why am i getting 16 bits 0000000000000001 when converting the number 1 in this website (https://www.rapidtables.com/convert/number/decimal-to-binary.html) shouldn't it be 8? [question08]
Who cares what some random web-site displays?
What C calls a "byte" is any type where sizeof(type) is 1, including char, signed char and unsigned char. It is at least 8 bits wide, but is wider than 8 bits on some exotic systems.
A pointer of character type (char *, signed char * or unsigned char *) can be used to access the individual bytes within any object, but that might not be true for pointers of other size 1 types, and is certainly not true for pointer to _Bool (_Bool *)!
• how can i access the first byte of this int pointer? [question01]
Generally, it is preferable to use unsigned char rather than char to access arbitrary bytes, so let’s do that.
After unsigned char *ptr = #, ptr is a pointer to unsigned char, and you could access the first byte of the int with *ptr or ptr[0], as in printf("The first byte, in hexadecimal, is 0x%02hhx.\n", *ptr);.
If instead you have int *ptr = #, there is no direct way to access the first byte. ptr here is a pointer to an int, and, to access an individual byte, you need a pointer to an unsigned char or other single-byte type. You could convert ptr to a pointer to unsigned char, as with (unsigned char *) ptr, and then you can access the individual byte with * (unsigned char *) ptr.
• is there a way to see the bites inside the of the first byte([01])? [question02]
The C standard does not provide a way to display the individual bits of a byte. Commonly programmers print the values in hexadecimal, as above, and read the bits from the hexadecimal digits. You can also write your own routine to write binary output from a byte.
• where does the pointer save the address? does it have to allocate a memory space in the ram to save whe address such as 0x233828ff21 and if so this(0x233828ff21) address requires a lot of bytes? [question03]
A pointer is a variable like your other int and char variables. It has space of its own in memory where its value is stored. (This model of variables having memory is used to specify the behavior of C programs. When a program is optimized by a compiler, it may change this.)
In current systems, pointers are commonly 32 or 64 bits (four or eight 8-bit bytes), depending on the target architecture. You can find out which with printf("The size of a 'char *' is %zu bytes.\n", sizeof (char *));. (The C standard allows pointers of different types to be different sizes, but that is rare in modern C implementations.)
• where does this int pointer stores it's type length (4bytes)? [question05]
The compiler knows the sizes of pointers. The pointer itself does not store the length of the thing it is pointing to. The compiler simply generates appropriate code when you use the pointer. If you use *ptr to get the value that a pointer points to, the compiler will generate a load-byte instruction if the type of ptr is char *, and it will generate a load-four-byte instruction of the type of ptr is int * (and int is four bytes in your C implementation).
• what happens if i declare a type with longer byte memory allocation such as long long * ptr = # [01][02][00][00][00][00][00][00] since i am pointing a long long to a 4 byte int, can those 4 last already been allocated by another program and in use? can i read it? [question06]
When long long is an eight-byte integer, and you have a long long *ptr that is pointing to a four-byte integer, the C standard does not define the behavior when you attempt to use *ptr.
In general-purpose multi-user operating systems, the memory after the int cannot be allocated by another program (unless this program and the other program have both arranged to share memory). Each process is given its own virtual address space, and their memory is kept separate.
Using this long long *ptr in your program may access memory beyond that of the int. This can cause various types of bugs in your program, including corrupting data and alignment errors.
• binary are only 0 and 1 and whether one of those(0 or 1) is called a bite? [question07]
One binary digit is a “bit”. Multiple binary digits are “bits”.
The smallest group of bits that a particular computer operates on as a unit is a “byte”. The size of a byte can vary; early computers had bytes of different sizes. Modern computers almost all use eight-bit bytes.
If your program includes the header <limits.h>, it defines a macro named CHAR_BIT that provides the number of bits in a byte. It is eight in almost all modern C implementations.
• one byte is 8 bits right? why am i getting 16 bits 0000000000000001 when converting the number 1 in this website (https://www.rapidtables.com/convert/number/decimal-to-binary.html) shouldn't it be 8? [question08]
The web site is not merely converting to one byte.
It seems to show at least 16 bits, choosing the least of 16, 32, or 64 bits that the value fits in as a signed integer type.
#include <stdio.h>
int main(){
int a = 5;
int *p = &a;
int **pp = &p;
char **cp = (char **)pp;
cp++; // This still moves 8 bytes
return 0;
}
Since the size of a pointer is 64 bits on 64 bit machines, doing a pp++ will always move 8 bytes. Is there a way to make it move only 1 byte?
Is there a way to make it move only 1 byte?
Maybe.
All object pointers can be converted to void * and since char * has the same representation, to char *. ++ increments a char * by 1.
#include <stdio.h>
int main() {
int a = 5;
int *p = &a;
int **pp = &p;
char **cp = (char **)pp;
char *character_pointer = (char *) cp;
character_pointer++; // Increment by 1
Now is the tricky part. Can that incremented pointer convert back to a char **. C allows that unless
If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. C17dr § 6.3.2.2 7
cp = (char **) character_pointer;
return 0;
}
Reading *cp can readily cause undefined behavior as cp does not certainly point to a valid char *. Unclear as to OP's goal at this point.
C is not assembly. What you are trying to do is undefined behavior, and compiler might not do what you ask, and the program might do anything, including possibly what you think it should do if C were just "assembly" with different syntax.
That being said, you can do this:
int a = 5;
int *p = &a;
int **pp = &p;
uintptr_t temp;
memcpy(&temp, &pp, sizeof temp);
temp++;
memcpy(&pp, &temp, sizeof temp);
Above code is likely to do what you want, even though that last memcpy already triggers undefined behavior, because it copies invalid value to a pointer (that is enough for it to be UB). Actually using pp, which now has invalid value, has increasing chance of messing things up.
To understand why having any UB is indeed UB: compiler is free to decide that the effect of the code, which can be proven to have UB, is nothing, or is never reached. So if that last memcpy is inside if, and compiler can prove UB occurs if condition is true, it may just assume condition is never true and optimize whole if away. Presumably C programmer knows to write their condition so that it would never result in UB, so this optimization can be made at compile time already.
Yeah, it is a bit crazy. C is not just assembly with different syntax!
Incrementing pointer to pointer by one byte
If you find an implementation where the size of a pointer to pointer variable contains only 8 bits, (i.e. one that uses 1 byte addressing, btw, very unlikely), then it will be doable, and only then would it be safe to do so. Otherwise it would not be considered a practical or safe thing to do.
For an implementation that uses 64 bit addressing, 64 bits are needed to represent each natural pointer location. Note however though _[t]he smallest incremental change is [available as a by-product of] the alignment needs of the referenced type. For performance, this often matches the width of the ref type, yet systems can allow less._ (per #Chux in comments) but de-referencing these locations could, and likely would lead to undefined behavior.
And in this statement
char **cp = (char **)pp; //where pp is defined as int **
the cast, although allowing a compile without complaining, is simply masking a problem. With the exception of void *, pointer variables are created using the same base type of the object they are to point to for the reason that the sizeof different types can be different, so the pointers designed to point to a particular type can represent its locations accurately.
It is also important to note the following:
sizeof char ** == sizeof char * == sizeof char *** !!= sizeof char`
32bit 4 bytes 4 bytes 4 bytes 1 byte
64bit 8 bytes 8 bytes 8 bytes 1 byte
sizeof int ** == sizeof int * == sizeof int *** !!= sizeof int`
32bit 4 bytes 4 bytes 4 bytes 4 bytes (typically)
64bit 8 bytes 8 bytes 8 bytes 4 bytes (typically)
So, unlike the type of a pointer, its size has little to do with it's ability to point to a location containing an object that is smaller, or even larger in size than the pointer used to point to it.
The purpose of a pointer ( eg char * ) is to store an address to an object of the same base type, in this case char. If targeting 32bit addressing, then the size of the pointer indicates it can point to 4,294,967,296 different locations (or if 64 bits to 18,446,744,073,709,551,616 locations.) and because in this case it is designed to point to char, each address differs by one byte.
But this really has nothing to do with your observation that when you increment a pointer to pointer to char that you see 8 bytes, and not 1 byte. It simply has to do with the fact that pointers, in 64bit addressing, require 8 bytes of space, thus the successive printf statements below will always show an increment of 8 bytes between the 1st and 2nd calls:
char **cp = (char **)pp;
size_t size = sizeof(cp);
printf("address of cp before increment: %p\n", cp);
cp++; // This still moves 8 bytes
printf("address of cp after increment: %p\n", cp);
return 0;
I have this sample program which when compiled with fstack-protector-all gives a stack smashing.
#include <stdio.h>
#include <stdint.h>
int func(int* value)
{
uint8_t port = 1;
*value = port; //Canary value changes at this point when seen in GDB
return 1;
}
int main()
{
uint16_t index = 0;
int ret = func((int*)&index);
}
I don't understand what is wrong with the line. Is any typecasting required?
It's because the size of int and the size of int16_t are different. The size of int is (usually) 32 bits (four bytes) while int16_t is 16 bits (two bytes).
So when you write an int to a int16_t variable you write two bytes too many, and leads to undefined behavior (and in this case will "smash" the stack).
The problem is more specifically because you call the function with a pointer to index which is a 16-bit variable, but the function expects (and uses its argument) as a 32-bit variable. You should not do the cast there in the call, as that hides the problem but doesn't solve it. It doesn't matter that you only write an 8-bit value to the dereference pointer inside the function, the destination is still a 32-bit variable and the compiler will convert the 8-bit value to a 32-bit value before writing to memory.
Since the type of index is uint16_t, only 16 bits are allocated for it. By casting the address of index to int*, you are pretending you have access to more than 16 bits -- 32 bits in most cases.
In
*value = port;
you are trying to set the value in those bits that haven't been allocated. Since unauthorized memory gets used in that line, any thing can happen after that.
I found this declaration in a C program
char huge * far *p;
Explanation: p is huge pointer, *p is far pointer and **p is char type
data variable.
Please explain declaration in more detail.
PS: I'm not asking about huge or far pointer here. I'm a newbie to programming
**p is character.Now a pointer pointing to address of this character will have value &(**p). Again if you want to take pointer to this pointer then next will be &(*p) and result will be p only.
If you read below sentence from right to left, you will get it all
p is huge pointer, *p is far pointer and **p is char type data variable.
In a nutshell virtual addresses on an Intel x86 chip have two components - a selector and an offset. The selector is an index into a table of base addresses [2] and the offset is added onto that base address. This was designed to let the processor access 20 bit (on a 8086/8, 186), 30 bit (286) or 46 bit (386 and later) virtual address spaces without needing registers that big.
'far' pointers have an explicit selector. However when you do pointer arithmetic on them the selector isn't modified.
'huge' pointers have an explicit selector. When you do pointer arithmetic on them though the selector can change.
Huge and far pointers are not part of standard C. They are borland extensions to the C language for managing segmented memory in DOS and Windows 16/32bit. Functionally, what the declaration says is **p is a char. That means "dereference p to get a pointer to char, dereference that to get a char"
In order to understand C pointer declarator semantics try this expression instead:
int* p;
It means p is a pointer to int. (hint: read from right to left)
int* const p;
This means p is a const pointer to int. (so you can't change the value of p)
Here's a proof of that:
p= 42; // error: assignment of read-only variable ‘p’
Another example:
int* const* lol;
This means lol is a pointer to const pointer to int. So the pointer which lol points at cannot point at another int.
lol= &p; // and yes, p cannot be reassigned, so we are correct.
In most cases reading from right to left makes sense. Now read the expression in question from right to left:
char huge * far *p;
Now the huge and far are just behaviour specifiers for pointers created by borland. What it actually means is
char** p;
"p is a pointer to pointer to char"
That means whatever p points to, points to a char.
Back in the 16-bit days on 8086, it would have declared a 32-bit pointer to a "normalized" 32-bit pointer to char (or to the first of an array thereof).
The difference exists because 32-bit pointers were composed of a segment number and offset between that segment, and segments overlapped (which meant two different pointers could point to the same physical address; example: 0x1200:1000 and 0x1300:0000). The huge qualifier forced a normalization using the highest segment number (and therefore, the lowest possible offset).
However, this normalization had a cost performance-wise, because after each operation that modified a pointer, the compiler had to automatically insert a code like this:
ptr = normalize(ptr);
with:
void huge * normalize(void huge *input)
{
unsigned long input2 = (unsigned long)input;
unsigned short segment = input >> 16;
unsigned short offset = (unsigned short)input;
segment += (offset >> 4);
offset &= 0x000F;
return ((unsigned long)segment) << 16 | offset;
}
The upside was the advantage of using your memory like it was flat, without worrying about segments and offsets.
Clarification to the other answers:
The far keyword is non-standard C, but it is not just an old obsolete extension from ancient PC days. Today, there are many modern 8 and 16 bit CPUs that uses "banked" memory to extend the amount of addressable memory beyond 65k. Typically they use a special register to pick a memory bank, effectively ending up with 24-bit addresses. All small microcontrollers on the market with RAM+flash memory > 64kb use such features.
You have an operating system, which has 2 functions dealing with memory allocation:
void *malloc( int sz ) // allocates a memory block sz bytes long
void free( void *addr ) // frees a memory block starting at addr
// (previously allocated by malloc)
Using these functions, implement the following 2 functions:
void *malloc_aligned( int sz ) // allocates a memory block sz bytes long,
// aligned to an address divisible by 16
void free_aligned( void *addr ) // frees a memory block starting at addr
// (previously allocated by malloc_aligned)
in the solution there is the following part:
void * aligned_malloc(size_t size){
unsigned char *res=malloc(size+16);
unsigned char offest=16-((long)res%16);
What I don't understand is: Why do we need to use unsigned char and why and what we achieve using 16-((long)res%16); and what is the purpose of (long)res in this case?
You can't do pointer arithmetic on "void *", because void has no size.
When adding to a pointer or subtracting to it, it's always done in units of sizeof(*p). Meaning - if you add one to an int pointer, its value grows by 4 (because the size of an integer is 4). So when you add to a void pointer, it should grow by the size of a void. But void has no size.
However, some compilers are willing to do arithmetic on void *, and they treat it like char *. With these compilers, you could implement these functions without casting. But it isn't standard.
Another point is that not all operators are applicable for pointers. Addition and subtraction are, but multiplication, division and modulus are not. So if you want to test the low bits of a pointer, to know if it's aligned, you cast it to a long.
Why long? The assumption is that long is as large as a pointer, which is true in Linux, but not in Windows. The right type is uintptr_t. However, if you're only interested in the low bits, it doesn't matter if you lose the high bits while casting. So a cast to int would have worked too.