I have written a simple C code for pointers. As per my understanding, Pointer is a variable which holds the address of another variable.
Eg :
int x = 25; // address - 1024
int *ptr = &x;
printf("%d", *ptr); // *ptr will give value at address of x i.e, 25 at 1024 address.
However when I try below code I'm getting segmentation fault
#include "stdio.h"
int main()
{
int *ptr = 25;
printf("%d", *ptr);
return 0;
}
What's wrong in this? Why can't a pointer variable return the value at address 25? Shouldn't I be able to read the bytes at that address?
Unless you're running on an embedded system with specific known memory locations, you can't assign an arbitrary value to a pointer an expect to be able to dereference it successfully.
Section 6.5.3.2p4 of the C standard states the following regarding the indirection operator *:
The unary
* operator denotes indirection. If the operand points to a function, the result is a function designator; if it points to an
object, the result is an lvalue designating the object. If
the operand has type "pointer to type", the result has
type "type". If an invalid value has been assigned to
the pointer, the behavior of the unary
* operator is undefined.
As mentioned in the passage above, the C standard only allows for pointers to point to known objects or to dynamically allocated memory (or NULL), not arbitrary memory locations. Some implementations may allow that in specific situations, but not in general.
Although the behavior of your program is undefined according to the C standard, your code is actually correct in the sense that it is doing exactly what you intend. It is attempting to read from memory address 25 and print the value at that address.
However, in most modern operating systems, such as Windows and Linux, programs use virtual memory and not physical memory. Therefore, you are most likely attempting to access a virtual memory address that is not mapped to a physical memory address. Accessing an unmapped memory location is illegal and causes a segmentation fault.
Since the memory address 0 (which is written in C as NULL) is normally reserved to specify an invalid memory address, most modern operating systems never map the first few kilobytes of virtual memory addresses to physical memory. That way, a segmentation fault will occur when an invalid NULL pointer is dereferenced (which is good, because it makes it easier to detect bugs).
For this reason, you can be reasonably certain that also the address 25 (which is very close to address 0) is never mapped to physical memory and will therefore cause a segmentation fault if you attempt to access that address.
However, most other addresses in your program's virtual memory address space will most likely have the same problem. Since the operating system tries to save physical memory if possible, it will not map more virtual memory address space to physical memory than necessary. Therefore, trying to guess valid memory addresses will fail, most of the time.
If you want to explore the virtual address space of your process to find memory addresses that you can read without a segmentation fault occuring, you can use the appropriate API supplied by your operating system. On Windows, you can use the function VirtualQuery. On Linux, you can read the pseudo-filesystem /proc/self/maps. The ISO C standard itself does not provide any way of determining the layout of your virtual memory address space, as this is operating system specific.
If you want to explore the virtual memory address layout of other running processes, then you can use the VirtualQueryEx function on Windows and read /proc/[pid]/maps on Linux. However, since other processes have a separate virtual memory address space, you can't access their memory directly, but must use the ReadProcessMemory and WriteProcessMemory functions on Windows and use /proc/[pid]/mem on Linux.
Disclaimer: Of course, I don't recommend messing around with the memory of other processes, unless you know exactly what you are doing.
However, as a programmer, you normally don't want to explore the virtual memory address space. Instead, you normally work with memory that has been assigned to your program by the operating system. If you want the operating system to give you some memory to play around with, which you are allowed to read from and write to at will (i.e. without segmentation faults), then you can just declare a large array of chars (bytes) as a global variable, for example char buffer[1024];. Be careful with declaring larger arrays as local variables, as this may cause a stack overflow. Alternatively, you can ask the operating system for dynamically allocated memory, for example using the malloc function.
You should consider all warnings that the compiler issues.
This statement
int *ptr = 25;
is incorrect. You are trying to assign an integer to a pointer as an address of memory. Thus in this statement
printf("%d", *ptr);
there is an attempt to access memory at address 25 that does not belong to your program.
What you mean is the following
#include "stdio.h"
int main( void )
{
int x = 25;
int *ptr = &x;
printf("%d", *ptr);
return 0;
}
Or
#include "stdio.h"
#include <stdlib.h>
int main( void )
{
int *ptr = malloc( sizeof( int ) );
*ptr = 25;
printf("%d", *ptr);
free( ptr );
return 0;
}
Related
I have this code:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <inttypes.h>
int main (int argc, char** argv) {
*(volatile uint8_t*)0x12345678u = 1;
int var = *(volatile uint8_t*)0x12345678;
printf("%i", var);
printf("%i", &var);
return (EXIT_SUCCESS);
}
I want to see a 1 and the address of that int, which i specified previously. But when compiled by gcc in bash, only "command terminated" without any error will be shown. Does anyone know why so?
PS: I am newbie to C, so just experimenting.
What you are doing:
*(volatile uint8_t*)0x12345678u = 1;
int var = *(volatile uint8_t*)0x12345678;
is totally wrong.
You have no guarantee whatsoever that an arbitrary address like 0x12345678 will be accessible, not to mention writable by your program. In other words, you cannot set a value to an arbitrary address and expect it to work. It's undefined behavior to say the least, and will most likely crash your program due to the operating system stopping you from touching memory you don't own.
The "command terminated" that you get when trying to run your program happens exactly because the operating system is preventing your program from accessing a memory location it is not allowed to access. Your program gets killed before it can do anything.
If you are on Linux, you can use the mmap function to request a memory page at an (almost) arbitrary address before accessing it (see man mmap). Here's an example program which achieves what you want:
#include <sys/mman.h>
#include <stdio.h>
#define WANTED_ADDRESS (void *)0x12345000
#define WANTED_OFFSET 0x678 // 0x12345000 + 0x678 = 0x12345678
int main(void) {
// Request a memory page starting at 0x12345000 of 0x1000 (4096) bytes.
unsigned char *mem = mmap(WANTED_ADDRESS, 0x1000, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
// Check if the OS correctly granted your program the requested page.
if (mem != WANTED_ADDRESS) {
perror("mmap failed");
return 1;
}
// Get a pointer inside that page.
int *ptr = (int *)(mem + WANTED_OFFSET); // 0x12345678
// Write to it.
*ptr = 123;
// Inspect the results.
printf("Value : %d\n", *ptr);
printf("Address: %p\n", ptr);
return 0;
}
The operating system and loader do not automatically make every possible address available to your program. The virtual address space of your process is constructed on demand by various operations of the program loader and of services inside the process. Although every address “exists” in the sense of being a potential address of memory, what happens when a process attempts to access an address is controlled by special data structures in the system. Those data structures control whether a process can read, write, or execute various portions of memory, whether the virtual addresses are currently mapped to physical memory, and whether the virtual addresses are not currently mapped to memory but will be provide with physical memory when needed. Initially, much of a process’ address space is marked not in use (or at least implicitly marked, in that none of the explicit records for the address space apply to it).
In the executions of your program you have attempted so far, the address 0x12345678 has not been mapped and marked available to your process, so, when your process attempted to use it, the system detected a fault and terminated your process.
(Some systems randomize the layout of the address space when a program is being loaded, to make it harder for an attacker to exploit bugs in a program. Because of this, it is possible that 0x12345678 will be accessible in some executions of your program and not others.)
The quote from C11 standard 6.5.3.2p4:
4 The unary * operator denotes indirection. [...] If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.
You use * operator on (volatile uint8_t*)0x12345678u pointer. Is this a valid pointer? Is it invalid pointer? What is an "invalid value" of a pointer?
There is no check that allows to find out which particilar pointer values are valid, which aren't. It is not implemented in C language. A random pointer may just happen to be a valid pointer. But most, most probably it is an invalid pointer. In which case - the behavior is undefined.
Dereferencing an invalid pointer is undefined behavior. But - outside of C scope and into operating system - on *unix systems trying to access memory that you are not allowed to, should raise a signal SIGSEGV on your program and terminate your program. Most probably this is what happens. Your program is not allowed to access memory location that is behind 0x12345678 value, the operating system specifically protects against that.
Also note, that systems use ASLR, so that pointer values within your program are indeed in some degree random. There are not linear, ie. *(char*)0x01 will not access the first byte in your ram. Operating system (or more exact, the underlying hardware as configured by the operating system) translates pointer values in your program to physical location in ram using what is called virtual memory. The same pointer values may just happen to be valid on the second run of your program. But most probably, because pointers can have so many values, most probably it isn't a valid pointer. Your operating system kills your program, as it detects an invalid memory access.
This question already has answers here:
What's the point of malloc(0)?
(17 answers)
Closed 4 years ago.
I know that malloc(size_t size) allocates size bytes and returns a pointer to the allocated memory. .
So how come when I allocate zero bytes to the integer pointer p, I am still able to initialize it?
#include <stdio.h>
#include <stdlib.h>
int main()
{
int *p = malloc(0);
*p = 10;
printf("Pointer address is: %p\n", p);
printf("The value of the pointer: %d\n", *p);
return 0;
}
Here is my program output, I was expecting a segmentation fault.
Pointer address is: 0x1ebd260
The value of the pointer: 10
The behavior of malloc(0) is implementation defined, it will either return a pointer or NULL. As per C Standard, you should not use the pointer returned by malloc when requested zero size space1).
Dereferencing the pointer returned by malloc(0) is undefined behavior which includes the program may execute incorrectly (either crashing or silently generating incorrect results), or it may fortuitously do exactly what the programmer intended.
1) From C Standard#7.22.3p1 [emphasis added]:
1 The order and contiguity of storage allocated by successive calls to the aligned_alloc, calloc, malloc, and realloc functions is unspecified. The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object with a fundamental alignment requirement and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated). The lifetime of an allocated object extends from the allocation until the deallocation. Each such allocation shall yield a pointer to an object disjoint from any other object. The pointer returned points to the start (lowest byte address) of the allocated space. If the space cannot be allocated, a null pointer is returned. If the size of the space requested is zero, the behavior is implementation-defined: either a null pointer is returned, or the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object.
When you call malloc(0) and write to the returned buffer, you invoke undefined behavior. That means you can't predict how the program will behave. It might crash, it might output strange results, or (as in this case) it may appear to work properly.
Just because the program could crash doesn't mean it will.
In general, C does not prevent you from doing incorrect things.
After int *p = malloc(0);, p has some value. It might be a null pointer, or it might point to one or more bytes of memory. In either case, you should not use it.1 But the C language does not stop you from doing so.
When you execute *p = 10;, the compiler may have generated code to write 10 to the place where p points. p may be pointing at actual writable memory, so the store instruction may execute without failing. And then you have written 10 to a place in memory where you should not. At this point, the C standard no longer specifies what the behavior of your program is—by writing to an inappropriate place, you have broken the model of how C works.
It is also possible your compiler recognizes that *p = 10; is incorrect code in this situation and generates something other than the write to memory described above. A good compiler might give you a warning message for this code, but the compiler is not obligated to do this by the C standard, and it can allow your program to break in other ways.
Footnote
1 If malloc returns a null pointer, you should not write to *p because it is not pointing to an object. If it returns something else, you should not write to *p because C 2018 7.22.3.1 says, for this of malloc(0), “the returned pointer shall not be used to access an object.”
I think it is good idea to look at segmentation fault definition https://en.wikipedia.org/wiki/Segmentation_fault
The following are some typical causes of a segmentation fault:
Attempting to access a nonexistent memory address (outside process's address space)
Attempting to access memory the program does not have rights to (such as kernel structures in process context)
Attempting to write read-only memory (such as code segment)
So in general, access to any address in program's data segment will not lead to seg fault. This approach makes sence as C doesn't have any framework with memory management in it (like C# or Java). So as malloc in this example returns some address (not NULL) it returns it from program's data segment which can be accessed by program.
However if program is complex, such action (*p = 10) may overwrite data belonging to some other variable or object (or even pointer!) and lead to undefined behaviour (including seg fault).
But please take in account, that as described in other answers such program is not best practice and you shouldn't use such approach in production.
This is a source code of malloc() (maybe have a litter difference between kernel versions but concept still like this), It can answer your question:
static void *malloc(int size)
{
void *p;
if (size < 0)
error("Malloc error");
if (!malloc_ptr)
malloc_ptr = free_mem_ptr;
malloc_ptr = (malloc_ptr + 3) & ~3; /* Align */
p = (void *)malloc_ptr;
malloc_ptr += size;
if (free_mem_end_ptr && malloc_ptr >= free_mem_end_ptr)
error("Out of memory");
malloc_count++;
return p;
}
When you assigned size is 0, it already gave you a pointer to first address of memory
int main()
{
int *p,*q;
p=(int *)1000;
q=(int *)2000;
printf("%d:%d:%d",q,p,(q-p));
}
output
2000:1000:250
1.I cannot understand p=(int *)1000; line, does this mean that p is pointing to 1000 address location? what if I do *p=22 does this value is stored at 1000 address and overwrite the existing value? If it overwrites the value, what if another program is working with 1000 address space?
how q-p=250?
EDIT: I tried printf("%u:%u:%u",q,p,(q-p)); the output is the same
int main()
{
int *p;
int i=5;
p=&i;
printf("%u:%d",p,i);
return 0;
}
the output
3214158860:5
does this mean the addresses used by compiler are integers? there is no difference between normal integers and address integers?
does this mean that p is pointing to 1000 address location?
Yes.
what if I do *p=22
It's invoking undefined behavior - your program will most likely crash with a segfault.
Note that in modern OSes, addresses are virtual - you can't overwrite an other process' adress space like this, but you can attempt writing to an invalid memory location in your own process' address space.
how q-p=250?
Because pointer arithmetic works like this (in order to be compatible with array indexing). The difference of two pointers is the difference of their value divided by sizeof(*ptr). Similarly, adding n to a pointer ptr of type T results in a numeric value ptr + n * sizeof(T).
Read this on pointers.
does this mean the addresses used by compiler are integers?
That "used by compiler" part is not even necessary. Addresses are integers, it's just an abstraction in C that we have nice pointers to ease our life. If you were coding in assembly, you would just treat them as unsigned integers.
By the way, writing
printf("%u:%d", p, i);
is also undefined behavior - the %u format specifier expects an unsigned int, and not a pointer. To print a pointer, use %p:
printf("%p:%d", (void *)p, i);
Yes, with *p=22 you write to 1000 address.
q-p is 250 because size of int is 4 so it's 2000-1000/4=250
The meaning of p = (int *) 1000 is implementation-defined. But yes, in a typical implementation it will make p to point to address 1000.
Doing *p = 22 afterwards will indeed attempt to store 22 at address 1000. However, in general case this attempt will lead to undefined behavior, since you are not allowed to just write data to arbitrary memory locations. You have to allocate memory in one way or another in order to be able to use it. In your example you didn't make any effort to allocate anything at address 1000. This means that most likely your program will simply crash, because it attempted to write data to a memory region that was not properly allocated. (Additionally, on many platforms in order to access data through pointers these pointers must point to properly aligned locations.)
Even if you somehow succeed succeed in writing your 22 at address 1000, it does not mean that it will in any way affect "other programs". On some old platforms it would (like DOS, fro one example). But modern platforms implement independent virtual memory for each running program (process). This means that each running process has its own separate address 1000 and it cannot see the other program's address 1000.
Yes, p is pointing to virtual address 1000. If you use *p = 22;, you are likely to get a segmentation fault; quite often, the whole first 1024 bytes are invalid for reading or writing. It can't affect another program assuming you have virtual memory; each program has its own virtual address space.
The value of q - p is the number of units of sizeof(*p) or sizeof(*q) or sizeof(int) between the two addresses.
Casting arbitrary integers to pointers is undefined behavior. Anything can happen including nothing, a segmentation fault or silently overwriting other processes' memory (unlikely in the modern virtual memory models).
But we used to use absolute addresses like this back in the real mode DOS days to access interrupt tables and BIOS variables :)
About q-p == 250, it's the result of semantics of pointer arithmetic. Apparently sizeof int is 4 in your system. So when you add 1 to an int pointer it actually gets incremented by 4 so it points to the next int not the next byte. This behavior helps with array access.
does this mean that p is pointing to 1000 address location?
yes. But this 1000 address may belong to some other processes address.In this case, You illegally accessing the memory of another process's address space. This may results in segmentation fault.
After doing lot of research in Google, I found this program:
#include <stdio.h>
int main()
{
int val;
char *a = (char*) 0x1000;
*a = 20;
val = *a;
printf("%d", val);
}
But it is throwing a run time error, at *a = 20.
So how can I write and read a specific memory location?
You are doing it except on your system you cannot write to this memory causing a segmentation fault.
A segmentation fault (often shortened to segfault), bus error or access violation is generally an attempt to access memory that the CPU cannot physically address. It occurs when the hardware notifies an operating system about a memory access violation. The OS kernel then sends a signal to the process which caused the exception. By default, the process receiving the signal dumps core and terminates. The default signal handler can also be overridden to customize how the signal is handled.
If you are interested in knowing more look up MMU on wikipedia.
Here is how to legally request memory from the heap. The malloc() function takes a number of bytes to allocate as a parameter. Please note that every malloc() should be matched by a free() call to that same memory after you are done using it. The free() call should normally be in the same function as where you called malloc().
#include <stdio.h>
int main()
{
int val;
char *a;
a = (char*)malloc(sizeof(char) * 1);
*a = 20;
val = (int)*a;
printf("%d", val);
free(a);
return 0;
}
You can also allocate memory on the stack in a very simple way like so:
#include <stdio.h>
int main()
{
int val;
char *a;
char b;
a = &b;
*a = 20;
val = (int)*a;
printf("%d", val);
return 0;
}
This is throwing a segment violation (SEGFAULT), as it should, as you don't know what is put in that address. Most likely, that is kernel space, and the hosting environment doesn't want you willy-nilly writing to another application's memory. You should only ever write to memory that you KNOW your program has access to, or you will have inexplicable crashes at runtime.
If you are running your code in user space(which you are), then all the addresses you get are virtual addresses and not physical addresses. You cannot just assume and write to any virtual address.
In fact with virtual memory model you cannot just assume any address to be a valid address.It is up to the memory manager to return valid addresses to the compiler implementation which the handles it to your user program and not the other way round.
In order that your program be able to write to an address:
It should be a valid virtual address
It should accessible to the address space of your program
You can't just write at any random address. You can only modify the contents of memory where your program can write.
If you need to modify contents of some variable, thats why pointers are for.
char a = 'x';
char* ptr = &a; // stored at some 0x....
*ptr = 'b'; // you just wrote at 0x....
Issue of permission, the OS will protect memory space from random access.
You didn't specify the exact error, but my guess is you are getting a "segmentation fault" which would clearly indicate a memory access violation.
You can write to a specific memory location.
But only when you have the rights to write to that location.
What you are getting, is the operating system forbidding you to write to 0x1000.
That is the correct way to write data to memory location 0x1000. But in this day and age of virtual memory, you almost never know the actual memory location you want to write to in advance, so this type of thing is never used.
If you can tell us the actual problem you're trying to solve with this, maybe we can help.
You cannot randomly pick a memory location and write to. The memory location must be allocated to you and must be writable.
Generally speaking, you can get the reference/address of a variable with & and write data on it. Or you can use malloc() to ask for space on heap to write data to.
This answer only covers how to write and read data on memory. But I don't cover how to do it in the correct way, so that the program functions normally. Other answer probably covers this better than mine.
First you need to be sure about memory location where you want to write into.
Then, check if you have enough permission to write or not.
Suppose I do the following
int *p = 1;
printf("%d", *p);
I get a segfault. Now, as I understand it, this memory location 1 is in the address space of my program. Why should there be a problem in reading this memory location ?
Your pointer doesn't point to anything valid. All you do is assign the value 1 to a pointer-to-int, but 1 isn't a valid memory location.
The only valid way to obtain an pointer value is either take the address-of a variable or call an allocation function:
int a;
int * p1 = &a; // OK
int * p2 = malloc(sizeof(int)); // also OK
*p1 = 2;
*p2 = 3;
As for "why there should be a problem": As far as the language is concerned, if you dereference an invalid pointer, you have undefined behaviour, so anything can happen -- this is really the only sensible way to specify the language if you don't want to introduce any arbitrary restrictions, and C is all about being easy-to-implement.
Practically, modern operating systems will usually have a clever virtual memory manager that needs to request memory when and as needed, and if the memory at address 1 isn't on a committed page yet, you'll actually get an error from the OS. If you try a pointer value near some actual address, you might not get an error (until you overstep the page boundary, perhaps).
To answer this properly will really depend upon a number of factors. However, it is quite likely that the address 0x00000001 (page 0) is not mapped and/or read protected.
Typically, addresses in the page 0 range are disallowed from a user application for one reason or another. Even in kernel space (depending upon the processor), addresses in page 0 are often protected and require that the page be both mapped and access enabled.
EDIT:
Another possible reason is that it could be segfaulting is that integer access is not aligned.
Your operating system won't let you access memory locations that don't belong to your program. It would work on kernel mode though...
No, address 1 is not in the address space of your program. You are trying to access a segment you do not own and receive segmentation fault.
You're trying to print 1, not the memory location. 1 isn't a valid memory location. To print the actual memory location do:
printf("%p", p);
Note the %p as icktoofay pointed out.