simulating a Null Pointer undefined behaviour in C [closed] - c

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I am trying to simulate the null pointer undefined behaviour.
What changes should be made in the below code to introduce null pointer undefined behaviour.
void foo( int * d )
{
printf("hello\n");
}
int main(void)
{
int a = 7 ;
int *b = malloc(sizeof(int)) ;
foo(b) ;
}

Dereferencing a NULL pointer (or some address outside your current address space, often in virtual memory) in C is not an exception, but some undefined behavior (often a segmentation fault). You really should avoid UB.
By definition of undefined behavior, we cannot explain it without going down into very specific implementation details (compiler, runtime, optimization, ASLR, machine code, phase of the moon, ...).
The malloc library function can (and does) fail. You should always test it, at least as:
int *b = malloc(sizeof(int));
if (!b) { perror("malloc of int"); exit(EXIT_FAILURE); };
To trigger failure of malloc (but very often the first few calls to malloc would still succeed) you might lower the available address space to your program. On Linux, use ulimit in the parent shell, or call setrlimit(2).
BTW, you could even link with your own malloc which is always failing:
// a silly, standard conforming, malloc which always fail:
void* malloc(size_t sz) {
if (sz == 0) return NULL;
errno = ENOMEM;
return NULL;
}
The C programming language does not have exceptions. C++ (and Ocaml, Java, ....) does (with catch & throw statements). Raising an exception is a non-local change of control flow. In standard C you might use longjmp for that purpose. In C++ dereferencing a nullptr is UB (and does not raise any null-pointer exception which does not exist in C++).

Based on your code, we can simulate Null Pointer UB like this,
#include<stdio.h>
void foo( int * d )
{
printf("hello, it is %d\n", *d);//dereference d (produces "Segmentation fault" if d is NULL)
}
int main(void)
{
int a = 7 ;
int *b = NULL; // simulate failed to malloc(sizeof(int))
foo(&a); // output is "hello, it is 7"
foo(b); // will trigger something like "Segmentation fault"
}
As pointed by #Basile Starynkevitch, there are no exceptions in C, so here it would be more accurate to say "NULL pointer UB(Undefined Behaviour)" compared to "NULL pointer exception".

Here's how to dereference a null pointer:
int *b = 0;
*b = 3;

Related

Ambiguous behavior of function returning pointer? [duplicate]

This question already has answers here:
Can a local variable's memory be accessed outside its scope?
(20 answers)
Closed 6 years ago.
#include<stdio.h>
int* add(int *a, int *b){
int c = *a + *b ;
return &c;
}
int main(void) {
int a=3,b=2 ;
int *ptr = add(&a,&b); // doubt in this line as it returns 5
printf("%d",*ptr);
}
I have doubt in the line commented.
I am using Codeblocks IDE(GNU gcc compiler), and I was wondering that if the *ptr in my main is pointing to the address of c in the add function, then it should print garbage as after the function `add completes its execution, it's popped from the stack, and the memory should be deallocated for it in the stack . So technically, it should be pointing to garbage. Then how is it printing the correct value.
You have undefined behavior, as you are returning the address of a function-local variable. A good compiler with warnings enabled would tell you this.
When you have undefined behavior you don't get to complain about the results. There is no need to wonder why it gives you the "correct" result because it does not--it gives you some number which may or may not be correct depending on any number of factors you don't control.
I think this is undefined behavior. On my system this example crashed with a segmentation fault. When something is deallocated, it is possible that the pointer to that memory location is simply moved without zeroing out the memory.

Malloc(0)ing an array in Windows Visual Studio for C allows the program to run perfectly fine

The C program is a Damereau-Levenshtein algorithm that uses a matrix to compare two strings. On the fourth line of main(), I want to malloc() the memory for the matrix (2d array). In testing, I malloc'd (0) and it still runs perfectly. It seems that whatever I put in malloc(), the program still works. Why is this?
I compiled the code with the "cl" command in the Visual Studio developer command prompt, and got no errors.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>
int main(){
char y[] = "felkjfdsalkjfdsalkjfdsa;lkj";
char x[] = "lknewvds;lklkjgdsalk";
int xl = strlen(x);
int yl = strlen(y);
int** t = malloc(0);
int *data = t + yl + 1; //to fill the new arrays with pointers to arrays
for(int i=0;i<yl+1;i++){
t[i] = data + i * (xl+1); //fills array with pointer
}
for(int i=0;i<yl+1;i++){
for(int j=0;j<xl+1;j++){
t[i][j] = 0; //nulls the whole array
}
}
printf("%s", "\nDistance: ");
printf("%i", distance(y, x, t, xl, yl));
for(int i=0; i<yl+1;i++){
for(int j=0;j<xl+1;j++){
if(j==0){
printf("\n");
printf("%s", "| ");
}
printf("%i", t[i][j]);
printf("%s", " | ");
}
}
}
int distance(char* y, char* x, int** t, int xl, int yl){
int isSub;
for(int i=1; i<yl+1;i++){
t[i][0] = i;
}
for(int j=1; j<xl+1;j++){
t[0][j] = j;
}
for(int i=1; i<yl+1;i++){
for(int j=1; j<xl+1;j++){
if(*(y+(i-1)) == *(x+(j-1))){
isSub = 0;
}
else{
isSub = 1;
}
t[i][j] = minimum(t[i-1][j]+1, t[i][j-1]+1, t[i-1][j-1]+isSub); //kooks left, above, and diagonal topleft for minimum
if((*(y+(i-1)) == *(x+(i-2))) && (*(y+(i-2)) == *(x+(i-1)))){ //looks at neighbor characters, if equal
t[i][j] = minimum(t[i][j], t[i-2][j-2]+1, 9999999); //since minimum needs 3 args, i include a large number
}
}
}
return t[yl][xl];
}
int minimum(int a, int b, int c){
if(a < b){
if(a < c){
return a;
}
if(c < a){
return c;
}
return a;
}
if(b < a){
if(b < c){
return b;
}
if(c < b){
return c;
}
return b;
}
if(a==b){
if(a < c){
return a;
}
if(c < a){
return c;
}
}
}
Regarding malloc(0) part:
From the man page of malloc(),
The malloc() function allocates size bytes and returns a pointer to the allocated memory. The memory is not initialized. If size is 0, then malloc() returns either NULL, or a unique pointer value that can later be successfully passed to free().
So, the returned pointer is either NULL or a pointer which can only be pasxed to free(), you cannot expect to dereference that pointer and store something into the memory location.
In either of the above cases, you're trying to to use a pointer which is invalid, it invokes undefined behavior.
Once a program hits UB, the output of that cannot be justified anyway.
One of the major outcome of UB is "working fine" (as "wrongly" expected), too.
That said, follwing the analogy
"you can allocate a zero-sized allocation, you just must not dereference it"
some of the memory debugger applications hints that usage of malloc(0) is potentially unsafe and red-zones the statements including a call to malloc(0).
Here's a nice reference related to the topic, if you're interested.
Regarding malloc(<any_size>) part:
In general, accessing out of bound memory is UB, again. If you happen to access outside the allocated memory region, you'll invoke UB anyways, and the result you speculate cannot be defined.
FWIW, C itself does not impose/ perform any boundary checking on it's own. So, you're not "restricted" (read as "compiler error") from accessing out of bound memory, but doing so invokes UB.
It seems that whatever I put in malloc(), the program still works. Why is this?
int** t = malloc(0);
int *data = t + yl + 1;
t + yl + 1 is undefined behavior (UB). Rest of code does not matter.
If t == NULL, adding 1 to it is UB as adding 1 to a null pointer is invalid pointer math.
If t != NULL, adding 1 to it is UB as adding 1 to that pointer is more than 1 beyond the allocating space.
With UB, the pointer math may worked as hope as typical malloc() allocates larges chunks, not necessarily the small size requested. It may crash on another platform/machine or another day or phase of the moon. The code is not reliable even if it works with light testing.
You just got lucky. C does not do rigorous bounds checking because it has a performance cost. Think of a C program as a raucous party happening in a private building, where the OS police are stationed outside. If somebody throws a rock that stays inside the club (an example of an invalid write that violates the ownership convention within the process but stays within the club boundaries) the police don't see it happening and take no action. But if the rock is thrown and it goes flying dangerously out the window (an example of a violation that is noticed by the operating system) the OS police step in and shut the party down.
The C standard says:
If the size of the space requested is zero, the behavior is implementation-defined; the value returned shall be either a null pointer or a unique pointer. [7.10.3]
So we have to check what your implementation says. The question says "Visual Studio," so let's check Visual C++'s page for malloc:
If size is 0, malloc allocates a zero-length item in the heap and returns a valid pointer to that item.
So, with Visual C++, we know that you're going to get a valid pointer rather than a null pointer.
But it's just a pointer to a zero-length item, so there's not really anything safe you can do with that pointer except pass it to free. If you dereference the pointer, the code is allowed to do anything it wants. That's what's meant by "undefined behavior" in the language standards.
So why does it appear to work? Probably because malloc returned a pointer to at least a few bytes of valid memory since the easiest way for malloc to give you a valid pointer to a zero-length item is to pretend you really asked for at least one byte. And then the alignment rules would round that up to something like 8 bytes.
When you dereference the beginning of that allocation, you likely have some valid memory. What you're doing is strictly illegal, non-portable, but, with this implementation, likely to work. When you index farther into it, you'll likely start corrupting other data structures (or metadata) in the heap. If you index even father into it, you're increasingly likely to crash due to hitting an unmapped page.
Why does the standard allow malloc(0) to be implementation-defined instead of just requiring it to return a null pointer?
With pointers, it's sometimes hand to have special values. The most obvious being the null pointer. The null pointer is just a reserved address that will never be used for valid memory. But what if you wanted another special pointer value that had some meaning to your program?
In the dark days before the standard, some mallocs allowed you to effectively reserve additional special pointer values by calling malloc(0). They could have used malloc(1) or any other very small size, but malloc(0) made it clear that you just wanted to reserve and address rather than actual space. So there were many programs that depended on this behavior.
Meanwhile, there were programs that expected malloc(0) to return a null pointer, since that's what their library had always done. When the standards people looked at the existing code and how it used the library, they decided they couldn't choose one method over the other without "breaking" some of the code out there. So they allowed malloc's behavior to remain "implementation-defined."

A function that takes string parameter and returns integer pointer [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I am wondering,How to write a function in c that return int pointer for any string input
The below function is my attempt to solve the requirment
int* methodname(char* param)
{
int *a;
int b=3;
a=&b;
return a;
}
Please correct me for any mistakes.
Within the definition of the question, which places no functionality on the function, the following would be a proper implementation without mallocs or undefined behavior:
int* methodname(char* param)
{
return ((int *)param); // just return the param as a pointer to int
}
Returning local address is undefined, refer below code.
int* methodname(char* param)
{
int *p = malloc(5*sizeof(int));
. . .
return p;
}
Your declaration is ok but definition makes no sense, if any interviewer asked such kind of question their main intention is to check your programming skills, but your definition will not impress him, just write rough body(mainly concentrate on syntax, how you are returning and no undefined behavior) instead of implementing useless code.
According to what you asked. This code takes in a char* (char pointer), and returns a int* (pointer to int).
int* methodname(char* param)
{
int* b = malloc(sizeof(int));
*b = 3;
return b;
}
NOTE
The difference between your code and this one is, your code was returning int pointer to a local variable. The local variable will be destroyed after the function exits. In this example, the variable is allocated through malloc(). The memory allocated through malloc() is retained between function calls.
I would suggest going through scope rules in C.
Edit
There is nothing called string* in C. An array of chars with a terminating '\0' (NUL) is treated as a string.
Since the memory for the int is allocated by malloc(), it has to be freed to avoid any memory leaks. This can be done when the memory is not needed anymore.

Pointers and the ways of de-referencing it [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have a doubt in de-referencing the pointers
I wrote a simple code but i dont know the reason why it is failing in certain condition can someone pls tell what is the reason for it to fail. and if we have char *ptr = "stack overflow" then the compiler itself will allocate memory to it.
int main()
{
int *ptr = 10;
double *dptr = 111.111
printf("%d", *ptr); //will give segmentation violation as we are trying to access the content in location 10
printf("%d", ptr);//o/p will be 10
printf("%lf", dptr); // will give segmentation violation
printf("%lf", *dptr); // will give segmentation violation
}
int *ptr = 10;
double *dptr = 111.111
The problem lie in the above two lines.
ptr points to the address 10 and dptr is I don't know where it is pointing.
Dereferencing these pointers will certainly yield undefined behavior.. usually a segmentation violation fault.
Fix:
int main(){
int iVal = 10;
double dVal = 111.11;
int *ptr = &iVal;
double *dptr = &dval;
printf("%d", *ptr); // ok
printf("%p", (void *)ptr);// ok
printf("%p", (void *)dptr); // ok
printf("%lf", *dptr); // ok
return 0;
}
Theory: A pointer is a variable that holds address - or as Alexey Frunze says:
The C standard does not define what a pointer is internally and how it
works internally. This is intentional so as not to limit the number of
platforms, where C can be implemented as a compiled or interpreted
language.
A pointer value can be some kind of ID or handle or a combination of
several IDs (say hello to x86 segments and offsets) and not
necessarily a real memory address. This ID could be anything, even a
fixed-size text string. Non-address representations may be especially
useful for a C interpreter.
When you do
int *ptr = 10;
you tell the compiler that ptr is a pointer to the address 10. Dereferencing this address will cause undefined behavior and may cause a crash.
You would most likely want something like:
int ival = 10;
int *ptr = &ival;
Similar for the double pointer.
char *ptr = "stack overflow"
This buffer was allocated in TEXT SEGMENT hence there is no issue of dereferencing and getting the address but if you modify the data content its a voilation [Segmentation Fault]
Note:-
Please refer from online,how many segments will be created for a program (process) ?
[Generally, Heap Segment,Stack Segment,Data Segment & Text Segment]
The reason char *ptr = "stack overflow" works and int *ptr = 10 does not is that 10 is just the number 10, but "stack overflow" evaluates to a pointer to the characters.
String literals are special kinds of constants. The compiler puts the characters of the string somewhere in memory, and the value of the string literal is a pointer to those characters. Conversely, integer and floating-point constants are just their values; they are not pointers.

Behavior of bad pointers at runtime

I've been doing some tests with pointers and came across the two following scenario. Can anybody explain to me what's happening?
void t ();
void wrong_t ();
void t () {
int i;
for (i=0;i<1;i++) {
int *p;
int a = 54;
p = &a;
printf("%d\n", *p);
}
}
void wrong_t() {
int i;
for (i=0;i<1;i++) {
int *p;
*p = 54;
printf("%d\n", *p);
}
}
Consider these two versions of main:
int main () {
t();
wrong_t();
}
prints:
54\n54\n, as expected
int main () {
wrong_t();
}
yields:
Segmentation fault: 11
I think that the issue arises from the fact that "int *p" in "wrong_t()" is a "bad pointer" as it's not correctly initialized (cfr.: cslibrary.stanford.edu/102/PointersAndMemory.pdf, page 8). But I don't understand why such problem arises just in some cases (e.g.: it does not happen if I call t() before wrong_t() or if I remove the for loop around the code in wrong_t()).
Because dereferencing an uninitialised pointer (as you correctly guessed) invokes undefined behaviour. Anything could happen.
If you want to understand the precise behaviour you're observing, then the only way is to look at the assembler code that your compiler produced. But this is normally not very productive.
What happens is almost certainly:
In both t and wrong_t, the definition int *p allocates space for p on the stack. When you call only wrong_t, this space contains data left over from previous activity (e.g., from the code that sets up the environment before main is called). It happens to be some value that is not valid as a pointer, so using it to access memory causes a segment fault.
When you call t, t initializes this space for p to contain a pointer to a. When you call wrong_t after this, wrong_t fails to initialize the space for p, but it already contains the pointer to a from when t executed, so using it to access memory results in accessing a.
This is obviously not behavior you may rely on. You may find that compiling with optimization turned on (e.g., -O3 with GCC) alters the behavior.
In wrong_t function, this statement *p = 54; is interesting. You are trying to store a value into a pointer p for which you haven't yet allocated the memory and hence, the error.

Resources