Strange memory allocation in case of pointers using malloc

Strange memory allocation in case of pointers using malloc - c

#include <stdio.h>
#include <stdlib.h>
int main()
{
int *a = (int *)malloc(sizeof(int));
// int a[1];
int i;
for(i = 0; i < 876; ++i)
a[i] = i;
printf("%d",a[799]);
}
Why is this code working, even if, I am allocating only 1 int's space using malloc()?

Why is this code working? Even if, I am allocating only 1 int's space using malloc ? In such case answer in undefined behavior.
Allocating block of 4 bytes like
int *a = (int *)malloc(sizeof(int)); /* No need to cast the malloc result */
and accessing beyond that like
a[i] = i; /* upto i<1 behavior is guaranteed as only 4byte allocated, not after */
results in undefined behavior i.e anything can happen and you shouldn't depend on it doing the same thing twice.
Side note, type casting the result of malloc() is not required as malloc() return type void* & its automatically promoted safely into required type. Read Do I cast the result of malloc?
And always check the return value malloc(). for e.g
int *a = malloc(sizeof(int));
if( a != NULL) {
/* allocated successfully & do_something()_with_malloced_memory() */
}
else {
/* error handling.. malloc failed */
}

It seems to be working. There is absolutely zero guarantee it will work the same way after a recompile, or in a different environment.
Basically, here you're trying to access memory address, which is not allocated to your program (using any index other than 0). So, from your program point of view, the memory address is invalid. Accessing invalid memory location invokes undefined behaviour.

As others have explained, the behavior while accessing a memory region beyond what is allocated, is undefined. Run the same program on a system which is running memory intensive applications. You might see a SIGSEGV. Run your code through coverity static analysis and you will see it catching the buffer overrun.

Related

When writing a function that returns pointer, why should I allocate memory to the pointer I'm going to return?

I'm a bit weak when it comes to memory allocation and pointers.
So, I want to understand why do I have to allocate memory to pointers in functions as follow:
char *cstring(char c, int n)
{
int i = 0;
char * res;
res = malloc ((n+1)*sizeof(char));
while (i<n)
{
res[i]=c;
i++;
}
res[i] ='\0';
return res;
}
and why is the following not valid?
char *cstring(char c, int n)
{
int i = 0;
char * res;
while (i<n)
{
res[i]=c;
i++;
}
res[i] ='\0';
return res;
}
I understand that I should allocate memory (generally) to pointers so that they have defined memory.
However, I want to mainly understand how is it related to the concept of stack and heap memories!
Thanks in advance!

Pointers need to point to a valid memory location before they can be dereferenced.
In your first example, res is made to point at a block of allocated memory which can subsequently be written to and read from.
In your second example, res remains uninitialized when you attempt to dereference it. This causes undefined behavior. The most likely outcome in this case is that whatever garbage value it happens to contain will not be a valid memory address, so when you attempt to dereference that invalid address your program will crash.

If you declare a variable like that,
int A = 5;
then that means the variable will be on the stack. When functions are called, their local variables are pushed to the stack. The main function is also an example of that. So you don't have to allocate memory manually, your compiler will do this for you in the background before it calls your main function. And that also means if you examine the stack during the execution of the function you can see the value 5.
With this,
int A = 5;
int *PtrToA = &A;
The pointer will be on the stack again. But this time, the value on the stack just shows the memory address of the actual integer value we want. It points to the address of the memory block that holds the value 5. Since A is held in the stack here, pointer will show a memory address on the stack.
Like the case in your question you can allocate memory dynamically. But you have to initialize it before you read it. Because when you request to allocate the memory, your operating system searches for a valid memory field in your programs heap and reserves that for you. Than it gives you back its adddress and gives you the read write permissions so you can use it. But the values in it won't contain what you want. When compiler allocates on stack, the initial values will be unset again. If you do this,
char *res;
res[1] = 3;
variable res will be on the stack and it will contain some random value. So accessing it is just like that,
(rand())[1] = 3;
You can get an access violation error because you may not have permission to write to that memory location.
An important note; after your function call returns, values of local variables on the stack are no more valid. So be careful with that. Do not dereference them after the function call ends.
In conclusion; if you want to use a pointer, be sure it points to a valid memory location. You can allocate it yourself or make it point another memory address.

The second version of your code declares a pointer, but does not initialize it to point to a valid memory address.
Then, the code dereferences that pointer -- during the loop. So, your code would access uninitialized memory in this case. Remember, array indexing is just syntactic sugar for dereferencing -- so your code accesses memory its not supposed to.
The first version of your code initializes the pointer to actually point to something, and hence when you dereference it during the loop, it works.
Of course, in either case, you return the pointer from the function -- its just that in the first version it points to something valid, whereas in the second version it points anywhere.
The moral here is to always initialize your variables. Not doing so could result in undefined behavior in your code (even if it appears to work sometimes). The general advice here is to always compile your code using at least some compilation flags. For example in gcc/clang, consider -Wall -Werror -Wextra. Such options often pick up on simple cases of not initializing variables.
Also, valgrind is a brilliant tool for memory profiling. It can easily detect uses of uninitialized memory at runtime, and also memory leaks.

Simple: because you do not have any allocated memory for the data you wite. In your example you define pointer, you do not initialize it so it will reference random (or rather not possible to predict) place in the memory, then you try to write to this random memory location.
You have 2 Undefined Behaviours here in 5 lines example. Your pointer is not initialized, and you did not allocate any valid memory this pointer to reference.
EDIT:
VALID
char *cstring(char c, int n)
{
char * res;
res = malloc ((n+1)*sizeof(char));
char *cstring(char c, int n)
{
char * res;
static char buff[somesize];
res = buff;
char buff[somesize];
char *cstring(char c, int n)
{
char * res;
res = buff;
INVALID
char *cstring(char c, int n)
{
char * res;
char buff[somesize];
res = buff;

malloc function in c?

I can't understand the reason of the output pf the following program
#include <stdio.h>
#include <stdlib.h>
int main(){
int* ip = (int*)malloc(100*sizeof(int));
if (ip){
int i;
for (i=0; i < 100; i++)
ip[i] = i*i;
}
free(ip);
int* ip2 = (int*)malloc(10*sizeof(int));
printf("%d\n", ip[5]);
printf("%d\n", ip2[5]);
ip[5] = 10;
printf("%d\n", ip2[5]);
return 0;
}
the output shows that ip & ip2 will have the same reference in the heap.
when we allocate a 100 int in(ip2) how did it return the same reference for ip1.
is it a feature of malloc function?
I know that malloc work like "new",right?
if it does then it should return some random reference,right?

All your outputs are undefined behavior, so they're not indicative of anything:
printf("%d\n", ip[5]); // ip was freed, so the memory it points to may not be accessed
printf("%d\n", ip2[5]); // reading uninitialized memory
ip[5] = 10; // writing to freed memory
printf("%d\n", ip2[5]); // still reading uninitialized memory
Generally, it's entirely possible that ip2 gets the same address that ip had. But you're not testing that in your code. You could do that like this, for instance:
#include <stdio.h>
#include <stdlib.h>
int main() {
int* ip = (int*)malloc(100 * sizeof(int));
printf("%p\n", ip);
free(ip);
int* ip2 = (int*)malloc(100 * sizeof(int));
printf("%p\n", ip2);
free(ip2);
}
When I run that, I do actually get the same address twice (but there's no guarantee that this happens).

I don't really see what your problem is.
It is completely up to the library where it allocates the memory which is returned to you. Especially, it may or may not use the most recently freed memory block, as long as it is still free.
Your first malloc() reserves some memory, you write to it and then you free it.
Your second malloc() reserves some memory, you read from it then you free it.
What you read in the second step is completely arbitrary and only by coincidence the same as wou wrote to it in the first step.
You code also shows undefined behaviour, because you are not allowed to access ip[5] after the free(it).

This is a undefined behaviour which you shouldn't rely on. When you allocate a memory typically from a page memory, when you free it isn't freed to OS immediately and when try allocating again most likely you get the same which is hot in the cache. Typical OS memory unit allocates the same size in next iteration from same memory freed blocks in LIFO fashion.

Malloc(0)ing an array in Windows Visual Studio for C allows the program to run perfectly fine

The C program is a Damereau-Levenshtein algorithm that uses a matrix to compare two strings. On the fourth line of main(), I want to malloc() the memory for the matrix (2d array). In testing, I malloc'd (0) and it still runs perfectly. It seems that whatever I put in malloc(), the program still works. Why is this?
I compiled the code with the "cl" command in the Visual Studio developer command prompt, and got no errors.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>
int main(){
char y[] = "felkjfdsalkjfdsalkjfdsa;lkj";
char x[] = "lknewvds;lklkjgdsalk";
int xl = strlen(x);
int yl = strlen(y);
int** t = malloc(0);
int *data = t + yl + 1; //to fill the new arrays with pointers to arrays
for(int i=0;i<yl+1;i++){
t[i] = data + i * (xl+1); //fills array with pointer
}
for(int i=0;i<yl+1;i++){
for(int j=0;j<xl+1;j++){
t[i][j] = 0; //nulls the whole array
}
}
printf("%s", "\nDistance: ");
printf("%i", distance(y, x, t, xl, yl));
for(int i=0; i<yl+1;i++){
for(int j=0;j<xl+1;j++){
if(j==0){
printf("\n");
printf("%s", "| ");
}
printf("%i", t[i][j]);
printf("%s", " | ");
}
}
}
int distance(char* y, char* x, int** t, int xl, int yl){
int isSub;
for(int i=1; i<yl+1;i++){
t[i][0] = i;
}
for(int j=1; j<xl+1;j++){
t[0][j] = j;
}
for(int i=1; i<yl+1;i++){
for(int j=1; j<xl+1;j++){
if(*(y+(i-1)) == *(x+(j-1))){
isSub = 0;
}
else{
isSub = 1;
}
t[i][j] = minimum(t[i-1][j]+1, t[i][j-1]+1, t[i-1][j-1]+isSub); //kooks left, above, and diagonal topleft for minimum
if((*(y+(i-1)) == *(x+(i-2))) && (*(y+(i-2)) == *(x+(i-1)))){ //looks at neighbor characters, if equal
t[i][j] = minimum(t[i][j], t[i-2][j-2]+1, 9999999); //since minimum needs 3 args, i include a large number
}
}
}
return t[yl][xl];
}
int minimum(int a, int b, int c){
if(a < b){
if(a < c){
return a;
}
if(c < a){
return c;
}
return a;
}
if(b < a){
if(b < c){
return b;
}
if(c < b){
return c;
}
return b;
}
if(a==b){
if(a < c){
return a;
}
if(c < a){
return c;
}
}
}

Regarding malloc(0) part:
From the man page of malloc(),
The malloc() function allocates size bytes and returns a pointer to the allocated memory. The memory is not initialized. If size is 0, then malloc() returns either NULL, or a unique pointer value that can later be successfully passed to free().
So, the returned pointer is either NULL or a pointer which can only be pasxed to free(), you cannot expect to dereference that pointer and store something into the memory location.
In either of the above cases, you're trying to to use a pointer which is invalid, it invokes undefined behavior.
Once a program hits UB, the output of that cannot be justified anyway.
One of the major outcome of UB is "working fine" (as "wrongly" expected), too.
That said, follwing the analogy
"you can allocate a zero-sized allocation, you just must not dereference it"
some of the memory debugger applications hints that usage of malloc(0) is potentially unsafe and red-zones the statements including a call to malloc(0).
Here's a nice reference related to the topic, if you're interested.
Regarding malloc(<any_size>) part:
In general, accessing out of bound memory is UB, again. If you happen to access outside the allocated memory region, you'll invoke UB anyways, and the result you speculate cannot be defined.
FWIW, C itself does not impose/ perform any boundary checking on it's own. So, you're not "restricted" (read as "compiler error") from accessing out of bound memory, but doing so invokes UB.

It seems that whatever I put in malloc(), the program still works. Why is this?
int** t = malloc(0);
int *data = t + yl + 1;
t + yl + 1 is undefined behavior (UB). Rest of code does not matter.
If t == NULL, adding 1 to it is UB as adding 1 to a null pointer is invalid pointer math.
If t != NULL, adding 1 to it is UB as adding 1 to that pointer is more than 1 beyond the allocating space.
With UB, the pointer math may worked as hope as typical malloc() allocates larges chunks, not necessarily the small size requested. It may crash on another platform/machine or another day or phase of the moon. The code is not reliable even if it works with light testing.

You just got lucky. C does not do rigorous bounds checking because it has a performance cost. Think of a C program as a raucous party happening in a private building, where the OS police are stationed outside. If somebody throws a rock that stays inside the club (an example of an invalid write that violates the ownership convention within the process but stays within the club boundaries) the police don't see it happening and take no action. But if the rock is thrown and it goes flying dangerously out the window (an example of a violation that is noticed by the operating system) the OS police step in and shut the party down.

The C standard says:
If the size of the space requested is zero, the behavior is implementation-defined; the value returned shall be either a null pointer or a unique pointer. [7.10.3]
So we have to check what your implementation says. The question says "Visual Studio," so let's check Visual C++'s page for malloc:
If size is 0, malloc allocates a zero-length item in the heap and returns a valid pointer to that item.
So, with Visual C++, we know that you're going to get a valid pointer rather than a null pointer.
But it's just a pointer to a zero-length item, so there's not really anything safe you can do with that pointer except pass it to free. If you dereference the pointer, the code is allowed to do anything it wants. That's what's meant by "undefined behavior" in the language standards.
So why does it appear to work? Probably because malloc returned a pointer to at least a few bytes of valid memory since the easiest way for malloc to give you a valid pointer to a zero-length item is to pretend you really asked for at least one byte. And then the alignment rules would round that up to something like 8 bytes.
When you dereference the beginning of that allocation, you likely have some valid memory. What you're doing is strictly illegal, non-portable, but, with this implementation, likely to work. When you index farther into it, you'll likely start corrupting other data structures (or metadata) in the heap. If you index even father into it, you're increasingly likely to crash due to hitting an unmapped page.
Why does the standard allow malloc(0) to be implementation-defined instead of just requiring it to return a null pointer?
With pointers, it's sometimes hand to have special values. The most obvious being the null pointer. The null pointer is just a reserved address that will never be used for valid memory. But what if you wanted another special pointer value that had some meaning to your program?
In the dark days before the standard, some mallocs allowed you to effectively reserve additional special pointer values by calling malloc(0). They could have used malloc(1) or any other very small size, but malloc(0) made it clear that you just wanted to reserve and address rather than actual space. So there were many programs that depended on this behavior.
Meanwhile, there were programs that expected malloc(0) to return a null pointer, since that's what their library had always done. When the standards people looked at the existing code and how it used the library, they decided they couldn't choose one method over the other without "breaking" some of the code out there. So they allowed malloc's behavior to remain "implementation-defined."

How much memory does calloc actually allocate?

The following code when tested, gives output as
1
0
0
2
0
which is amazing because ptr[3], ptr[4] did not have any memory allocation. Although they stored value in them and prints it. I tried the same code for few larger i's in ptr[i] which again compiled successfully and gives result but for very large value of i, of the order of 100000,program get crashed. if calloc() allocates such a large memory on single call then it is not worth effective. So how calloc() works? Where is this discrepancy?
#include <stdio.h>
void main() {
int * ptr = (int *)calloc(3,sizeof(int));//allocates memory to 3 integer
int i = 0;
*ptr = 1;
*(ptr+3) = 2;//although memory is not allocated but get initialized
for( i =0 ; i<5 ; ++i){
printf("%d\n",*ptr++);
}
}
After that i tried this code which continuously runs without any output
#include <stdio.h>
void main() {
int * ptr = (int *)calloc(3,sizeof(int));
int i = 0;
*ptr = 1;
*(ptr+3) = 2;
//free(ptr+2);
for( ; ptr!=NULL ;)
{
//printf("%d\n",*ptr++);
i++;
}
printf("%d",i);
}

You are confounding two kinds of memory allocation: the kernel's and libc's.
When you ask malloc or calloc for 5 bytes, it does not turn around and ask the kernel for 5 bytes. That would take forever. Instead, the libc heap system obtains larger blocks of memory from the kernel, and subdivides it.
Therefore, when you ask for a small amount of memory, there is usually plenty of more accessible memory right after it that has not been allocated yet by libc, and you can access it. Of course, accessing it is an instant recipe for bugs.
The only time that referencing off the end will get a SIGnal is if you happen to be at the very end of the region acquired from the kernel.
I recommend that you try running your test case under valgrind for additional insight.

The code that you have has undefined behavior. However, you do not get a crash because malloc and calloc indeed often allocate more memory than you ask.
One way to tell how much memory you've got is to call realloc with increasing size, until the pointer that you get back is different from the original. Although the standard does not guarantee that this trick is going to work, very often it would produce a good result.
Here is how you can run this experiment:
int *ptr = calloc(1,sizeof(int));
// Prevent expansion of the original block
int *block = calloc(1, sizeof(int));
int *tmp;
int k = 1;
do {
tmp = realloc(ptr, k*sizeof(int));
k++;
} while (tmp == ptr);
printf("%d\n", k-1);
This prints 4 on my system and on ideone (demo on ideone). This means that when I requested 4 bytes (i.e. one sizeof(int) from calloc, I got enough space for 16 bytes (i.e. 4*sizeof(int)). This does not mean that I can freely write up to fourints after requesting memory for a singleint`, though: writing past the boundary of the requested memory is still undefined behavior.

calloc is allocating memory for 3 int's in the given snippet. Actually you are accessing unallocated memory. Accessing unallocated memory invokes undefined behavior.

calloc allocates only the amount of memory that you asked, which in your case in for 3 int variables
but it doesnt create a bound on the pointer that it has created (in your case ptr). so you can access the unallocated memory just by incrementing the pointer. thats exactly whats happening in your case..

Allocating and initialising a pointer in C

So, I am new to pointers to in C.
I am facing a confusion.
If I have,
int a;
Here, I dont allocate memory manually for a. It's done automatically by the compiler.
Now, if in a similar fashion, if I do,
char * a;
Do I need to allocate memory for the pointer?
Secondly, I made this code,
#include <stdio.h>
int main (void)
{
int *s=NULL;
*s=100;
printf("%d\n",*s);
return 0;
}
Why do I get a seg fault in this code? Is it because I havent allocated memory for the pointer? But as asked in the above question, I can simply declare it as well without manually allocating the memory.
PS: I am new to pointers and I am facing confusion in this. Spare me if it is a bad question. Thanks.
Edit: I read the post for malloc on SO.
http://stackoverflow.com/questions/1963780/when-should-i-use-malloc-in-c-and-when-dont-i
It doesnt really solve my doubt.

You don't need to allocate memory for the pointer itself. That's automatic, like the int in your first code snippet.
What you do need to allocate is the memory that the pointer should point to, and you need to initialize the pointer to point to that.
Since you're not allocating any space, the *s= assignment is undefined behavior. s itself (the pointer) is allocated, and initialized to NULL. You can't dereference (*s - look at what the pointer points to) a null pointer.
#include <stdio.h>
#include <stdlib.h>
int main (void)
{
int *s = NULL; // s is created as a null pointer, doesn't point to any memory
s = malloc(sizeof(int)); // allocate one int's worth of memory
*s = 100; // store the int value 100 in that allocated memory
printf("%d\n",*s); // read the memory back
free(s); // release the memory
// (you can't dereference s after this without
// making it point to valid memory first)
return 0;
}

You have to allocate memory for the variable s you are declaring. Malloc is a nice way to do that. With int *s=NULL the pointer does not point to any address. And after that you trying to give a value to that address (*s=100;). If you do not want to allocate memory manualy (with malloc) you simply declare an int variable and then make s pointing to that variable.
With malloc:
#include <stdio.h> int main (void)
{
int *s=NULL;
s=(int *)malloc(sizeof(int));
*s=100;
printf("%d\n",*s);
return 0;
}
Without malloc:
#include <stdio.h>
int main (void)
{
int *s=NULL;
int var;
s=&var;
*s=100;
printf("%d\n",*s);//100
printf("%d\n",var);//also 100
return 0;
}

As pointers are also special type of variables which stores the address of other variables,As the address is also some value(number), Hence memory is also needed to store this address,so compiler automatically allocates memory(exactly 4 bytes on 32-bit machine) to a pointer variable,when you are allocating the memory using malloc() or calloc()your are actually allocating the memory to the data which is to be stored,and simply assigning the starting address of this memory to the pointer.
In your example at the line
int *s=NULL; //this line
Compiler actually allocates memory of 4 bytes (if your machine is 32-bit)
see this small snippet
int main(void)
{
int *s=NULL;
printf("%d\n",sizeof(s)); //Will output 4 if you are using is 32-bit OS or else 2 if you are using using 16 bit OS.
return 0;
}
And NULL is simply just another way of assigning Zero to a pointer.So technically your actually using 4 bytes to store zeros in that pointer. And don't be confused, assigning zero or NULL means that the pointer is not pointing to any data(as it doesn't hold a valid address,Zero is not valid address).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight