Why do I need to allocate memory? - c

#include<stdio.h>
#include<stdlib.h>
void main()
{
char *arr;
arr=(char *)malloc(sizeof (char)*4);
scanf("%s",arr);
printf("%s",arr);
}
In the above program, do I really need to allocate the arr?
It is giving me the result even without using the malloc.
My second doubt is ' I am expecting an error in 9th line because I think it must be
printf("%s",*arr);
or something.

do I really need to allocate the arr?
Yes, otherwise you're dereferencing an uninitialised pointer (i.e. writing to a random chunk of memory), which is undefined behaviour.

do I really need to allocate the arr?
You need to set arr to point to a block of memory you own, either by calling malloc or by setting it to point to another array. Otherwise it points to a random memory address that may or may not be accessible to you.
In C, casting the result of malloc is discouraged1; it's unnecessary, and in some cases can mask an error if you forget to include stdlib.h or otherwise don't have a prototype for malloc in scope.
I usually recommend malloc calls be written as
T *ptr = malloc(N * sizeof *ptr);
where T is whatever type you're using, and N is the number of elements of that type you want to allocate. sizeof *ptr is equivalent to sizeof (T), so if you ever change T, you won't need to duplicate that change in the malloc call itself. Just one less maintenance headache.
It is giving me the result even without using the malloc
Because you don't explicitly initialize it in the declaration, the initial value of arr is indeterminate2; it contains a random bit string that may or may not correspond to a valid, writable address. The behavior on attempting to read or write through an invalid pointer is undefined, meaning the compiler isn't obligated to warn you that you're doing something dangerous. On of the possible outcomes of undefined behavior is that your code appears to work as intended. In this case, it looks like you're accessing a random segment of memory that just happens to be writable and doesn't contain anything important.
My second doubt is ' I am expecting an error in 9th line because I think it must be printf("%s",*arr); or something.
The %s conversion specifier tells printf that the corresponding argument is of type char *, so printf("%s", arr); is correct. If you had used the %c conversion specifier, then yes, you would need to dereference arr with either the * operator or a subscript, such as printf("%c", *arr); or printf("%c", arr[i]);.
Also, unless your compiler documentation explicitly lists it as a valid signature, you should not define main as void main(); either use int main(void) or int main(int argc, char **argv) instead.
1. The cast is required in C++, since C++ doesn't allow you to assign void * values to other pointer types without an explicit cast
2. This is true for pointers declared at block scope. Pointers declared at file scope (outside of any function) or with the static keyword are implicitly initialized to NULL.

Personally, I think this a very bad example of allocating memory.
A char * will take up, in a modern OS/compiler, at least 4 bytes, and on a 64-bit machine, 8 bytes. So you use four bytes to store the location of the four bytes for your three-character string. Not only that, but malloc will have overheads, that add probably between 16 and 32 bytes to the actual allocated memory. So, we're using something like 20 to 40 bytes to store 4 bytes. That's a 5-10 times more than it actually needs.
The code also casts malloc, which is wrong in C.
And with only four bytes in the buffer, the chances of scanf overflowing is substantial.
Finally, there is no call to free to return the memory to the system.
It would be MUCH better to use:
int len;
char arr[5];
fgets(arr, sizeof(arr), stdin);
len = strlen(arr);
if (arr[len] == '\n') arr[len] = '\0';
This will not overflow the string, and only use 9 bytes of stackspace (not counting any padding...), rather than 4-8 bytes of stackspace and a good deal more on the heap. I added an extra character to the array, so that it allows for the newline. Also added code to remove the newline that fgets adds, as otherwise someone would complain about that, I'm sure.

In the above program, do I really need to allocate the arr?
You bet you do.
It is giving me the result even without using the malloc.
Sure, that's entirely possible... arr is a pointer. It points to a memory location. Before you do anything with it, it's uninitialized... so it's pointing to some random memory location. The key here is wherever it's pointing is a place your program is not guaranteed to own. That means you can just do the scanf() and at that random location that arr is pointing to the value will go, but another program can overwrite that data.
When you say malloc(X) you're telling the computer that you need X bytes of memory for your own usage that no one else can touch. Then when arr captures the data it will be there safely for your usage until you call free() (which you forgot to do in your program BTW)
This is a good example of why you should always initialize your pointers to NULL when you create them... it reminds you that you don't own what they're pointing at and you better point them to something valid before using them.
I am expecting an error in 9th line because I think it must be printf("%s",*arr)
Incorrect. scanf() wants an address, which is what arr is pointing to, that's why you don't need to do: scanf("%s", &arr). And printf's "%s" specificier wants a character array (a pointer to a string of characters) which again is what arr is, so no need to deference.

Related

do I need to allocate space for pointer as well as space for memory area whose address will be kept in pointer in pointer to pointer and realloc

I have this code
int main(int argc, char *argv[])
{
int i=1;
char **m=malloc(sizeof(char *)*i);
printf("%zu\n",sizeof *m);
m[0]=malloc(strlen("hello")+1);
strcpy(m[0],"hello");
printf("%s\n", m[0]);
i=2;
m=(char **)realloc(m,sizeof (char *)*i);
m[1]=malloc(strlen("hi")+1);
strcpy(m[1],"hi");
printf("%s %s \n",m[0],m[1] );
// TODO: write proper cleanup code just for good habits.
return 0;
}
this is how I am allocating pointer char **m 8 byte single char pointer
int i=1;
char **m=malloc(sizeof(char *)*i);
and this is how I am allocating area of space whose address will be kept in m[0]
m[0]=malloc(strlen("hello")+1);
strcpy(m[0],"hello");
printf("%s\n", m[0]);
I like to know is this normally how its done. I mean allocating space for pointer and then allocating space in memory that the pointer will hold.
Does m[0]=malloc(strlen("hello")+1); is same as this *(m+0)=malloc(strlen("hello")+1); and does this m[1]=malloc(strlen("hi")+1); this *(m+1)=malloc(strlen("hi")+1);
And I am increasing pointer to pointer numbers like this in allocation m=(char **)realloc(m,sizeof (char *)*i); before m[1]=malloc(strlen("hi")+1);
is there anything wrong with above code. I seen similar code on this Dynamic memory/realloc string array
can anyone please explain with this statement char **m=malloc(sizeof(char *)*i); I am allocating 8 byte single pointer of type char but with this statement m=(char **)realloc(m,sizeof (char *)*i); why I am not getting stack smaching detected error. How exactly realloc works. can anyone give me the link of realloc function or explain a bit on this please
I like to know is this normally how its done. I mean allocating space for pointer and then allocating space in memory that the pointer will hold.
It depends on what you are trying to achieve. If you wish to allocate an unspecified amount of strings with individual lengths, then your code is pretty much the correct way to do it.
If you wish to have a fixed amount of strings with individual lengths, you could just do char* arr [n]; and then only malloc each arr[i].
Or if you wish to have a fixed amount of strings with a fixed maximum length, you could use a 2D array of characters, char arr [x][y];, and no malloc at all.
Does m[0]=malloc(strlen("hello")+1); is same as this *(m+0)=malloc(strlen("hello")+1);
Yes, m[0] is 100% equivalent to *((m)+(0)). See Do pointers support "array style indexing"?
is there anything wrong with above code
Not really, except stylistic and performance issues. It could optionally be rewritten like this:
char** m = malloc(sizeof(*m) * i); // subjective style change
m[0]=malloc(sizeof("hello")); // compile-time calculation, better performance
why I am not getting stack smaching detected error
Why would you get that? The only thing stored on the stack here is the char** itself. The rest is stored on the heap.
How exactly realloc works. can anyone give me the link of realloc function or explain a bit on this please
It works pretty much as you've used it, though pedantically you should not store the result in the same pointer as the one passed, in case realloc fails and you wish to continue using the old data. That's a very minor remark though, since in case realloc fails, it either means that you made an unrealistic request for memory, or that the RAM on your system is toast and you will unlikely be able to continue execution anyway.
The canonical documentation for realloc would be the C standard C17 7.22.3.5:
#include <stdlib.h>
void *realloc(void *ptr, size_t size);
The realloc function deallocates the old object pointed to by ptr and returns a
pointer to a new object that has the size specified by size. The contents of the new
object shall be the same as that of the old object prior to deallocation, up to the lesser of
the new and old sizes. Any bytes in the new object beyond the size of the old object have
indeterminate values.
If ptr is a null pointer, the realloc function behaves like the malloc function for the
specified size. Otherwise, if ptr does not match a pointer earlier returned by a memory
management function, or if the space has been deallocated by a call to the free or
realloc function, the behavior is undefined. If memory for the new object cannot be
allocated, the old object is not deallocated and its value is unchanged.
Returns
The realloc function returns a pointer to the new object (which may have the same value as a pointer to the old object), or a null pointer if the new object could not be allocated.
Notably there is no guarantee that the returned pointer always has the same value as the old pointer, so correct use would be:
char* tmp = realloc(arr, size);
if(tmp == NULL)
{
/* error handling */
}
arr = tmp;
(Where tmp has the same type as arr.)
Your code looks fine to me. Yes, if you are storing an array of strings, and you don't know how many strings will be in the array in advance, then it is perfectly fine to allocate space for an array of pointers with malloc. You also need to somehow get memory for the strings themselves, and it is perfectly fine for each string to be allocated with its own malloc call.
The line you wrote to use realloc is fine; it expands the memory area you've allocated for pointers so that it now has the capacity to hold 2 pointers, instead of just 1. When the realloc function does this, it might need to move the memory allocation to a different address, so that is why you have to overwrite m as you did. There is no stack smashing going on here. Also, please note that pointers are not 8 bytes on every platform; that's why it was wise of you to write sizeof(char *) instead of 8.
To find more documentation about realloc, you can look in the C++ standard, or the POSIX standard, but perhaps the most appropriate place for this question is the C standard, which documents realloc on page 314.

C or C++ sprintf and value in struct

code:
sprintf(tmp, "xbitmap_width %d\n", symbol->scale);
Output:
xbitmap_width 1075052544
expected output - value of scale which is 5 so it should be:
xbitmap_width 5
What am i missing??? Why is sprintf taking pointer value?
Update:
If symbol->scale is indeed not a pointer, then also ensure tmp is big enough, to avoid overflow. I hope tmp is at least 18 chars big, but best make it big enough (like 30 or bigger), and if it's allocated on the heap: initialize it to zeroes: memset or calloc(30, sizeof *tmp) would be preferable.
You may also want to ensure that symbol is not a stack value, returned by a function. This, too, would be undefined behaviour. However, given that you say you're using new or malloc (which _does not initialize the struct, BTW), that can't be the issue.
The not-initializing bit here (when using malloc) might be, though: malloc merely reserves enough memory to store a given object one or more times. The memory is not initialized, though:
char *str = malloc(100);
Is something like that thing where you give a bunch of monkeys type-writers: eventually one of them might wind up punching in a line of Shakespeare: well, if you malloc strings like this, and print them, eventually one of them might end up containing the string "Don't panic".
Now, this isn't exactly true, but you get the point...
To ensure your struct is initialized, either use calloc or memset those members that str giving you grief.
if your struct looks like this:
struct symbol
{
int *scale;
}
Then you are passing the value of scale to sprintf. This value is a memory address, not an int. An int, as you may no is guaranteed to be at least 2 bytes in size (most commonly it's 4 though). A pointer is 4 or 8 bytes in size, so passing a pointer, and have sprintf interpret it as an int, you get undefined behaviour.
To print 5 in your case:
struct symbol *symbol = malloc(sizeof *symbol);
int s = 5;
symbol->scale = &s;
printf("%d\n", *(symbol->scale));//dereference the scale pointer
But this is undefined behaviour:
printf("%d\n", symbol->scale);//passing pointer VALUE ==> memory address
//for completeness & good practices' sake:
free(symbol);
Oh, and as stated in the comments: snprintf is to sprintf what strncpy is to strcpy and strncat is to strcat: it's safer to use the function which allows you to specify a maximum of chars to set

C char* pointers pointing to same location where they definitely shouldn't

I'm trying to write a simple C program on Ubuntu using Eclipse CDT (yes, I'm more comfortable with an IDE and I'm used to Eclipse from Java development), and I'm stuck with something weird. On one part of my code, I initialize a char array in a function, and it is by default pointing to the same location with one of the inputs, which has nothing to do with that char array. Here is my code:
char* subdir(const char input[], const char dir[]){
[*] int totallen = strlen(input) + strlen(dir) + 2;
char retval[totallen];
strcpy(retval, input);
strcat(retval, dir);
...}
Ok at the part I've marked with [*], there is a checkpoint. Even at that breakpoint, when I check y locals, I see that retval is pointing to the same address with my argument input. It not even possible as input comes from another function and retval is created in this function. Is is me being unexperienced with C and missing something, or is there a bug somewhere with the C compiler?
It seems so obvious to me that they should't point to the same (and a valid, of course, they aren't NULL) location. When the code goes on, it literally messes up everything; I get random characters and shapes in console and the program crashes.
I don't think it makes sense to check the address of retval BEFORE it appears, it being a VLA and all (by definition the compiler and the debugger don't know much about it, it's generated at runtime on the stack).
Try checking its address after its point of definition.
EDIT
I just read the "I get random characters and shapes in console". It's obvious now that you are returning the VLA and expecting things to work.
A VLA is only valid inside the block where it was defined. Using it outside is undefined behavior and thus very dangerous. Even if the size were constant, it still wouldn't be valid to return it from the function. In this case you most definitely want to malloc the memory.
What cnicutar said.
I hate people who do this, so I hate me ... but ... Arrays of non-const size are a C99 extension and not supported by C++. Of course GCC has extensions to make it happen.
Under the covers you are essentially doing an _alloca, so your odds of blowing out the stack are proportional to who has access to abuse the function.
Finally, I hope it doesn't actually get returned, because that would be returning a pointer to a stack allocated array, which would be your real problem since that array is gone as of the point of return.
In C++ you would typically use a string class.
In C you would either pass a pointer and length in as parameters, or a pointer to a pointer (or return a pointer) and specify the calls should call free() on it when done. These solutions all suck because they are error prone to leaks or truncation or overflow. :/
Well, your fundamental problem is that you are returning a pointer to the stack allocated VLA. You can't do that. Pointers to local variables are only valid inside the scope of the function that declares them. Your code results in Undefined Behaviour.
At least I am assuming that somewhere in the ..... in the real code is the line return retval.
You'll need to use heap allocation, or pass a suitably sized buffer to the function.
As well as that, you only need +1 rather than +2 in the length calculation - there is only one null-terminator.
Try changing retval to a character pointer and allocating your buffer using malloc().
Pass the two string arguments as, char * or const char *
Rather than returning char *, you should just pass another parameter with a string pointer that you already malloc'd space for.
Return bool or int describing what happened in the function, and use the parameter you passed to store the result.
Lastly don't forget to free the memory since you're having to malloc space for the string on the heap...
//retstr is not a const like the other two
bool subdir(const char *input, const char *dir,char *retstr){
strcpy(retstr, input);
strcat(retstr, dir);
return 1;
}
int main()
{
char h[]="Hello ";
char w[]="World!";
char *greet=(char*)malloc(strlen(h)+strlen(w)+1); //Size of the result plus room for the terminator!
subdir(h,w,greet);
printf("%s",greet);
return 1;
}
This will print: "Hello World!" added together by your function.
Also when you're creating a string on the fly you must malloc. The compiler doesn't know how long the two other strings are going to be, thus using char greet[totallen]; shouldn't work.

How much memory is reserved when i declare a string?

What exactly happens, in terms of memory, when i declare something like:
char arr[4];
How many bytes are reserved for arr?
How is null string accommodated when I 'strcpy' a string of length 4 in arr?
I was writing a socket program, and when I tried to suffix NULL at arr[4] (i.e. the 5th memory location), I ended up replacing the values of some other variables of the program (overflow) and got into a big time mess.
Any descriptions of how compilers (gcc is what I used) manage memory?
sizeof(arr) bytes are saved* (plus any padding the compiler wants to put around it, though that isn't for the array per se). On an implementation with a stack, this just means moving the stack pointer sizeof(arr) bytes down. (That's where the storage comes from. This is also why automatic allocation is fast.)
'\0' isn't accommodated. If you copy "abcd" into it, you get a buffer overrun, because that takes up 5 bytes total, but you only have 4. You enter undefined behavior land, and anything could happen.
In practice you'll corrupt the stack and crash sooner or later, or experience what you did and overwrite nearby variables (because they too are allocated just like the array was.) But nobody can say for certain what happens, because it's undefined.
* Which is sizeof(char) * 4. sizeof(char) is always 1, so 4 bytes.
What exactly happens, in terms of
memory, when i declare something like:
char arr[4];
4 * sizeof(char) bytes of stack memory is reserved for the string.
How is null string accommodated when I
'strcpy' a string of length 4 in arr?
You can not. You can only have 3 characters, 4th one (i.e. arr[3]) should be '\0' character for a proper string.
when I tried to suffix NULL at arr[4]
The behavior will be undefined as you are accessing a invalid memory location. In the best case, your program will crash immediately, but it might corrupt the stack and crash at a later point of time also.
In C, what you ask for is--usually--exactly what you get. char arr[4] is exactly 4 bytes.
But anything in quotes has a 'hidden' null added at the end, so char arr[] = "oops"; reserves 5 bytes.
Thus, if you do this:
char arr[4];
strcpy(arr, "oops");
...you will copy 5 bytes (o o p s \0) when you've only reserved space for 4. Whatever happens next is unpredictable and often catastrophic.
When you define a variable like char arr[4], it reserves exactly 4 bytes for that variable. As you've found, writing beyond that point causes what the standard calls "undefined behavior" -- a euphemism for "you screwed up -- don't do that."
The memory management of something like this is pretty simple: if it's a global, it gets allocated in a global memory space. If it's a local, it gets allocated on the stack by subtracting an appropriate amount from the stack pointer. When you return, the stack pointer is restored, so they cease to exist (and when you call another function, will normally get overwritten by parameters and locals for that function).
When you make a declaration like char arr[4];, the compiler allocates as many bytes as you asked for, namely four. The compiler might allocate extra in order to accommodate efficient memory accesses, but as a rule you get exactly what you asked for.
If you then declare another variable in the same function, that variable will generally follow arr in memory, unless the compiler makes certain optimizations again. For that reason, if you try to write to arr but write more characters than were actually allocated for arr, then you can overwrite other variables on the stack.
This is not really a function of gcc. All C compilers work essentially the same way.

Pointer initialization and string manipulation in C

I have this function which is called about 1000 times from main(). When i initialize a pointer in this function using malloc(), seg fault occurs, possibly because i did not free() it before leaving the function. Now, I tried free()ing the pointer before returning to main, but its of no use, eventually a seg fault occurs.
The above scenario being one thing, how do i initialize double pointers (**ptr) and pointer to array of pointers (*ptr[])?
Is there a way to copy a string ( which is a char array) into an array of char pointers.
char arr[]; (Lets say there are fifty such arrays)
char *ptr_arr[50]; Now i want point each such char arr[] in *ptr_arr[]
How do i initialize char *ptr_arr[] here?
What are the effects of uninitialized pointers in C?
Does strcpy() append the '\0' on its own or do we have to do it manually? How safe is strcpy() compared to strncpy()? Like wise with strcat() and strncat().
Thanks.
Segfault can be caused by many things. Do you check the pointer after the malloc (if it's NULL)? Step through the lines of the code to see exactly where does it happen (and ask a seperate question with more details and code)
You don't seem to understand the relation of pointers and arrays in C. First, a pointer to array of pointers is defined like type*** or type**[]. In practice, only twice-indirected pointers are useful. Still, you can have something like this, just dereference the pointer enough times and do the actual memory allocation.
This is messy. Should be a separate question.
They most likely crash your program, BUT this is undefined, so you can't be sure. They might have the address of an already used memory "slot", so there might be a bug you don't even notice.
From your question, my advice would be to google "pointers in C" and read some tutorials to get an understanding of what pointers are and how to use them - there's a lot that would need to be repeated in an SO answer to get you up to speed.
The top two hits are here and here.
It's hard to answer your first question without seeing some code -- Segmentation Faults are tricky to track down and seeing the code would be more straightforward.
Double pointers are not more special than single pointers as the concepts behind them are the same. For example...
char * c = malloc(4);
char **c = &c;
I'm not quite sure what c) is asking, but to answer your last question, uninitialized pointers have undefined action in C, ie. you shouldn't rely on any specific result happening.
EDIT: You seem to have added a question since I replied...
strcpy(..) will indeed copy the null terminator of the source string to the destination string.
for part 'a', maybe this helps:
void myfunction(void) {
int * p = (int *) malloc (sizeof(int));
free(p);
}
int main () {
int i;
for (i = 0; i < 1000; i++)
myfunction();
return 0;
}
Here's a nice introduction to pointers from Stanford.
A pointer is a special type of variable which holds the address or location of another variable. Pointers point to these locations by keeping a record of the spot at which they were stored. Pointers to variables are found by recording the address at which a variable is stored. It is always possible to find the address of a piece of storage in C using the special & operator. For instance: if location were a float type variable, it would be easy to find a pointer to it called location_ptr
float location;
float *location_ptr,*address;
location_ptr = &(location);
or
address = &(location);
The declarations of pointers look a little strange at first. The star * symbol which stands in front of the variable name is C's way of declaring that variable to be a pointer. The four lines above make two identical pointers to a floating point variable called location, one of them is called location_ptr and the other is called address. The point is that a pointer is just a place to keep a record of the address of a variable, so they are really the same thing.
A pointer is a bundle of information that has two parts. One part is the address of the beginning of the segment of memory that holds whatever is pointed to. The other part is the type of value that the pointer points to the beginning of. This tells the computer how much of the memory after the beginning to read and how to interpret it. Thus, if the pointer is of a type int, the segment of memory returned will be four bytes long (32 bits) and be interpreted as an integer. In the case of a function, the type is the type of value that the function will return, although the address is the address of the beginning of the function executable.
Also get more tutorial on C/C++ Programming on http://www.jnucode.blogspot.com
You've added an additional question about strcpy/strncpy.
strcpy is actually safer.
It copies a nul terminated string, and it adds the nul terminator to the copy. i.e. you get an exact duplicate of the original string.
strncpy on the other hand has two distinct behaviours:
if the source string is fewer than 'n' characters long, it acts just as strcpy, nul terminating the copy
if the source string is greater than or equal to 'n' characters long, then it simply stops copying when it gets to 'n', and leaves the string unterminated. It is therefore necessary to always nul-terminate the resulting string to be sure it's still valid:
char dest[123];
strncpy(dest, source, 123);
dest[122] = '\0';

Resources