I understand how malloc() works. My question is, I'll see things like this:
#define A_MEGABYTE (1024 * 1024)
char *some_memory;
size_t size_to_allocate = A_MEGABYTE;
some_memory = (char *)malloc(size_to_allocate);
sprintf(some_memory, "Hello World");
printf("%s\n", some_memory);
free(some_memory);
I omitted error checking for the sake of brevity. My question is, can't you just do the above by initializing a pointer to some static storage in memory? perhaps:
char *some_memory = "Hello World";
At what point do you actually need to allocate the memory yourself instead of declaring/initializing the values you need to retain?
char *some_memory = "Hello World";
is creating a pointer to a string constant. That means the string "Hello World" will be somewhere in the read-only part of the memory and you just have a pointer to it. You can use the string as read-only. You cannot make changes to it. Example:
some_memory[0] = 'h';
Is asking for trouble.
On the other hand
some_memory = (char *)malloc(size_to_allocate);
is allocating a char array ( a variable) and some_memory points to that allocated memory. Now this array is both read and write. You can now do:
some_memory[0] = 'h';
and the array contents change to "hello World"
For that exact example, malloc is of little use.
The primary reason malloc is needed is when you have data that must have a lifetime that is different from code scope. Your code calls malloc in one routine, stores the pointer somewhere and eventually calls free in a different routine.
A secondary reason is that C has no way of knowing whether there is enough space left on the stack for an allocation. If your code needs to be 100% robust, it is safer to use malloc because then your code can know the allocation failed and handle it.
malloc is a wonderful tool for allocating, reallocating and freeing memory at runtime, compared to static declarations like your hello world example, which are processed at compile-time and thus cannot be changed in size.
Malloc is therefore always useful when you deal with arbitrary sized data, like reading file contents or dealing with sockets and you're not aware of the length of the data to process.
Of course, in a trivial example like the one you gave, malloc is not the magical "right tool for the right job", but for more complex cases ( creating an arbitrary sized array at runtime for example ), it is the only way to go.
If you don't know the exact size of the memory you need to use, you need dynamic allocation (malloc). An example might be when a user opens a file in your application. You will need to read the file's contents into memory, but of course you don't know the file's size in advance, since the user selects the file on the spot, at runtime. So basically you need malloc when you don't know the size of the data you're working with in advance. At least that's one of the main reasons for using malloc. In your example with a simple string that you already know the size of at compile time (plus you don't want to modify it), it doesn't make much sense to dynamically allocate that.
Slightly off-topic, but... you have to be very careful not to create memory leaks when using malloc. Consider this code:
int do_something() {
uint8_t* someMemory = (uint8_t*)malloc(1024);
// Do some stuff
if ( /* some error occured */ ) return -1;
// Do some other stuff
free(someMemory);
return result;
}
Do you see what's wrong with this code? There's a conditional return statement between malloc and free. It might seem okay at first, but think about it. If there's an error, you're going to return without freeing the memory you allocated. This is a common source of memory leaks.
Of course this is a very simple example, and it's very easy to see the mistake here, but imagine hundreds of lines of code littered with pointers, mallocs, frees, and all kinds of error handling. Things can get really messy really fast. This is one of the reasons I much prefer modern C++ over C in applicable cases, but that's a whole nother topic.
So whenever you use malloc, always make sure your memory is as likely to be freed as possible.
char *some_memory = "Hello World";
sprintf(some_memory, "Goodbye...");
is illegal, string literals are const.
This will allocate a 12-byte char array on the stack or globally (depending on where it's declared).
char some_memory[] = "Hello World";
If you want to leave room for further manipulation, you can specify that the array should be sized larger. (Please don't put 1MB on the stack, though.)
#define LINE_LEN 80
char some_memory[LINE_LEN] = "Hello World";
strcpy(some_memory, "Goodbye, sad world...");
printf("%s\n", some_memory);
One reason when it is necessary to allocate the memory is if you want to modify it at runtime. In that case, a malloc or a buffer on the stack can be used. The simple example of assigning "Hello World" to a pointer defines memory that "typically" cannot be modified at runtime.
Related
I am pretty new to C programming and I have several functions returning type char *
Say I declare char a[some_int];, and I fill it later on. When I attempt to return it at the end of the function, it will only return the char at the first index. One thing I noticed, however, is that it will return the entirety of a if I call any sort of function on it prior to returning it. For example, my function to check the size of a string (calling something along the lines of strLength(a);).
I'm very curious what the situation is with this exactly. Again, I'm new to C programming (as you probably can tell).
EDIT: Additionally, if you have any advice concerning the best method of returning this, please let me know. Thanks!
EDIT 2: For example:
I have char ret[my_strlen(a) + my_strlen(b)]; in which a and b are strings and my_strlen returns their length.
Then I loop through filling ret using ret[i] = a[i]; and incrementing.
When I call my function that prints an input string (as a test), it prints out how I want it, but when I do
return ret;
or even
char *ptr = ret;
return ptr;
it never supplies me with the full string, just the first char.
A way not working to return a chunk of char data is to return it in memory temporaryly allocated on the stack during the execution of your function and (most probably) already used for another purpose after it returned.
A working alternative would be to allocate the chunk of memory ont the heap. Make sure you read up about and understand the difference between stack and heap memory! The malloc() family of functions is your friend if you choose to return your data in a chunk of memory allocated on the heap (see man malloc).
char* a = (char*) malloc(some_int * sizeof(char)) should help in your case. Make sure you don't forget to free up memory once you don't need it any more.
char* ret = (char*) malloc((my_strlen(a) + my_strlen(b)) * sizeof(char)) for the second example given. Again don't forget to free once the memory isn't used any more.
As MByD correctly pointed out, it is not forbidden in general to use memory allocated on the stack to pass chunks of data in and out of functions. As long as the chunk is not allocated on the stack of the function returning this is also quite well.
In the scenario below function b will work on a chunk of memory allocated on the stackframe created, when function a entered and living until a returns. So everything will be pretty fine even though no memory allocated on the heap is involved.
void b(char input[]){
/* do something useful here */
}
void a(){
char buf[BUFFER_SIZE];
b(buf)
/* use data filled in by b here */
}
As still another option you may choose to leave memory allocation on the heap to the compiler, using a global variable. I'd count at least this option to the last resort category, as not handled properly, global variables are the main culprits in raising problems with reentrancy and multithreaded applications.
Happy hacking and good luck on your learning C mission.
Most examples using structs in C use malloc to assign the required size block of memory to a pointer to that struct. However, variables with basic types (int, char etc.) are allocated to the stack and it is assumed that enough memory will be available.
I understand the idea behind this is that memory may not be available for larger structs so we use malloc to ensure we do indeed have enough memory but in the case of our struct being small is this really necessary? For example if a struct only consists of three ints, surely I am always fine to assume there is enough memory?
So really my question boils down to what are the best practises in C regarding when it is necessary to malloc variables and what is the justification?
The only time you don't have to allocate memory is when you statically allocate memory, which is what happens when you have a statement like:
int number = 5;
You can always write it as:
int *pNumber = malloc(sizeof(int));
but you have to make sure to free it or you will be leaking memory.
You can do the same thing with a struct (instead of dynamically allocating memory for it, statically allocate):
struct some_struct_t myStruct;
and access members by:
myStruct.member1 = 0;
etc...
The big difference between dynamic allocation and static is whether that data is available outside of your current scope. With static allocation, it's not. With dynamic it is, but you have to make sure to free it.
Where you run into trouble is when you have to return a structure (or a pointer to it) from a function. You either have to dynamically allocate inside the function which is returning it or you have to pass in a pointer to an externally (dynamically or statically) allocated structure which the function can then work with.
Good code gets re-used. Good code have few size limitations. Write good code.
Use malloc() whenever there is anything more than trivial buffer sizes.
Buffer size to write an int: The needed buffer size is at most sizeof(int)*CHAR_BIT/3 + 3. Use a fixed buffer.
Buffer size to write a double as in sprintf(buf, "%f",...: The needed buffer size could be thousands of bytes: use malloc(). Or use sprintf(buf, "%e",... and use a fixed buffer.
Forming a file path name could involve thousands of char. Use malloc().
I don't understand how dynamically allocated strings in C work. Below, I have an example where I think I have created a pointer to a string and allocated it 0 memory, but I'm still able to give it characters. I'm clearly doing something wrong, but what?
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
char *str = malloc(0);
int i;
str[i++] = 'a';
str[i++] = 'b';
str[i++] = '\0';
printf("%s\n", str);
return 0;
}
What you're doing is undefined behavior. It might appear to work now, but is not required to work, and may break if anything changes.
malloc normally returns a block of memory of the given size that you can use. In your case, it just so happens that there's valid memory outside of that block that you're touching. That memory is not supposed to be touched; malloc might use that memory for internal housekeeping, it might give that memory as the result of some malloc call, or something else entirely. Whatever it is, it isn't yours, and touching it produces undefined behavior.
Section 7.20.3 of the current C standard states in part:
"If the size of the space requested is zero, the behavior is
implementation defined: either a null pointer is returned, or the
behavior is as if the size were some nonzero value, except that the
returned pointer shall not be used to access an object."
This will be implementation defined. Either it could send a NULL pointer or as mentioned something that cannot be referenced
Your are overwriting non-allocated memory. This might looks like working. But you are in trouble when you call free where the heap function tries to gives the memory block back.
Each malloc() returned chunk of memory has a header and a trailer. These structures hold at least the size of the allocated memory. Sometimes yout have additional guards. You are overwriting this heap internal structures. That's the reason why free() will complain or crash.
So you have an undefined behavior.
By doing malloc(0) you are creating a NULL pointer or a unique pointer that can be passed to free. Nothing wrong with that line. The problem lies when you perform pointer arithmetic and assign values to memory you have not allocated. Hence:
str[i++] = 'a'; // Invalid (undefined).
str[i++] = 'b'; // Invalid (undefined).
str[i++] = '\0'; // Invalid (undefined).
printf("%s\n", str); // Valid, (undefined).
It's always good to do two things:
Do not malloc 0 bytes.
Check to ensure the block of memory you malloced is valid.
... to check to see if a block of memory requested from malloc is valid, do the following:
if ( str == NULL ) exit( EXIT_FAILURE );
... after your call to malloc.
Your malloc(0) is wrong. As other people have pointed out that may or may not end up allocating a bit of memory, but regardless of what malloc actually does with 0 you should in this trivial example allocate at least 3*sizeof(char) bytes of memory.
So here we have a right nuisance. Say you allocated 20 bytes for your string, and then filled it with 19 characters and a null, thus filling the memory. So far so good. However, consider the case where you then want to add more characters to the string; you can't just out them in place because you had allocated only 20 bytes and you had already used them. All you can do is allocate a whole new buffer (say, 40 bytes), copy the original 19 characters into it, then add the new characters on the end and then free the original 20 bytes. Sounds inefficient doesn't it. And it is inefficient, it's a whole lot of work to allocate memory, and sounds like an specially large amount of work compared to other languages (eg C++) where you just concatenate strings with nothing more than str1 + str2.
Except that underneath the hood those languages are having to do exactly the same thing of allocating more memory and copying existing data. If one cares about high performance C makes it clearer where you are spending time, whereas the likes of C++, Java, C# hide the costly operations from you behind convenient-to-use classes. Those classes can be quite clever (eg allocating more memory than strictly necessary just in case), but you do have to be on the ball if you're interested in extracting the very best performance from your hardware.
This sort of problem is what lies behind the difficulties that operations like Facebook and Twitter had in growing their services. Sooner or later those convenient but inefficient class methods add up to something unsustainable.
I am just learning C (reading Sam's Teach Yourself C in 24 hours). I've gotten through pointers and memory allocation, but now I'm wondering about them inside a structure.
I wrote the little program below to play around, but I'm not sure if it is OK or not. Compiled on a Linux system with gcc with the -Wall flag compiled with nothing amiss, but I'm not sure that is 100% trustworthy.
Is it ok to change the allocation size of a pointer as I have done below or am I possibly stepping on adjacent memory? I did a little before/after variable in the structure to try to check this, but don't know if that works and if structure elements are stored contiguously in memory (I'm guessing so since a pointer to a structure can be passed to a function and the structure manipulated via the pointer location). Also, how can I access the contents of the pointer location and iterate through it so I can make sure nothing got overwritten if it is contiguous? I guess one thing I'm asking is how can I debug messing with memory this way to know it isn't breaking anything?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct hello {
char *before;
char *message;
char *after;
};
int main (){
struct hello there= {
"Before",
"Hello",
"After",
};
printf("%ld\n", strlen(there.message));
printf("%s\n", there.message);
printf("%d\n", sizeof(there));
there.message = malloc(20 * sizeof(char));
there.message = "Hello, there!";
printf("%ld\n", strlen(there.message));
printf("%s\n", there.message);
printf("%s %s\n", there.before, there.after);
printf("%d\n", sizeof(there));
return 0;
}
I'm thinking something is not right because the size of my there didn't change.kj
Kind regards,
Not really ok, you have a memory leak, you could use valgrind to detect it at runtime (on Linux).
You are coding:
there.message = malloc(20 * sizeof(char));
there.message = "Hello, there!";
The first assignment call malloc(3). First, when calling malloc you should always test if it fails. But indeed it usually succeeds. So better code at least:
there.message = malloc(20 * sizeof(char));
if (!there.message)
{ perror("malloc of 20 failed"); exit (EXIT_FAILURE); }
The second assignment put the address of the constant literal string "Hello, there!" into the same pointer there.message, and you have lost the first value. You probably want to copy that constant string
strncpy (there.message, "Hello, there!", 20*sizeof(char));
(you could use just strcpy(3) but beware of buffer overflows)
You could get a fresh copy (in heap) of some string using strdup(3) (and GNU libc has also asprintf(3) ...)
there.message = strdup("Hello, There");
if (!there.message)
{ perror("strdup failed"); exit (EXIT_FAILURE); };
At last, it is good taste to free at program end the heap memory.
(But the operating system would supress the process space at _exit(2) time.
Read more about C programming, memory management, garbage collection. Perhaps consider using Boehm's conservative GC
A C pointer is just a memory address zone. Applications need to know their size.
PS. manual memory management in C is tricky, even for seasoned veteran programmers.
there.message = "Hello, there!" does not copy the string into the buffer. It sets the pointer to a new (generally static) buffer holding the string "Hello, there!". Thus, the code as written has a memory leak (allocated memory that never gets freed until the program exits).
But, yes, the malloc is fine in its own right. You'd generally use a strncpy, sprintf, or similar function to copy content into the buffer thus allocated.
Is it ok to change the allocation size of a pointer [...] ?
Huh? What do you mean by "changing the allocation size of a pointer"? Currently all your code does is leaking the 20 bytes you malloc()ated by assigning a different address to the pointer.
May be similar question found on SO. But, I didn't found that, here is the scenario
Case 1
void main()
{
char g[10];
char a[10];
scanf("%[^\n] %[^\n]",a,g);
swap(a,g);
printf("%s %s",a,g);
}
Case 2
void main()
{
char *g=malloc(sizeof(char)*10);
char *a=malloc(sizeof(char)*10);
scanf("%[^\n] %[^\n]",a,g);
swap(a,g);
printf("%s %s",a,g);
}
I'm getting same output in both case. So, my question is when should I prefer malloc() instead of array or vice-verse and why ?? I found common definition, malloc() provides dynamic allocation. So, it is the only difference between them ?? Please any one explain with example, what is the meaning of dynamic although we are specifying the size in malloc().
The principle difference relates to when and how you decide the array length. Using fixed length arrays forces you to decide your array length at compile time. In contrast using malloc allows you to decide the array length at runtime.
In particular, deciding at runtime allows you to base the decision on user input, on information not known at the time you compile. For example, you may allocate the array to be a size big enough to fit the actual data input by the user. If you use fixed length arrays, you have to decide at compile time an upper bound, and then force that limitation onto the user.
Another more subtle issue is that allocating very large fixed length arrays as local variables can lead to stack overflow runtime errors. And for that reason, you sometimes prefer to allocate such arrays dynamically using malloc.
Please any one explain with example, what is the meaning of dynamic although we are specifying the size.
I suspect this was significant before C99. Before C99, you couldn't have dynamically-sized auto arrays:
void somefunc(size_t sz)
{
char buf[sz];
}
is valid C99 but invalid C89. However, using malloc(), you can specify any value, you don't have to call malloc() with a constant as its argument.
Also, to clear up what other purpose malloc() has: you can't return stack-allocated memory from a function, so if your function needs to return allocated memory, you typically use malloc() (or some other member of the malloc familiy, including realloc() and calloc()) to obtain a block of memory. To understand this, consider the following code:
char *foo()
{
char buf[13] = "Hello world!";
return buf;
}
Since buf is a local variable, it's invalidated at the end of its enclosing function - returning it results in undefined behavior. The function above is erroneous. However, a pointer obtained using malloc() remains valid through function calls (until you don't call free() on it):
char *bar()
{
char *buf = malloc(13);
strcpy(buf, "Hello World!");
return buf;
}
This is absolutely valid.
I would add that in this particular example, malloc() is very wasteful, as there is more memory allocated for the array than what would appear [due to overhead in malloc] as well as the time it takes to call malloc() and later free() - and there's overhead for the programmer to remember to free it - memory leaks can be quite hard to debug.
Edit: Case in point, your code is missing the free() at the end of main() - may not matter here, but it shows my point quite well.
So small structures (less than 100 bytes) should typically be allocated on the stack. If you have large data structures, it's better to allocate them with malloc (or, if it's the right thing to do, use globals - but this is a sensitive subject).
Clearly, if you don't know the size of something beforehand, and it MAY be very large (kilobytes in size), it is definitely a case of "consider using malloc".
On the other hand, stacks are pretty big these days (for "real computers" at least), so allocating a couple of kilobytes of stack is not a big deal.