I am working on a short program that reads a .txt file. Intially, I was playing around in main function, and I had gotten to my code to work just fine. Later, I decided to abstract it to a function. Now, I cannot seem to get my code to work, and I have been hung up on this problem for quite some time.
I think my biggest issue is that I don't really understand what is going on at a memory/hardware level. I understand that a pointer simply holds a memory address, and a pointer to a pointer simply holds a memory address to an another memory address, a short breadcrumb trail to what we really want.
Yet, now that I am introducing malloc() to expand the amount of memory allocated, I seem to lose sight of whats going on. In fact, I am not really sure how to think of memory at all anymore.
So, a char takes up a single byte, correct?
If I understand correctly, then by a char* takes up a single byte of memory?
If we were to have a:
char* str = "hello"
Would it be say safe to assume that it takes up 6 bytes of memory (including the null character)?
And, if we wanted to allocate memory for some "size" unknown at compile time, then we would need to dynamically allocate memory.
int size = determine_size();
char* str = NULL;
str = (char*)malloc(size * sizeof(char));
Is this syntactically correct so far?
Now, if you would judge my interpretation. We are telling the compiler that we need "size" number of contiguous memory reserved for chars. If size was equal to 10, then str* would point to the first address of 10 memory addresses, correct?
Now, if we could go one step further.
int size = determine_size();
char* str = NULL;
file_read("filename.txt", size, &str);
This is where my feet start to leave the ground. My interpretation is that file_read() looks something like this:
int file_read(char* filename, int size, char** buffer) {
// Set up FILE stream
// Allocate memory to buffer
buffer = malloc(size * sizeof(char));
// Add characters to buffer
int i = 0;
char c;
while((c=fgetc(file))!=EOF){
*(buffer + i) = (char)c;
i++;
}
Adding the characters to the buffer and allocating the memory is what is I cannot seem to wrap my head around.
If **buffer is pointing to *str which is equal to null, then how do I allocate memory to *str and add characters to it?
I understand that this is lengthy, but I appreciate the time you all are taking to read this! Let me know if I can clarify anything.
EDIT:
Whoa, my code is working now, thanks so much!
Although, I don't know why this works:
*((*buffer) + i) = (char)c;
So, a char takes up a single byte, correct?
Yes.
If I understand correctly, by default a char* takes up a single byte of memory.
Your wording is somewhat ambiguous. A char takes up a single byte of memory. A char * can point to one char, i.e. one byte of memory, or a char array, i.e. multiple bytes of memory.
The pointer itself takes up more than a single byte. The exact value is implementation-defined, usually 4 bytes (32bit) or 8 bytes (64bit). You can check the exact value with printf( "%zd\n", sizeof char * ).
If we were to have a char* str = "hello", would it be say safe to assume that it takes up 6 bytes of memory (including the null character)?
Yes.
And, if we wanted to allocate memory for some "size" unknown at compile time, then we would need to dynamically allocate memory.
int size = determine_size();
char* str = NULL;
str = (char*)malloc(size * sizeof(char));
Is this syntactically correct so far?
Do not cast the result of malloc. And sizeof char is by definition always 1.
If size was equal to 10, then str* would point to the first address of 10 memory addresses, correct?
Yes. Well, almost. str* makes no sense, and it's 10 chars, not 10 memory addresses. But str would point to the first of the 10 chars, yes.
Now, if we could go one step further.
int size = determine_size();
char* str = NULL;
file_read("filename.txt", size, &str);
This is where my feet start to leave the ground. My interpretation is that file_read() looks something like this:
int file_read(char* filename, int size, char** buffer) {
// Set up FILE stream
// Allocate memory to buffer
buffer = malloc(size * sizeof(char));
No. You would write *buffer = malloc( size );. The idea is that the memory you are allocating inside the function can be addressed by the caller of the function. So the pointer provided by the caller -- str, which is NULL at the point of the call -- needs to be changed. That is why the caller passes the address of str, so you can write the pointer returned by malloc() to that address. After your function returns, the caller's str will no longer be NULL, but contain the address returned by malloc().
buffer is the address of str, passed to the function by value. Allocating to buffer would only change that (local) pointer value.
Allocating to *buffer, on the other hand, is the same as allocating to str. The caller will "see" the change to str after your file_read() returns.
Although, I don't know why this works: *((*buffer) + i) = (char)c;
buffer is the address of str.
*buffer is, basically, the same as str -- a pointer to char (array).
(*buffer) + i) is pointer arithmetic -- the pointer *buffer plus i means a pointer to the ith element of the array.
*((*buffer) + i) is dereferencing that pointer to the ith element -- a single char.
to which you are then assigning (char)c.
A simpler expression doing the same thing would be:
(*buffer)[i] = (char)c;
with char **buffer, buffer stands for the pointer to the pointer to the char, *buffer accesses the pointer to a char, and **buffer accesses the char value itself.
To pass back a pointer to a new array of chars, write *buffer = malloc(size).
To write values into the char array, write *((*buffer) + i) = c, or (probably simpler) (*buffer)[i] = c
See the following snippet demonstrating what's going on:
void generate0to9(char** buffer) {
*buffer = malloc(11); // *buffer dereferences the pointer to the pointer buffer one time, i.e. it writes a (new) pointer value into the address passed in by `buffer`
for (int i=0;i<=9;i++) {
//*((*buffer)+i) = '0' + i;
(*buffer)[i] = '0' + i;
}
(*buffer)[10]='\0';
}
int main(void) {
char *b = NULL;
generate0to9(&b); // pass a pointer to the pointer b, such that the pointer`s value can be changed in the function
printf("b: %s\n", b);
free(b);
return 0;
}
Output:
0123456789
Related
Can someone explain to me why my call to malloc with a string size of 6 returns a sizeof of 4 bytes? In fact, any integer argument I give malloc I get a sizeof of 4. Next, I am trying to copy two strings. Why is my ouput of the copied string (NULL)?
Following is my code:
int main()
{
char * str = "string";
char * copy = malloc(sizeof(str) + 1);
printf("bytes allocated for copy: %d\n", sizeof(copy));
while(*str != '\0'){
*copy = *str;
str++;
copy++;
}
copy = '\0';
printf("%s\n", copy);
}
sizeof(str) returns the size of a pointer of type char*. What you should do is to malloc the size of the string it self:
char * copy = malloc(strlen(str) + 1);
Also, these lines:
while(*str != '\0'){
*copy = *str;
str++;
copy++;
}
copy = '\0';
Can be rewritten easily in C like this:
while(*copy++ = *str++);
First you should understand that sizeof(xxx) where xxx is any left value expression (a variable) is always equivalent to do sizeof(type of xxx). Hence what is really doing your sizeof(str) is returning the size of a char *, that is the size of any other pointer. On a 32 bits architecture you'll get 4, on a 64 bits architecture it'll be 8, etc.
So, as others also explained you have to know the length of the string you want to allocate, and then add 1 to store the terminal \0, C implicitly use to put at the end of strings.
But to do what you want (copy a string and allocate necessary space) it will be more simple and more efficient to use strdup, that does exactly that : a malloc and a strcopy.
You should also not forget to free space you allocated yourself (using malloc, calloc, strdup or any other allocation function). In C it won't go away when allocated variable go out of scope. It will stay used until the end of the program. That's what you call a memory leak.
#include <string.h> /* for strdup, strlen */
#include <stdio.h> /* for printf */
int main()
{
char * str = "string";
char * copy = strdup(str);
printf("bytes at least allocated for copy: %d\n", strlen(copy)+1);
printf("%s\n", copy);
free(copy);
}
One last point : I changed message to bytes at least allocated because you don't really know the size allocated when calling malloc. It quite often allocates a slighly more space that what you asked for. One reason is that in many memory managers free blocks are linked together using some hidden data structure and any allocated block should be able to contain at least such structure, another is that allocated blocks are always aligned in such a way to be compatible with any type alignment.
Hope it will help you to understand C a little better.
You're getting the size of the str pointer (4 bytes), not what it's pointing to?
sizeof(str) returns the space necessary to store the pointer to the string, not the string itself. You can see the size of the string with strlen(str) for example.
Then you affect your copy pointer to an integer which has the value 0 (the character '\0'). It is the same as copy = NULL, which is what the printf() function shows you.
sizeof() returns the size of the actual type of the variable. So, when you define your type as char *, it returns the size of a pointer.
But if you made your variable an array, sizeof would return the size of the array itself, which would do what you want to do:
char *ptr = "moo to you";
char arr[] = "moo to you";
assert(sizeof(ptr) == 4); // assuming 32 bit
assert(sizeof(arr) == 11); // sizeof array includes terminating NUL
assert(strlen(arr) == 10); // strlen does not include terminating NUL
To tackle your second questions, by executing the statement copy++ you have changed the value of copy (that is, the address in memory that holds a char array) so that by the time you print it out, it is pointing at the end of the array rather than the beginning (the value returned by malloc()). You will need an extra variable to update the string and be able to access the beginning of the string:
Edit to repair malloc/sizeof issue - thanks CL.
char * str = "string";
/* char * copy = malloc(sizeof(str) + 1); Oops */
char * copy = malloc(strlen(str) + 1);
char * original_copy = copy;
printf("bytes allocated for copy: %d\n", sizeof(copy));
while(*str != '\0'){
*copy = *str;
str++;
copy++;
}
copy = '\0';
printf("%s\n", original_copy);
sizeof() returns you the size of the pointer and not the amount of allocated bytes. You don't need to count the allocated bytes, just check if the returned pointer is not NULL.
The line copy = '\0'; resets the pointer and makes it NULL.
You can use:
size_t malloc_usable_size (void *ptr);
instead of : sizeof
But it returns the real size of the allocated memory block! Not the size you passed to malloc!
Can someone explain to me why my call to malloc with a string size of 6 returns a sizeof of 4 bytes? In fact, any integer argument I give malloc I get a sizeof of 4. Next, I am trying to copy two strings. Why is my ouput of the copied string (NULL)?
Following is my code:
int main()
{
char * str = "string";
char * copy = malloc(sizeof(str) + 1);
printf("bytes allocated for copy: %d\n", sizeof(copy));
while(*str != '\0'){
*copy = *str;
str++;
copy++;
}
copy = '\0';
printf("%s\n", copy);
}
sizeof(str) returns the size of a pointer of type char*. What you should do is to malloc the size of the string it self:
char * copy = malloc(strlen(str) + 1);
Also, these lines:
while(*str != '\0'){
*copy = *str;
str++;
copy++;
}
copy = '\0';
Can be rewritten easily in C like this:
while(*copy++ = *str++);
First you should understand that sizeof(xxx) where xxx is any left value expression (a variable) is always equivalent to do sizeof(type of xxx). Hence what is really doing your sizeof(str) is returning the size of a char *, that is the size of any other pointer. On a 32 bits architecture you'll get 4, on a 64 bits architecture it'll be 8, etc.
So, as others also explained you have to know the length of the string you want to allocate, and then add 1 to store the terminal \0, C implicitly use to put at the end of strings.
But to do what you want (copy a string and allocate necessary space) it will be more simple and more efficient to use strdup, that does exactly that : a malloc and a strcopy.
You should also not forget to free space you allocated yourself (using malloc, calloc, strdup or any other allocation function). In C it won't go away when allocated variable go out of scope. It will stay used until the end of the program. That's what you call a memory leak.
#include <string.h> /* for strdup, strlen */
#include <stdio.h> /* for printf */
int main()
{
char * str = "string";
char * copy = strdup(str);
printf("bytes at least allocated for copy: %d\n", strlen(copy)+1);
printf("%s\n", copy);
free(copy);
}
One last point : I changed message to bytes at least allocated because you don't really know the size allocated when calling malloc. It quite often allocates a slighly more space that what you asked for. One reason is that in many memory managers free blocks are linked together using some hidden data structure and any allocated block should be able to contain at least such structure, another is that allocated blocks are always aligned in such a way to be compatible with any type alignment.
Hope it will help you to understand C a little better.
You're getting the size of the str pointer (4 bytes), not what it's pointing to?
sizeof(str) returns the space necessary to store the pointer to the string, not the string itself. You can see the size of the string with strlen(str) for example.
Then you affect your copy pointer to an integer which has the value 0 (the character '\0'). It is the same as copy = NULL, which is what the printf() function shows you.
sizeof() returns the size of the actual type of the variable. So, when you define your type as char *, it returns the size of a pointer.
But if you made your variable an array, sizeof would return the size of the array itself, which would do what you want to do:
char *ptr = "moo to you";
char arr[] = "moo to you";
assert(sizeof(ptr) == 4); // assuming 32 bit
assert(sizeof(arr) == 11); // sizeof array includes terminating NUL
assert(strlen(arr) == 10); // strlen does not include terminating NUL
To tackle your second questions, by executing the statement copy++ you have changed the value of copy (that is, the address in memory that holds a char array) so that by the time you print it out, it is pointing at the end of the array rather than the beginning (the value returned by malloc()). You will need an extra variable to update the string and be able to access the beginning of the string:
Edit to repair malloc/sizeof issue - thanks CL.
char * str = "string";
/* char * copy = malloc(sizeof(str) + 1); Oops */
char * copy = malloc(strlen(str) + 1);
char * original_copy = copy;
printf("bytes allocated for copy: %d\n", sizeof(copy));
while(*str != '\0'){
*copy = *str;
str++;
copy++;
}
copy = '\0';
printf("%s\n", original_copy);
sizeof() returns you the size of the pointer and not the amount of allocated bytes. You don't need to count the allocated bytes, just check if the returned pointer is not NULL.
The line copy = '\0'; resets the pointer and makes it NULL.
You can use:
size_t malloc_usable_size (void *ptr);
instead of : sizeof
But it returns the real size of the allocated memory block! Not the size you passed to malloc!
Can someone explain to me why my call to malloc with a string size of 6 returns a sizeof of 4 bytes? In fact, any integer argument I give malloc I get a sizeof of 4. Next, I am trying to copy two strings. Why is my ouput of the copied string (NULL)?
Following is my code:
int main()
{
char * str = "string";
char * copy = malloc(sizeof(str) + 1);
printf("bytes allocated for copy: %d\n", sizeof(copy));
while(*str != '\0'){
*copy = *str;
str++;
copy++;
}
copy = '\0';
printf("%s\n", copy);
}
sizeof(str) returns the size of a pointer of type char*. What you should do is to malloc the size of the string it self:
char * copy = malloc(strlen(str) + 1);
Also, these lines:
while(*str != '\0'){
*copy = *str;
str++;
copy++;
}
copy = '\0';
Can be rewritten easily in C like this:
while(*copy++ = *str++);
First you should understand that sizeof(xxx) where xxx is any left value expression (a variable) is always equivalent to do sizeof(type of xxx). Hence what is really doing your sizeof(str) is returning the size of a char *, that is the size of any other pointer. On a 32 bits architecture you'll get 4, on a 64 bits architecture it'll be 8, etc.
So, as others also explained you have to know the length of the string you want to allocate, and then add 1 to store the terminal \0, C implicitly use to put at the end of strings.
But to do what you want (copy a string and allocate necessary space) it will be more simple and more efficient to use strdup, that does exactly that : a malloc and a strcopy.
You should also not forget to free space you allocated yourself (using malloc, calloc, strdup or any other allocation function). In C it won't go away when allocated variable go out of scope. It will stay used until the end of the program. That's what you call a memory leak.
#include <string.h> /* for strdup, strlen */
#include <stdio.h> /* for printf */
int main()
{
char * str = "string";
char * copy = strdup(str);
printf("bytes at least allocated for copy: %d\n", strlen(copy)+1);
printf("%s\n", copy);
free(copy);
}
One last point : I changed message to bytes at least allocated because you don't really know the size allocated when calling malloc. It quite often allocates a slighly more space that what you asked for. One reason is that in many memory managers free blocks are linked together using some hidden data structure and any allocated block should be able to contain at least such structure, another is that allocated blocks are always aligned in such a way to be compatible with any type alignment.
Hope it will help you to understand C a little better.
You're getting the size of the str pointer (4 bytes), not what it's pointing to?
sizeof(str) returns the space necessary to store the pointer to the string, not the string itself. You can see the size of the string with strlen(str) for example.
Then you affect your copy pointer to an integer which has the value 0 (the character '\0'). It is the same as copy = NULL, which is what the printf() function shows you.
sizeof() returns the size of the actual type of the variable. So, when you define your type as char *, it returns the size of a pointer.
But if you made your variable an array, sizeof would return the size of the array itself, which would do what you want to do:
char *ptr = "moo to you";
char arr[] = "moo to you";
assert(sizeof(ptr) == 4); // assuming 32 bit
assert(sizeof(arr) == 11); // sizeof array includes terminating NUL
assert(strlen(arr) == 10); // strlen does not include terminating NUL
To tackle your second questions, by executing the statement copy++ you have changed the value of copy (that is, the address in memory that holds a char array) so that by the time you print it out, it is pointing at the end of the array rather than the beginning (the value returned by malloc()). You will need an extra variable to update the string and be able to access the beginning of the string:
Edit to repair malloc/sizeof issue - thanks CL.
char * str = "string";
/* char * copy = malloc(sizeof(str) + 1); Oops */
char * copy = malloc(strlen(str) + 1);
char * original_copy = copy;
printf("bytes allocated for copy: %d\n", sizeof(copy));
while(*str != '\0'){
*copy = *str;
str++;
copy++;
}
copy = '\0';
printf("%s\n", original_copy);
sizeof() returns you the size of the pointer and not the amount of allocated bytes. You don't need to count the allocated bytes, just check if the returned pointer is not NULL.
The line copy = '\0'; resets the pointer and makes it NULL.
You can use:
size_t malloc_usable_size (void *ptr);
instead of : sizeof
But it returns the real size of the allocated memory block! Not the size you passed to malloc!
Fairly simple question regarding malloc. What is the max that I can set within the allocated area. For instance:
char *buffer;
buffer = malloc(20);
buffer[19] = 'a'; //Is this the highest spot I can set?
buffer[20] = 'a'; //Or is this the highest spot I can set?
free(buffer);
The phrasing of your question is a bit off. You mean "what is the maximum index I can use for an allocated block of memory". The answer is the same as for arrays.
If you are reading or writing the memory, you may safely use indices between (and including) 0 and one less than the size of the block (in your case, that means index 19). All up, that means you can access the 20 values that you asked for.
If you are simply obtaining the pointer for comparison with other pointers inside the same block (and you are not going to read or write to it), you may additionally obtain the pointer one-past-the-end (in your case that means index 20).
To clarify these things with examples:
Yes, buffer[19] = 'a'; is the very last value you may access in a read or write capacity. Don't forget that if you want to store a string in this memory, and hand it to functions that expect a null-terminated string, this slot is your last chance to put that value of '\0'.
You are allowed to access buffer[20] in the following manner:
char *p;
for( p = &buffer[0]; p != &buffer[20]; ++p )
{
putc( *p, stdout );
}
This is useful because of the way we tend to iterate over memory and store sizes. It would make our code quite less readable if we had to subtract 1 all over the place.
Oh, and it gives you the neat trick:
size_t buf_size = 20;
char *buffer = malloc(buf_size);
char *start = buffer;
char *end = buffer + buf_size;
size_t oops_i_forgot_the_size = end - start;
malloc(x) will allocate x bytes.
So by accessing buffer[0] you access the first byte, by accessing buffer[1] you access the second.
e.g
char * buffer = (char *) malloc(1);
buffer[0] = 0; // legal
buffer[1] = 0; // illegal
char *t = malloc(2);
t = "as";
t = realloc(t,sizeof(char)*6);
I am getting error "invalid pointer: 0x080488d4 *"..
I am getting strange errors in using memory allocation functions. Is there any good tuts/guides which could explain me memory allocation functions.
I am using linux..
Please help..
This is your problem:
char *t = malloc(2);
t = "as";
You probably thought this would copy the two-character string "as" into the buffer you just allocated. What it actually does is throw away (leak) the buffer, and change the pointer to instead point to the string constant "as", which is stored in read-only memory next to the machine code, not on the malloc heap. Because it's not on the heap, realloc looks at the pointer and says "no can do, that's not one of mine". (The computer is being nice to you by giving you this error; when you give realloc a pointer that wasn't returned by malloc or realloc, the computer is allowed to make demons fly out of your nose if it wants.)
This is how to do what you meant to do:
char *t = malloc(3);
strcpy(t, "as");
Note that you need space for three characters, not two, because of the implicit NUL terminator.
By the way, you never need to multiply anything by sizeof(char); it is 1 by definition.
That is not how you assign strings in C.
The correct syntax is:
char* t = malloc(3); // Reserve enough space for the null-terminator \0
strncpy(t, "as", 3);
// Copy up to 3 bytes from static string "as" to char* t.
// By specifying a maximum of 3 bytes, prevent buffer-overruns
Allocating 2-bytes is NOT enough for "as".
C-strings have a 1-byte null-terminator, so you need at least 3 bytes to hold "as\0".
(\0 represents the null-terminator)
The code you wrote: t = "as"; makes the pointer t "abandon" the formerly allocated memory, and instead point to the static string "as". The memory allocated with malloc is "leaked" and cannot be recovered (until the program terminates and the OS reclaims it).
After this, you can call realloc as you originally did.
However, you should not do t = realloc(t,6);. If realloc fails for any reason, you've lost your memory.
The preferred method is:
new_t = realloc(t, 6);
if (new_t != NULL) // realloc succeeded
{ t = new_t;
}
else
{ // Error in reallocating, but at least t still points to good memory!
}
Your code reassigns t, making it point elsewhere
char *t = malloc(2); //t=0xf00ba12
t = "as"; //t=0xbeefbeef
t = realloc(t,sizeof(char)*6); //confused because t is 0xbeefbeef, not 0xf00b412.
Instead use strcpy
char *t = malloc(3); //don't forget about the '\0'
strcpy(t, "as");
t = realloc(t, 6); //now the string has room to breathe
First off, don't do that:
char *t = malloc(2);
Do this instead:
char *t = malloc(2 * sizeof(char));
/* or this: */
char *t = calloc(2, sizeof(char));
It may not seem worth the effort, but otherwise you may run into problems later when you deal with types larger than 1 byte.
In this line:
t = "as";
You're assigning the address of the string literal "as", so your pointer no longer points to the memory you allocated. You need to copy the contents of the literal to your allocated memory:
char *t = calloc(3, sizeof(char));
/* "ar" is 3 char's: 'a', 'r' and the terminating 0 byte. */
strncpy(t, "ar", 3);
/* then later: */
t = realloc(t,sizeof(char)*6);
You can also just use strdup, which is safer:
#include <string.h>
char *t = strdup("ar");
t = realloc(t,sizeof(char)*6);
And don't forget to free the memory
free(t);
char *t = malloc(2);
this means you have created a pointer to a memory location that can hold 2 bytes
+-+-+
t -> | | |
+-+-+
when you do
t = "as";
now you made t point to somewhere else than what it originally was pointing to. now it no longer points to the heap
t = realloc(t,sizeof(char)*6);
now you are taking the pointer pointing to read only memory and try to realloc it.
when you use malloc you allocate space on the heap. t in this case is a pointer to that location, an address of where the block is.
in order to put something in that spot you need to copy the data there by dereferencing t, this is done by writing * in front of t:
*t = 'a'; // now 'a' is where t points
*(t+1)='s'; // now 's' is behind a, t still pointing to 'a'
however in C, a string is always terminated with a 0 (ASCII value) written as '\0' so in order to make it a string you need to append a \0
+-+-+--+
t -> |a|s|\0|
+-+-+--+
in order to do this you need to malloc 3 bytes instead, than you can add the \0 by writing *(t+2)='\0';
now t can be treated as pointing to a string and used in functions that takes strings as arguments e.g. strlen( t ) returns 2