Understanding pointers in a structure and malloc

Understanding pointers in a structure and malloc - c

I am just learning C (reading Sam's Teach Yourself C in 24 hours). I've gotten through pointers and memory allocation, but now I'm wondering about them inside a structure.
I wrote the little program below to play around, but I'm not sure if it is OK or not. Compiled on a Linux system with gcc with the -Wall flag compiled with nothing amiss, but I'm not sure that is 100% trustworthy.
Is it ok to change the allocation size of a pointer as I have done below or am I possibly stepping on adjacent memory? I did a little before/after variable in the structure to try to check this, but don't know if that works and if structure elements are stored contiguously in memory (I'm guessing so since a pointer to a structure can be passed to a function and the structure manipulated via the pointer location). Also, how can I access the contents of the pointer location and iterate through it so I can make sure nothing got overwritten if it is contiguous? I guess one thing I'm asking is how can I debug messing with memory this way to know it isn't breaking anything?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct hello {
char *before;
char *message;
char *after;
};
int main (){
struct hello there= {
"Before",
"Hello",
"After",
};
printf("%ld\n", strlen(there.message));
printf("%s\n", there.message);
printf("%d\n", sizeof(there));
there.message = malloc(20 * sizeof(char));
there.message = "Hello, there!";
printf("%ld\n", strlen(there.message));
printf("%s\n", there.message);
printf("%s %s\n", there.before, there.after);
printf("%d\n", sizeof(there));
return 0;
}
I'm thinking something is not right because the size of my there didn't change.kj
Kind regards,

Not really ok, you have a memory leak, you could use valgrind to detect it at runtime (on Linux).
You are coding:
there.message = malloc(20 * sizeof(char));
there.message = "Hello, there!";
The first assignment call malloc(3). First, when calling malloc you should always test if it fails. But indeed it usually succeeds. So better code at least:
there.message = malloc(20 * sizeof(char));
if (!there.message)
{ perror("malloc of 20 failed"); exit (EXIT_FAILURE); }
The second assignment put the address of the constant literal string "Hello, there!" into the same pointer there.message, and you have lost the first value. You probably want to copy that constant string
strncpy (there.message, "Hello, there!", 20*sizeof(char));
(you could use just strcpy(3) but beware of buffer overflows)
You could get a fresh copy (in heap) of some string using strdup(3) (and GNU libc has also asprintf(3) ...)
there.message = strdup("Hello, There");
if (!there.message)
{ perror("strdup failed"); exit (EXIT_FAILURE); };
At last, it is good taste to free at program end the heap memory.
(But the operating system would supress the process space at _exit(2) time.
Read more about C programming, memory management, garbage collection. Perhaps consider using Boehm's conservative GC
A C pointer is just a memory address zone. Applications need to know their size.
PS. manual memory management in C is tricky, even for seasoned veteran programmers.

there.message = "Hello, there!" does not copy the string into the buffer. It sets the pointer to a new (generally static) buffer holding the string "Hello, there!". Thus, the code as written has a memory leak (allocated memory that never gets freed until the program exits).
But, yes, the malloc is fine in its own right. You'd generally use a strncpy, sprintf, or similar function to copy content into the buffer thus allocated.

Is it ok to change the allocation size of a pointer [...] ?
Huh? What do you mean by "changing the allocation size of a pointer"? Currently all your code does is leaking the 20 bytes you malloc()ated by assigning a different address to the pointer.

Related

Using realloc to reduce the size of a memory block

A little more than 20 years ago I had some grasp of writing something small in C , but even at that time, I probably didn't really do things right all the time. Now I'm trying to learn C again, so I'm really a newbie.
Based on this article:
Using realloc to shrink the allocated memory
, I made this test, which works, but troubles me:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int test (char *param) {
char *s = malloc(strlen(param));
strcpy(s, param);
printf("original string : [%4d] %s \n", strlen(s), s);
// reduce size
char *tmp = realloc(s, 5);
if (tmp == NULL) {
printf("Failed\n");
free(s);
exit(1);
} else {
tmp[4] = 0;
}
s = tmp;
printf("the reduced string : [%4d] %s\n", strlen(s), s );
free(s);
}
void main(void){
test("This is a string with a certain length!");
}
If I leave out "tmp[4] = 0", then I still get back the whole string. Does this mean the rest of the string is still in memory, but not allocated anymore?
how does c free memory anyway? Does it keep track of memory by itself or is it something that is handled by the OS?
I free the s string "free(s)", do I also need to free the tmp str (it does point to the same memory block, yet the (same) address it holds is probably stored on another memory location?
These are most likely just basics, but none of what I have read so far has given me a clear answer (including mentioned article).

If I leave out "tmp[4] = 0", then I still get back the whole string.
You've invoked undefined behavior. All the string operations require the argument to be a null-terminated array of characters. If you reduce the size of the allocation so it doesn't include the null terminator, you're accessing outside the allocation when it tries to find it.
Does this mean the rest of the string is still in memory, but not allocated anymore?
In practice, many implementations don't actually re-allocate anything when you shrink the size. They simply update the bookkeeping information to say that the allocated length is shorter, and return the original pointer. So the remainder of the string stays the same unless you do another allocation that happens to use that memory.
This can even happen when you grow the size. Some designs always allocate memory in specific granularities (e.g. powers of 2), so if you grow the allocation but it doesn't exceed the granularity, it doesn't need to copy the data.
how does c free memory anyway? Does it keep track of memory by itself or is it something that is handled by the OS?
Heap management is part of the C runtime library. It can use a variety of strategies.
I free the s string "free(s)", do I also need to free the tmp str (it does point to the same memory block, yet the (same) address it holds is probably stored on another memory location?
After s = tmp;, both s and tmp point to the same allocated memory block. You only need to free one of them.
BTW, the initial allocation should be:
char *s = malloc(strlen(param)+1);
You need to add 1 for the null terminator, since strlen() doesn't count this.

Is there a way to reliably malloc the same block of memory as a previously freed block, then access the content that was previously in it?

I have the following C program which requests some memory (str1), reads the content of a file into that space then frees it. Next, a block of the same size (str2) is requested, and the content is printed to stdout.
What I want is for str2 to contain the content of str1 so that the output is always the content of the file.
I am aware that what I am doing is undefined behaviour, in that I can't guarantee what the content of memory that has been allocated will contain. However, I'm trying to do some underhanded stuff for a demonstration where data from a file can be exfiltrated without it being obvious in a code review.
Almost all the time, I receive a block of memory at the same address for both str1 and str2, and most of the time when I run the program on macOS and Windows, the content of the file is printed. It seems to never happen on Linux (on Linux, calling free() seems to zero out the memory block).
Is there a way of making this more reliable on Windows and macOS, and is there any explanation for why it doesn't work at all on Linux?
My code is:
#include <stdlib.h>
#include <stdio.h>
int main() {
FILE *file = fopen("data.txt", "r");
char *str1 = malloc(4096*sizeof(char));
fread(str1, 1, 4096, f);
free(str1);
char *str2 = malloc(4096);
printf("Content: %s\n", str2);
free(str2);
}

Essentially, what happens when you allocate and free is a black box to you. There is absolutely no reliable way to get the same address. Calling free means that you tell the OS that you're done with the memory, and there's no undo for this.
What I want is for str2 to contain the content of str1 so that the output is always the content of the file.
You basically have three options here.
Wait with the call to free
Copy the buffer before you call free
Write your very own implementation of malloc and free
From comments:
imagine that I'm going to allocate some memory then read something sensitive (e.g. a private key) into that block and do something with it. Later, I allocate some memory of the same size and stick some data into it that will be saved to a file. If I don't overwrite all the data in the block then it may contain some sensitive info that would get saved to the file. In that case, it may not be obvious from a code review that some sensitive data exfiltration is possible. I want to demonstrate that sensitive data can be exfiltrated in a non-obvious manner.
Nice thing, but these kind of exploits almost always relies on undefined behavior. As you say yourself, it's a security concern. So there's really no point in providing a reliable way to do this.
Here is a snippet that worked for me on Fedora Linux.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(void) {
char *s = malloc(100);
const char str[] = "Hello, World. Prepare to meet your doom.";
strcpy(s, str);
free(s);
for(int i=0; i<strlen(str); i++)
putchar(s[i]);
puts("");
}
My output:
$ ./a.out
��epare to meet your doom.
As you can see, I got parts of the data, but not all. And to demonstrate the undefined behavior of this, here is output with different optimizations:
$ gcc k.c -O1
$ ./a.out
0Separe to meet your doom.
$ gcc k.c -O2
$ ./a.out
#�
$ gcc k.c -O3
$ ./a.out
�
Your method is very unreliable for this, because you will print until the first 0 in the string. Here is code that will output nothing, which can fool you that the data has been wiped. That's why I used putchar in a loop above.
char str[] = "Hello, World";
str[0] = '\0';
printf("%s", str); // Will print nothing, but only first character is wiped

Is there a way to reliably malloc the same block of memory as a previously freed block
Yes, use realloc instead of free + malloc. Otherwise there's no reliable or safe way to get the exact amount at the same address.
I receive a block of memory at the same address for both str1 and str2
Well there's not much else going on in this simple program, so perhaps no wonder. There's no guarantees though. Also, unless you actually do a write access to the heap, the memory allocation might not actually be called. So str2 could just be some random address in case the whole malloc call gets optimized away. Or alternatively, malloc is called but the OS never allocates any actual memory.
is there any explanation for why it doesn't work at all on Linux?
I don't know but I suspect ASLR might have something to with it. Some Linux guru will have to answer that part.

malloc function crash

I have a problem with memory allocation using malloc.
Here is a fragment from my code:
printf("DEBUG %d\n",L);
char *s=(char*)malloc(L+2);
if(s==0)
{
printf("DEBUGO1");
}
printf("DEBUGO2\n");
It outputs "DEBUG 3",and then a error msgbox appears with this message:
The instruction at 0x7c9369aa referenced memory at "0x0000000". The
memory could not be read
For me such behavior is very strange.
What can be wrong here?
The application is single threaded.
I'm using mingw C compiler that is built in code::blocks 10.05
I can provide all the code if it is needed.
Thanks.
UPD1:
There is more code:
char *concat3(char *str1,char *str2,char *str3)
{
/*concatenate three strings and frees the memory allocated for substrings before*/
/* returns a pointer to the new string*/
int L=strlen(str1)+strlen(str2)+strlen(str3);
printf("DEBUG %d\n",L);
char *s=(char*)malloc(L+2);
if(s==0)
{
printf("DEBUGO1");
}
printf("DEBUGO2\n");
sprintf(s,"%s%s%s",str1,str2,str3);
free(str1);
free(str2);
free(str3);
return s;
}
UPD2:
It seems the problem is more complicated than i thought. Just if somebody has enough time for helping me out:
Here is all the code
Proj
(it is code::blocks 10.05 project,but you may compile the sources without an ide ,it is pure C without any libraries):
call the program as
"cbproj.exe s.pl" (the s.pl file is in the root of the arhive)
and you may see it crashes when it calls the function "malloc" that is on the 113th line of "parser.tab.c"(where the function concat3 is written).
I do the project in educational purpouses,you may use the source code without any restrictions.
UPD3:
The problem was that it was allocated not enough memory for one of the strings in program ,but the it seemed to work until the next malloc.. Oh,I hate C now:)
I agree with the comments about bad coding style,need to improve myself in this.

The problem with this exact code is that when malloc fails, you don't return from the function but use this NULL-pointer further in sprintf call as a buffer.
I'd also suggest you to free memory allocated for str1, str2 and str3 outside this function, or else you might put yourself into trouble somewhere else.
EDIT: after running your program under valgrind, two real problems revealed (in parser.tab.c):
In yyuserAction,
char *applR=(char*)malloc(strlen(ruleName)+7);
sprintf(applR,"appl(%s).",ruleName);
+7 is insufficient since you also need space for \0 char at the end of string. Making it +8 helped.
In SplitList,
char *curstr=(char*)malloc(leng);
there's a possibility of allocating zero bytes. leng + 1 helps.
After aforementioned changes, everything runs fine (if one could say so, since I'm not going to count memory leaks).

From the error message it actually looks like your if statement is not quite what you have posted here. It suggests that your if statement might be something like this:
if(s=0) {
}
Note the single = (assignment) instead of == (equality).

You cannot use free on pointers that were not created by malloc, calloc or realloc. From the Manpage:
free() frees the memory space pointed to by ptr, which must have been returned by a previous call to malloc(), calloc() or realloc(). Otherwise, or if free(ptr) has already been called before, undefined behavior occurs. If ptr is NULL, no operation is performed.

How to know if C function free is working?

I have seen some differences in the result of the following code:
#include <stdio.h>
#include <malloc.h>
#include <string.h>
int main(void)
{
char* ptr;
ptr = (char*)malloc(sizeof(char) * 12);
strcpy(ptr, "Hello World");
printf("%s\n", ptr);
printf("FREEING ?\n");
free(ptr);
printf("%s\n", ptr);
}
Let me explain:
In the third call to printf depending the OS I get different results, gargabge caracters in Windows, nothing in Linux and in A Unix system "Hello World" is printed.
Is there a way to check the status of the pointer to know when memory has been freed?
I think this mechanism of print can not be trusted all the times.
Thnaks.
Greetings.

Using a pointer after it has been freed results in undefined behavior.
That means the program may print garbage, print the former contents of the string, erase your hard drive, or make monkeys fly out of your bottom, and still be in compliance with the C standard.
There's no way to look at a pointer and "know when memory has been freed". It's up to you as a programmer to keep track.

If you call free(), the memory will have been freed when that function returns. That doesn't mean that the memory will be overwritten immediately, but it's nevertheless unsafe to use any pointer on which you've called free().
It's a good idea to always assign nil to a pointer variable once you've freed it. That way, you know that non-nil pointers are safe to use.

Simple ansewr: you can't check if a pointer has been freed already in C. Different behaviors are probably due to different compilers, as using a pointer after freeing it is undefined you can get all sorts of behavior (including a SEGFAULT and program termination).
If you want to check if you use free property and your program is memory leak free, then use a tool like Valgrind.

When should I use malloc in C and when don't I?

I understand how malloc() works. My question is, I'll see things like this:
#define A_MEGABYTE (1024 * 1024)
char *some_memory;
size_t size_to_allocate = A_MEGABYTE;
some_memory = (char *)malloc(size_to_allocate);
sprintf(some_memory, "Hello World");
printf("%s\n", some_memory);
free(some_memory);
I omitted error checking for the sake of brevity. My question is, can't you just do the above by initializing a pointer to some static storage in memory? perhaps:
char *some_memory = "Hello World";
At what point do you actually need to allocate the memory yourself instead of declaring/initializing the values you need to retain?

char *some_memory = "Hello World";
is creating a pointer to a string constant. That means the string "Hello World" will be somewhere in the read-only part of the memory and you just have a pointer to it. You can use the string as read-only. You cannot make changes to it. Example:
some_memory[0] = 'h';
Is asking for trouble.
On the other hand
some_memory = (char *)malloc(size_to_allocate);
is allocating a char array ( a variable) and some_memory points to that allocated memory. Now this array is both read and write. You can now do:
some_memory[0] = 'h';
and the array contents change to "hello World"

For that exact example, malloc is of little use.
The primary reason malloc is needed is when you have data that must have a lifetime that is different from code scope. Your code calls malloc in one routine, stores the pointer somewhere and eventually calls free in a different routine.
A secondary reason is that C has no way of knowing whether there is enough space left on the stack for an allocation. If your code needs to be 100% robust, it is safer to use malloc because then your code can know the allocation failed and handle it.

malloc is a wonderful tool for allocating, reallocating and freeing memory at runtime, compared to static declarations like your hello world example, which are processed at compile-time and thus cannot be changed in size.
Malloc is therefore always useful when you deal with arbitrary sized data, like reading file contents or dealing with sockets and you're not aware of the length of the data to process.
Of course, in a trivial example like the one you gave, malloc is not the magical "right tool for the right job", but for more complex cases ( creating an arbitrary sized array at runtime for example ), it is the only way to go.

If you don't know the exact size of the memory you need to use, you need dynamic allocation (malloc). An example might be when a user opens a file in your application. You will need to read the file's contents into memory, but of course you don't know the file's size in advance, since the user selects the file on the spot, at runtime. So basically you need malloc when you don't know the size of the data you're working with in advance. At least that's one of the main reasons for using malloc. In your example with a simple string that you already know the size of at compile time (plus you don't want to modify it), it doesn't make much sense to dynamically allocate that.
Slightly off-topic, but... you have to be very careful not to create memory leaks when using malloc. Consider this code:
int do_something() {
uint8_t* someMemory = (uint8_t*)malloc(1024);
// Do some stuff
if ( /* some error occured */ ) return -1;
// Do some other stuff
free(someMemory);
return result;
}
Do you see what's wrong with this code? There's a conditional return statement between malloc and free. It might seem okay at first, but think about it. If there's an error, you're going to return without freeing the memory you allocated. This is a common source of memory leaks.
Of course this is a very simple example, and it's very easy to see the mistake here, but imagine hundreds of lines of code littered with pointers, mallocs, frees, and all kinds of error handling. Things can get really messy really fast. This is one of the reasons I much prefer modern C++ over C in applicable cases, but that's a whole nother topic.
So whenever you use malloc, always make sure your memory is as likely to be freed as possible.

char *some_memory = "Hello World";
sprintf(some_memory, "Goodbye...");
is illegal, string literals are const.
This will allocate a 12-byte char array on the stack or globally (depending on where it's declared).
char some_memory[] = "Hello World";
If you want to leave room for further manipulation, you can specify that the array should be sized larger. (Please don't put 1MB on the stack, though.)
#define LINE_LEN 80
char some_memory[LINE_LEN] = "Hello World";
strcpy(some_memory, "Goodbye, sad world...");
printf("%s\n", some_memory);

One reason when it is necessary to allocate the memory is if you want to modify it at runtime. In that case, a malloc or a buffer on the stack can be used. The simple example of assigning "Hello World" to a pointer defines memory that "typically" cannot be modified at runtime.