C Structures: Initialized Strings become invalid - c

I have the following code example which gets a pointer to a structure in two different ways. While the first one ("Test1") succeeds, the second one fails with a Segmentation fault when trying to output the string (title), while the number (type) is printed properly:
#include <stdio.h>
#include <stdlib.h>
typedef struct{
unsigned char type;
char* title;
} MenuItem;
typedef struct{
unsigned short itemCount;
MenuItem *items;
} Menu;
Menu* createMenu(unsigned short itemCount, MenuItem items[]){
Menu *menu = malloc(sizeof(Menu));
menu->itemCount = itemCount;
menu->items = items;
return menu;
}
Menu* getSampleMenu(void){
return createMenu(2,(MenuItem[]){
{3,"Foo2"},
{4,"Bar2"}
});
}
void showMenu(const Menu *menu){
for(unsigned short i = 0; i < menu->itemCount; i++)
printf("Item %d: %d/%s\n",i,menu->items[i].type,menu->items[i].title);
}
int main(void){
//Test 1
Menu *menu = createMenu(2,(MenuItem[]){
{1,"Foo"},
{2,"Bar"}
});
showMenu(menu);
//Result: 1/Foo\n 2/Bar
//Test 2
showMenu(getSampleMenu());
//Result: 3/ [segmentation fault]
}
Do you have any idea what the problem might be? The example is compiled and tested on Debian using gcc 4.6.3 in C99 mode.
Thanks in advance!

The array you're passing to createMenu has "automatic storage duration". It dies, and any pointers to it become invalid, once getSampleMenu ends.
(Edit: It might actually be even more severe than that. The array, being a temporary object, may well be dead once the statement that caused its creation ends. In this case the two are about equivalent, since that statement is the last one in the function...but were there subsequent statements in createSampleMenu that attempted to use that menu, even they may be following invalid pointers.)
You'll need to dynamically allocate (malloc) some memory and copy the array into it. (Of course, then you should also have a destroyMenu or similar function to properly free the memory once the menu's no longer needed.)

Variables that are declared locally, also called "automatic", are usually stored on the stack frame of the current function - so that when you return from the function in which they were declared, they are popped from the stack, and a function called later could write over them. malloc allocates a range of memory on the heap, which remains allocated to your use until you call free, regardless of the scope your code is in.

The pointer menu->items is lo longer valid once the function getSampleMenu() returns because "MenuItem[]" is locally defined in this function.
So in test 2, your program segfault when accessing menu->items in showMenu().

Related

Check if a pointer to function is initialized

How can I check if a pointer to function was initialized?
I can check for NULL, but if not null could be garbage, right?
I have the following:
#include <stdio.h>
#include <stdlib.h>
typedef struct client_struct
{
char *name;
char *email;
void (*print)(struct client_struct *c);
} client;
void print_client(client *c)
{
if (c->print != NULL)
c->print(c);
}
int main()
{
client *c = (client *)malloc(sizeof(client));
c->email = (char *)malloc(50 * sizeof(char));
sprintf(c->email, "email#server.com");
c->name = (char *)malloc(50 * sizeof(char));
sprintf(c->name, "some name");
//Uncommenting line below work as expected, otherwise segmentation fault
//c->print = NULL;
print_client(c);
printf("\nEOF\n");
int xXx = getchar();
return 0;
}
How can I check if this pointer really points to function "void (*f)(client *)"?
Comparing size doesn't work because could garbage in same size, correct?
I would like a way to accomplish that preferably according to C standard.
As described in the comments, it is impossible to determine with 100% certainty whether a pointer is garbage.
To avoid such situation, you can provide a "constructor" function, like this:
struct client_struct* client_allocate()
{
struct client_struct* object = malloc(sizeof *object);
if (object)
{
object->name = NULL;
object->email = NULL;
object->print = NULL;
}
return object;
}
Then write in your documentation that the only valid way to create "clients" is by using your function. If you do this, you should also provide a destroy function, where you call free.
Suppose you add a new pointer to your struct one day. Then you update your client_allocate function, where you set this pointer to NULL, and the new pointer will always be properly initialized. There is no need to update all places in code where your struct is allocated, because now there is only one such place.
Caveats
Checking if a pointer to a function is initialized with an valid function is not an easily solvable problem. Any solution, will not be portable across platforms, and is also dependent on the binary format (statically or dynamically linkable formats) that you end up with. There are ways to do this, with varying success, on different binary formats, however I am not going to go over every permutation. Hopefully this will get you going down that rabbit hole :-) and you can figure out the particular solution that works for you in your circumstances.
In order for some of the solutions to work you have to ensure that the linked binaries have exported symbols (it's possible to do it without, but it's a lot harder and I don't have the time). So when you're linking your program ensure that you have dynamic symbols enabled.
Having said that, here's an approach you can use on systems using dlfcn functions. (See History below)
More Caveats
As #Deduplicator points out in his comment below, there may be situations where 0xdeadbeef may arbitrarily happen to point to a valid function, in which case you may end up with a situation where you end up calling the wrong valid function. There are ways to mitigate that situation at either compile-time or runtime but you'll have to build the solution by hand. For example, C++ does it by mangling in namespace into the symbols. You could require that to happen. (I'll think of an interesting way to do this and post it)
Linux / SysV variants (Mac OSX included)
Use dladdr (SysV) (GNU has a dladdr1 as well) to determine which function does the address you provide fall within:
Example:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <dlfcn.h>
int is_valid_function_ptr( void *func) {
Dl_info info;
int rc;
rc = dladdr(func, &info);
if (!rc) {
/* really should do more checking here */
return 0;
}
return 1; /* you can print out function names and stuff here */
}
void print(const char *value) {
fprintf(stdout, "%s", value);
}
void call_the_printer(void (*foo)(), const char *value)
{
if(is_valid_function_ptr(foo)) {
foo(value);
}
else {
fprintf(stderr, "The Beef is Dead!\n");
}
}
int main()
{
void (*funcptr)() = (void (*)()) 0xdeadbeef; /* some dead pointer */
call_the_printer(funcptr, "Hello Dead Beef\n");
funcptr = print; /* actually a function */
call_the_printer(funcptr, "Hello Printer\n");
return 0;
}
NOTE Enable dynamic symbols for this to work
GCC/LLVM etc.
use -rdynamic or -Wl,--export-dynamic during the link process, so compile with:
gcc -o ex1 -rdynamic ex1.c
Windows
Windows does its own thing (as always) and I haven't tested any of these, but the basic concept should work:
Use GetModuleHandle and EnumCurrentProcess together to get loaded symbol information and run through the pointers in a loop to see they match any of the address therein.
The other way would be to use VirtualQuery and then cast mbi.AllocationBase to (HMODULE) and see if you get the path of your own binary back.
In c function pointers are no different than regular pointers and by standard they have one value that says the value should not be used and this is NULL.
The way you should work with pointers is to set them only to valid value or NULL. There is no other way you can be sure there is a OK value. And by definition every value that is not NULL should be considered valid.
Like pointed to in other comments and answers, there is not way to check a variable is initialized. That's why initializing vars to NULL and then checking is considered good practice.
If you really want to validate your function pointer is pointing to the correct place, you could export the function and load your pointer from the ELF symbols (see: http://gcc.gnu.org/wiki/Visibility)
Always check for null parameters first of all.
void print_client(client *c)
{
if ((c != NULL) && (c->print != NULL))
{
c->print(c);
}
}
As for your question, nullify your client struct after it's been malloc'd. This way you can ensure that an unassigned function pointer shall indeed ==NULL.
client* create_client(void)
{
client *c = malloc(sizeof(client));
if (c != NULL)
{
memset(c, 0, sizeof(c))
}
return c;
}

Variable scope inside while loop

This is perhaps one of the most odd things I've ever encountered. I don't program much in C but from what I know to be true plus checking with different sources online, variables macroName and macroBody are only defined in scope of the while loop. So every time the loop runs, I'm expecting marcoName and macroBody to get new addresses and be completely new variables. However that is not true.
What I'm finding is that even though the loop is running again, both variables share the same address and this is causing me serious headache for a linked list where I need to check for uniqueness of elements. I don't know why this is. Shouldn't macroName and macroBody get completely new addresses each time the while loop runs?
I know this is the problem because I'm printing the addresses and they are the same.
while(fgets(line, sizeof(line), fp) != NULL) // Get new line
{
char macroName[MAXLINE];
char macroBody[MAXLINE];
// ... more code
switch (command_type)
{
case hake_macro_definition:
// ... more code
printf("**********%p | %p\n", &macroName, &macroBody);
break;
// .... more cases
}
}
Code that is part of my linked-list code.
struct macro {
struct macro *next;
struct macro *previous;
char *name;
char *body;
};
Function that checks if element already exists inside linked-list. But since *name has the same address, I always end up inside the if condition.
static struct macro *macro_lookup(char *name)
{
struct macro *temp = macro_list_head;
while (temp != NULL)
{
if (are_strings_equal(name, temp->name))
{
break;
}
temp = temp->next;
}
return temp;
}
These arrays are allocated on the stack:
char macroName[MAXLINE];
char macroBody[MAXLINE];
The compiler has pre-allocated space for you that exists at the start of your function. In other words, from the computer's viewpoint, the location of these arrays would the same as if you had defined them outside the loop body at the top of your function body.
The scope in C merely indicates where an identifier is visible. So the compiler (but not the computer) enforces the semantics that macroName and macroBody cannot be referenced before or after the loop body. But from the computer's viewpoint, the actual data for these arrays exists once the function starts and only goes away when the function ends.
If you were to look at the assembly dump of your code, you'd likely see that your machine's frame pointer is decremented by a big enough amount for your function's call stack to have space for all of your local variables, including these arrays.
What I need to mention in addition to chrisaycock's answer: you should never use pointers to local variables outside function these variables were defined in. Consider this example:
int * f()
{
int local_var = 0;
return &local_var;
}
int g(int x)
{
return (x > 0) ? x : 0;
}
int main()
{
int * from_f = f(); //
*from_f = 100; //Undefined behavior
g(15); //some function call to change stack
printf("%d", *from_f); //Will print some random value
return 0;
}
The same, actually, applies to a block. Technically, block-local variables can be cleaned out after the block ends. So, on each iteration of a loop old addresses can be invalid. It will not be true since C compiler indeed puts these vars to the same address for perfomance reasons, but you can not rely on it.
What you need to understand is how memory is allocated. If you want to implement a list, it is a structure that grows. Where does the memory come from? You can not allocate much memory from the stack, plus the memory is invalidated once you return from a function. So, you will need to allocate it from the heap (using malloc).

Modifying struct members through a pointer passed to a function

for instance this code:
struct test{
int ID;
bool start;
};
struct test * sTest;
void changePointer(struct test * t)
{
t->ID = 3;
t->start = false;
}
int main(void)
{
sTest->ID = 5;
sTest->start = true;
changePointer(sTest);
return 0;
}
If I was to execute this code, then what would the output be? (i.e. if I pass a pointer like this, does it change the reference or is it just a copy?)
Thanks in advance!
Your program doesn't have any output, so there would be none.
It also never initializes the sTest pointer to point at some valid memory, so the results are totally undefined. This program invokes undefined behavior, and should/might/could crash when run.
IF the pointer had been initialized to point at a valid object of type struct test, the fields of that structure would have been changed so that at the end of main(), ID would be 3. The changes done inside changePointer() are done on the same memory as the changes done in main().
An easy fix would be:
int main(void)
{
struct test aTest;
sTest = &aTest; /* Notice the ampersand! */
sTest->start = true;
changePointer(sTest);
return 0;
}
Also note that C before C99 doesn't have a true keyword.
The only question is why do you need a test pointer in a global name space? Second is that you do not have any memory allocation operations. And you have a pointer as an input parameter of your function. Therefore structure where it points to will be changed in "changePointer".
1) First thing your code will crash since you are not allocating memory for saving structure.. you might need to add
sText = malloc(sizeof(struct test));
2) After correcting the crash, you can pass structure pointer and the changes you make in changePointer function will reflect in main and vizeversa..
3) But since you are not printing anything, there wont be any output to your program..

code that cause a framework crashed, but when reproduced in a single file, it worked

I have a question regarding this code. I write this code in my framework, and it caused the framework crashed. But when I rewrite this code below in a single file, but it works just fine. I was just wondering, is the code below is correct for memory allocation and freeing it? (especially for the part of msg->context_var.type = f;)
Thank you
#include <stdio.h>
#include <stdlib.h>
typedef struct
{
int value;
int price;
int old;
} type_t;
typedef struct {
type_t *type;
} context_t;
typedef struct {
context_t context_var;
} send_request;
void send_Message(send_request *msg)
{
type_t *f = 0;
f = malloc(sizeof(f));
msg->context_var.type = f;
msg->context_var.type->price = 1;
msg->context_var.type->value = 100;
msg->context_var.type->old =120;
printf("value of %d/n", msg->context_var.type->price);
free(f);
}
int main()
{
send_request *msg = 0;
msg = (send_request *) malloc(sizeof(send_request));
send_Message(msg);
free(msg);
return 0;
}
It's wrong.
f = malloc(sizeof(f)); /* Wrong */
f = malloc(sizeof(*f)); /* Better ? */
sizeof(f) will give you the size of a pointer on your machine; sizeof(*f) will give you the size of the object pointed to.
EDIT As requested by #Perception
When you allocate less than you need you're eliciting Undefined Behavior. Anything can happen (even the desired behavior) and it all depends on the platform, the environment (the moon phase, etc).
msg->context_var.type->value = 100; /* Writes beyond what's allocated. */
So, depending on the memory layout of the "framework" this might simply overwrite some memory and "work", or it could crash. Frankly I prefer when it crashes straight away.
You allocate an instance of context_t on the heap, and then msg->context_var.type gets the value of the resulting pointer f.
Since msg is a pointer parameter to the send_Message function, no reliable assumptions can be made about what is done with msg and its contents after your function exists. As such, when you go on to free the memory pointed to by f, you leave a dangling pointer in msg->context_var.type.
If the memory it points to is accessed after send_Message exists, there's a fair chance that you corrupt something vital (or read something crazy, like a pointer to 0xdeadbeef), as it might contain something completely different now.
Not only are you allocating wrong size (see cnicutar's answer)-- If you are attaching f to message that is passed by the framework, you probably don't want to free it before the function returns. You'll need to free it later, though-- probably through some other facility provided by the framework?

Returning local data from functions in C and C++ via pointer

I have argument with my friend. He says that I can return a pointer to local data from a function. This is not what I have learned but I can't find a counterargument for him to prove my knowledge.
Here is illustrated case:
char *name() {
char n[10] = "bodacydo!";
return n;
}
And it's used as:
int main() {
char *n = name();
printf("%s\n", n);
}
He says this is perfectly OK because after a program calls name, it returns a pointer to n, and right after that it just prints it. Nothing else happens in the program meanwhile, because it's single threaded and execution is serial.
I can't find a counter-argument. I would never write code like that, but he's stubborn and says this is completely ok. If I was his boss, I would fire him for being a stubborn idiot, but I can't find a counter argument.
Another example:
int *number() {
int n = 5;
return &n;
}
int main() {
int *a = number();
int b = 9;
int c = *a * b;
printf("%d\n", c);
}
I will send him this link after I get some good answers, so he at least learns something.
Your friend is wrong.
name is returning a pointer to the call stack. Once you invoke printf, there's no telling how that stack will be overwritten before the data at the pointer is accessed. It may work on his compiler and machine, but it won't work on all of them.
Your friend claims that after name returns, "nothing happens except printing it". printf is itself another function call, with who knows how much complexity inside it. A great deal is happening before the data is printed.
Also, code is never finished, it will be amended and added to. Code the "does nothing" now will do something once it's changed, and your closely-reasoned trick will fall apart.
Returning a pointer to local data is a recipe for disaster.
you will get a problem, when you call another function between name() and printf(), which itself uses the stack
char *fun(char *what) {
char res[10];
strncpy(res, what, 9);
return res;
}
main() {
char *r1 = fun("bla");
char *r2 = fun("blubber");
printf("'%s' is bla and '%s' is blubber", r1, r2);
}
As soon as the scope of the function ends i.e after the closing brace } of function, memory allocated(on stack) for all the local variables will be left. So, returning pointer to some memory which is no longer valid invokes undefined behavior.
Also you can say that local variable lifetime is ended when the function finished execution.
Also more details you can read HERE.
My counter-arguments would be:
it's never OK to write code with undefined behavior,
how long before somebody else uses that function in different context,
the language provides facilities to do the same thing legally (and possibly more efficiently)
It's undefined behavior and the value could easily be destroyed before it is actually printed. printf(), which is just a normal function, could use some local variables or call other functions before the string is actually printed. Since these actions use the stack they could easily corrupt the value.
If the code happens to print the correct value depends on the implementation of printf() and how function calls work on the compiler/platform you are using (which parameters/addresses/variables are put where on the stack,...). Even if the code happens to "work" on your machine with certain compiler settings it's far from sure that it will work anywhere else or under slightly different border conditions.
You are correct - n lives on the stack and so could go away as soon as the function returns.
Your friend's code might work only because the memory location that n is pointing to has not been corrupted (yet!).
As the others have already pointed out it is not illegal to do this, but a bad idea because the returned data resides on the non-used part of the stack and may get overridden at any time by other function calls.
Here is a counter-example that crashes on my system if compiled with optimizations turned on:
char * name ()
{
char n[] = "Hello World";
return n;
}
void test (char * arg)
{
// msg and arg will reside roughly at the same memory location.
// so changing msg will change arg as well:
char msg[100];
// this will override whatever arg points to.
strcpy (msg, "Logging: ");
// here we access the overridden data. A bad idea!
strcat (msg, arg);
strcat (msg, "\n");
printf (msg);
}
int main ()
{
char * n = name();
test (n);
return 0;
}
gcc : main.c: In function ‘name’:
main.c:4: warning: function returns address of local variable
Wherever it could been done like that (but it's not sexy code :p) :
char *name()
{
static char n[10] = "bodacydo!";
return n;
}
int main()
{
char *n = name();
printf("%s\n", n);
}
Warning it's not thread safe.
You're right, your friend is wrong. Here's a simple counterexample:
char *n = name();
printf("(%d): %s\n", 1, n);
Returning pointer to local variable is aways wrong, even if it appears to work in some rare situation.
A local (automatic) variable can be allocated either from stack or from registers.
If it is allocated from stack, it will be overwritten as soon as next function call (such as printf) is executed or if an interrupt occurs.
If the variable is allocated from a register, it is not even possible to have a pointer pointing to it.
Even if the application is "single threaded", the interrupts may use the stack. In order to be relatively safe, you should disable the interrupts. But it is not possible to disable the NMI (Non Maskable Interrupt), so you can never be safe.
While it is true that you cannot return pointers to local stack variables declared inside a function, you can however allocate memory inside a function using malloc and then return a pointer to that block. Maybe this is what your friend meant?
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
char* getstr(){
char* ret=malloc(sizeof(char)*15);
strcpy(ret,"Hello World");
return ret;
}
int main(){
char* answer=getstr();
printf("%s\n", answer);
free(answer);
return 0;
}
The way I see it you have three main options because this one is dangerous and utilizes undefined behavior:
replace: char n[10] = "bodacydo!"
with: static char n[10] = "bodacydo!"
This will give undesirable results if you use the same function more than once in row while trying to maintain the values contained therein.
replace:
char n[10] = "bodacydo!"
with:
char *n = new char[10];
*n = "bodacydo!"
With will fix the aforementioned problem, but you will then need to delete the heap memory or start incurring memory leaks.
Or finally:
replace: char n[10] = "bodacydo!";
with: shared_ptr<char> n(new char[10]) = "bodacydo!";
Which relieves you from having to delete the heap memory, but you will then have change the return type and the char *n in main to a shared_prt as well in order to hand off the management of the pointer. If you don't hand it off, the scope of the shared_ptr will end and the value stored in the pointer gets set to NULL.
If we take the code segment u gave....
char *name() {
char n[10] = "bodacydo!";
return n;
}
int main() {
char *n = name();
printf("%s\n", n);
}
Its okay to use that local var in printf() in main 'coz here we are using a string literal which again isn't something local to name().
But now lets look at a slightly different code
class SomeClass {
int *i;
public:
SomeClass() {
i = new int();
*i = 23;
}
~SomeClass() {
delete i;
i = NULL;
}
void print() {
printf("%d", *i);
}
};
SomeClass *name() {
SomeClass s;
return &s;
}
int main() {
SomeClass *n = name();
n->print();
}
In this case when the name() function returns SomeClass destructor would be called and the member var i would have be deallocated and set to NULL.
So when we call print() in main even though since the mem pointed by n isn't overwritten (i am assuming that) the print call will crash when it tried to de-reference a NULL pointer.
So in a way ur code segment will most likely not fail but will most likely fail if the objects deconstructor is doing some resource deinitialization and we are using it afterwards.
Hope it helps

Resources