malloc operation clears other allocated memory - c

I'm trying to make a program that basically picks a specific piece of source code and adds some other specific code into it. The program is just to big to put it all inside my question, but basically I have this "actors" struct:
typedef struct actors_s {
int num;
char *src_path;
char *project_path;
int *papify;
char *actor_path[];
} actors_s;
As you can see these are almost all pointers and the last one is an array of strings. This needs to be done this way because the number of "actor elements" depends on the input every time.
The problem:In an specific test case, I have a case with 'num' members in the actor_path array. Then I first call malloc only once this way:
*actors->actor_path = malloc(actors->num);
My logic tells me I shouldn't be using the '*' operator here but without it I get an error, this is possibly where the problem is. So, a function is called that allocates a new memory space for every new member (never going further of 'num' members):
int size = strlen(name)+strlen(actors->project_path)+strlen("/src/")+strlen(".c")+4;
actors->actor_path[i] = malloc(size);
(The malloc calls are properly tested if successful in the actual program)
This is called inside a function that is called for every "actor_path" element. In this test example I have three actors.
Mysteriously enough, on the third call of this malloc, the src_path element of the struct, which was properly allocated and set to a string once in the beginning of the program (and never touched again) is freed (I think so, at least it is changed into random numbers and symbols if I watch it in debug mode).
Anyone has any idea how and why is this possible? How do I fix this?
Thanks in advance.
EDIT:
Here are some screenshots from the debug watch window: http://imgur.com/a/aB1uv
First call to malloc: all OK.
Second call to malloc: all OK.
Third call to malloc: src_path gets erased!!

[] in latest array element is called flexible array member. It means structure have an array that starts just after structure itself, and its size is unspecified. You have to allocate memory for this manually. E.g.
actors_s *actor = malloc(sizeof(*actor) + sizeof(char*) * num);
Then just assign at most num elements into actor_path (each element is pointer to char).

about *actors->actor_path = malloc(actors->num);
actors->actor_path is an array of pointers, so *actors->actor_path is the first pointer in actors->actor_path, i.e. actors->actor_path[0].
When doing this, you actually allocate actors->num bytes memory for actors->actor_path[0].
Now, accessing actors->actor_path[0] is OK, while accessing actors->actor_path[1], actors->actor_path[2], actors->actor_path[3],... may cause problems, say, rewrite src_path...
about the solution
#keltar is right. In this way, the resource for actors->actor_path[0], actors->actor_path[1], actors->actor_path[2], ..., actors->actor_path[num - 1] is correctly allocated.

Related

Why do variables declared with the same name in different scopes get assigned the same memory addresses?

I know that declaring a char[] variable in a while loop is scoped, having seen this post: Redeclaring variables in C.
Going through a tutorial on creating a simple web server in C, I'm finding that I have to manually clear memory assigned to responseData in the example below, otherwise the contents of index.html are just continuously appended to the response and the response contains duplicated contents from index.html:
while (1)
{
int clientSocket = accept(serverSocket, NULL, NULL);
char httpResponse[8000] = "HTTP/1.1 200 OK\r\n\n";
FILE *htmlData = fopen("index.html", "r");
char line[100];
char responseData[8000];
while(fgets(line, 100, htmlData) != 0)
{
strcat(responseData, line);
}
strcat(httpResponse, responseData);
send(clientSocket, httpResponse, sizeof(httpResponse), 0);
close(clientSocket);
}
Correct by:
while (1)
{
...
char responseData[8000];
memset(responseData, 0, strlen(responseData));
...
}
Coming from JavaScript, this was surprising. Why would I want to declare a variable and have access to the memory contents of a variable declared in a different scope with the same name? Why wouldn't C just reset that memory behind the scenes?
Also... Why is it that variables of the same name declared in different scopes get assigned the same memory addresses?
According to this question: Variable declared interchangebly has the same pattern of memory address that ISN'T the case. However, I'm finding that this is occurring pretty reliably.
Not completely correct. You don't need to clear the whole responseData array - clearing its first byte is just enough:
responseData[0] = 0;
As Gabriel Pellegrino notes in the comment, a more idiomatic expression is
responseData[0] = '\0';
It explicitly defines a character via its code point of zero value, while the former uses an int constant zero. In both cases the right-side argument has type int which is implicitly converted (truncated) to char type for assignment. (Paragraph fixed thx to the pmg's comment.)
You could know that from the strcat documentation: the function appends its second argument string to the first one. If you need the very first chunk to get stored into the buffer, you want to append it to an empty string, so you need to ensure the string in the buffer is empty. That is, it consists of the terminating NUL character only. memset-ting the whole array is an overkill, hence a waste of time.
Additionally, using a strlen on the array is asking for troubles. You can't know what the actual contents of the memory block allocated for the array is. If it was not used yet or was overwritten with some other data since your last use, it may contain no NUL character. Then strlen will run out of the array causing Undefined Behavior. And even if it returns successfuly, it will give you the string's length bigger than the size of the array. As a result memset will run out of the array, possibly overwriting some vital data!
Use sizeof whenever you memset an array!
memset(responseData, 0, sizeof(responseData));
EDIT
In the above I tried to explain how to fix the issue with your code, but I didn't answer your questions. Here they are:
Why do variables (...) in different scopes get assigned the same memory addresses?
In regard of execution each iteration of the while(1) { ... } loop indeed creates a new scope. However, each scope terminates before the new one is created, so the compiler reserves appropriate block of memory on the stack and the loop re-uses it in every iteration. That also simplifies a compiled code: every iteration is executed by exactly the same code, which simply jumps at the end to the beginning. All instructions within the loop that access local variables use exactly the same addressing (relative to the stack) in each iteration. So, each variable in the next iteration has precisely the same location in memory as in all previous iterations.
I'm finding that I have to manually clear memory
Yes, automatic variables, allocated on the stack, are not initialized in C by default. We always need to explicitly assign an initial value before we use it – otherwise the value is undefined and may be incorrect (for example, a floating-point variable can appear not-a-number, a character array may appear not terminated, an enum variable may have a value out of the enum's definition, a pointer variable may not point at a valid, accessible location, etc.).
otherwise the contents (...) are just continuously appended
This one was answered above.
Coming from JavaScript, this was surprising
Yes, JavaScript apparently creates new variables at the new scope, hence each time you get a brand new array – and it is empty. In C you just get the same area of a previously allocated memory for an automatic variable, and it's your responsibility to initialize it.
Additionally, consider two consecutive loops:
void test()
{
int i;
for (i=0; i<5; i++) {
char buf1[10];
sprintf(buf1, "%d", i);
}
for (i=0; i<1; i++) {
char buf2[10];
printf("%s\n", buf2);
}
}
The first one prints a single-digit, character representation of five numbers into the character array, overwriting it each time - hence the last value of buf1[] (as a string) is "4".
What output do you expect from the second loop? Generally speaking, we can't know what buf2[] will contain, and printf-ing it causes UB. However we may suppose the same set of variables (namely a single 10-items character array) from both disjoint scopes will get allocated the same way in the same part of a stack. If this is the case, we'll get a digit 4 as an output from a (formally uninitialized) array.
This result depends on the compiler construction and should be considered a coincidence. Do not rely on it as this is UB!
Why wouldn't C just reset that memory behind the scenes?
Because it's not told to. The language was created to compile to effective, compact code. It does as little 'behind the scenes' as possible. Among others things it does not do is not initializing automatic variables unless it's told to. Which means you need to add an explicit initializer to a local variable declaration or add an initializing instruction (e.g. an assignment) before the first use. (This does not apply to global, module-scope variables; those are initialized to zeros by default.)
In higher-level languages some or all variables are initialized on creation, but not in C. That's its feature and we must live with it – or just not use this language.
With this line:
char responseData[8000];
You are saying to your compiler: Hey big C, give me a 8000 bytes chunk and name it responseData.
In runtime, if you don't specify, no one will ever clean or give you a "brand-new" chunk of memory. That means that the 8000 bytes chunk you get in every single execution can hold all the possible permutations of bits in this 8000 bytes. Something extraordinary that can happens, is that you're getting in every execution the same memory region and thus, the same bits in this 8000 bytes your big C gave to you in the first time. So, if you don't clean, you have the impression that you're using the same variable, but you're not! You're just using the same (never cleaned) memory region.
I'd add that it's part of the programmer's responsibilities to clean, if you need to, the memory you're allocating, in dynamic or static way.
Why would I want to declare a variable and have access to the memory contents of a variable declared in a different scope with the same name? Why wouldn't C just reset that memory behind the scenes?
Objects with auto storage duration (i.e., block-scope variables) are not automatically initialized - their initial contents are indeterminate. Remember that C is a product of the early 1970s, and errs on the side of runtime speed over convenience. The C philosophy is that the programmer is in the best position to know whether something should be initialized to a known value or not, and is smart enough to do it themselves if needed.
While you're logically creating and destroying a new instance of responseData on each loop iteration, it turns out the same memory location is being reused each time through. We like to think that space is allocated for each block-scope object as we enter the block and released as we leave it, but in practice that's (usually) not the case - space for all block-scope objects within a function is allocated on function entry, and released on function exit1.
Different objects in different scopes may map to the same memory behind the scenes. Consider something like
void bletch( void )
{
if ( some_condition )
{
int foo = some_function();
printf( "%d\n", foo );
}
else
{
int bar = some_other_function();
printf( "%d\n", bar );
}
It's impossible for both foo and bar to exist at the same time, so there's no reason to allocate separate space for both - the compiler will (usually) allocate space for one int object at function entry, and that space gets used for either foo or bar depending on which branch is taken.
So, what happens with responseData is that space for one 8000-character array is allocated on function entry, and that same space gets used for each iteration of the loop. That's why you need to clear it out on each iteration, either with a memset call or with an initializer like
char responseData[8000] = {0};
As M.M points out in a comment, this isn't true for variable-length arrays (and potentially other variably modified types) - space for those is set aside as needed, although where that space is taken from isn't specified by the language definition. For all other types, though, the usual practice is to allocate all necessary space on function entry.

C - Multiple malloc's for a struct-array inside a function

I've got a theoretical question on allocating memory for structs. Consider the following code IN THE MAIN FUNCTION:
I have the following struct:
typedef struct {
char *descr = NULL;
DWORD id = 0x00FFFF00;
int start_byte = 0;
int end_byte = 0;
double conversion_factor = 0.0;
} CAN_ID_ENTRY;
I want an array of this structs, so I'm allocating a pointer to the first struct:
can_id_list = (CAN_ID_ENTRY **)malloc(sizeof(CAN_ID_ENTRY));
And then I'm allocating memory for the first struct can_id_list[0]:
can_id_list[0] = (CAN_ID_ENTRY *)malloc(sizeof(CAN_ID_ENTRY));
Now the problem is, that I don't know HOW MANY of these structs I need (because I'm reading a CSV-File and I don't know the amount of lines/entries). So I need to enlarge the struct-pointer can_id_list for a second one:
can_id_list = (CAN_ID_ENTRY **)malloc(sizeof(CAN_ID_ENTRY));
And then I'm allocating the second struct can_id_list[1]:
can_id_list[1] = (CAN_ID_ENTRY *)malloc(sizeof(CAN_ID_ENTRY));
can_id_list[1]->id = 6;
Obviously, this works. But why? My point is the following: Normally, malloc allocates memory in one block in the memory (without gaps). But if another malloc is done BEFORE I'm allocating memory for the next struct, there is a gap between the first and the second struct. So, why can I access the second struct via can_id_list[1]? Does the index [1] store the actual address of the struct, or does it just calculate the size of the struct and jumps to this address beginning on the offset of the struct-pointer can_id_list (-> can_id_list+<2*sizeof(CAN_ID_ENTRY))?
Well, my real problem is, that I need to do this inside a function and therefore I need to pass the pointer of the struct to the function. But I don't know how to do this, because can_id_list is already a pointer ... and the changes must also be visible in the main method (that's the reason i need to use pointers).
The mentioned function is this one:
int load_can_id_list(char *filename, CAN_ID_ENTRY **can_id_list);
But is the parameter CAN_ID_ENTRY **can_id_list correct? And how do i pass the struct-array into this function? And how can i modify it inside??
Any help would be great!
EDIT: Casting malloc returns - Visual Studio forces me to do that! (Because it's a C++ project i think)
As the comments already said, the source of your confusion is can_id_list = (CAN_ID_ENTRY **)malloc(sizeof(CAN_ID_ENTRY)); allocating the wrong amount of memory. It probably gave you space for a few pointers to be stored, not just one. Should be can_id_list = (CAN_ID_ENTRY **)malloc(sizeof(CAN_ID_ENTRY*));.
To answer the question at the end,
But is the parameter CAN_ID_ENTRY **can_id_list correct? And how do i
pass the struct-array into this function? And how can i modify it
inside??
If you want to enlarge the size of the array within another function, you need to pass CAN_ID_ENTRY*** pr so you can set *ptr = realloc(...) inside as needed. Realloc may give you the new chunk of memory at a different address, so you can't simply pass in a CAN_ID_ENTRY** ptr then do realloc(ptr). See https://www.tutorialspoint.com/c_standard_library/c_function_realloc.htm

Initially mallocate 0 elements to later reallocate and measure size

I have a function that will add a new position to an array by reallocating new memory every time it is called.
The problem is that, for each call I need it to add one position to the array, starting from 1 at first call, but I understand that I have to mallocate before reallocating.
So my question is, can I initially do something like p = malloc(0) and then reallocate for example using p = (int *)realloc(p,sizeof(int)) inside my function? p is declared as int *p.
Maybe with a different syntax?
Of course I could make a condition in my function that would mallocate if memory hasn't been allocated before and reallocate if it has, but I am looking for a better way.
And the second problem I have is... Once reallocated more positions, I want to know the size of the array.
I know that if, for example, I declare an array a[10], the number of elements would be defined by sizeof(a)/sizeof(a[0]), but for some reason that doesn't work with arrays declared as pointers and then reallocated.
Any advice?
You could initialize your pointer to NULL, so that the first time you call realloc(yourPointer, yourSize), it will return the same value as malloc(yourSize).
For your second problem, you could use a struct that contains your pointer and a count member.
struct MyIntVector {
int * ptr;
size_t count;
}
Then you probably will want to define wrapper functions for malloc, realloc, and free (where you could reset ptr to NULL), that takes your struct as one of the parameters, and updates the struct as needed.
If you want to optimize this for pushing 1 element at a time, you could add a allocatedCount member, and only realloc if count == allocatedCount, with a new allocatedCount equals (for example) twice the old allocatedCount.
You should implement this in a MyIntVector_Push(MyIntVector *, int ) function.
You will then have a simplified c version of c++ std::vector<int> (but without automatic deallocation when the object goes out of scope).
As ThreeStarProgrammer57 said just use realloc.
realloc(NULL, nr_of_bytes) is equivalent to malloc(nr_of_bytes)
so
p = realloc(p, your_new_size)
will work just fine the first time if p is initialized to NULL. But be sure to pass the number of bytes you need after resizing, not the additional space that you want, as you have written your question.
As regarding the size, you have to keep track of it. That's the way C was designed.

Array memory allocating override

i created a code which will basically create an allocation of an array according to a size of a string,and store a pointer to the allocated array inside a for loop:
int Note;
int ifd;
char **pointer[ir];
for (Note = 0; Note < ir; ++Note) {
char ** Temp=malloc(Count(' ',Sentences[Note])*sizeof(char *));
ifd=StoreArr(Sentences[Note],Temp," ");
pointer[Note]=&Temp;
printer(*(pointer[Note]),ifd);
}
char **temp should create a new array each time the function starts,and pointer should store a pointer to the created array.when i print the created arrays,`(printer(pointer[Note]),ifd)). the output is correct:
hello
ola
hiya
howdy
eitan
eitanon
eitanya
but after exiting the for,and trying to print the first array of strings,i only receive
eitan
eitanon
eitanya
hence,i can persume that the Temp allocation does not create new memory,but simply overrides the existing allocation.
my question is how to solve the problem, in order to allocate new memory to Temp each time and by so allocate room for all arays in Sentences"
Thanks
It is quite hard to figure out what you meant, unless you post the whole code. However I suspect the following line of code to cause your issue:
char **pointer[ir]; //three pointers here
When you show the array with the following line. You are accessing the "local memory" and it works fine. However once you quit the function, the "local" variables are not longer accessible, thus using them will cause undefined behaviors.
printer(*(pointer[Note]),ifd);
I would recommend you to change your pointer to something like this:
char*** pointer;
Bottom line, never return an array declared with []
NB: Give your variables, methods, modules a descriptive name
Cheers !

C call from Smalltalk

I'm trying to call EnumServicesStatus from within VisualWorks. For the first call I set the parameters to the required values to know how many bytes the returned information will require (pcbBytesNeeded).
Now I need to allocate memory for the lpServices buffer using malloc:, which expects the number of instances as an argument. How can I calculate this easily? Just dividing the pcbBytesNeeded by the size of of an LPENUM_SERVICE_STATUS struct makes my code crash when freeing the memory.
/Edit
I solved the crash when freeing the memory. (I accidently manipulated the variable holding the pointer). However, my question in the comment to Karsten is still valid. Why doesn't the size of ENUM_SERVICE_STATUS divide pcbBytesNeeded? Is this because of the LPTSTR lpServiceName and LPTSTR lpDisplayName members?
you can send #sizeOf to the ENUM_SERVICE_STATUS structure, similar to the sizeof(ENUM_SERVICE_STATUS) in C.
Something like:
numItems := pcbBytesNeeded / self ENUM_SERVICE_STATUS sizeOf.
please also make sure that you call the EnumServicesStatusW function, because EnumServicesStatus is a macro that actually points to EnumServicesStatusW.

Resources