two mallocs returning same pointer value

two mallocs returning same pointer value - c

I'm filling a structure with data from a line, the line format could be 3 different forms:
1.-"LD "(Just one word)
2.-"LD A "(Just 2 words)
3.- "LD A,B "(The second word separated by a coma).
The structure called instruccion has only the 3 pointers to point each part (mnemo, op1 and op2), but when allocating memory for the second word sometimes malloc returns the same value that was given for the first word. Here is the code with the mallocs pointed:
instruccion sepInst(char *linea){
instruccion nueva;
char *et;
while(linea[strlen(linea)-1]==32||linea[strlen(linea)-1]==9)//Eliminating spaces and tabs at the end of the line
linea[strlen(linea)-1]=0;
et=nextET(linea);//Save the direction of the next space or tab
if(*et==0){//If there is not, i save all in mnemo
nueva.mnemo=malloc(strlen(linea)+1);
strcpy(nueva.mnemo,linea);
nueva.op1=malloc(2);
nueva.op1[0]='k';nueva.op1[1]=0;//And set a "K" for op1
nueva.op2=NULL;
return nueva;
}
nueva.mnemo=malloc(et-linea+1);<-----------------------------------
strncpy(nueva.mnemo,linea,et-linea);
nueva.mnemo[et-linea]=0;printf("\nj%xj",nueva.mnemo);
linea=et;
while(*linea==9||*linea==32)//Move pointer to the second word
linea++;
if(strchr(linea,',')==NULL){//Check if there is a coma
nueva.op1=malloc(strlen(linea)+1);//Do this if there wasn't any coma
strcpy(nueva.op1,linea);
nueva.op2=NULL;
}
else{//Do this if there was a coma
nueva.op1=malloc(strchr(linea,',')-linea+1);<----------------------------------
strncpy(nueva.op1,linea,strchr(linea,',')-linea);
nueva.op1[strchr(linea,',')-linea]=0;
linea=strchr(linea,',')+1;
nueva.op2=malloc(strlen(linea)+1);
strcpy(nueva.op2,linea);printf("\n2j%xj2",nueva.op2);
}
return nueva;
}
When I print the pointers it happens to be the same number.
note: the function char *nextET(char *line) returns the direction of the first space or tab in the line, if there is not it returns the direction of the end of the line.
sepInst() is called several times in a program and only after it has been called several times it starts failing. These mallocs across all my program are giving me such a headache.

There are two main possibilities.
Either you are freeing the memory somewhere else in your program (search for calls to free or realloc). In this case the effect that you see is completely benign.
Or, you might be suffering from memory corruption, most likely a buffer overflow. The short term cure is to use a specialized tool (a memory debugger). Pick one that is available on your platform. The tool will require recompilation (relinking) and eventually tell you where exactly is your code stepping beyond previously defined buffer limits. There may be multiple offending code locations. Treat each one as a serious defect.
Once you get tired of this kind of research, learn to use the const qualifier and use it with all variable/parameter declarations where you can do it cleanly. This cannot completely prevent buffer overflows, but it will restrict them to variables intended to be writable buffers (which, for example, those involved in your question apparently are not).

On a side note, personally, I think you should work harder to call malloc less. It's a good idea for performance, and it also causes corruption less.
nueva.mnemo=malloc(strlen(linea)+1);
strcpy(nueva.mnemo,linea);
nueva.op1=malloc(2);
should be
// strlen has to traverse your string to get the length,
// so if you need it more than once, save its value.
cbLineA = strlen(linea);
// malloc for the string, and the 2 bytes you need for op1.
nueva.mnemo=malloc(cbLineA + 3);
// strcpy checks for \0 again, so use memcpy
memcpy(nueva.mnemo, linea, cbLineA);
nueva.mnemo[cbLineA] = 0;
// here we avoid a second malloc by pointing op1 to the space we left after linea
nueva.op1 = nueva.mnemo + cbLinea + 1;
Whenever you can reduce the number of mallocs by pre-calculation....do it. You are using C! This is not some higher level language that abuses the heap or does garbage collection!

Related

Is this code vulnerable to buffer overflow?

Fortify reported a buffer overflow vulnerability in below code citing following reason -
In this case we are primarily concerned with the case "Depends upon properties of the data that are enforced outside of the immediate scope of the code.", because we cannot verify the safety of the operation performed by memcpy() in abc.cpp
void create_dir(const char *sys_tmp_dir, const char *base_name,
size_t base_name_len)
{
char *tmp_dir;
size_t sys_tmp_dir_len;
sys_tmp_dir_len = strlen(sys_tmp_dir);
tmp_dir = (char*) malloc(sys_tmp_dir_len + 1 + base_name_len + 1);
if(NULL == tmp_dir)
return;
memcpy(tmp_dir, sys_tmp_dir, sys_tmp_dir_len);
tmp_dir[sys_tmp_dir_len] = FN_LIBCHAR;
memcpy(tmp_dir + sys_tmp_dir_len + 1, base_name, base_name_len);
tmp_dir[sys_tmp_dir_len + base_name_len + 1] = '\0';
..........
..........
}
It appears to me a false positive since we are getting the size of data first, allocating that much amount of space, then calling memcpy with size to copy.
But I am looking for good reasons to convince fellow developer to get rid of current implementation and rather use c++ strings. This issue has been assigned to him. He just sees this a false positive so doesn't want to change anything.
Edit I see quick, valid criticism of the current code. Hopefully, I'll be able to convince him now. Otherwise, I'll hold the baton. :)

Take a look to strlen(), it has input string but it has not an upper bound then it'll go on searching until it founds \0. It's a vulnerability because you'll perform memcpy() trusting its result (if it won't crash because of access violation while searching). Imagine:
create_dir((const char*)12345, baseDir, strlen(baseDir));
You tagged both C and C++...if you're using C++ then std::string will protect you from these issues.

It appears to me a false positive since we are getting the size of data first, allocating that much amount of space
This assumption is a problem that matches the warning/error. In your code, you're assuming that malloc successfully allocated the requested memory. If your system has no memory to spare, malloc will fail and return NULL. When you try to memcpy into tmp_dir, you'd be copying to NULL which would be bad news.
You should check to guarantee that the value returned by malloc is not NULL before considering it as a valid pointer.

memset() not setting memory in c

I apologize if my formatting is incorrect as this is my first post, I couldn't find a post on the site that dealt with the same issue I am running into. I'm using plain C on ubuntu 12.04 server. I'm trying to concatenate several strings together into a single string, separated by Ns. The string sizes and space between strings may vary, however. A struct was made to store the positional data as several integers that can be passed to multiple functions:
typedef struct pseuInts {
int pseuStartPos;
int pseuPos;
int posDiff;
int scafStartPos;
} pseuInts;
As well as a string struct:
typedef struct string {
char *str;
int len;
} myString;
Since there are break conditions for the concatenated string multiple nodes of a dynamically linked list were assembled containing an identifier and the concatenated string:
typedef struct entry {
myString title;
myString seq;
struct entry *next;
} entry;
The memset call is as follows:
} else if ((*pseuInts)->pseuPos != (*pseuInts)->scafStartPos) {
(*pseuEntry)->seq.str = realloc ((*pseuEntry)->seq.str, (((*pseuEntry)->seq.len) + (((*pseuInts)->scafStartPos) - ((*pseuInts)->pseuPos)))); //realloc the string being extended to account for the Ns
memset (((*pseuEntry)->seq.str + ((*pseuEntry)->seq.len)), 'N', (((*pseuInts)->scafStartPos) - ((*pseuInts)->pseuPos))); //insert the correct number of Ns
(*pseuEntry)->seq.len += (((*pseuInts)->scafStartPos) - ((*pseuInts)->pseuPos)); //Update the length of the now extended string
(*pseuInts)->pseuPos += (((*pseuInts)->scafStartPos) - ((*pseuInts)->pseuPos)); //update the position values
}
These are all being dereferenced as this else if decision is in a function being called by a function called from main, but the changes to the pseuEntry struct need to be updated in main so as to be passed to another function for further processing.
I've double checked the numbers being used in pseuInts by inserting some printf commands and they are correct in the positioning of how many Ns need to be added, even as they change between different short strings. However, when the program is run the memset only inserts Ns the first time it's called. IE:
GATTGT and TAATTTGACT are separated by 4 spaces and they become:
GATTGTNNNNTAATTTGACT
The second time it is called on the same concatenated string it doesn't work though. IE:
TAATTTGACT and TCTCC are separated by 6 spaces so the long string should become:
GATTGTNNNNTAATTTGACTNNNNNNTCTCC
but it only shows:
GATTGTNNNNTAATTTGACTTCTCC
I've added printfs to display the concatenated string immediately before and after the memset and the they are identical in output.
Sometimes the insertion is adding extra character spaces, but not initializing them so they print nonsense (as would be expected). IE:
GAATAAANNNNNNNNNNNNNNNNN¬GCTAATG
should be
GAATAAANNNNNNNNNNNNNNNNNGCTAATG
I've switched the memset with a for or a while loop and I get the same result. I used an intermediate char * to realloc and still get the same result. I'm looking for for suggestions as to where I should look to try and detect the error.

If you are okay with considering a completely different approach, I would like to offer this:
I understand your intent to be: Replace existing spaces between two strings with an equal number of "N"s. memset() (and associated memory allocations) is the primary method to perform the concatenations.
The problems you have described with your current concatenation attempts are :
1) garbage embedded in resulting string.
2) writing "N" in some unintended memory locations.
3) "N" not being written in other intended memory locations.
Different approach:
First: verify that the memory allocated to the string being modified is sufficient to contain results
second: verify all strings to be concatenated are \0 terminated before attempting concatenation.
third: use strcat(), and a for(;;) loop to append all "N"s, and eventually, subsequent strings.
eg.
for(i=0;i<numNs;i++)//compute numNs with your existing variables
{
strcat(firstStr, "N");//Note: "N" is already NULL term. , and strcat() also ensures null term.
}
strcat(firstStr, lastStr); //a null terminated concatenation
I know this approach is vastly different from what you were doing, but it does address at least the issues identified from your problem statement. If this makes no sense, please let me know and I will address questions as I am able to. (currently have other projects going on)

Looking at your memset:
memset (((*pseuEntry)->seq.str + ((*pseuEntry)->seq.len))), ...
That's the destination. Shouldn't it be:
(memset (((*pseuEntry)->seq.str + ((*pseuEntry)->seq.len) + ((*pseuEntry)->seq.pseuStartPos))
Otherwise I'm missing the meaninging of pseuInts .

Concatenate with memcpy

I'm trying to add two strings together using memcpy. The first memcpy does contain the data, I require. The second one does not however add on. Any idea why?
if (strlen(g->db_cmd) < MAX_DB_CMDS )
{
memcpy(&g->db_cmd[strlen(g->db_cmd)],l->db.param_value.val,strlen(l->db.param_value.val));
memcpy(&g->db_cmd[strlen(g->db_cmd)],l->del_const,strlen(l->del_const));
g->cmd_ctr++;
}

size_t len = strlen(l->db.param_value.val);
memcpy(g->db_cmd, l->db.param_value.val, len);
memcpy(g->db_cmd + len, l->del_const, strlen(l->del_cost)+1);
This gains you the following:
Less redundant calls to strlen. Each of those must traverse the string, so it's a good idea to minimize these calls.
The 2nd memcpy needs to actually append, not replace. So the first argument has to differ from the previous call.
Note the +1 in the 3rd arg of the 2nd memcpy. That is for the NUL terminator.
I'm not sure your if statement makes sense either. Perhaps a more sane thing to do would be to make sure that g->db_cmd has enough space for what you are about to copy. You would do that via either sizeof (if db_cmd is an array of characters) or by tracking how big your heap allocations are (if db_cmd was acquired via malloc). So perhaps it would make most sense as:
size_t param_value_len = strlen(l->db.param_value.val),
del_const_len = strlen(l->del_const);
// Assumption is that db_cmd is a char array and hence sizeof(db_cmd) makes sense.
// If db_cmd is a heap allocation, replace the sizeof() with how many bytes you
// asked malloc for.
//
if (param_value_len + del_const_len < sizeof(g->db_cmd))
{
memcpy(g->db_cmd, l->db.param_value.val, param_value_len);
memcpy(g->db_cmd + param_value_len, l->del_const, del_const_len + 1);
}
else
{
// TODO: your buffer is not big enough. handle that.
}

You're not copying the null terminator, you're only coping the raw string data. That leaves your string non-null-terminated, which can cause all sorts of problems. You're also not checking to make sure you have enough space in your buffer, which can result in buffer overflow vulnerabilities.
To make sure you copy the null terminator, just add 1 to the number of bytes you're copying -- copy strlen(l->db.param_value.val) + 1 bytes.

One possible problem is that your first memcpy() call won't necessarily result in a null terminated string since you're not copying the '\0' terminator from l->db.param_value.val:
So when strlen(g->db_cmd) is called in the second call to memcpy() it might be returning something completely bogus. Whether this is a problem depends on whether the g->db_cmd buffer is initialized to zeros beforehand or not.
Why not use the strcat(), which was made to do exactly what you're trying to do with memcpy()?
if (strlen(g->db_cmd) < MAX_DB_CMDS )
{
strcat( g->db_cmd, l->db.param_value.val);
strcat( g->db_cmd, l->del_const);
g->cmd_ctr++;
}
That'll have the advantage of being easier for someone to read. You might think it would be less performant - but I don't think so since you're making a bunch of strlen() calls explicitly. In any case, I'd concentrate on getting it right first, then worry about performance. Incorrect code is as unoptimized as you can get - get it right before getting it fast. In fact, my next step wouldn't be to improve the code performance-wise, it would be to improve the code to be less likely to have a buffer overrun (I'd probably switch to using something like strlcat() instead of strcat()).
For example, if g->db_cmd is a char array (and not a pointer), the result might look like:
size_t orig_len = strlen(g->db_cmd);
size_t result = strlcat( g->db_cmd, l->db.param_value.val, sizeof(g->db_cmd));
result = strlcat( g->db_cmd, l->del_const, sizeof(g->db_cmd));
g->cmd_ctr++;
if (result >= sizeof(g->db_cmd)) {
// the new stuff didn't fit, 'roll back' to what we started with
g->db_cmd[orig_len] = '\0';
g->cmd_ctr--;
}
If strlcat() isn't part of your platform it can be found on the net pretty easily. If you're using MSVC there's a strcat_s() function which you could use instead (but note that it's not equivalent to strlcat() - you'd have to change how the results from calling strcat_s() are checked and handled).

String concats onto another without an assignment, why is this?

Below is a function from a program:
//read the specified file and check for the input ssn
int readfile(FILE *fptr, PERSON **rptr){
int v=0, i, j;
char n2[MAXS+1], b[1]=" ";
for(i=0; i<MAXR; i++){
j=i;
if(fscanf(fptr, "%c\n%d\n%19s %19s\n%d\n%19s\n%d\n%19s\n%19s\n%d\n%d\n%19s\n\n",
&rptr[j]->gender, &rptr[j]->ssn, rptr[j]->name, n2, &rptr[j]->age,
rptr[j]->job, &rptr[j]->income, rptr[j]->major, rptr[j]->minor,
&rptr[j]->height, &rptr[j]->weight, rptr[j]->religion)==EOF) {
i=MAXR;
}
strcat(rptr[j]->name, b);
//strcat(rptr[j]->name, n2);
if(&rptr[MAXR]->ssn==&rptr[j]->ssn)
v=j;
}
return v;
}
the commented line is like that because for some reason the array 'b' contains the string 'n2' despite an obvious lack of an assignment. This occurs before the first strcat call, but after/during the fscanf call.
it does accomplish the desired goal, but why is n2 concatenated onto the end of b, especially when b only has reserved space for 1 array element?
Here is a snippet of variable definitions after the fscanf call:
*rptr[j]->name = "Rob"
b = " Low"
n2= "Low"

It works, because you got lucky. b and n2 happened to be next to each other in memory, in the right order. C doesn't do boundary checking on arrays and will quite happily let you overflow them. So you can declare an array like this:
char someArray[1] = "lots and lots of characters";
The C compiler (certainly old ones) is going to think this is fine, even though there clearly isn't enough space in the someArray to store that many characters. I'm not sure if it's defined what it'll do in this situation (I suspect not), but on my compiler it limits the population to the size of the array, so it doesn't overflow the boundary (someArray=={'l'}).
Your situation is the same (although less extreme). char b[1] is creating an array with enough room to store 1 byte. You're putting a space in that byte, so there's no room for the null terminator. strcat, keeps copying memory until it gets to a null terminator, consequently it'll keep going until it finds one, even if it's not until the end of the next string (which is what's happening in your case).
If you had been using a C++ compiler, it would have thrown at least a warning (or more likely an error) to tell you that you were trying to put too many items into the array.

B needs to be size 2, 1 for the space, 1 for the null.

Coredump when parsing chapters

I am doing a homework assignment that reads in a book. First, a line is read in and a pointer made to point at that line. Then a paragraph function reads in lines and stores their address into a array of pointers. Now, I am on reading a chapter (a paragraph recognized by the next line being broke). It should call get_paragraph() and store the address of paragraphs until it comes to a new chapter.
A new chapter is the only time in the book where the first character in the line is not a space. I think this is were I am having problems in my code. All functions up to this point work. I hope I have provided enough information. The code compiles but core dumps when started.
I am a student and learning so please be kind. Thanks.
char*** get_chapter(FILE * infile){
int i=0;
char **chapter[10000];//an array of pointers
// Populate the array
while(chapter[i]=get_paragraph(infile)) { //get address store into array
if(!isspace(**chapter[0])){ //check to see if it is a new chapter<---problem line?
// save paragraph not used in chapter using static to put into next chapter
break;
}
i++;//increment array
}
//add the null
chapter[++i]='\0';//put a null at the end to signify end of array
//Malloc the pointer
char**(*chap) = malloc(i * sizeof(*chap));//malloc space
//Copy the array to the pointer
i=0;//reset address
while(chapter[i]){//while there are addresses in chapter
chap[i] = chapter[i++];//change addresses into chap
}
chap[i]='\0';//null to signify end of chapter
//Return the pointer
return(chap);//return pointer to array
}
For those who would rather see without comments:
char*** get_chapter(FILE * infile){
int i=0;
char **chapter[10000];
while(chapter[i]=get_paragraph(infile)) {
if(!isspace(**chapter[0])){
break;
}
i++;
}
chapter[++i]='\0';
char**(*chap) = malloc(i * sizeof(*chap));//malloc space
i=0;
while(chapter[i]){
chap[i] = chapter[i++];
}
chap[i]='\0';
return(chap);
}

Comments inline.
char*** get_chapter(FILE * infile) {
int i=0;
// This is a zero length array!
// (The comma operator returns its right-hand value.)
// Trying to modify any element here can cause havoc.
char **chapter[10,000];
while(chapter[i]=get_paragraph(infile)) {
// Do I read this right? I translate it as "if the first character of
// the first line of the first paragraph is not whitespace, we're done."
// Not the paragraph just read in -- the first paragraph. So this will exit
// immediately or else loop forever and walk off the end of the array
// of paragraphs. I think you mean **chapter[i] here.
if(!isspace(**chapter[0])){
break;
}
i++;
}
// Using pre-increment here means you leave one item in the array uninitialized
// which can also cause a fault later on. Use post-increment instead.
// Also '\0' here is the wrong sort of zero; I think you need NULL instead.
chapter[++i]='\0';
char**(*chap) = malloc(i * sizeof(*chap));
i=0;
while(chapter[i]) {
// This statement looks ambiguous to me. Referencing a variable twice and
// incrementing it in the same statement? You may end up with an off-by-one error.
chap[i] = chapter[i++];
}
// Wrong flavor of zero again.
chap[i]='\0';
return(chap);
}

Can I suggest that you use for loops instead of whiles? You need to stop if you run out of space, so you might as well use the appropriate construct.
I suspect you have a bug in this code:
while(chapter[i]=get_paragraph(infile)) {
if(!isspace(**chapter[0])){
break;
}
i++;
}
chapter[++i]='\0';
Firstly, shouldn't it be chapter[i] instead of chapter[0]? You want to know if the pointer at chapter[i] points to a space, not the first pointer in chapter. So this will probably loop indefinitely - hence the need for a for loop, so you don't just loop forever accidentally.
Secondly, you increment i at the end of the while block, and then again in the chapter[++i] assignment. i has already been incremented by the final loop execution before the while condition breaks, so it is already the correct position to use. ++i increments before yielding the value, so presumably you meant to have i++ here, so that it would increment after yielding the current value of i. Either way, it's confusing one of us as to what you mean, so maybe just put the increment on a separate line for clarity. The compiler will sort out any available optimisation.
Finally (and I might well be wrong here) why are you setting the value to '\0'? That's a null character, isn't it? But your array is of pointers. The null pointer would be 0, rather than '\0', I think. If I'm right, you might have still got away with it if '\0' yields the same set of zeroes as the null pointer.

Have you tried single stepping though it in gdb, and occasionally dumping the local variables to see the current state? It's a good way to learn. You may want to add a few extra intermediate variables that "info locals" will automatically dump as well (pointers to the current XXX, where XXX is various items in your hierarchy)
I assume a GNU environment:
% gcc -g homework.c -o hw
% gdb hw
(gdb) b 10
(gdb) r
(gdb) info locals
(gdb) n
(gdb) info locals
...
Replace "10" with a suitable line number near the beginning of the function.

Shouldn't it be:
if(!isspace(**chapter[i])){
Each chapter[i] is a pointer to a pointer to a char, this char is the first character in each chapter. So **chapter[i] represents the first character in chapter i. Using chapter[0] will only look at the first chapter.