Having trouble diagnosing a heap buffer overflow - c

I'm trying to analyze a function that a project partner wrote to try and figure out why it's giving me a heap buffer overflow. I have run this through valgrind and cppcheck, but they didn't bring me any closer.
The function is designed to parse through a string and pull out a value to sort on. Special cases to handle were null values, and movie titles with commas in them. (Since we're parsing a CSV file, if data has a comma in it, that data needs to be handled differently.) Here is the relevant snippet:
char* findKey(char lineBuffer[], int columnNumber ){
char tempArray[512];
int commasCounted = 0;
int i =0;
for(i = 0; i< 1024; i++){
if (commasCounted == columnNumber){
commasCounted = i;
break;
}
if (lineBuffer[i] == '\"'){
while(lineBuffer[i] != '\"'){
i++;
}
}
if (lineBuffer[i] == ','){
commasCounted++;
}
}
if(lineBuffer[commasCounted] == ','){
tempArray[0] = '0';
tempArray[1] = '0';
tempArray[2] = '0';
tempArray[3] = '0';
tempArray[4] = '\0';
}else{
int j = 0;
for(i = commasCounted; i < 1024; i++){
if(lineBuffer[i] == '\"'){
i++;
while(lineBuffer[i] != '\"'){
tempArray[j] = lineBuffer[i];
i++;
j++;
}
break;
}else if(lineBuffer[i] == ','){
break;
}else
tempArray[j] = lineBuffer[i];
j++;
}
tempArray[j] = '\0';
}
char* tempString = strtok(tempArray, "\n");
return tempString;
}
I believe the part that's blowing up is this section:
while(lineBuffer[i] != '\"'){
tempArray[j] = lineBuffer[i];
i++;
j++;
}
I just can't figure out exactly why. Is there a way to fix this? I don't know if this is the cause, but this is the input for lineBuffer when it breaks:
Color,Glenn Ficarra,310,118,43,7000,Emma Stone,33000,84244877,Comedy|Drama|Romance,Ryan Gosling,"Crazy, Stupid, Love. ",375456,57426,Steve Carell,7,bar|divorce|friend|girl|male objectification,http://www.imdb.com/title/tt1570728/?ref_=fn_tt_tt_1,292,English,USA,PG-13,50000000,2011,15000,7.4,2.39,44000,
Any help would be appreciated. Thank you!

You have a few problems here.
First you are returning a pointer to a local variable. This invokes undefined behavior. You should allocate the string with malloc or duplicate tempArray with strdup and then return it.
Second you should not declare tmpArray of size 512, which seems to be an arbitrary number. tmpArray should have at least the size to fit lineBuffer in it, so you should declare it as char tmpArray[strlen(lineBuffer)+1];
You iterate from 0 to 1023 when searching for commas, which may access lineBuffer out of bounds, better option would be to iterate form 0 to strlen(lineBuffer)-1. Since you are incrementing i also inside the loop you should always check if your index is inside bounds.
But all those things does not seem to be a problem for the provided test string, since it fit inside all buffers.
I think the problem is here:
if (lineBuffer[i] == '\"')
{
while(lineBuffer[i] != '\"')
{
i++;
}
}
If you find a '\"' you want to skip everything until you find the next one. If you think about it your code does not have any effect at all, since the while loop can not be entered. You should change this to:
if (lineBuffer[i] == '\"')
{
i++;
while(lineBuffer[i] && lineBuffer[i] != '\"')
{
i++;
}
}

Make sure index i doesn't move past the end of lineBuffer:
if (lineBuffer[i] && lineBuffer[i] == '\"'){...

Related

Why do the values in my dynamically-allocated memory look like they're being overwritten

I'm working on an assignment that is supposed to parse a string into separate tokens without the use of the c string library and while dynamically allocating any necessary memory. I thought I had everything working correctly, except now it looks like every value is being overwritten every time I write a new value.
Here's my code. Sorry it's a mess, I've been in a hurry and reluctantly have been working with functions I don't fully understand. The problem is probably something dumb, but I'm out of time and it's clear I probably wont be able to figure it out myself.
int makearg(char s[], char **args[]);
int main() { char **tokenArray; char strInput[MAXSTRING]; int tokenResult; int i = 0;
printf("Input String to be Parsed: "); scanf("%[^\n]%*c", strInput);
tokenResult = makearg(strInput, &tokenArray);
printf("argc: %d\n", tokenResult); for (i = 0; i < tokenResult; i++) {
printf("arg(%d): %s\n", i, tokenArray[i]); } }
int makearg(char s[], char **args[]) { int numTokens = 0; int lastSpace = 0; int i; int fakeI; char token[MAXSTRING]; int subFromPos = 0; int firstToken = 1;
*args = NULL; while ((s[i] != '\n') && (s[i] != '\0') && (s[i] != '\r')) {
fakeI = i;
if ((s[i + 1] == '\n') || (s[i + 1] == '\0'))
{
fakeI = i + 1;
}
token[i - lastSpace - subFromPos] = s[i];
if ((s[fakeI] == ' ') || (s[fakeI] == '\n') || (s[fakeI] == '\0') || (s[fakeI] == '\r'))
{
if (firstToken == 1)
{
token[fakeI - lastSpace] = '\0';
firstToken = 0;
} else if (firstToken == 0){
token[i - lastSpace] = '\0';
printf("Saved Token 1: %s\n", *args[numTokens - 1]); //test to see if the token got written properly
if (numTokens > 1){
printf("Prior Saved Token: %s\n", *args[numTokens - 2]); //test to see if the tokens are overwritten
}
if (numTokens > 2){
printf("Prior Saved Token 2: %s\n", *args[numTokens - 3]); //test to see if the tokens are overwritten
}
}
*args = realloc(*args, (numTokens + 1));
args[numTokens] = NULL;
args[numTokens] = realloc(args[numTokens], (fakeI - lastSpace + 1));
*args[numTokens] = token;
printf("Saved Token: %s\n", *args[numTokens]); //test to see if the token got written properly
numTokens++;
lastSpace = fakeI;
subFromPos = 1;
}
i++; } numTokens++; return numTokens; }
For whatever reason Saved Token, Saved Token 1, Prior Saved Token, and Prior Saved Token 2 all print the same value every time they run (by which I mean if one of them prints the word "hello", they all print the word hello. That seems to tell me that the previous data is being overwritten.
Additionally, the for-loop in the main function is supposed to go through and print every value in the array, but instead it's only printing the following (in this scenario I was testing with the string "hello my one true friend":
arg(0): friend
arg(1): (null)
arg(2): (null)
What am I doing wrong here? I'm sure it's something dumb that I'm overlooking, but I just can't find it. Am I writing in the data incorrectly? Is it actually not being overwritten and just being printed incorrectly. At this point any advice at all would be greatly appreciated.
Ok well my dev env picked up immediatly
int i; <<<<=====
int fakeI;
char token[255];
int subFromPos = 0;
int firstToken = 1;
*args = NULL;
while ((s[i] != '\n') && (s[i] != '\0') && (s[i] != '\r')) { <<<<<=
gave
C4700 uninitialized local variable 'i' used
after that all bets are off

How can I get the next element of an array in C?

I am new to C, and I have to create a transliterator for my hw assignment at university. In Polish, the sound [tsh] as in chair is represented by two letters: "cz". I have to create a program that will turn every "cz" into 4 (F.e. zaskoczony = zasko4ony). I have a char array(defined at the beginning of the program) and I can get "c" and change it to anything I want, but I'm struggling with getting "z" checked, because I cannot get the +1 element in my string array.
I've tried putting i+1 into the array's brackets, tried using a variable, but nothing seems to work.
while(i<100){
intText[i] = someString[i];
if(intText[i] == 'c'){
int increasedI=i+1;
printf(" %d", increasedI);
if(intText[increasedI] == 'z'){
printf("4");
}
}else{
putchar(intText[i]);
}
i++;
}
How can I get the next element of an array in C?
someString[i + 1]
A problem with OP's code was that it did not advance i an extra 1 nor populated intText[increasedI] before using it.
if(intText[i] == 'c'){
int increasedI=i+1;
printf(" %d", increasedI);
// if(intText[increasedI] == 'z'){
if(someString[increasedI] == 'z'){
printf("4");
}
i++; //add
....
i++;
Also string processing should stop when the null character is reached, not i== 100.
// while(i<100){
while(someString[i]){
Keep separate indexes of reading and writing. Walk down the string, making the desire substation.
As long as the substitution string, example "4", is no longer than the source, "cz", we can do an in-place substitution.
// in-place substitution
size_t in_index = 0;
size_t out_index = 0;
// Loop until end-of-string
while (someString[in_index] != '\0') {
// Test for special combination
if (someString[in_index] == 'c' && someString[in_index + 1] == 'z') {
in_index += 2;
someString[out_index++] = '4';
} else{
someString[out_index++] = someString[in_index++];
}
}
someString[out_index] = '\0';
puts(someString);
Thank everyone for your helpful advice which helped me find my own solution. This is the code which meets my expectations and looks quite simple.
int main()
{
char someString[100] = "cenczetkczacczka";
int i = 0;
while(someString[i] != '\0'){
if(someString[i] == 'c' && someString[i + 1] == 'z'){
printf("4");
i++;
}else{
putchar(someString[i]);
}
i++;
}
return 0;
}

Find palindromes in sentence

I am trying to write a piece of C code that takes a sentence and returns all the palindromes in that sentence, each in a new line. For example, the sentence "I like to race a civic racecar" would return:
civic
racecar
I've tried to use some debugging software (lldb, as I'm a mac user), but found it a bit confusing. The code below is what I have written. It's returning a segmentation fault, and I'm having trouble identifying it within my program.
int is_palin(char c[], int length)
{
int front = 0;
int back = length - 1; /* account for length starting at 0 */
if (length % 2 == 0){ /* check for even palindromes */
int middle = (length /2) -1 ;
while (front< middle + 1){
if (c[front] != c[back]){
return 0;}
front = front + 1;
back = back -1;
}
}
else { /* check for odd palindromes */
int middle = ((back - 2) / 2 ) + 1;
while (front != middle){
if (c[front] != c[back]){
return 0;}
front = front + 1;
back = back -1;}
}
return 1;
}
int is_delimiting_char(char ch)
{
if(ch == ' ') //White space
return 1;
else if(ch == ',') //Comma
return 1;
else if(ch == '.') //Period
return 1;
else if(ch == '!') //Exclamation
return 1;
else if(ch == '?') //Question mark
return 1;
else if(ch == '_') //Underscore
return 1;
else if(ch == '-') //Hyphen
return 1;
else if(ch == '(') //Opening parentheses
return 1;
else if(ch == ')') //Closing parentheses
return 1;
else if(ch == '\n') //Newline (the input ends with it)
return 1;
else
return 0;
}
/////////////////////////////////////////////
//---------------------------------------------------------------------------
// MAIN function
//---------------------------------------------------------------------------
int main (int argc, char** argv) {
char input_sentence[100];
int i=0;
char current_char;
int delimiting_char;
char word[20];
int word_length;
int have_palindrome = 0;
/////////////////////////////////////////////
/////////////////////////////////////////////
/* Infinite loop
* Asks for input sentence and prints the palindromes in it
* Terminated by user (e.g. CTRL+C)
*/
while(1) {
i=0;
print_char('\n');
print_string("input: ");
/* Read the input sentence.
* It is just a sequence of character terminated by a new line (\n) character.
*/
do {
current_char=read_char();
input_sentence[i]=current_char;
i++;
} while (current_char != '\n');
/////////////////////////////////////////////
print_string("output:\n");
int char_index = 0;
for(int k=0; k<i; k++) {
palin = 1;
current_char = input_sentence[k];
delimiting_char = is_delimiting_char(current_char);
if(delimiting_char) {
if (char_index > 0) { //Avoids printing a blank line in case of consecutive delimiting characters.
word[char_index++] = '\n'; //Puts an newline character so the next word in printed in a new line.
word_length = word_length + 1;
if (is_palin(word, word_length) && word_length > 1){
have_palindrome = 1;
for(int j=0; j<char_index; j++) {
print_char(word[j]);
}
word_length = 0;
char_index = 0;
}
} }
else {
word[char_index++] = current_char;
word_length = word_length + 1;
}
}
if (have_palindrome == 0){
print_string("Sorry! No palindromes found!"); }
}
return 0;
}
Also wondering if anyone has good videos or sites for learnign how to use lldb, when one has never used anything of the sort before. Thanks!
There are several things wrong here:
word_length is uninitialised at first use, so statements like word_length = word_length + 1 lead to undefined behaviour. In fact, you have two different variables, char_index and word_length, that should always have the same value. Instead of going through the hassle to keep them in sync, use just one variable.
You reset both char_index and word_length to zero only if a palindrome was found. You should reset if after every word, of course.
The line palin = 1; is probably a leftover from older code. You should also reset have_palindrome after each line. In general, you should take more care when defining variables.
By adding a newline to your word you make printing a bit easier, but you will never find a palindrome, because the newline at the end is taken into account when checking for the palindrome.
Your code for reading with read_char, which is probably an alias to getchar, needs to check for the end of input.
You don't need to distinguish between even and odd sized palindromes. Just make the condition that front < back and be done with it. The middle character of an odd sized palindrome doesn't matter. (That's not an error, your code is just needlessly complicated.)

Can't assign value to a variable inside a for loop

Here is what I want to do:
Read all characters from a '.c' file and store that into an array.
When a character from that array is '{', it will be pushed into a stack. And count of pushed characters will be increased by 1.
When a character from that array is '}', stack will pop and the count of popped characters will be increased by 1.
Compare those two counts to check whether there is a missing '{' or '}'
Here is my code:
int getLinesSyntax(char s[], int limit, FILE *cfile)
{
int i, c, push_count = 0, pop_count = 0;
int state = CODE;
int brackets[limit];
char braces[limit];
for(i = 0; i < 100; i++)
{
braces[i] = 0;
}
for(i = 0; i < limit - 1 && (c = getc(cfile)) != EOF && c != '\n'; i++)
{
s[i] = c;
if(s[i] == '{')
{
braces[0] = s[i];
//push(s[i], braces);
++push_count;
}
else if(s[i] == '}')
{
pop(braces);
++pop_count;
}
}
//Mor shiljih uyed array -n togsgold 0-g zalgana
if(c == '\n')
{
s[i] = c;
i++;
}
s[i] = '\0';
i = i -1; //Suuld zalgasan 0 -g toonoos hasna
if(c == EOF)
{
//just checking
for(i = 0; i < 100; i++)
{
printf("%d", braces[i]);
}
if(push_count != pop_count)
{
printf("%d and %d syntax error: braces", push_count, pop_count);
}
return -1;
}
else
{
return i;
}
}
Here is the output
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
The problems is:
Assignments inside the for loop is not working. (It's working when I put that outside of the loop)
I would like to know if there's something wrong with my code :).
There are several problems.
Lets go through it step by step
1) Your array initialization loop:
int brackets[limit];
char braces[limit];
for(i = 0; i < 100; i++)
{
braces[i] = 0;
}
You declare the array having size of limit but only initialize 100 items. Change 100 to limit to fully initialize it depending on the parameter of the function.
2) The conditional statement of the main for loop:
i < limit - 1 && (c = getc(cfile)) != EOF && c != '\n'
Although the first substatement is correct I have two remarks:
Firstly (c = getc(cfile)) != EOF might be one reason why the loop is never accessed and still everything is 000000.... Check if the file exists, the pointer is not NULL or other silent errors occured.
Secondly the c != '\n'. What if one of these characters occurs? In this case you won't continue with the next iteration but break out of the entire forloop. Remove it there and put it in the first line of the body like this:
if(c == '\n')
{
i -= 1; // to really skip the character and maintain the index.
continue;
}
3) s[i] = c;
Can you be certain, that the array is indeed sizeof limit?
4) Checking for curly braces
if(s[i] == '{')
{
braces[0] = s[i];
//push(s[i], braces);
++push_count;
}
else if(s[i] == '}')
{
pop(braces);
++pop_count;
}
You assign to braces[0] always, why?
5) Uninitialized access
if(c == '\n')
{
s[i] = c;
i++;
}
s[i] = '\0';
i = i -1; //Suuld zalgasan 0 -g toonoos hasna
You're now using the function-global variable i, which is never initialized properly for this block. What you do is to use a variable that is used basically everywhere ( which is basically also no problem from the memory point of view. ), but you rely on legacy values. Is this done by purpose? If no, reinitialize i properly. I have to ask this since i can't read your comments in code.
What I'm quite unhappy about is that you entirely rely on one variable in all the loops and statements. Usually a loop-index should never be altered from inside. Maybe you can come up with a cleaner design of the function like an additional index variable you parallelly increase without altering i. The additional index will be used for array access where appropriate whereas i really remains just a counter.
I think the problem is in this condition "c != '\n'" which is breaking the for loop right after the first line, before it reaches any brackets. And hence the output.
For the task of counting whether there are balanced braces in the data, the code is excessively complex. You could simply use:
int l_brace = 0;
int r_brace = 0;
int c;
while ((c = getchar()) != EOF)
{
if (c == '{')
l_brace++;
else if (c == '}')
r_brace++;
}
if (l_brace != r_brace)
printf("Number of { = %d; number of } = %d\n", l_brace, r_brace);
Of course, this can be confused by code such as:
/* This is a comment with an { in it */
char string[] = "{{{";
char c = '{';
There are no braces that mark control-of-flow statement grouping in that fragment, for all there are 5 left braces ({) in the source code. Parsing C properly is hard work.

Quick question regarding this issue, Why doesnt it print out the second value(converted second value) on the string?

Quick question, What have I done wrong here. The purpose of this code is to get the input into a string, the input being "12 34", with a space in between the "12" and "32" and to convert and print the two separate numbers from an integer variable known as number. Why doesn't the second call to the function copyTemp, not produce the value 34?. I have an index_counter variable which keeps track of the string index and its meant to skip the 'space' character?? what have i done wrong?
thanks.
#include <stdio.h>
#include <string.h>
int index_counter = 0;
int number;
void copyTemp(char *expr,char *temp);
int main(){
char exprstn[80]; //as global?
char tempstr[80];
gets(exprstn);
copyTemp(exprstn,tempstr);
printf("Expression: %s\n",exprstn);
printf("Temporary: %s\n",tempstr);
printf("number is: %d\n",number);
copyTemp(exprstn,tempstr); //second call produces same output shouldnt it now produce 34 in the variable number?
printf("Expression: %s\n",exprstn);
printf("Temporary: %s\n",tempstr);
printf("number is: %d\n",number);
return 0;
}
void copyTemp(char *expr,char *temp){
int i;
for(i = index_counter; expr[i] != '\0'; i++){
if (expr[i] == '0'){
temp[i] = expr[i];
}
if (expr[i] == '1'){
temp[i] = expr[i];
}
if (expr[i] == '2'){
temp[i] = expr[i];
}
if (expr[i] == '3'){
temp[i] = expr[i];
}
if (expr[i] == '4'){
temp[i] = expr[i];
}
if (expr[i] == '5'){
temp[i] = expr[i];
}
if (expr[i] == '6'){
temp[i] = expr[i];
}
if (expr[i] == '7'){
temp[i] = expr[i];
}
if (expr[i] == '8'){
temp[i] = expr[i];
}
if (expr[i] == '9'){
temp[i] = expr[i];
}
if (expr[i] == ' '){
temp[i] = '\0';
sscanf(temp,"%d",&number);
index_counter = i+1; //skips?
}
}
// is this included here? temp[i] = '\0';
}
There are a few problems in your program:
You are using the same index into
expr and temp arrays. This works for
the first time since both will be 0
to start with but when you want to
process the 2nd number, you need to
reset the index into the temp array
back to 0. Clearly this cannot be
done using a single index. You'll
have to use two indices, i and j.
By the time you complete the
processing of the 2nd number ( 34 in
"12 34") you'll reach the end of the
string and hence the sscanf never
gets run on the second occasion ( in
general for the last occasion). So
after the for loop you need another
sscanf to extract the last number. Also you should return from the function once you've extracted the number from the string and incremented i.
You should avoid using gets() and use
fgets() instead because of security
reasons.
You can combine the multiple test for
the digits into a single test as
shown:
Something like this.
void copyTemp(char *expr,char *temp){
int i;
int j = 0;
for(i = index_counter; expr[i] != '\0'; i++){
if (expr[i] >= '0' && expr[i]<='9'){
temp[j++] = expr[i]; // copy the digit into temp..increment j.
}
else if (expr[i] == ' '){ // space found..time to extract number.
temp[j] = '\0'; // terminate the temp.
sscanf(temp,"%d",&number); // extract.
index_counter = i+1; // skip the space.
return; // done converting...return..must not continue.
}
}
// have reached the end of the input string..and still need to extract a
// the last number from temp string.
temp[j] = '\0';
sscanf(temp,"%d",&number);
}
After these changes it works as expected:
$ gcc b.c 2> /dev/null && ./a.out
12 34
Expression: 12 34
Temporary: 12
number is: 12
Expression: 12 34
Temporary: 34
number is: 34
Your approach is very fragile...if a user gives multiple spaces between the input numbers..your program will fail.
The main problem is that copyTemp writes to temp[i], but each call to copyTemp initializes i to index_counter, not to 0. This means that each call to copyTemp appends to the existing temp buffer instead of overwriting the old contents, and sscanf thus always re-reads the same string. You need to use separate indices to keep track of where to read from the input buffer and where to write to the output buffer.
Additional problems:
* Never** use ggets. Ever. Use fgets instead.
* You duplicate a lot of code in copyTemp. You instead could do:
if (expr[i] == '0' || expr[i] == '1' || ...)
or better:
if (isdigit(expr[i]))
copyTemp should take some precautions to not overflow its destination buffer. (Note that copyTemp shouldn't even need to take a destination buffer as an argument.)
You should avoid using global variables. It'd be better for copyTemp to take an argument specifying where to start reading from the input string and if it returned the index where it left off.

Resources