Can't fix heap-buffer-overflow error on my C code - c

I need help fixing an fsanitize=address error on this code.
If I compile my .c program with the flags "fsanitize=address -g" I get the following error:
==93042==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x000107903a7c at pc 0x0001052aa780 bp 0x00016b2af490 sp 0x00016b2aec48
READ of size 1 at 0x000107903a7c thread T0
#0 0x1052aa77c in wrap_strchr+0x18c (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x1677c)
#1 0x104b50b70 in processData front.c:50
#2 0x104b509d0 in main front.c:27
#3 0x104eb5088 in start+0x204 (dyld:arm64e+0x5088)
The function I'm having problems with is called "processData". It gets a "char * data" which contains an entire CSV file (which has been copied as a string to it), and divides the csv file in lines. Each line is then sent to a "loadData" function.
The "processData()" function starts by declaring two pointers: "string", which points to the character string passed as an argument, and "line", which initially points to the '\n' character (new line) in the "data" string.
Then, function enters a loop that runs while there are lines left in the "data" string. Inside the loop, the function calculates the size of the current line by subtracting the value of "string" from the value of "line". Then, it creates a "aux" variable to store the current line and copies the line into the "aux" variable using "strncpy()".
Next, the function adds a null character at the end of the "aux" string to indicate the end of the string. Then, it sends the line to the "loadLine()" function passed as an argument for processing. Finally, it updates the "string" pointer to point to the beginning of the next line in the "data" string.
Once all lines in the "data" string have been processed, the "processData()" function ends and returns control to the caller.
This is what the processData function looks like (I have highlighted line 50):
void processData(sensorADT s, char * data, loadLine loadData) {
// Pointer to the "data" string
char * string = data;
// Pointer to the '\n' character (new line) in the "data" string
char * line;
// Loop that runs while there are lines left in the "data" string
// THE FOLLOWING LINE IS LINE 50:
while (string != NULL && (line = strchr(string,'\n')) != NULL) {
// Calculates the size of the current line
int len = line - string;
// Creates a "aux" variable to store the current line
char aux[len + 1];
// Copies the current line in the "aux" variable
strncpy(aux, string, len);
// Adds a null character at the end of the line to indicate the end of the string
aux[len] = 0;
// Sends the line to the "loadLine()" function for processing
loadData(s, aux);
// Updates the "string" pointer to point to the beginning of the next line
string = line + 1;
}
}
If I try compiling and running the code without the sanitizer on, it works as intended.
Thanks!
I tried compiling my program with the sanitize flag on, and I get that error. If I compile it without the sanitizer flag, it runs flawlessly and gives me the expected results.

According to your error message, the culprit is wrap_strchr(), which is reading after the allocated space of the string.
Since strchr() should stop at the final '\0' of the string and return NULL, my guess is that your data is not null-terminated at all.
By the way, that also means that strlen() will trigger the same error.
There is no easy way out of this error inside the function. Either you add a size_t len parameter or ensure that the string is null-terminated in the caller.

Related

Removing the Rest of a String

Just started experimenting with C, coming from a Java background and I am having an issue trying to remove a section of a string. The basic logic to this one is I have a String (which I found out is an array of chars in C, Very cool!), and once a certain condition is met while going through this string, I want to delete the rest of the String. so for example, if My string was "hello world!", and I set the condition as a blank space, I would like to delete everything following that blank space, so just return "hello ". I had an idea of noting the index where the condition was met, and creating a second array and filling it, then deleting the previous one, however I'm certain there is a better way of doing this. If anyone can help, that would be greatly appreciated, Thank you all in advance!
edit:
The Idea is I want to take in user input, the specific case I have is if there is a single dot "." which has a "next line" as the previous and next element, or a "next line" as the previous element and a null as the next argument. so basically:
(if string[n] == ".")
{
if((string[n-1]==\n && string[n+1]==\n) || (string[n-1]==\n && string[n+1]==
null)
{ Then remove everything past this point}
}
input:
hello world
this is ok.
.
Everything here will be deleted.
Output:
hello world
this is ok.
.
Edit 2:
Thank you all for some great advice so far, I am still running into issues with he program however, So here I will post the code for the main method so far (just testing the delete rest of string part (have not added user input yet).
//main method
int main(void)
{
char test = "This is a sample text.\
The file will be terminated by a single dot: .\
The program continues processing the lines because the dot (.)\
did not appear at the beginning.\
. even though this line starts with a dot, it is not a single dot.\
The program stops processing lines right here.\
.\
You wont be able to feed any more lines to the program.";
int n =0;
while(test[n] != NULL)
{
if (test[n]=='.')
{
if ((test[n-1]=='\n' && test[n+1]=='\n') || (test[n-1]=='\n' && test[n+1]==NULL))
{
test[n] = '\0';
}
}
n++;
}
printf("%c\n",test);
return 0;
}
The idea here is that I will eventually send the string word by word to an insertion sort linked list function and sort the string alphabetically after removing everything after the dot as specified. The problem now is I am encountering errors for some reason, If anybody could help sort them out, I would greatly appreciate the help.
Errors in main:
345500375/source.c: In function ‘main’:
345500375/source.c:64:17: warning: initialization makes integer from pointer without a cast [-Wint-conversion]
char test = "This is a sample text.\
^~~~~~~~~~~~~~~~~~~~~~~~
The file will be terminated by a single dot: .\
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The program continues processing the lines because the dot (.)\
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
did not appear at the beginning.\
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
. even though this line starts with a dot, it is not a single dot.\
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The program stops processing lines right here.\
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.\
~~
You wont be able to feed any more lines to the program.";
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
345500375/source.c:73:15: error: subscripted value is neither array nor pointer nor vector
while(test[n] != NULL)
^
345500375/source.c:75:17: error: subscripted value is neither array nor pointer nor vector
if (test[n]=='.')
^
345500375/source.c:77:21: error: subscripted value is neither array nor pointer nor vector
if ((test[n-1]=='\n' && test[n+1]=='\n') || (test[n-1]=='\n' && test[n+1]==NULL))
^
345500375/source.c:77:40: error: subscripted value is neither array nor pointer nor vector
if ((test[n-1]=='\n' && test[n+1]=='\n') || (test[n-1]=='\n' && test[n+1]==NULL))
^
345500375/source.c:77:61: error: subscripted value is neither array nor pointer nor vector
if ((test[n-1]=='\n' && test[n+1]=='\n') || (test[n-1]=='\n' && test[n+1]==NULL))
^
345500375/source.c:77:80: error: subscripted value is neither array nor pointer nor vector
if ((test[n-1]=='\n' && test[n+1]=='\n') || (test[n-1]=='\n' && test[n+1]==NULL))
^
345500375/source.c:79:19: error: subscripted value is neither array nor pointer nor vector
test[n] = '\0';
^
This would do the trick (assuming all indices are in bounds):
if (string[n]=='.') {
if (string[n-1]=='\n' && string[n+1]=='\n') {
string[n+1] = '\0';
}
}
I presume that you are using char* as your "string." In truth, this is not a string that is comparable to Java's strings at all or most languages in general.
A number of developers and libraries (and languages, except maybe Pascal) use something closer to a struct in order to store a string, ex:
struct string {
char * pointer;
unsigned short length;
};
This offers a few advantages, namely O(1) time complexity length lookups.
In your case, it would allow you to quickly create substrings or slices from your old string, while not modifying memory at all.
If we were to use that very rudimentary struct that I had provided as an example:
// note: you need the keyword struct behind every usage of a struct, many times structs are type aliased solely because of that
struct string userInput = {
.pointer = "words in str.ing",
.length = 16
};
// ...
unsigned short whereDotIs = 13;
struct string result = {
.pointer = userInput.pointer,
.length = whereDotIs
};
At this point, you could read past the dot (.), but it's not apart of the abstract idea of a "string" that we have now established, it is just random memory.
Although, there are cases in which you would need to work with null-terminated character pointers, in which #Mustafa Quraish's answer will suffice, unless the string is const qualified, in which you would have to stick to your original solution: copying the array into a new one.
First of all, your compilation errors depart from the fact that you have declared a char variable, not a char array. You can declare a char array with char variable[] and initialize it with a string (in this case you get an array of n elements, where n is the string size in characters, plus one for the final \0 char) Or you can specify a length (in between the brackets) and then initialize also (the unused part of the array is filled with \0 chars, as in char variable[30] = "hello"; /* the five chars of "hello" plus 25 '\0' chars */)
In java, Strings are immutable. You can extract a substring from them, but they become different instances of the class String. In C, a string is simply an array of chars. For C functions dealing with strings, a string extends until the function encounters a character '\0', and all the processing of the array (that continues to be the same length) terminates when the '\0' is found. So the best way to cut a string at some point is to put there a '\0' character.
BTW, don't use the final \ to continue a string at the next line, it is obsolete by the new C syntax (which is older than some of the readers here, and the compiler will eliminate the newline and the backslash from the input source, making the continuation line to continue the string literal as if you had written it stuck to the end of the previous line --this is, IMHO, not what you want). The new syntax allows a string to continue in the next line by just terminating it (with ") and start again in the next line (again with ") as below (so this code is equivalent to what you have written):
char test[] = /* now test is a char array, you need the pair of [] brackets */
"This is a sample text."
"The file will be terminated by a single dot: ."
"The program continues processing the lines because the dot (.)"
"did not appear at the beginning."
". even though this line starts with a dot, it is not a single dot."
"The program stops processing lines right here."
"."
"You wont be able to feed any more lines to the program.";
and also equivalent to this one:
char test[] = /* now test is a char array, you need the pair of [] brackets */
"This is a sample text.The file will be terminated by a single dot: .The program continues processing the lines because the dot (.)did not appear at the beginning.. even though this line starts with a dot, it is not a single dot.The program stops processing lines right here..You wont be able to feed any more lines to the program.";
but if you want the newlines to be included in the string literal, then you have
to include explicit \n characters on them, as below:
char test[] = /* now test is a char array, you need the pair of [] brackets */
"This is a sample text.\n"
"The file will be terminated by a single dot: .\n"
"The program continues processing the lines because the dot (.)\n"
"did not appear at the beginning.\n"
". even though this line starts with a dot, it is not a single dot.\n"
"The program stops processing lines right here.\n"
".\n"
"You wont be able to feed any more lines to the program.\n";
if you want to end the string in the single dot that is preceded and followed by a
\n, then you can use the strstr() function to find the place of the sequence you are following and put a '\0' in the appropiate place.
char *p = strstr(test, "\n.\n");
/* p (if found, e.g. not NULL) will point to the first \n, so we must use the
* address of the next char */
/* we can do the following as we know that the string extends past the
* position in which the dot is, because we have found (in the string) the
* sequence, that extends past the place we are going to put it. */
if (p) /* this is the same as if (p != NULL) */
p[1] = '\0'; /* put a \0 in the position of the dot */
printf("The cut text is: %s", test);
Your code has still another error, this time it is grave, which will lead you to runtime problems (possibly undetected by the compiler or until the code has been used for a long time), as you can access (and C doesn't check for bounding errors like java does) if you happen to find a dot character in the first position of the array. n will be 0 and when you try to access test[n-1] in your if statement, you'll be accessing the previous to first element in the array (this is test[-1]). This would throw an ArrayOutOfBoundsException in java, but C has not such protection. This problem will not happen at the end of the string (despite you also access the next char to the dot), because even if you find the dot at the end of the string, the following char (and there must be one) will be the last \0 char, so no problem will arise from this (as it has been pointed also erroneously in some other answers)

Function with string as return type in C Error: cannot access at memory ''

I want to program a function that gives me a string as return type. With this code I always get the Error: 'word' cannot access the memory at '0x63736e41'.
Main:
FILE *datei;
datei = fopen("C:\\Users\\user\\Documents\\dictionary.txt", "r");
char *word = getRandomWord(datei);
Function:
char* getRandomWord(FILE *datei)
{
int length = fsize("C:\\Users\\user\\Documents\\dictionary.txt"); //Get lenght of file
srand(time(NULL));
int randIndex = rand() % length+1; //Random number between 0 and the length of the file
char *cTest = malloc(20);
fseek(datei, randIndex, SEEK_SET); //I want to read only one random line from the file
fscanf(datei, "%s\0", &cTest);
fscanf(datei, "%s\0", &cTest);
return cTest;
}
You have multiple problems with the code you show.
The most serious and the probable cause of your crash is that you pass a pointer to the pointer when you use the address-of operator & in &cTest.
Remember that the scanf format %s expects an argument of type char *. When you do &cTest you get a value of type char **, which is very wrong and will lead to undefined behavior (and likely crashes).
The simple solution is to just pass cTest as it is, since it's already the correct type:
fscanf(datei, "%19s", cTest);
Note that in my shown call to fscanf I specified a field-width of 19, that's so the fscanf function will not read to much and write out of bounds of your allocated memory.
I also do not have the string-terminator \0 in the format string, it isn't needed and the fscanf function will just see it as the end of the format string.
And are you sure you want to call fscanf twice, with the same cTest destination? The second call will overwrite the contents you read with the first call (making you lose the string read by the first call).
You seek to a random position in the file. There's no guarantee that it will be the start of a word. There not even a guarantee that it will be somewhere in the file: Your calculation could return a position one beyond the end of the file.
And lastly, talking about the randomness: On any modern PC-like system the time(NULL) call will return an integer with the number of seconds since an epoch. That means if you call your function twice (or more) within a single second, then each call will set the same seed, making rand() return the exact same value. So each of those calls to your function will cause it to read the exact same data. Call srand only once at the beginning of your program.

Strncpy (String C programming) doesn't perform well the second time

Why does the second strncpy give incorrect weird symbols when printing?
Do I need something like fflush(stdin) ?
Note that I used scanf("%s",aString); to read an entire string, the input that is entered starts first off with a space so that it works correctly.
void stringMagic(char str[], int index1, int index2)
{
char part1[40],part2[40];
strncpy(part1,&str[0],index1);
part1[sizeof(part1)-1] = '\0';//strncpy doesn't always put '\0' where it should
printf("\n%s\n",part1);
strncpy(part2,&str[index1+1],(index2-index1));
part2[sizeof(part2)-1] = '\0';
printf("\n%s\n",part2);
}
EDIT
The problem seems to lie in
scanf("%s",aString);
because when I use printf("\n%s",aString); and I have entered something like "Enough with the hello world" I only get as output "Enough" because of the '\0'.
How can I correctly input the entire sentence with whitespace stored? Reading characters?
Now I use: fgets (aString, 100, stdin);
(Reading string from input with space character?)
In order to print a char sequence correctly using %s it needs to be null-terminated. In addition the terminating symbol should be immediately after the last symbol to be printed. However this section in your code: part2[sizeof(part2)-1] = '\0'; always sets the 39th cell of part2 to be the 0 character. Note that sizeof(part2) will always return the memory size allocated for part2 which in this case is 40. The value of this function does not depend on the contents of part2.
What you should do instead is to set the (index2-index1) character in part2 to 0. You know you've copied that many characters to part2, so you know what is the expected length of the resulting string.

Getting strange characters from strncpy() function

I am supposed to load a list of names from a file, and then find those names in the second file and load them in a structure with some other data (for the simplicity, I will load them to another array called "test".
The first part is just fine, I am opening a file and loading all the names into a 2dimensional array called namesArr.
The second part is where unexpected characters occur, and I can't understand why. Here is the code of the function:
void loadStructure(void){
char line[MAX_PL_LENGTH], *found;
int i, j=0;
char test[20][20];
FILE *plotPtr=fopen(PLOT_FILE_PATH, "r");
if (plotPtr==NULL){perror("Error 05:\nError opening a file in loadStructure function. Check the file path"); exit(-5);}
while(fgets(line, MAX_PL_LENGTH, plotPtr)!=NULL){ // This will load each line from a file to an array "line" until it reaches the end of file.
for(i=0; i<numOfNames; i++){ // Looping through the "namesArr" array, which contains the list of 20 character names.
if((found=strstr(line, namesArr[i]))!=NULL){ // I use strstr() to find if any of those names appear in the particular line.
printf("** %s", found); // Used of debugging.
strncpy(test[j], found, strlen(namesArr[i])); j++; // Copying the newly found name to test[j] (copying only the name, by defining it's length, which is calculated by strlen function).
}
}
}
fclose(plotPtr);
printf("%s\n", test[0]);
printf("%s\n", test[1]);
printf("%s\n", test[2]);
}
This is the output I get:
...20 names were loaded from the "../Les-Mis-Names-20.txt".
** Leblanc, casting
** Fabantou seems to me to be better," went on M. Leblanc, casting
** Jondrette woman, as she stood
Leblanct╕&q
Fabantou
Jondretteⁿ  └
Process returned 0 (0x0) execution time : 0.005 s
Press any key to continue.
The question is, why am I getting characters like "╕&q" and "ⁿ  └" in the newly created array? And also, is there any other more efficient way to achieve what I am trying to do?
The problem is that strncpy does not store a null in the target array if the length specified is less than the source string (as is always the case here). So whatever garbage happpend to be in the test array will remain there.
You can fix this specific problem by zeroing the test array, either when you declare it:
char test[20][20] = { { 0 } };
or as you use it:
memset(test[j], 0, 20);
strncpy(test[j], found, strlen(namesArr[i]));
but in general, it is best to avoid strncpy for this reason.
The length limitation for strncpy should be based on the target size, not the source length: that's the point of using it over strcpy, which uses only the source length. In your code
strncpy(test[j], found, strlen(namesArr[i]));
the length parameter is from the source array, which defeats the purpose of using strncpy. In addition, the nul terminator will not be present if the function copies the full limit of bytes, so the code should be
strncpy(test[j], found, 19); // limit to target size, leaving room for terminator
test[j][19] = '\0'; // add terminator (if copy did not complete)
Whether you loaded namesArr[] from file correctly is another potential issue, since you do not show the code.
Edited:
Slight modification to a previous answer:
1) Since you are working with C strings, make sure (since strncpy(...) does not do it for you) that you null terminate the buffer.
2) When using strncpy the length argument should represent the target string byte capacity - 1 (space for null terminator), not the source string length.
...
int len = strlen(found)
memset(test[j], 0, 20);
strncpy(test[j], found, 19);//maximum length (19) matches array size
//of target string -1 ( test[j] ).
if(len > 19) len = 19; //in case length of found is longer than the target string.
test[j][len+1] = 0;
...
In addition to what Chris Dodd said,, quoted from man strncpy
The strncpy() function is similar [to the strcpy() function], except that at most n bytes of src are copied. Warning: If there is no null byte among the first n bytes of src, the string placed in dest will not be null-terminated.
Since the size parameter in your strncpy call is the length of the string, this will not include the null byte at the end of the string and thus your destination string will not be null-terminated from this call.

insert text inside a line

I have a file pointer which I am using with fgets() to give me a complete line along with the new line in the buffer.
I want to replace 1 char and add another character before the new line. Is that possible?
For example:
buffer is "12345;\n"
output buffer is "12345xy\n"
This is the code:
buff = fgets((char *)newbuff, IO_BufferSize , IO_handle[i_inx]->fp);
nptr = IO_handle[i_inx]->fp;
if(feof(nptr))
{
memcpy((char *)o_rec_buf+(strlen((char *)newbuff)-1),"E",1);
}
else
{
memcpy((char *)o_rec_buf+(strlen((char *)newbuff)-1),"R",1);
}
As you can see I am replacing the new line here (example line is shown above).
I want to insert the text and retain the new line instead of what I am doing above.
You can't insert one character the way you want to. If you are sure the o_rec_buf has enough space, and that the line will always end in ";\n", then you can do something like:
size_t n = strlen(newbuff);
if (n >= 2)
strcpy(o_rec_buf + n - 1, "E\n");
/* memcpy(o_rec_buf+n-1, "E\n", 3); works too */
Note that using feof() like the way you do is an error most of the times. feof() tells you if you hit end-of-file condition on a file after you hit it. If you are running the above code in a loop, when feof() returns 'true', no line will be read by fgets, and buff will be NULL, but newbuff will be unchanged. In other words, newbuff will contain data from the last fgets call. You will process the last line twice. See CLC FAQ 12.2 for more, and a solution.
Finally, why all the casts? Are o_rec_buf and newbuff not of type char *?
If the buffer has enough space, you'll need to move the trailer 1 character further, using memmove and update the char you need.
Make sure you do not forget to memmove the trailing '\0'.

Resources