I am fairly new to C-coding and I have a task where we run libFuzzer on a basic C-program to exploit problems and fix them. This is my C-program (it takes a string input and changes "&" to "&", ">" to ">" and "<" to "<"):
char *checkString(char str[50]) {
int i;
char *newstr;
newstr = (char *)malloc(200);
for (i = 0; i < strlen(str); i++) {
if (str[i] == '&') {
const char *ch = "&";
strncat(newstr, ch, 4);
} else if (str[i] == '<') {
const char *ch = "<";
strncat(newstr, ch, 3);
} else if (str[i] == '>') {
const char *ch = ">";
strncat(newstr, ch, 3);
} else {
const char ch = str[i];
strncat(newstr, &ch, 1);
}
}
return newstr;
}
This is the error message from libFuzzer:
SUMMARY: AddressSanitizer: heap-buffer-overflow (/path/to/a.out+0x50dc14) in strncat
Anybody who knows how to possibly fix this heap buffer overflow problem? Thanks!
After newstr = (char *)malloc(200);, newstr is not yet properly initialized so you must not call strncat( newstr, ... ).
You can solve that e.g. by calling strcpy( newstr, "" ); after malloc() or by replacing malloc(200) with calloc(200,1) which fills the entire buffer with NUL.
Besides, as #stevesummit has mentioned, despite its declaration there is no guarantee, that strlen(str) < 50. So instead of allocating a fix number of 200 characters, you should alloc strlen(str)*4 + 1
... or strlen(str)*5 + 1 if what you're doing is HTML esacping and you realize that & should be replaced by &
Related
So I am writing a little function to parse paths, it looks like this:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int parse_path() {
char *pathname = "this/is/a/path/hello";
int char_index = 0;
char current_char = pathname[char_index];
char *buffer = malloc(2 * sizeof(char));
char *current_char_str = malloc(2 * sizeof(char));
while (current_char != '\0' && (int)current_char != 11) {
if (char_index == 0 && current_char == '/') {
char_index++; current_char = pathname[char_index];
continue;
}
while (current_char != '/' && current_char != '\0') {
current_char_str[0] = current_char;
current_char_str[1] = '\0';
buffer = (char *)realloc(buffer, (strlen(buffer) + 2) * sizeof(char));
strcat(buffer, current_char_str);
char_index++; current_char = pathname[char_index];
}
if (strlen(buffer)) {
printf("buffer(%s)\n", buffer);
current_char_str[0] = '\0';
buffer[0] = '\0';
}
char_index++; current_char = pathname[char_index];
}
};
int main(int argc, char *argv[]) {
parse_path();
printf("hello\n");
return 0;
}
Now, there is undefined behavior in my code, it looks like the printf call inside the main method is changing the buffer variable... as you can see, the output of this program is:
buffer(this)
buffer(is)
buffer(a)
buffer(path)
buffer(hello)
buffer(buffer(%s)
)
buffer(hello)
hello
I have looked at other posts where the same sort of problem is mentioned and people have told me to use a static char array etc. but that does not seem to help.
Any suggestions?
For some reason, at one time in this program the "hello" string from printf is present in my buffer variable.
Sebastian, if you are still having problems after #PaulOgilvie answer, then it is most likely due to not understanding his answer. Your problem is due to buffer being allocated but not initialized. When you call malloc, it allocates a block of at least the size requested, and returns a pointer to the beginning address for the new block -- but does nothing with the contents of the new block -- meaning the block is full random values that just happened to be in the range of addresses for the new block.
So when you call strcat(buffer, current_char_str); the first time and there is nothing but random garbage in buffer and no nul-terminating character -- you do invoke Undefined Behavior. (there is no end-of-string in buffer to be found)
To fix the error, you simply need to make buffer an empty-string after it is allocated by setting the first character to the nul-terminating character, or use calloc instead to allocate the block which will ensure all bytes are set to zero.
For example:
int parse_path (const char *pathname)
{
int char_index = 0, ccs_index = 0;
char current_char = pathname[char_index];
char *buffer = NULL;
char *current_char_str = NULL;
if (!(buffer = malloc (2))) {
perror ("malloc-buffer");
return 0;
}
*buffer = 0; /* make buffer empty-string, or use calloc */
...
Also do not hardcode paths or numbers (that includes the 0 and 2, but we will let those slide for now). Hardcoding "this/is/a/path/hello" within parse_path() make is a rather un-useful function. Instead, make your pathname variable your parameter so I can take any path you want to send to it...
While the whole idea of realloc'ing 2-characters at a time is rather inefficient, you always need to realloc with a temporary pointer rather than the pointer itself. Why? realloc can and does fail and when it does, it returns NULL. If you are using the pointer itself, you will overwrite your current pointer address with NULL in the event of failure, losing the address to your existing block of memory forever creating a memory leak. Instead,
void *tmp = realloc (buffer, strlen(buffer) + 2);
if (!tmp) {
perror ("realloc-tmp");
goto alldone; /* use goto to break nested loops */
}
...
}
alldone:;
/* return something meaningful, your function is type 'int' */
}
A short example incorporating the fixes and temporary pointer would be:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int parse_path (const char *pathname)
{
int char_index = 0, ccs_index = 0;
char current_char = pathname[char_index];
char *buffer = NULL;
char *current_char_str = NULL;
if (!(buffer = malloc (2))) {
perror ("malloc-buffer");
return 0;
}
*buffer = 0; /* make buffer empty-string, or use calloc */
if (!(current_char_str = malloc (2))) {
perror ("malloc-current_char_str");
return 0;
}
while (current_char != '\0' && (int) current_char != 11) {
if (char_index == 0 && current_char == '/') {
char_index++;
current_char = pathname[char_index];
continue;
}
while (current_char != '/' && current_char != '\0') {
current_char_str[0] = current_char;
current_char_str[1] = '\0';
void *tmp = realloc (buffer, strlen(buffer) + 2);
if (!tmp) {
perror ("realloc-tmp");
goto alldone;
}
strcat(buffer, current_char_str);
char_index++;
current_char = pathname[char_index];
}
if (strlen(buffer)) {
printf("buffer(%s)\n", buffer);
current_char_str[0] = '\0';
buffer[0] = '\0';
}
if (current_char != '\0') {
char_index++;
current_char = pathname[char_index];
}
}
alldone:;
return ccs_index;
}
int main(int argc, char* argv[]) {
parse_path ("this/is/a/path/hello");
printf ("hello\n");
return 0;
}
(note: your logic is quite tortured above and you could just use a fixed buffer of PATH_MAX size (include limits.h) and dispense with allocating. Otherwise, you should allocate some anticipated number of characters for buffer to begin with, like strlen (pathname) which would ensure sufficient space for each path component without reallocating. I'd rather over-allocate by 1000-characters than screw up indexing worrying about reallocating 2-characters at a time...)
Example Use/Output
> bin\parsepath.exe
buffer(this)
buffer(is)
buffer(a)
buffer(path)
buffer(hello)
hello
A More Straight-Forward Approach Without Allocation
Simply using a buffer of PATH_MAX size or an allocated buffer of at least strlen (pathname) size will allow you to simply step down your string without any reallocations, e.g.
#include <stdio.h>
#include <limits.h> /* for PATH_MAX - but VS doesn't provide it, so we check */
#ifndef PATH_MAX
#define PATH_MAX 2048
#endif
void parse_path (const char *pathname)
{
const char *p = pathname;
char buffer[PATH_MAX], *b = buffer;
while (*p) {
if (*p == '/') {
if (p != pathname) {
*b = 0;
printf ("buffer (%s)\n", buffer);
b = buffer;
}
}
else
*b++ = *p;
p++;
}
if (b != buffer) {
*b = 0;
printf ("buffer (%s)\n", buffer);
}
}
int main (int argc, char* argv[]) {
char *path = argc > 1 ? argv[1] : "this/is/a/path/hello";
parse_path (path);
printf ("hello\n");
return 0;
}
Example Use/Output
> parsepath2.exe
buffer (this)
buffer (is)
buffer (a)
buffer (path)
buffer (hello)
hello
Or
> parsepath2.exe another/path/that/ends/in/a/filename
buffer (another)
buffer (path)
buffer (that)
buffer (ends)
buffer (in)
buffer (a)
buffer (filename)
hello
Now you can pass any path you would like to parse as an argument to your program and it will be parsed without having to change or recompile anything. Look things over and let me know if you have questions.
You strcat something to buffer but buffer has never been initialized. strcat will first search for the first null character and then copy the string to concatenate there. You are now probably overwriting memory that is not yours.
Before the outer while loop do:
*buffer= '\0';
There are 2 main problems in your code:
the arrays allocated by malloc() are not initialized, so you have undefined behavior when you call strlen(buffer) before setting a null terminator inside the array buffer points to. The program could just crash, but in your case whatever contents is present in the memory block and after it is retained up to the first null byte.
just before the end of the outer loop, you should only take the next character from the path if the current character is a '/'. In your case, you skip the null terminator and the program has undefined behavior as you read beyond the end of the string constant. Indeed, the parse continues through another string constant "buffer(%s)\n" and through yet another one "hello". The string constants seem to be adjacent without padding on your system, which is just a coincidence.
Here is a corrected version:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
void parse_path(const char *pathname) {
int char_index = 0;
char current_char = pathname[char_index];
char *buffer = calloc(1, 1);
char *current_char_str = calloc(1, 1);
while (current_char != '\0' && current_char != 11) {
if (char_index == 0 && current_char == '/') {
char_index++; current_char = pathname[char_index];
continue;
}
while (current_char != '/' && current_char != '\0') {
current_char_str[0] = current_char;
current_char_str[1] = '\0';
buffer = (char *)realloc(buffer, strlen(buffer) + 2);
strcat(buffer, current_char_str);
char_index++; current_char = pathname[char_index];
}
if (strlen(buffer)) {
printf("buffer(%s)\n", buffer);
current_char_str[0] = '\0';
buffer[0] = '\0';
}
if (current_char == '/') {
char_index++; current_char = pathname[char_index];
}
}
}
int main(int argc, char *argv[]) {
parse_path("this/is/a/path/hello");
printf("hello\n");
return 0;
}
Output:
buffer(this)
buffer(is)
buffer(a)
buffer(path)
buffer(hello)
hello
Note however some remaining problems:
allocation failure is not tested, resulting in undefined behavior,
allocated blocks are not freed, resulting in memory leaks,
it is unclear why you test current_char != 11: did you mean to stop at TAB or newline?
Here is a much simpler version with the same behavior:
#include <stdio.h>
#include <string.h>
void parse_path(const char *pathname) {
int i, n;
for (i = 0; pathname[i] != '\0'; i += n) {
if (pathname[i] == '/') {
n = 1; /* skip path separators and empty items */
} else {
n = strcspn(pathname + i, "/"); /* get the length of the path item */
printf("buffer(%.*s)\n", n, pathname + i);
}
}
}
int main(int argc, char *argv[]) {
parse_path("this/is/a/path/hello");
printf("hello\n");
return 0;
}
I wrote this code, but inserts garbage in the start of string:
void append(char *s, char c) {
int len = strlen(s);
s[len] = c;
s[len + 1] = '\0';
}
int main(void) {
char c, *s;
int i = 0;
s = malloc(sizeof(char));
while ((c = getchar()) != '\n') {
i++;
s = realloc(s, i * sizeof(char));
append(s, c);
}
printf("\n%s",s);
}
How can I do it?
There are multiple problems in your code:
you iterate until you read a newline ('\n') from the standard input stream. This will cause an endless loop if the end of file occurs before you read a newline, which would happen if you redirect standard input from an empty file.
c should be defined as int so you can test for EOF properly.
s should be null terminated at all times, you must set the first byte to '\0' after malloc() as this function does not initialize the memory it allocates.
i should be initialized to 1 so the first realloc() extends the array by 1 etc. As coded, your array is one byte too short to accommodate the extra character.
you should check for memory allocation failure.
for good style, you should free the allocated memory before exiting the program
main() should return an int, preferably 0 for success.
Here is a corrected version:
#include <stdio.h>
#include <stdlib.h>
/* append a character to a string, assuming s points to an array with enough space */
void append(char *s, char c) {
size_t len = strlen(s);
s[len] = c;
s[len + 1] = '\0';
}
int main(void) {
int c;
char *s;
size_t i = 1;
s = malloc(i * sizeof(char));
if (s == NULL) {
printf("memory allocation failure\n");
return 1;
}
*s = '\0';
while ((c = getchar()) != EOF && c != '\n') {
i++;
s = realloc(s, i * sizeof(char));
if (s == NULL) {
printf("memory allocation failure\n");
return 1;
}
append(s, c);
}
printf("%s\n", s);
free(s);
return 0;
}
when you call strlen it searches for a '\0' char to end the string. You don't have this char inside your string to the behavior of strlen is unpredictable.
Your append function is acually good.
Also, a minor thing, you need to add return 0; to your main function. And i should start from 1 instead if 0.
Here is how it should look:
int main(void){
char *s;
size_t i = 1;
s = malloc (i * sizeof(char));//Just for fun. The i is not needed.
if(s == NULL) {
fprintf(stderr, "Coul'd not allocate enough memory");
return 1;
}
s[0] = '\0';
for(char c = getchar(); c != '\n' && c != EOF; c = getchar()) {//it is not needed in this case to store the result as an int.
i++;
s = realloc (s,i * sizeof(char) );
if(s == NULL) {
fprintf(stderr, "Coul'd not allocate enough memory");
return 1;
}
append (s,c);
}
printf("%s\n",s);
return 0;
}
Thanks for the comments that helped me improve the code (and for my english). I am not perfect :)
The inner realloc needs to allocate one element more (for the trailing \0) and you have to initialize s[0] = '\0' before starting the loop.
Btw, you can replace your append by strcat() or write it like
size_t i = 0;
s = malloc(1);
/* TODO: check for s != NULL */
while ((c = getchar()) != '\n') {
s[i] = c;
i++;
s = realloc(s, i + 1);
/* TODO: check for s != NULL */
}
s[i] = '\0';
I'm trying to allocate memory only if i need it for the next while.
char *str = malloc(sizeof(char));
int i = 0;
while(something == true){
str[i] = fgetc(fp);
str = realloc(str, strlen(str)+1);
i++;
}
free(str);
But for some reason the code above give me an "Invalid read of size 1" at strlen().
strlen will not determine the size of the allocated char array even if it contains a null terminated string. See proposed fix although I do not like the code structure overall: You will always end up with an extra allocated character.
char *str = malloc(sizeof(char));
int i = 0;
while(something == true){
str[i] = fgetc(fp);
str = realloc(str, (i+2)*sizeof(char));
i++;
}
// str[i*sizeof(char)]='\0'; <-- Add this if you want a null terminated string
free(str);
I would propose the following code that would avoid allocating the extra character:
char *str = NULL;
int i = 0;
while(something == true){
str = realloc(str, (i+1)*sizeof(char));
str[i] = fgetc(fp);
i++;
}
free(str);
As per documentation, "In case that ptr is a null pointer, the function behaves like malloc, assigning a new block of size bytes and returning a pointer to its beginning."
This is in case you are not reading text and not planning to use such functions as strlen, strcat...
Chunk at a time allocation:
char *str = malloc(sizeof(char));
int i = 0;
const int chunk_size = 100;
while(something == true){
str[i] = fgetc(fp);
if (i % chunk_size == 0)
str = realloc(str, (i+1+chunk_size)*sizeof(char));
i++;
}
// str[i*sizeof(char)]='\0'; <-- Add this if you want a null terminated string
free(str);
I am trying to merge two strings of variable length in C. The result should be 1st character from str1 then 1st character from str2 then 2nd character from str1, 2nd character from str2, etc. When it reaches the end of one string it should append the rest of the other string.
For Example:
str1 = "abcdefg";
str2 = "1234";
outputString = "a1b2c3d4efg";
I'm pretty new to C, my first idea was to convert both strings to arrays then try to iterate through the arrays but I thought there might be an easier method. Sample code would be appreciated.
UPDATE:
I've tried to implement the answer below. My function looks like the following.
void strMerge(const char *s1, const char *s2, char *output, unsigned int ccDest)
{
printf("string1 is %s\n", s1);
printf("string2 is %s\n", s2);
while (*s1 != '\0' && *s2 != '\0')
{
*output++ = *s1++;
*output++ = *s2++;
}
while (*s1 != '\0')
*output++ = *s1++;
while (*s2 != '\0')
*output++ = *s2++;
*output = '\0';
printf("merged string is %s\n", *output);
}
But I get a warning when compiling:
$ gcc -g -std=c99 strmerge.c -o strmerge
strmerge2.c: In function ‘strMerge’:
strmerge2.c:41:5: warning: format ‘%s’ expects argument of type ‘char *’, but argument 2 has type ‘int’ [-Wformat]
And when I run it it doesnt work:
./strmerge abcdefg 12314135
string1 is abcdefg
string2 is 12314135
merged string is (null)
Why does it think argument 2 is an int and how do I fix it to be a char? If I remove the "*" off output in the printf it doesn't give a compile error but the function still doesn’t work.
The code below ensures that the strings can't overflow by making the output string as long as the two input strings, and using fgets() to ensure that there is no overflow of the input strings. One alternative design would do dynamic memory allocation (malloc() et al), at the cost of the calling code having to free() the allocated space. Another design would pass the length of the output buffer to the function so that it could ensure no overflow occurs.
The test program doesn't emit prompts: it would not be hard to add a function to do so.
Code
#include <stdio.h>
#include <string.h>
void interleave_strings(const char *s1, const char *s2, char *output)
{
while (*s1 != '\0' && *s2 != '\0')
{
*output++ = *s1++;
*output++ = *s2++;
}
while (*s1 != '\0')
*output++ = *s1++;
while (*s2 != '\0')
*output++ = *s2++;
*output = '\0';
}
int main(void)
{
char line1[100];
char line2[100];
char output[200];
if (fgets(line1, sizeof(line1), stdin) != 0 &&
fgets(line2, sizeof(line2), stdin) != 0)
{
char *end1 = line1 + strlen(line1) - 1;
char *end2 = line2 + strlen(line2) - 1;
if (*end1 == '\n')
*end1 = '\0';
if (*end2 == '\n')
*end2 = '\0';
interleave_strings(line1, line2, output);
printf("In1: <<%s>>\n", line1);
printf("In2: <<%s>>\n", line2);
printf("Out: <<%s>>\n", output);
}
}
Example output
$ ./interleave
abcdefgh
1234
In1: <<abcdefgh>>
In2: <<1234>>
Out: <<a1b2c3d4efgh>>
$
char* getMerged(const char* str1, const char* str2) {
char* str = malloc(strlen(str1)+strlen(str2)+1);
int k=0,i;
for(i=0;str1[i] !='\0' && str2[i] !='\0';i++) {
str[k++] = str1[i];
str[k++] = str2[i];
}
str[k]='\0';
if (str1[i] != '\0') {
strcpy(&str[k], &str1[i]);
} else if (str2[i] != '\0') {
strcpy(&str[k], &str2[i]);
}
return str;
}
Your strMerge prints null because, you print the valueAt(*)output which was assigned null the previous step.
#include<stdio.h>
#include<string.h>
//strMerge merges two string as per user requirement
void strMerge(const char *s1, const char *s2, char *output)
{
printf("string1 is %s\n", s1);
printf("string2 is %s\n", s2);
while (*s1 != '\0' && *s2 != '\0')
{
*output++= *s1++;
*output++ = *s2++;
}
while (*s1 != '\0')
*output++=*s1++;
while (*s2 != '\0')
*output++ = *s2++;
*output='\0';
}
int main()
{
char *str1="abcdefg";
char *str2="1234";
char *output=malloc(strlen(str1)+strlen(str2)+1); //allocate memory 7+4+1 = 12 in this case
strMerge(str1,str2,output);
printf("%s",output);
return 0;
}
OUTPUT:
string1 is abcdefg
string2 is 1234
a1b2c3d4efg
In C, both strings are already arrays that can be accessed by their pointers. You just need to create a new buffer that's large enough, then copy into it.
E.g. something like this:
int str1Length = strlen(str1);
int str2Length = strlen(str2);
char* output = (char*) malloc(str1Length + str2Length + 1);
int j = 0;
for (int i = 0; i < str1Length; i++)
{
output[j++] = str1[i];
if (str2Length < i)
output[j++] = str2[i];
}
if (str2Length > str1Length)
{
for (int i = str2Length - str1Length; i < str2Length; i++)
{
output[j++] = str2[i];
}
}
Psudo code below, but does anyone have any idea why this would be breaking the heap? The urlencode function is a standard library function downloaded elsewhere, and appears to function as designed. In the actual code I'm using dynamic size char arrays, thus the reason for the malloc requirement in main.
/* Returns a url-encoded version of str */
/* IMPORTANT: be sure to free() the returned string after use */
char *urlencode(char *str) {
//char *pstr = str, *buf = malloc(strlen(str) * 3 + 1), *pbuf = buf;
char *pstr = str, *buf = malloc(strlen(str) * 3 + 1), *pbuf = buf;
while (*pstr) {
if (isalnum(*pstr) || *pstr == '-' || *pstr == '_' || *pstr == '.' || *pstr == '~')
*pbuf++ = *pstr;
else if (*pstr == ' ')
*pbuf++ = '+';
else
*pbuf++ = '%', *pbuf++ = to_hex(*pstr >> 4), *pbuf++ = to_hex(*pstr & 15);
pstr++;
}
*pbuf = '\0';
return buf;
}
int testFunction(char *str) {
char *tmpstr;
tmpstr = urlencode(str);
// Now we do a bunch of stuff
// that doesn't use str
free(tmpstr);
return 0;
// At the end of the function,
// the debugger shows str equal
// to "This is a test"
}
int main() {
char *str = NULL;
str = malloc(100);
strcpy(str, "This is a test");
testFunction(str);
free(str); // Debugger shows correct value for str, but "free" breaks the heap
return 0;
}
Thanks.
I would guess that str was already freed by free(tmpstr); - please have a look at the behavior of the urlencode-function. It seems like it does not generate a new string as return value, but passes the (changed) input string back.
the problem turned out to be an issue with the size calculation for the initial malloc of str being executed as 0. Thank you for the comments, unfortunately no way to really mark an answer as comments.
If this is an improper way to close this out, please let me know.