This subprogram takes three user inputs: a text string, a path to a file, and a 1 digit flag. It loads the file into a buffer, then appends both the flag and the file buffer, in that order, to a char array that serves as a payload. It returns the payload and the original user string.
I received a bug where some of my string operations on the file buffer, flag, and payload appeared to corrupt the memory that the user_string was located in. I fixed the bug by swapping strcat(flag, buffer) to strcpy(payload, flag), (which is what I intended to write originally), but I'm still perplexed as to what caused this bug.
My guess from reading the documentation (https://www.gnu.org/software/libc/manual/html_node/Concatenating-Strings.html , https://www.gnu.org/software/libc/manual/html_node/Concatenating-Strings.html) is that strcat extends the to string strlen(to) bytes into unprotected memory, which the file contents loaded into the buffer copied over in a buffer overflow.
My questions are:
Is my guess correct?
Is there a way to reliably prevent this from occurring? Catching this sort of thing with an if(){} check is kind of unreliable, as it doesn't consistently return something obviously wrong; you expect a string of length filelength+1 and get a string of filelength+1.
bonus/unrelated: is there any computational cost/drawbacks/effects with calling a variable without operating on it?
/*
user inputs:
argv[0] = tendigitaa/four
argv[1] = ~/Desktop/helloworld.txt
argv[2] = 1
helloworld.txt is a text file containing (no quotes) : "Hello World"
*/
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <unistd.h>
#include <string.h>
int main (int argc, char **argv) {
char user_string[100] = "0";
char file_path[100] = "0";
char flag[1] = "0";
strcpy(user_string, argv[1]);
strcpy(file_path, argv[2]);
strcpy(flag, argv[3]);
/*
at this point printfs of the three declared variables return the same as the user inputs.
======
======
a bunch of other stuff happens...
======
======
and then this point printfs of the three declared variables return the same as the user inputs.
*/
FILE *file;
char * buffer = 0;
long filelength;
file = fopen(file_path, "r");
if (file) {
fseek(file, 0, SEEK_END);
filelength = ftell(file);
fseek(file, 0, SEEK_SET);
buffer = malloc(filelength);
printf("stringcheck1: %s \n", user_string);
if (buffer) {
fread(buffer, 1, filelength, file);
}
}
long payloadlen = filelength + 1;
char payload[payloadlen];
printf("stringcheck2: %s \n", user_string);
strcpy(payload, flag);
printf("stringcheck3: %s \n", user_string);
strcat(flag, buffer);
printf("stringcheck4: %s \n", user_string); //bug here
free(buffer);
printf("stringcheck5: %s \n", user_string);
payload; user_string; //bonus question: does this line have any effect on the program or computational cost?
return 0;
}
/*
printf output:
stringcheck1: tendigitaa/four
stringcheck2: tendigitaa/four
stringcheck3: tendigitaa/four
stringcheck4: lo World
stringcheck5: lo World
*/
note: taking this section out of the main program caused stringcheck 4 to segfault instead of returning "lo World". The behavior was otherwise equivalent.
strcat does exactly what documentation says:
char *strcat(char *restrict s1, const char *restrict s2); The
strcat() function shall append a copy of the string pointed to by s2
(including the terminating null byte) to the end of the string pointed
to by s1. The initial byte of s2 overwrites the null byte at the end
of s1. If copying takes place between objects that overlap, the
behavior is undefined.
s1 has to have enough memory allocated to accommodate both strings plus the terminating nul
The linked article is about programming own string concatenating functions. How to write such a function depends on the application - which is stated there. There are many ways.
In your program the destination char array is not big enough and the result is an Undefined Behaviour and it is not even big enough to accommodate a single character string.
I strongly advice to learn some C strings basics.
If you want safer strcat you can write your own one for example:
char *mystrcat(const char *str1, const char *str2)
{
char *dest = NULL;
size_t str1_length, str2_length;
if(str1 && str2)
{
dest = malloc((str1_length = strlen(str1)) + (str2_length = strlen(str2)) + 1);
if(dest)
{
memcpy(dest, str1, str1_length);
memcpy(dest + str1_length, str2, str2_length);
}
}
return dest;
}
But for the safety we always pay the price - the code is longer and less efficient. C language was designed to be as efficient as possible sacrificing the safety and introducing the idea if the Undefined Behaviour.
You can't store a non-empty string in a 1-character array. A string needs room for the string contents and a null terminator.
So when you declare
char flag[1] = "1";
you've only allocated one byte, which contains the character 1. There's no null terminator.
Using this with any string functions will result in undefined behavior, because they look for the null terminator to find the end of the string.
strcat(flag, buffer) will search for the null terminator, which will be outside the array, and then append buffer after that. So this clearly causes a buffer overflow when writing.
strcpy(payload, flag) is also wrong. It will look for a null terminator after the flag bytes to know when to stop copying to payload, so it will copy more than just flag (unless there happens to be a null byte after it).
You can resolve the strcpy() problem by increasing the size:
char flag[2] = "1";
You can also leave the size empty, the compiler will make it large enough to hold the string that initializes it, including the null byte:
char flag[] = "1";
The line that causes the problem is because strcat() is trying to cram buffer into flag which is only one character long and you haven't allocated any more space to fit buffer.
If you want to put buffer into flag, I recommend using realloc() to increase the length of flag to include the length of buffer.
Also the only thing you ever print is user_string. I'm not sure if you're trying to print the other string you're working with.
Related
I have a string, and in it I need to find a substring and replace it. The one to be found and the one that'll replace it are of different length. My code, partially:
char *source_str = "aaa bbb CcCc dddd kkkk xxx yyyy";
char *pattern = "cccc";
char *new_sub_s = "mmmmm4343afdsafd";
char *sub_s1 = strcasestr(source_str, pattern);
printf("sub_s1: %s\r\n", sub_s1);
printf("sub_str before pattern: %s\r\n", sub_s1 - source_str); // Memory corruption
char *new_str = (char *)malloc(strlen(source_str) - strlen(pattern) + strlen(new_sub_s) + 1);
strcat(new_str, '\0');
strcat(new_str, "??? part before pattern ???");
strcat(new_str, new_sub_s);
strcat(new_str, "??? part after pattern ???");
Why do I have memory corruption?
How do I effective extract and replace pattern with new_sub_s?
There are multiple problems in your code:
you do not test if sub_s1 was found in the string. What if there is no match?
printf("sub_str before pattern: %s\r\n", sub_s1 - source_str); passes a difference of pointers for %s that expects a string. The behavior is undefined.
strcat(new_str, '\0'); has undefined behavior because the destination string is uninitialized and you pass a null pointer as the string to concatenate. strcat expects a string pointer as its second argument, not a char, and '\0' is a character constant with type int (in C) and value 0, which the compiler will convert to a null pointer, with or without a warning. You probably meant to write *new_str = '\0';
You cannot compose the new string with strcat as posted: because the string before the match is not a C string, it is a fragment of a C string. You should instead determine the lengths of the different parts of the source string and use memcpy to copy fragments with explicit lengths.
Here is an example:
char *patch_string(const char *source_str, const char *pattern, const char *replacement) {
char *match = strcasestr(source_str, pattern);
if (match != NULL) {
size_t len = strlen(source_str);
size_t n1 = match - source_str; // # bytes before the match
size_t n2 = strlen(pattern); // # bytes in the pattern string
size_t n3 = strlen(replacement); // # bytes in the replacement string
size_t n4 = len - n1 - n2; // # bytes after the pattern in the source string
char *result = malloc(n1 + n3 + n4 + 1);
if (result != NULL) {
// copy the initial portion
memcpy(result, source_str, n1);
// copy the replacement string
memcpy(result + n1, replacement, n3);
// copy the trailing bytes, including the null terminator
memcpy(result + n1 + n3, match + n2, n4 + 1);
}
return result;
} else {
return strdup(source_str); // always return an allocated string
}
}
Note that the above code assumes that the match in the source string has be same length as the pattern string (in the example, strings "cccc" an "CcCc" have the same length). Given that strcasestr is expected to perform a case independent search, which is confirmed by the example strings in the question, it might be possible that this assumption fail, for example if the encoding of upper and lower case letters have a different length, or if accents are matched by strcasestr as would be expected in French: "é" and "E" should match but have a different length when encoded in UTF-8. If strcasestr has this advanced behavior, it is not possible to determine the length of the matched portion of the source string without a more elaborate API.
printf("sub_str before pattern: %s\r\n", sub_s1 - source_str); // Memory corruption
You're taking the difference of two pointers, and printing it as though it was a pointer to a string. In practice, on your machine, this probably calculates a meaningless number and interprets it as a memory address. Since this is a small number, when interpreted as an address, on your system, this probably points to unmapped memory, so your program crashes. Depending on the platform, on the compiler, on optimization settings, on what else there is in your program, and on the phase of the Moon, anything could happen. It's undefined behavior.
Any half-decent compiler would tell you that there's a type mismatch between the %s directive and the argument. Turn those warnings on. For example, with GCC:
gcc -Wall -Wextra -Werror -O my_program.c
char *new_str = (char *)malloc(…);
strcat(new_str, '\0');
strcat(new_str, "…");
The first call to strcat attempts to append '\0'. This is a character, not a string. It happens that since this is the character 0, and C doesn't distinguish between characters and numbers, this is just a weird way of writing the integer 0. And any integer constant with the value 0 is a valid way of writing a null pointer constant. So strcat(new_str, '\0') is equivalent to strcat(new_str, NULL) which will probably crash due to attempting to dereference the null pointer. Depending on the compiler optimizations, it's possible that the compiler will think that this block of code is never executed, since it's attempting to dereference a null pointer, and this is undefined behavior: as far as the compiler is concerned, this can't happen. This is a case where you can plausibly expect that the undefined behavior causes the compiler to do something that looks preposterous, but makes perfect sense from the way the compiler sees the program.
Even if you'd written strcat(new_str, "\0") as you probably intended, that would be pointless. Note that "\0" is a pointless way of writing "": there's always a null terminator at the end of a string literal¹. And appending an empty string to a string wouldn't change it.
And there's another problem with the strcat calls. At this point, the content of new_str is not initialized. But strcat (if called correctly, even for strcat(new_str, ""), if the compiler doesn't optimize this away) will explore this uninitialized memory and look for the first null byte. Because the memory is uninitialized, there's no guarantee that there is a null byte in the allocated memory, so strcat may attempt to read from an unmapped address when it runs out of buffer, or it may corrupt whatever. Or it may make demons fly out of your nose: once again it's undefined behavior.
Before you do anything with the newly allocated memory area, make it contain the empty string: set the first character to 0. And before that, check that malloc succeeded. It will always succeed in your toy program, but not in the real world.
char *new_str = malloc(…);
if (new_str == NULL) {
return NULL; // or whatever you want to do to handle the error
}
new_str[0] = 0;
strcat(new_str, …);
¹ The only time there isn't a null pointer at the end of a "…" is when you use this to initialize an array and the characters that are spelled out fill the whole array without leaving room for a null terminator.
snprintf can be used to calculate the memory needed and then print the string to the allocated pointer.
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main ( void) {
char *source_str = "aaa bbb CcCc dddd kkkk xxx yyyy";
char *pattern = "cccc";
char *new_sub_s = "mmmmm4343afdsafd";
char *sub_s1 = strcasestr(source_str, pattern);
int span = (int)( sub_s1 - source_str);
char *tail = sub_s1 + strlen ( pattern);
size_t size = snprintf ( NULL, 0, "%.*s%s%s", span, source_str, new_sub_s, tail);
char *new_str = malloc( size + 1);
snprintf ( new_str, size, "%.*s%s%s", span, source_str, new_sub_s, tail);
printf ( "%s\n", new_str);
free ( new_str);
return 0;
}
I am trying to concatenate two strings in C and receive a "Thread 1: signal SIGABRT" error.
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <string.h>
int main() {
char name[50];
ifile = fopen("stats.list", "r");
for(;;) {
fscanf(ifile, "%s%f%f", name, &sky, &stddev);
if (feof(ifile))
break;
char ext[5] = ".par";
dataparsFile = strcat(name, ext);
dataparsFile = fopen(dataparsFile, "w");
fprintf(dataparsFile, "%s\n",
"stuff gets read in to file named after new string";
fprintf(ofile, "phot ");
fprintf(ofile, "%s%s%s%s%s%s \n",
", datapars=", dataparsFile);
}
fclose(ifile);
fclose(ofile);
The goal of the code is to take an image name that is read in and add on the .par extension. Then, I want to open a file with that name of image+.par and write into it. Since I will have a couple hundred such files, I need to loop through them with the name changing each time.
The problem is name is not initialized. You see, in c strings use a convention, they are any sequence of ASCII (probably some other printable characters, but in principle just ASCII) that must be followed by a '\0' byte that marks the end of the string.
Your name array doesn't have this '\0' so strcat() tries to find it but it fails and perhaps it reads beyond the end of the array, although anyway reading uninitialized data is undefined behavior.
The way strcat(dst, src) works is pretty much like this
char *
strcat(char *const dst, char *src)
{
// Make a pointer to keep dst's address
// unchanged and return it
char *ptr = dst;
// Compute search for the end of the destination
// string to start copying there
while (*ptr != '\0')
ptr++;
// Copy all the characters from `src' until the '\0'
// occurs
while (*src != '\0')
*(ptr++) = *(src++);
*ptr = '\0';
return dst;
}
As you see, this is very inefficient if you call strcat() many times, and it will certainly not work if you pass either of the parameters before initializing it.
In fact, it's terribly unsafe because there is no bound checking, the caller has to make sure that the destination array is large enough to hold both strings.
This is just a small program I wrote to find a problem with a larger one. Everything changes when I add the line with scanf. I know it is not safe, I read other threads concerning printf errors that suggest other functions. Anything but cin is fine. Btw, I didn't choose the type definitions of the 'messages', that came from my teachers, so I cannot change them.
#include <stdio.h>
#include <string.h>
char message1 [] = "amfdalkfaklmdklfamd.";
char message2 [] = "fnmakajkkjlkjs.";
char initializer [] = ".";
char* com;
char* word;
int main()
{
com = initializer;
int i = 1;
while (i !=4)
{
printf ("%s \n", com);
scanf("%s",word);
i++;
};
return 0;
}
The problem: after a single iteration the program exits, nothing is printed.
The reason the scanf will crash is buffer is not initialized: word has not been assigned a value, so it is pointing nowhere.
You can fix it by allocating some memory to your buffer, and limiting scanf to a certain number of characters, like this:
char word[20];
...
scanf("%19s", word);
Note that the number between % and s, which signifies the maximum number of characters in a string, is less by 1 than the length of the actual buffer. This is because of null terminator, which is required for C strings.
com is a pointer whose value is the address of the literal string initializer. Literal strings are contained within read-only memory areas, but the scanf function will attempt to write into the address given to it, this is an access-violation and causes the OS to kill your process, hence the crash you're seeing.
Change your scanf code to resemble this, note the addition of width limit in the %s placeholder, as well as the use of the scanf_s version to ensure there is no buffer overflow.
static int const BufferLength = 2048; // 2KiB should be sufficient
char* buffer = calloc( BufferLength , 1 );
if( buffer == null ) exit(1);
int fieldCount = scanf_s("%2047s", buffer, BufferLength );
if( fieldCount == 1 ) {
// do stuff with `buffer`
}
free( buffer );
Note that calloc zeroes memory before returning, which means that buffer can serve as a null-terminated string directly, whereas a string allocated with malloc cannot (unless you zero it yourself).
word has no memory associated with it.
char* word;
scanf("%s",word);
Could use
char word[100];
word[0] = '\0';
scanf("%99s",word);
If available, use getline().
Although not standard C, getline() will dynamicaly allocate memory for arbitrarily long user input.
char *line = NULL;
size_t len = 0;
ssize_t read;
while ((read = getline(&line, &len, stdin)) != -1) {
printf("%s", line);
}
free(line);
Linux Programmer's Manual GETLINE(3)
In the code below, I hope you can see that I have a char* variable and that I want to read in a string from a file. I then want to pass this string back from the function. I'm rather confused by pointers so I'm not too sure what I'm supposed to do really.
The purpose of this is to then pass the array to another function to be searched for a name.
Unfortunately the program crashes as a result and I've no idea why.
char* ObtainName(FILE *fp)
{
char* temp;
int i = 0;
temp = fgetc(fp);
while(temp != '\n')
{
temp = fgetc(fp);
i++;
}
printf("%s", temp);
return temp;
}
Any help would be vastly appreciated.
fgetc returns an int, not a char*. This int is a character from the stream, or EOF if you reach the end of the file.
You're implicitly casting the int to a char*, i.e., interpreting it as an address (turn your warnings on.) When you call printf it reads that address and continues to read a character at a time looking for the null terminator which ends the string, but that address is almost certainly invalid. This is undefined behavior.
I've taken some liberties with what you wanted to accomplish. Rather that deal with pointers, you can just use a fixed sized array as long as you can set a maximum length. I've also included several checks so that you don't run off the end of the buffer or the end of the file. Also important is to make sure that you have a null termination '\0' at the end of the string.
#define MAX_LEN 100
char* ObtainName(FILE *fp)
{
static char temp[MAX_LEN];
int i = 0;
while(i < MAX_LEN-1)
{
if (feof(fp))
{
break;
}
temp[i] = fgetc(fp);
if (temp[i] == '\n')
{
break;
}
i++;
}
temp[i] = '\0';
printf("%s", temp);
return temp;
}
So, there are several problems here:
You're not setting aside any storage for the string contents;
You're not storing the string contents correctly;
You're attempting to read memory that doesn't belong to you;
The way you're attempting to return the string is going to give you heartburn.
1. You're not setting aside storage for the string contents
The line
char *temp;
declares temp as a pointer to char; its value will be the address of a single character value. Since it's declared at local scope without the static keyword, its initial value will be indeterminate, and that value may not correspond to a valid memory address.
It does not set aside any storage for the string contents read from fp; that would have to be done as a separate step, which I'll get to below.
2. You're not storing the string contents correctly
The line
temp = fgetc(fp);
reads the next character from fp and assigns it to temp. First of all, this means you're only storing the last character read from the stream, not the whole string. Secondly, and more importantly, you're assigning the result of fgetc() (which returns a value of type int) to an object of type char * (which is treated as an address). You're basically saying "I want to treat the letter 'a' as an address into memory." This brings us to...
3. You're attempting to read memory that doesn't belong to you
In the line
printf("%s", temp);
you're attempting to print out the string beginning at the address stored in temp. Since the last thing you wrote to temp was most likely a character whose value is < 127, you're telling printf to start at a very low and most likely not accessible address, hence the crash.
4. The way you're attempting to return the string is guaranteed to give you heartburn
Since you've defined the function to return a char *, you're going to need to do one of the following:
Allocate memory dynamically to store the string contents, and then pass the responsibility of freeing that memory on to the function calling this one;
Declare an array with the static keyword so that the array doesn't "go away" after the function exits; however, this approach has serious drawbacks;
Change the function definition;
Allocate memory dynamically
You could use dynamic memory allocation routines to set aside a region of storage for the string contents, like so:
char *temp = malloc( MAX_STRING_LENGTH * sizeof *temp );
or
char *temp = calloc( MAX_STRING_LENGTH, sizeof *temp );
and then return temp as you've written.
Both malloc and calloc set aside the number of bytes you specify; calloc will initialize all those bytes to 0, which takes a little more time, but can save your bacon, especially when dealing with text.
The problem is that somebody has to deallocate this memory when its no longer needed; since you return the pointer, whoever calls this function now has the responsibility to call free() when it's done with that string, something like:
void Caller( FILE *fp )
{
...
char *name = ObtainName( fo );
...
free( name );
...
}
This spreads the responsibility for memory management around the program, increasing the chances that somebody will forget to release that memory, leading to memory leaks. Ideally, you'd like to have the same function that allocates the memory free it.
Use a static array
You could declare temp as an array of char and use the static keyword:
static char temp[MAX_STRING_SIZE];
This will set aside MAX_STRING_SIZE characters in the array when the program starts up, and it will be preserved between calls to ObtainName. No need to call free when you're done.
The problem with this approach is that by creating a static buffer, the code is not re-entrant; if ObtainName called another function which in turn called ObtainName again, that new call will clobber whatever was in the buffer before.
Why not just declare temp as
char temp[MAX_STRING_SIZE];
without the static keyword? The problem is that when ObtainName exits, the temp array ceases to exist (or rather, the memory it was using is available for someone else to use). That pointer you return is no longer valid, and the contents of the array may be overwritten before you can access it again.
Change the function definition
Ideally, you'd like for ObtainName to not have to worry about the memory it has to write to. The best way to achieve that is for the caller to pass target buffer as a parameter, along with the buffer's size:
int ObtainName( FILE *fp, char *buffer, size_t bufferSize )
{
...
}
This way, ObtainName writes data into the location that the caller specifies (useful if you want to obtain multiple names for different purposes). The function will return an integer value, which can be a simple success or failure, or an error code indicating why the function failed, etc.
Note that if you're reading text, you don't have to read character by character; you can use functions like fgets() or fscanf() to read an entire string at a time.
Use fscanf if you want to read whitespace-delimited strings (i.e., if the input file contains "This is a test", fscanf( fp, "%s", temp); will only read "This"). If you want to read an entire line (delimited by a newline character), use fgets().
Assuming you want to read an individual string at a time, you'd use something like the following (assumes C99):
#define FMT_SIZE 20
...
int ObtainName( FILE *fp, char *buffer, size_t bufsize )
{
int result = 1; // assume success
int scanfResult = 0;
char fmt[FMT_SIZE];
sprintf( fmt, "%%%zus", bufsize - 1 );
scanfResult = fscanf( fp, fmt, buffer );
if ( scanfResult == EOF )
{
// hit end-of-file before reading any text
result = 0;
}
else if ( scanfResult == 0 )
{
// did not read anything from input stream
result = 0;
}
else
{
result = 1;
}
return result;
}
So what's this noise
char fmt[FMT_SIZE];
sprintf( fmt, "%%%zus", bufsize - 1 );
about? There is a very nasty security hole in fscanf() when you use the %s or %[ conversion specifiers without a maximum length specifier. The %s conversion specifier tells fscanf to read characters until it sees a whitespace character; if there are more non-whitespace characters in the stream than the buffer is sized to hold, fscanf will store those extra characters past the end of the buffer, clobbering whatever memory is following it. This is a common malware exploit. So we want to specify a maximum length for the input; for example, %20s says to read no more than 20 characters from the stream and store them to the buffer.
Unfortunately, since the buffer length is passed in as an argument, we can't write something like %20s, and fscanf doesn't give us a way to specify the length as an argument the way fprintf does. So we have to create a separate format string, which we store in fmt. If the input buffer length is 10, then the format string will be %10s. If the input buffer length is 1000, then the format string will be %1000s.
The following code expands on that in your question, and returns the string in allocated storage:
char* ObtainName(FILE *fp)
{
int temp;
int i = 1;
char *string = malloc(i);
if(NULL == string)
{
fprintf(stderr, "malloc() failed\n");
goto CLEANUP;
}
*string = '\0';
temp = fgetc(fp);
while(temp != '\n')
{
char *newMem;
++i;
newMem=realloc(string, i);
if(NULL==newMem)
{
fprintf(stderr, "realloc() failed.\n");
goto CLEANUP;
}
string=newMem;
string[i-1] = temp;
string[i] = '\0';
temp = fgetc(fp);
}
CLEANUP:
printf("%s", string);
return(string);
}
Take care to 'free()' the string returned by this function, or a memory leak will occur.
I am getting "Bus Error" trying to read stdin into a char* variable.
I just want to read whole stuff coming over stdin and put it first into a variable, then continue working on the variable.
My Code is as follows:
char* content;
char* c;
while( scanf( "%c", c)) {
strcat( content, c);
}
fprintf( stdout, "Size: %d", strlen( content));
But somehow I always get "Bus error" returned by calling cat test.txt | myapp, where myapp is the compiled code above.
My question is how do i read stdin until EOF into a variable? As you see in the code, I just want to print the size of input coming over stdin, in this case it should be equal to the size of the file test.txt.
I thought just using scanf would be enough, maybe buffered way to read stdin?
First, you're passing uninitialized pointers, which means scanf and strcat will write memory you don't own. Second, strcat expects two null-terminated strings, while c is just a character. This will again cause it to read memory you don't own. You don't need scanf, because you're not doing any real processing. Finally, reading one character at a time is needlessly slow. Here's the beginning of a solution, using a resizable buffer for the final string, and a fixed buffer for the fgets call
#define BUF_SIZE 1024
char buffer[BUF_SIZE];
size_t contentSize = 1; // includes NULL
/* Preallocate space. We could just allocate one char here,
but that wouldn't be efficient. */
char *content = malloc(sizeof(char) * BUF_SIZE);
if(content == NULL)
{
perror("Failed to allocate content");
exit(1);
}
content[0] = '\0'; // make null-terminated
while(fgets(buffer, BUF_SIZE, stdin))
{
char *old = content;
contentSize += strlen(buffer);
content = realloc(content, contentSize);
if(content == NULL)
{
perror("Failed to reallocate content");
free(old);
exit(2);
}
strcat(content, buffer);
}
if(ferror(stdin))
{
free(content);
perror("Error reading from stdin.");
exit(3);
}
EDIT: As Wolfer alluded to, a NULL in your input will cause the string to be terminated prematurely when using fgets. getline is a better choice if available, since it handles memory allocation and does not have issues with NUL input.
Since you don't care about the actual content, why bother building a string? I'd also use getchar():
int c;
size_t s = 0;
while ((c = getchar()) != EOF)
{
s++;
}
printf("Size: %z\n", s);
This code will correctly handle cases where your file has '\0' characters in it.
Your problem is that you've never allocated c and content, so they're not pointing anywhere defined -- they're likely pointing to some unallocated memory, or something that doesn't exist at all. And then you're putting data into them. You need to allocate them first. (That's what a bus error typically means; you've tried to do a memory access that's not valid.)
(Alternately, since c is always holding just a single character, you can declare it as char c and pass &c to scanf. No need to declare a string of characters when one will do.)
Once you do that, you'll run into the issue of making sure that content is long enough to hold all the input. Either you need to have a guess of how much input you expect and allocate it at least that long (and then error out if you exceed that), or you need a strategy to reallocate it in a larger size if it's not long enough.
Oh, and you'll also run into the problem that strcat expects a string, not a single character. Even if you leave c as a char*, the scanf call doesn't make it a string. A single-character string is (in memory) a character followed by a null character to indicate the end of the string. scanf, when scanning for a single character, isn't going to put in the null character after it. As a result, strcpy isn't going to know where the end of the string is, and will go wandering off through memory looking for the null character.
The problem here is that you are referencing a pointer variable that no memory allocated via malloc, hence the results would be undefined, and not alone that, by using strcat on a undefined pointer that could be pointing to anything, you ended up with a bus error!
This would be the fixed code required....
char* content = malloc (100 * sizeof(char));
char c;
if (content != NULL){
content[0] = '\0'; // Thanks David!
while ((c = getchar()) != EOF)
{
if (strlen(content) < 100){
strcat(content, c);
content[strlen(content)-1] = '\0';
}
}
}
/* When done with the variable */
free(content);
The code highlights the programmer's responsibility to manage the memory - for every malloc there's a free if not, you have a memory leak!
Edit: Thanks to David Gelhar for his point-out at my glitch! I have fixed up the code above to reflect the fixes...of course in a real-life situation, perhaps the fixed value of 100 could be changed to perhaps a #define to make it easy to expand the buffer by doubling over the amount of memory via realloc and trim it to size...
Assuming that you want to get (shorter than MAXL-1 chars) strings and not to process your file char by char, I did as follows:
#include <stdio.h>
#include <string.h>
#define MAXL 256
main(){
char s[MAXL];
s[0]=0;
scanf("%s",s);
while(strlen(s)>0){
printf("Size of %s : %d\n",s,strlen(s));
s[0]=0;
scanf("%s",s);
};
}