Read text file, save all digits into character string - c

I am trying to read a text file containing the string "a3rm5t?7!z*&gzt9v" and put all the numeric characters into a character string to later convert into an integer.
I am currently trying to do this by using sscanf on the buffer after reading the file, and then using sprintf to save all characters found using %u in a character string called str.
However, the integer that is returning when I call printf on str is different each time I run the program. What am I doing right and what am I doing wrong?
This code works when the text file contains a string like "23dog" and returns 23 but not when the string is something like 23dog2.
EDIT: I now realize that i should be putting the numeric characters in a character ARRAY rather than just one string.
int main(int argc, const char **argv)
{
int in;
char buffer[128];
char *str;
FILE *input;
in = open(argv[1], O_RDONLY);
read(in, buffer, 128);
unsigned x;
sscanf(buffer, "%u", &x);
sprintf(str,"%u\n", x);
printf("%s\n",str);
close (in);
exit(0);
}

If you simply want to filter out any non-digits from your input, you need not use scanf, sprintf and the like. Simply loop over the buffer and copy the characters that are digits.
The following program only works for a single line of input read from standard input and only if it is less than 512 characters long but it should give you the correct idea.
#include <stdio.h>
#define BUFFER_SIZE 512
int
main()
{
char buffer[BUFFER_SIZE]; /* Here we read into. */
char digits[BUFFER_SIZE]; /* Here we insert the digits. */
char * pos;
size_t i = 0;
/* Read one line of input (max BUFFER_SIZE - 1 characters). */
if (!fgets(buffer, BUFFER_SIZE, stdin))
{
perror("fgets");
return 1;
}
/* Loop over the (NUL terminated) buffer. */
for (pos = buffer; *pos; ++pos)
{
if (*pos >= '0' && *pos <= '9')
{
/* It's a digit: copy it over. */
digits[i++] = *pos;
}
}
digits[i] = '\0'; /* NUL terminate the string. */
printf("%s\n", digits);
return 0;
}

A good approach to any problem like this is to read the entire line into a buffer and then assign a pointer to the buffer. You can then use the pointer to step through the buffer reading each character and acting on it appropriately. The following is one example of this approach. getline is used to read the line from the file (it has the advantage of allocating space for buffer and returning the number of characters read). You then allocate space for the character string based on the size of buffer as returned by getline. Remember, when done, you are responsible for freeing the memory allocated by getline.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main (int argc, const char **argv)
{
char *buffer = NULL; /* forces getline to allocate required space */
ssize_t read = 0; /* number of characters read by getline */
size_t n = 0; /* limit of characters to read, (0 no limit) */
char *str = NULL; /* string to hold digits read from file */
char *p = NULL; /* ptr to use with buffer (could use buffer) */
int idx = 0; /* index for adding digits to str */
int number = 0; /* int to hold number parsed from file */
FILE *input;
/* validate input */
if (argc < 2) { printf ("Error: insufficient input. Usage: %s filename\n", argv[0]); return 1; }
/* open and validate file */
input = fopen(argv[1], "r");
if (!input) { printf ("Error: failed to open file '%s\n", argv[1]); return 1; }
/* read line from file with getline */
if ((read = getline (&buffer, &n, input)) != -1)
{
str = malloc (sizeof (char) * read); /* allocate memory for str */
p = buffer; /* set pointer to buffer */
while (*p) /* read each char in buffer */
{
if (*p > 0x2f && *p < 0x3a) /* if char is digit 0-9 */
{
str[idx] = *p; /* copy to str at idx */
idx++; /* increment idx */
}
p++; /* increment pointer */
}
str[idx] = 0; /* null-terminate str */
number = atoi (str); /* convert str to int */
printf ("\n string : %s number : %d\n\n", buffer, number);
} else {
printf ("Error: nothing read from file '%s\n", argv[1]);
return 1;
}
if (input) fclose (input); /* close input file stream */
if (buffer) free (buffer); /* free memory allocated by getline */
if (str) free (str); /* free memory allocated to str */
return 0;
}
datafile:
$ cat dat/fwintstr.dat
a3rm5t?7!z*&gzt9v
output:
$ ./bin/prsint dat/fwintstr.dat
string : a3rm5t?7!z*&gzt9v
number : 3579

Related

Finding Longest Word in a String

I am very new in C coding. I have written my code to find the longest word in a string. my code does not show any error but it prints a word with strange characters that is not in the string. Can you tell me what is wrong with my code?
Thank you
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char LongestWord (char GivenString[100]);
int main()
{
char input[100];
char DesiredWord[20];
printf("please give a string:\n");
gets(input);
DesiredWord[20]=LongestWord(input);
printf("longest Word is:%s\n",DesiredWord);
return 0;
}
char LongestWord (char GivenString[100]){
//It is a predefined function, by using this function we can clear the data from console (Monitor).
//clrscr()
int position1=0;
int position2=0;
int longest=0;
int word=0;
int Lenght=strlen(GivenString);
char Solution[20];
int p=0;
for (int i=1; i<=Lenght; i++){
if (GivenString[i-1]!=' '){
word=word++;
}
if(GivenString[i-1]=' '){
if (word>longest){
//longest stores the length of longer word
longest=word;
position2=i-1;
position1=i-longest;
word=0;
}
}
}
for (int j=position1; j<=position2; j++){
Solution[p]=GivenString[j];
p=p++;
}
return (Solution[20]);
}
This should work:
#include <stdio.h>
#include <string.h>
void LongestWord(char string[100])
{
char word[20],max[20],min[20],c;
int i = 0, j = 0, flag = 0;
for (i = 0; i < strlen(string); i++)
{
while (i < strlen(string) && string[i]!=32 && string[i]!=0)
{
word[j++] = string[i++];
}
if (j != 0)
{
word[j] = '\0';
if (!flag)
{
flag = !flag;
strcpy(max, word);
}
if (strlen(word) > strlen(max))
{
strcpy(max, word);
}
j = 0;
}
}
printf("The largest word is '%s' .\n", max);
}
int main()
{
char string[100];
printf("Enter string: ");
gets(string);
LongestWord(string);
}
Aside from invoking Undefined Behavior by returning a pointer to a locally declared array in LongestWord, using gets despite gets() is so dangerous it should never be used! and writing beyond the end of the Solution array -- you are missing the logic of identifying the longest word.
To identify the longest word, you must obtain the length of each word as you work you way down the string. You must keep track of what the longest string seen, and only if the current string is longer than the longest seen so far do you copy to valid memory that will survive the function return (and nul-terminate).
There are a number of ways to do this. You can use strtok to tokenize all words in the string, you can use a combination of strcspn and strspn to bracket the words, you can use sscanf and an offset to the beginning of each word, or what I find easiest is just to use a pair of pointers sp (start-pointer) and ep (end-pointer) to work down the string.
There you just move sp to the first character in each word and keep moving ep until you find a space (or end of string). The word length is ep - sp and then if it is the longest, you can simply use memcpy to copy length characters to your longest word buffer and nul-terminate, (repeat until you run out of characters)
To create valid storage, you have two-choices, either pass an array of sufficient size (see comment), or declare a valid block of memory within your function using malloc (or calloc or realloc) and return a pointer to that block of memory.
An example passing an array of sufficient size to hold the longest word could be:
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#define MAXW 256 /* longest word buffer size */
#define MAXC 1024 /* input string buffer size */
size_t longestword (char *longest, const char *str)
{
int in = 0; /* flag reading (in/out) of word */
size_t max = 0; /* word max length */
const char *sp = str, /* start-pointer for bracketing words */
*ep = str; /* end-pointer for bracketing words */
*longest = 0; /* initialize longest as empty-string */
for (;;) { /* loop over each char in str */
if (isspace (*ep) || !*ep) { /* is it a space or end? */
if (in) { /* are we in a word? */
size_t len = ep - sp; /* if so, get word length */
if (len > max) { /* is it longest? */
max = len; /* if so, set max to len */
memcpy (longest, sp, len); /* copy len chars to longest */
longest[len] = 0; /* nul-terminate longest */
}
in = 0; /* it's a space, no longer in word */
}
if (!*ep) /* if end of string - done */
break;
}
else { /* not a space! */
if (!in) { /* if we are not in a word */
sp = ep; /* set start-pointer to current */
in = 1; /* set in flag */
}
}
ep++; /* increment end-pointer to next char */
}
return max; /* return max length */
}
int main (void) {
char str[MAXC] = "", /* storage for input string */
word[MAXW] = ""; /* storage for longest word */
size_t max = 0; /* longest word length */
fputs ("enter string: ", stdout); /* prompt */
if (!fgets (str, MAXC, stdin)) { /* validate input */
fputs ("(user canceled input)\n", stderr);
return 1;
}
if ((max = longestword (word, str))) /* get length and longest word */
printf ("longest word: %s (%zu-chars)\n", word, max);
}
(note: by using this method you ignore all leading, trailing and intervening whitespace, so strings like " my little dog has 1 flea . " do not present problems.)
Example Use/Output
$ ./bin/longest_word
enter string: my dog has fleas
longest word: fleas (5-chars)
$ ./bin/longest_word
enter string: my little dog has 1 flea .
longest word: little (6-chars)
There are many, many ways to do this. This is one of the most basic, using pointers. You could do the same thing using indexes, e.g. string[i], etc.. That just requires you maintain an offset to the start of each word and then do the subtraction to get the length. strtok is convenient, but modifies the string being tokenized so it cannot be used with string literals or other constant strings.
Best way to learn is work the problem 3-different ways, and pick the one that you find the most intuitive. Let me know if you have further questions.
please declare a proper main entry point: int main( int argc, const char* argv[] )
Use fgets instead of gets, as gets does not check the bound of your string ( what happened when you enter a 120 chars line)
pass the length of the expected string to LongestWord
if available prefer using strnlen to plain strlen, there might be scenario where your string is not properly terminated.
Better yet use the suggested length parameter to limit your loop and break when a terminating char is encountered.
your Solution is a stack allocated array, returning it as it is might depend on your implementation, you might be better returning a heap allocated array (using malloc).
Suggested changes
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char* getLongestWord(char* input, size_t input_length, size_t *result_length);
int main( int argc, const char* argv[] )
{
const size_t max_length = 100;
char input[max_length]; // consider using LINE_MAX from limits.h
printf("please give a string:\n");
if ( fgets( input, max_length, stdin ) == NULL ) return EXIT_FAILURE; // some failure happened with fgets.
size_t longestWord_length = 0;
char* longestWord = getLongestWord(input, max_length , &longestWord_length);
printf("longest Word is %.*s\n",longestWord_length, longestWord );
return EXIT_SUCCESS;
}
char* getLongestWord(char* input, size_t input_length, size_t *result_length) {
char* result = NULL;
size_t length = 0;
size_t word_start = 0, word_end = 0;
for(int i = 0; i < input_length; ++i) {
if( (input[i] == ' ') || (input[i] == 0) ) {
if( i == 0 ) { // first space
word_start = 1;
continue;
}
word_end = i-1;
size_t word_length = word_end - word_start+1;
if( word_length <= length ) {
word_start = i + 1; // next word start
continue;
}
// new max length
length = word_length;
result = &input[word_start];
word_start = i + 1; // next word start
}
if( input[i] == 0 ) break; // end of string
}
*result_length = length;
return result;
}

How to separate binary strings from text file and store them in 1d or 2d char array?

"01110011 01100001 01100100 " This would be one line having the 8 bits separated by spaces in the file.
Currently I have:
if (fr != NULL) //see if file opens or not
{
char chter[500]; //char to get string from text
char *ptr; //pointer to char
//char store[100][32];
fgets(chter, 1000, fr); //gets text from file
printf("%s", chter); //prints current text to cmd from textfile
puts("\n");
for (int i = 0; i < 1; i++)
{
li1 = strtol(chter, &ptr, 2); //convert 1st binary set to alphabet
printf("%c", li1); //Not sure how to get the rest of the sets from here on
}
puts("\n");
fclose(fr);
}
I was thinking of using a 2d array to have multiple strings stored, however I'm stuck. As I don't know how to seperatly extract the binary bits from the other string.
After each call to strtol, ptr will point to the first byte in the string that's not part of the parsed integer. That will be your starting point on the next iteration. Since fgets returns a string with a newline, loop until ptr points to either the newline or the null byte at the end of the string.
char *ptr, *tmp;
tmp = chter;
do
{
li1 = strtol(tmp, &ptr, 2);
printf("%c", li1);
tmp = ptr;
} while ((*ptr != '\n') && (*ptr != '\0'));
The primary problem you have is failing to use separate pointers for strtol (ptr, &endptr, 2) which would then allow you to work through all values held in chter. Secondly, you risk Undefined Behavior by potentially reading 1000 characters where chter will only hold 500. Thirdly, you need to properly validate the results of the strtol conversion by checking (1) whether digits were converted; and (2) whether overflow/undeflow occurred by checking errno.
Putting those together, you could do:
#include <errno.h>
...
if (fr != NULL) //see if file opens or not
{
char chter[500]; //char to get string from text
char *ptr = chter; /* assign chter to the pointer */
char *endptr; /* separate end-pointer for strtol */
fgets(chter, sizeof chter, fr); /* properly limit read size */
printf("%s", chter); //prints current text to cmd from textfile
puts("\n");
errno = 0; /* set errno zero */
while (*ptr && *ptr != '\n') /* loop over all values */
{
long li1 = strtol (ptr, &endptr, 2); /* convert to long */
if (ptr == endptr) { /* validate digits converted */
fputs ("error: no digits converted.\n", stderr);
/* handle error */
break;
}
else if (errno) { /* validate no over/underflow */
perror ("strtol-conversion_failed");
/* handle error */
break;
}
printf (" %ld", li1); /* output value */
ptr = endptr; /* advance pointer */
}
// puts("\n");
putchar ('\n'); /* use putchar for single-char output */
fclose(fr);
}
(note: not compiled, so drop a note if you have problems)

Text line character count dynamically in c

char *line = NULL;
int count=0;
line = (char*)malloc(sizeof(char));
while(fgets(line,sizeof(line),file)){
line = realloc(line,sizeof(char*)); // dynamically increase allocate memory
count++;
}
printf("count number :%d\n",count);
free(line);
I am trying to count character in every line reading text , but for now trying to one line . Every time count number is 4 even i give more character string . I am confused . Please help me !!
Some issues :
First, you want a line :
line = (char*)malloc(sizeof(char));
This is equivalent to allocate one byte - sizeof(char) - and store its address to line. Maybe you want to get a larger buffer to get some characters from a file.
One of the way to do that is to define a constant size :
#define BUFFER_SIZE 256
line = (char *)malloc(sizeof(char) * BUFFER_SIZE);
After, you run the counter while.
while(fgets(line,sizeof(line),file))
is also wrong, because you want to read at most sizeof(line) bytes, which is equivalent to sizeof(char *). It's 32 or 64 bits depending on your system architecture.
You want to read as most the size of your buffer, which means you want to read at most BUFFER_SIZE characters. So it's better to do :
while(fgets(line,sizeof(char) * BUFFER_SIZE, file))
{
/* do stuff */
}
It's a warning : the use of fgets is dangerous. If you want to get bytes from file and also to count them you can use fread like :
size_t tmp;
while(tmp = fread(line, sizeof(char), BUFFER_SIZE, file))
{
count += tmp;
/* do stuff on line */
}
But if you only want to get the size of your file, go check this other post.
One way to do this without tying yourself in knots over memory allocation, etc, is:
FILE *f;
int n;
char c;
int line_number = 1;
int line_length = 0;
f = fopen("whatever", "r");
while (n = fread(&c, 1, 1, f))
{
if (c != '\n')
line_length += 1;
else
{
printf("Length of line %d = %d\n", line_number , line_length);
line_number += 1;
line_length = 0;
}
}
fclose(f);
i.e. read the file one character at a time, counting characters as you go. Let the OS and the runtime library worry about buffering - that's what they're there for. Perhaps not the most efficient, but sometimes simplicity is beneficial.
Best of luck.
Here is a function mfgets that reads a line into a dynamically allocated buffer. It should be reasonably bomb-proof.
Like fgets it returns NULL if no characters were read. However, it can also return NULL if the initial buffer allocation failed before any characters were read.
It sets errno to ENOMEM if a buffer allocation or reallocation failed at any point, but if any characters have been read, then a buffer is still returned.
As a bonus, the second parameter can be used to obtain the length of the string in the buffer.
The returned buffer can be freed by calling the free function.
mfgets.h:
#ifndef MFGETS_H__INCLUDED__
#define MFGETS_H__INCLUDED__
#include <stdio.h>
char *mfgets(FILE *stream, size_t *stringlen);
#endif
mfgets.c:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <limits.h>
#include <errno.h>
#include "mfgets.h"
/**
* Read a line into allocated memory.
*
* Reads a line from a stream into memory allocated by \b malloc() or
* \b realloc() until an \b EOF or a newline is read. If a newline is
* read, it is stored into the memory. A terminating null byte is
* stored after the last character in the memory. The memory can be
* freed with \b free().
*
* \param stream The stream pointer.
* \param[out] stringlen If non-null, set to length of string read.
*
* \return A pointer to the memory if at least one character was read,
* otherwise \c NULL.
*
* \remark \c errno is set to \c ENOMEM on failure to allocate memory
* of sufficient size to store the whole line. If the line has been
* partially read, memory is still returned even if \c errno is set to
* \c ENOMEM.
*/
char *mfgets(FILE *stream, size_t *stringlen)
{
size_t buflen = 256; /* initial allocation size */
size_t slen = 0; /* string length */
int err = 0; /* new error */
int olderr = errno; /* old error propagation */
char *buf; /* allocated buffer */
char *newbuf; /* reallocated buffer */
/* allocate initial buffer */
buf = malloc(buflen);
if (!buf) {
err = ENOMEM;
} else {
/* read remainder of line into new part of buffer */
while (fgets(buf + slen, buflen - slen, stream)) {
/* update string length */
slen += strlen(buf + slen);
if (slen < buflen - 1 || buf[slen - 1] == '\n') {
/* fgets() did not run out of space */
break;
}
/* need to increase buffer size */
if (buflen == SIZE_MAX) {
/* cannot increase buffer size */
err = ENOMEM;
break;
}
if (SIZE_MAX - buflen >= buflen && buflen <= INT_MAX) {
/* double buffer size */
buflen *= 2;
} else if (SIZE_MAX - buflen > INT_MAX) {
/* increase buffer size by maximum amount
* that can be passed to fgets() */
buflen += INT_MAX;
} else {
/* increase buffer size to maximum amount */
buflen = SIZE_MAX;
}
/* reallocate buffer with new size */
newbuf = realloc(buf, buflen);
if (!newbuf) {
err = ENOMEM;
break;
}
buf = newbuf;
}
/* finished reading line (or reached EOF or stream error) */
if (slen) {
/* reallocate buffer to actual string size */
newbuf = realloc(buf, slen + 1);
if (newbuf) {
buf = newbuf;
}
} else {
/* no characters read, so do not return a buffer */
free(buf);
buf = NULL;
}
}
if (stringlen) {
/* caller wants actual length of string */
*stringlen = slen;
}
/* set new error or propagate old error */
errno = err ? err : olderr;
/* return buffer or NULL */
return buf;
}
Test program:
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include "mfgets.h"
int main(void)
{
size_t slen;
char *line;
errno = 0;
while ((line = mfgets(stdin, &slen)) != NULL) {
printf("(%zu) %s", slen, line);
free(line);
errno = 0;
}
if (errno) {
perror("");
}
return 0;
}

Program that reads multiple lines from a file and converts to a single line string

I'm trying to create a C program that takes a file path as a command line argument. My program then reads that file line by line, removing the endline character and adding it to a string so at the end the string is the input file but on only one line.
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int main(int argc, char **argv){
FILE *input = fopen(argv[1], "r");
char *line = NULL;
char buffer[100];
char temp[200];
line = fgets(buffer, 100 *sizeof(char), input);
while(line != NULL){
line[strlen(line)-1] = '\0';
strcat(temp, line);
line = fgets(buffer, 100 *sizeof(char), input);
}
printf("%s\n", temp);
fclose(input);
}
I input a file with data:
This is Line 1
line 2
line 3
and this is line 4
and expect the string to be
This is Line 1line 2line 3and this is line 4
but instead it is
and this is line 1
Any help is appreciated!
You risk Undefined Behavior for (1) failing to validate input is actually a valid file stream opened for reading, and (2) failing to track the maximum number of characters stored in temp allowing more characters to be stored than you have room for (you would invoke undefined behavior by failing to check that the argument count is greater than 1, but as used in fopen the sentinel NULL at the end of the argument vector (argv) simply causes fopen to fail -- but since you fail to check that...).
Before looking at possible fixes, let's look at some general coding points. Don't use magic numbers in your code, e.g. 100, 200,
/* if use need constants, #define them or use an enum */
enum { BUFSZ = 100, TMPSZ = 200 };
int main (int argc, char **argv){
char buffer[BUFSZ] = "", /* read buffer */
temp[TMPSZ] = ""; /* buffer for combined lines */
By defining constants at the beginning of each source file, if you need to adjust them over time, you don't have to go picking through each declaration, loop variable, test clause, etc.. to make sure you have make all required changes.
Before using command line arguments, validate your argument count (argc):
if (argc < 2 ) { /* validate at least 1 argument given */
fprintf (stderr, "error: insufficient input, usage: %s file\n", argv[0]);
return 1;
}
Before reading from any file you open:
if (!input) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
When reading with fgets, unless you are discarding the first line, there is no need for a separate fgets before entering your while loop to read the file,
while (fgets(buffer, BUFSZ, input)) { /* read each line */
...
}
Validate whether the last character read in buffer is the '\n' characters:
size_t len = strlen (buffer); /* get length */
if (len && buffer[len - 1] == '\n') /* check if last is '\n' */
buffer[--len] = 0; /* overwrite with nul-char */
Now to the meat and potatoes of preventing Undefined Behavior. You only have TMPSZ characters available, and if you plan to use temp as a string, you must make sure you leave room for the nul-terminating character '\0' as the final character. ('\0' is equivalent to 0 (zero)). So check the length, the number of characters you have stored so far, and write at most one less than the number of characters remaining in temp,
if (TMPSZ <= nchr + len + 1) { /* room to fit in buffer? */
strncat (temp, buffer, TMPSZ - nchr - 1); /* only what fits */
nchr = TMPSZ - 1; /* nchr is max (+1 for nul-char) */
temp[nchr] = 0; /* affirmatively nul-terminate */
break; /* bail - all full */
}
If you didn't reach the end of temp in the test above, then buffer will fit in temp, so just copy it and update the total character count:
strcat (temp, buffer); /* copy buffer to temp */
nchr += len; /* update nchr */
}
That's it. Putting it altogether, you can read from the file given as the first argument (or from stdin by default if no argument is given) and combine all lines into a single line up to TMPSZ - 1 characters, e.g.
#include <stdio.h>
#include <string.h>
/* if use need constants, #define them or use an enum */
enum { BUFSZ = 100, TMPSZ = 200 };
int main (int argc, char **argv){
char buffer[BUFSZ] = "", /* read buffer */
temp[TMPSZ] = ""; /* buffer for combined lines */
size_t nchr = 0; /* total character count */
FILE *input = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!input) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
while (fgets(buffer, BUFSZ, input)) { /* read each line */
size_t len = strlen (buffer); /* get length */
if (len && buffer[len - 1] == '\n') /* check if last is '\n' */
buffer[--len] = 0; /* overwrite with nul-char */
if (TMPSZ <= nchr + len + 1) { /* room to fit in buffer? */
strncat (temp, buffer, TMPSZ - nchr - 1); /* only what fits */
nchr = TMPSZ - 1; /* nchr is max (+1 for nul-char) */
temp[nchr] = 0; /* affirmatively nul-terminate */
break; /* bail - all full */
}
strcat (temp, buffer); /* copy buffer to temp */
nchr += len; /* update nchr */
}
if (input != stdin) fclose (input); /* close file if not stdin */
printf ("'%s'\n(%zu chars)\n", temp, nchr); /* output temp */
return 0;
}
Example Use/Output
$ ./bin/cmblines <dat/cmblines.txt
'This is Line 1line 2line 3and this is line 4'
(44 chars)
Example with TMPSZ = 20
Let's perform a test. Let's set TMPSZ = 20 and verify our code properly protects against writing beyond the end of temp,
$ ./bin/cmblines <dat/cmblines.txt
'This is Line 1line '
(19 chars)
Look things over and let me know if you have further questions.
The following program works for me.
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int main(int argc, char **argv){
FILE *input = fopen(argv[1], "r");
char *line = NULL;
char buffer[100]={0};
char temp[200]={0};
line = fgets(buffer, 100 *sizeof(char), input);
while(line != NULL){
strcat(temp, line);
temp[strlen(temp)-1] = '\0';
line = fgets(buffer, 100 *sizeof(char), input);
}
printf("%s\n", temp);
fclose(input);
}

strcat() causes segmentation fault only after program is finished?

Here's a program that summarizes text. Up to this point, I'm counting the number of occurrences of each word in the text. But, I'm getting a segmentation fault in strcat.
Program received signal SIGSEGV, Segmentation fault.
0x75985629 in strcat () from C:\WINDOWS\SysWOW64\msvcrt.dll
However, while stepping through the code, the program runs the strcat() function as expected. I don't receive the error until line 75, when the program ends.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
#define MAXTEXT 1000
#define MAXLINE 200
#define MAXWORDS 200
#define MAXWORD 32
char *strtolower(char *d, const char *s, size_t len);
/* summarizer: summarizes text */
int main(int argc, char *argv[]) {
/* argument check */
char *prog = argv[0];
if (argc < 1) {
fprintf(stderr, "%s: missing arguments, expected 1", prog);
exit(1);
}
/* attempt to open file */
FILE *fp;
if (!(fp = fopen(argv[1], "r"))) {
fprintf(stderr, "%s: Couldn't open file %s", prog, argv[1]);
exit(2);
}
/* read file line by line */
char line[MAXLINE], text[MAXTEXT];
while ((fgets(line, MAXTEXT, fp))) {
strncat(text, line, MAXLINE);
}
/* separate into words and count occurrences */
struct wordcount {
char *word;
int count;
};
struct wordcount words[MAXWORDS];
char *token, *delim = " \t\n.,!?";
char word[MAXWORD], textcpy[strlen(text)+1]; /*len of text and \0 */
int i, j, is_unique_word = 1;
strcpy(textcpy, text);
token = strtok(textcpy, delim);
for (i = 0; i < MAXWORDS && token; i++) {
strtolower(word, token, strlen(token));
token = strtok(NULL, delim);
/* check if word exists */
for (j = 0; words[j].word && j < MAXWORDS; j++) {
if (!strcmp(word, words[j].word)) {
is_unique_word = 0;
words[j].count++;
}
}
/* add to word list of unique */
if (is_unique_word) {
strcpy(words[i].word, word);
words[i].count++;
}
is_unique_word = 1;
}
return 0;
}
/* strtolower: copy str s to dest d, returns pointer of d */
char *strtolower(char *d, const char *s, size_t len) {
int i;
for (i = 0; i < len; i++) {
d[i] = tolower(*(s+i));
}
return d;
}
The problem is in the loop: while ((fgets(line, MAXTEXT, fp))) strncat(text, line, MAXLINE);. It is incorrect for multiple reasons:
text is uninitialized, concatenating a string to it has undefined behavior. Undefined behavior may indeed cause a crash after the end of the function, for example if the return address was overwritten.
there is no reason to use strncat() with a length of MAXLINE, the string read by fgets() has at most MAXLINE-1 bytes.
you do not check if there is enough space at the end of text to concatenate the contents of line. strncat(dest, src, n) copies at most n bytes from src to the end of dest and always sets a null terminator. It is not a safe version of strcat(). If the line does not fit at the end of text, you have unexpected behavior, and again you can observe a crash after the end of the function, for example if the return address was overwritten.
You could just try and read the whole file with fread:
/* read the file into the text array */
char text[MAXTEXT];
size_t text_len = fread(text, 1, sizeof(text) - 1, fp);
text[text_len] = '\0';
If text_len == sizeof(text) - 1, the file is potentially too large for the text array and the while loop would have caused a buffer overflow.
There is at least one problem because you create line with MAXLINE size (200), then you fgets() up to MAXTEXT (1000) chars into it.
Destination string of strncat function shall be null terminated. You need to null terminate text before passing it to strncat function. You also have to write only upto MAXLINE-1 bytes and leave a space for '\0' appended by strncat at the end to stop buffer overflow.
char line[MAXLINE], text[MAXTEXT] = {'\0'};
while ((fgets(line, MAXTEXT, fp)))
{
strncat(text, line, MAXLINE-1);
}

Resources