txt to separate strings in c - c

I have been trying to take chars from a txt file(in which the words of the text that will become strings will be separated by spaces) and import them into strings in my code. I tried it but I only could print the words (that are separated by spaces). How can I input them into strings?
The code that prints the words is the following, but I also need it to save the string into arrays or pointers if possible.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(){
FILE *fp;
int i=0;
char *words=NULL,*word=NULL,c;
if ((fp=fopen("monologue.txt","r"))==NULL){ /*Where monologue txt is a normal file with plain text*/
printf("Error Opening File\n");
exit(1);}
while ((c = fgetc(fp))!= EOF){
if (c=='\n'){ c = ' '; }
words = (char *)realloc(words, ++i*sizeof(char));
words[i-1]=c;}
word=strtok(words," ");
while(word!= NULL){
printf("%s\n",word);
word = strtok(NULL," ");}
exit(0);
}

Your code is rather hard to read. Here is almost identical code that is (I submit) considerably more readable:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
const char filename[] = "monologue.txt";
FILE *fp;
int i = 0;
char *words = NULL;
char *word = NULL;
int c;
if ((fp = fopen(filename, "r")) == NULL)
{
/*Where monologue txt is a normal file with plain text*/
fprintf(stderr, "Error opening file %s\n", filename);
exit(1);
}
while ((c = fgetc(fp)) != EOF)
{
if (c == '\n')
c = ' ';
words = (char *)realloc(words, ++i * sizeof(char));
words[i-1] = c;
}
word = strtok(words, " ");
while (word != NULL)
{
printf("%s\n", word);
word = strtok(NULL, " ");
}
return(0);
}
This shows us that you are slurping the entire file into the string pointed to by words, but you are doing so rather inefficiently in that you are reallocating memory one byte at a time for each byte read. You should be looking to do things much more effectively, by reading bigger chunks of the file into memory. For example, you might allocate an initial buffer of 32 KiB; you could read into that buffer using fread(); if you don't encounter EOF, you could then reallocate the space, doubling the amount available to you. (For testing, you'd start with a much smaller block - maybe 16 bytes, maybe even as small as 4 bytes; this ensures you test the memory reallocation code, whereas 32 KiB would probably seldom exercise the reallocation code.)
You also need to ensure that your string is null terminated; as it stands, it is not. You would need to do a final realloc() to make space for the null terminator too.
You can avoid mapping newlines during input since strtok() can be given a list of characters on which to split, so you can add newline to that list.
To generate a list of words, you need to adapt the loop around strtok(). You might simply count the spaces and newlines and then allocate enough pointers to point to that many words; you might have an overestimate if there are adjacent spaces or newlines, but better over than under. Alternatively, you can can allocate, for sake of argument, 16 pointers. As you process the first 16 words, you use these pointers; when you run out of space, you double the number of pointers allocated, and use the new supply until that runs out. You can use any algorithm that allocates a significant number of pointers (meaning 'more than one' and 'increasing as the number already used goes up') instead of simple doubling, but doubling has its merits (notably, it is simple).
One word of caution: you should never assign the result of realloc() to the variable that is its first argument:
words = (char *)realloc(words, ++i * sizeof(char)); // Bad!
The trouble is that if realloc() fails, you've just wiped out the only pointer to the previously allocated memory, so you have leaked it all. Always assign to a new variable, test that it worked, then copy the result:
char *new_space = (char *)realloc(words, ++i * sizeof(char));
if (new_space == 0)
{
fprintf(stderr, "Memory allocation failed at size %d\n", i);
exit(1);
}
words = new_space;
I assembled this code yesterday. Notice that it uses functions to do repeated jobs - such as checking that memory allocation succeeded. There is room to improve it (there always is). It does character at a time input still (and newline mapping, therefore) but allocates increasingly large chunks of memory so that it does not do memory allocation on every character read. The err_exit() function is a useful skeleton; you can flesh it out into a much more complex system, but the basic idea of a function to report errors and exit (with a behaviour similar to fprintf() + exit() can simplify programs a lot (and error checking and reporting is important, but needs to be simple when it can be).
#include <assert.h>
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static void err_exit(const char *format, ...);
static void *emalloc(size_t nbytes);
static void *erealloc(void *old_space, size_t nbytes);
int main(void)
{
const char filename[] = "monologue.txt";
FILE *fp;
size_t i = 0;
size_t len_data = 4;
char *data = emalloc(len_data);
int c;
/* Read data from file */
if ((fp = fopen(filename, "r")) == NULL)
err_exit("Error opening file %s\n", filename);
while ((c = fgetc(fp)) != EOF)
{
if (c == '\n')
c = ' ';
if (i >= len_data)
{
assert(i == len_data);
data = realloc(data, 2 * len_data);
len_data *= 2;
}
data[i++] = c;
}
if (i >= len_data)
{
assert(i == len_data);
data = erealloc(data, len_data + 1);
len_data++;
}
data[i] = '\0';
fclose(fp);
/* Split file into words */
size_t len_wordlist = 16;
size_t num_words = 0;
char **wordlist = emalloc(len_wordlist * sizeof(char *));
char *location = data;
char *word;
for (num_words = 0; (word = strtok(location, " ")) != NULL; num_words++)
{
if (num_words >= len_wordlist)
{
assert(num_words == len_wordlist);
wordlist = erealloc(wordlist, 2 * len_wordlist * sizeof(char *));
len_wordlist *= 2;
}
wordlist[num_words] = word;
location = NULL;
}
/* Print the word list - one per line */
for (i = 0; i < num_words; i++)
printf("%zu: %s\n", i, wordlist[i]);
/* Release allocated space */
free(data);
free(wordlist);
return(0);
}
static void err_exit(const char *format, ...)
{
va_list args;
va_start(args, format);
vfprintf(stderr, format, args);
va_end(args);
exit(1);
}
static void *emalloc(size_t nbytes)
{
void *new_space = malloc(nbytes);
if (new_space == 0)
err_exit("Failed to allocate %zu bytes of memory\n", nbytes);
return(new_space);
}
static void *erealloc(void *old_space, size_t nbytes)
{
void *new_space = realloc(old_space, nbytes);
if (new_space == 0)
err_exit("Failed to reallocate %zu bytes of memory\n", nbytes);
return(new_space);
}

Try this. I've modified very little about your code, just to keep it close to your starting point. The main thing I did was add allwords which is an array of char * (this is where I store each string one by one). Then right after printing each version of word (what you were already doing), I also copied it into the next open slot in the allwords array. At the end I added another printing loop to display the contents of each string.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define MAXWORDS 999
int main(){
FILE *fp;
int i=0, j;
char *words=NULL,*word=NULL,c;
char *allwords[MAXWORDS];
if ((fp=fopen("monologue.txt","r"))==NULL){ /*Where monologue txt is a normal file with plain text*/
printf("Error Opening File\n");
exit(1);}
while ((c = fgetc(fp))!= EOF){
if (c=='\n'){ c = ' '; }
words = (char *)realloc(words, ++i*sizeof(char));
words[i-1]=c;}
word=strtok(words," ");
i=0;
while(word!= NULL && i < MAXWORDS){
printf("%s\n",word);
allwords[i] = malloc(strlen(word));
strcpy(allwords[i], word);
word = strtok(NULL," ");
i++;
}
printf("\nNow printing each saved string:\n");
for (j=0; j<i; j++)
printf("String %d: %s\n", j, allwords[j]);
exit(0);
}

Related

Input a char string with any size [duplicate]

If I don't know how long the word is, I cannot write char m[6];,
The length of the word is maybe ten or twenty long.
How can I use scanf to get input from the keyboard?
#include <stdio.h>
int main(void)
{
char m[6];
printf("please input a string with length=5\n");
scanf("%s",&m);
printf("this is the string: %s\n", m);
return 0;
}
please input a string with length=5
input: hello
this is the string: hello
Enter while securing an area dynamically
E.G.
#include <stdio.h>
#include <stdlib.h>
char *inputString(FILE* fp, size_t size){
//The size is extended by the input with the value of the provisional
char *str;
int ch;
size_t len = 0;
str = realloc(NULL, sizeof(*str)*size);//size is start size
if(!str)return str;
while(EOF!=(ch=fgetc(fp)) && ch != '\n'){
str[len++]=ch;
if(len==size){
str = realloc(str, sizeof(*str)*(size+=16));
if(!str)return str;
}
}
str[len++]='\0';
return realloc(str, sizeof(*str)*len);
}
int main(void){
char *m;
printf("input string : ");
m = inputString(stdin, 10);
printf("%s\n", m);
free(m);
return 0;
}
With the computers of today, you can get away with allocating very large strings (hundreds of thousands of characters) while hardly making a dent in the computer's RAM usage. So I wouldn't worry too much.
However, in the old days, when memory was at a premium, the common practice was to read strings in chunks. fgets reads up to a maximum number of chars from the input, but leaves the rest of the input buffer intact, so you can read the rest from it however you like.
in this example, I read in chunks of 200 chars, but you can use whatever chunk size you want of course.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char* readinput()
{
#define CHUNK 200
char* input = NULL;
char tempbuf[CHUNK];
size_t inputlen = 0, templen = 0;
do {
fgets(tempbuf, CHUNK, stdin);
templen = strlen(tempbuf);
input = realloc(input, inputlen+templen+1);
strcpy(input+inputlen, tempbuf);
inputlen += templen;
} while (templen==CHUNK-1 && tempbuf[CHUNK-2]!='\n');
return input;
}
int main()
{
char* result = readinput();
printf("And the result is [%s]\n", result);
free(result);
return 0;
}
Note that this is a simplified example with no error checking; in real life you will have to make sure the input is OK by verifying the return value of fgets.
Also note that at the end if the readinput routine, no bytes are wasted; the string has the exact memory size it needs to have.
I've seen only one simple way of reading an arbitrarily long string, but I've never used it. I think it goes like this:
char *m = NULL;
printf("please input a string\n");
scanf("%ms",&m);
if (m == NULL)
fprintf(stderr, "That string was too long!\n");
else
{
printf("this is the string %s\n",m);
/* ... any other use of m */
free(m);
}
The m between % and s tells scanf() to measure the string and allocate memory for it and copy the string into that, and to store the address of that allocated memory in the corresponding argument. Once you're done with it you have to free() it.
This isn't supported on every implementation of scanf(), though.
As others have pointed out, the easiest solution is to set a limit on the length of the input. If you still want to use scanf() then you can do so this way:
char m[100];
scanf("%99s",&m);
Note that the size of m[] must be at least one byte larger than the number between % and s.
If the string entered is longer than 99, then the remaining characters will wait to be read by another call or by the rest of the format string passed to scanf().
Generally scanf() is not recommended for handling user input. It's best applied to basic structured text files that were created by another application. Even then, you must be aware that the input might not be formatted as you expect, as somebody might have interfered with it to try to break your program.
There is a new function in C standard for getting a line without specifying its size. getline function allocates string with required size automatically so there is no need to guess about string's size. The following code demonstrate usage:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char *line = NULL;
size_t len = 0;
ssize_t read;
while ((read = getline(&line, &len, stdin)) != -1) {
printf("Retrieved line of length %zu :\n", read);
printf("%s", line);
}
if (ferror(stdin)) {
/* handle error */
}
free(line);
return 0;
}
If I may suggest a safer approach:
Declare a buffer big enough to hold the string:
char user_input[255];
Get the user input in a safe way:
fgets(user_input, 255, stdin);
A safe way to get the input, the first argument being a pointer to a buffer where the input will be stored, the second the maximum input the function should read and the third is a pointer to the standard input - i.e. where the user input comes from.
Safety in particular comes from the second argument limiting how much will be read which prevents buffer overruns. Also, fgets takes care of null-terminating the processed string.
More info on that function here.
EDIT: If you need to do any formatting (e.g. convert a string to a number), you can use atoi once you have the input.
Safer and faster (doubling capacity) version:
char *readline(char *prompt) {
size_t size = 80;
char *str = malloc(sizeof(char) * size);
int c;
size_t len = 0;
printf("%s", prompt);
while (EOF != (c = getchar()) && c != '\r' && c != '\n') {
str[len++] = c;
if(len == size) str = realloc(str, sizeof(char) * (size *= 2));
}
str[len++]='\0';
return realloc(str, sizeof(char) * len);
}
Read directly into allocated space with fgets().
Special care is need to distinguish a successful read, end-of-file, input error and out-of memory. Proper memory management needed on EOF.
This method retains a line's '\n'.
#include <stdio.h>
#include <stdlib.h>
#define FGETS_ALLOC_N 128
char* fgets_alloc(FILE *istream) {
char* buf = NULL;
size_t size = 0;
size_t used = 0;
do {
size += FGETS_ALLOC_N;
char *buf_new = realloc(buf, size);
if (buf_new == NULL) {
// Out-of-memory
free(buf);
return NULL;
}
buf = buf_new;
if (fgets(&buf[used], (int) (size - used), istream) == NULL) {
// feof or ferror
if (used == 0 || ferror(istream)) {
free(buf);
buf = NULL;
}
return buf;
}
size_t length = strlen(&buf[used]);
if (length + 1 != size - used) break;
used += length;
} while (buf[used - 1] != '\n');
return buf;
}
Sample usage
int main(void) {
FILE *istream = stdin;
char *s;
while ((s = fgets_alloc(istream)) != NULL) {
printf("'%s'", s);
free(s);
fflush(stdout);
}
if (ferror(istream)) {
puts("Input error");
} else if (feof(istream)) {
puts("End of file");
} else {
puts("Out of memory");
}
return 0;
}
I know that I have arrived after 4 years and am too late but I think I have another way that someone can use. I had used getchar() Function like this:-
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//I had putten the main Function Bellow this function.
//d for asking string,f is pointer to the string pointer
void GetStr(char *d,char **f)
{
printf("%s",d);
for(int i =0;1;i++)
{
if(i)//I.e if i!=0
*f = (char*)realloc((*f),i+1);
else
*f = (char*)malloc(i+1);
(*f)[i]=getchar();
if((*f)[i] == '\n')
{
(*f)[i]= '\0';
break;
}
}
}
int main()
{
char *s =NULL;
GetStr("Enter the String:- ",&s);
printf("Your String:- %s \nAnd It's length:- %lu\n",s,(strlen(s)));
free(s);
}
here is the sample run for this program:-
Enter the String:- I am Using Linux Mint XFCE 18.2 , eclispe CDT and GCC7.2 compiler!!
Your String:- I am Using Linux Mint XFCE 18.2 , eclispe CDT and GCC7.2 compiler!!
And It's length:- 67
Take a character pointer to store required string.If you have some idea about possible size of string then use function
char *fgets (char *str, int size, FILE* file);
else you can allocate memory on runtime too using malloc() function which dynamically provides requested memory.
i also have a solution with standard inputs and outputs
#include<stdio.h>
#include<malloc.h>
int main()
{
char *str,ch;
int size=10,len=0;
str=realloc(NULL,sizeof(char)*size);
if(!str)return str;
while(EOF!=scanf("%c",&ch) && ch!="\n")
{
str[len++]=ch;
if(len==size)
{
str = realloc(str,sizeof(char)*(size+=10));
if(!str)return str;
}
}
str[len++]='\0';
printf("%s\n",str);
free(str);
}
I have a solution using standard libraries of C and also creating a string type (alias of char*) like in C++
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef char* string;
typedef struct __strstr {
char ch;
struct __strstr *next;
}Strstr;
void get_str(char **str) {
char ch, *buffer, a;
Strstr *new = NULL;
Strstr *head = NULL, *tmp = NULL;
int c = 0, k = 0;
while ((ch = getchar()) != '\n') {
new = malloc(sizeof(Strstr));
if(new == NULL) {
printf("\nError!\n");
exit(1);
}
new->ch = ch;
new->next = NULL;
new->next = head;
head = new;
}
tmp = head;
while (tmp != NULL) {
c++;
tmp = tmp->next;
}
if(c == 0) {
*str = "";
} else {
buffer = malloc(sizeof(char) * (c + 1));
*str = malloc(sizeof(char) * (c + 1));
if(buffer == NULL || *str == NULL) {
printf("\nError!\n");
exit(1);
}
tmp = head;
while (tmp != NULL) {
buffer[k] = tmp->ch;
k++;
tmp = tmp->next;
}
buffer[k] = '\0';
for (int i = 0, j = strlen(buffer)-1; i < j; i++, j--) {
a = buffer[i];
buffer[i] = buffer[j];
buffer[j] = a;
}
strcpy(*str, buffer);
// Dealloc
free(buffer);
while (head != NULL) {
tmp = head;
head = head->next;
free(tmp);
}
}
}
int main() {
string str;
printf("Enter text: ");
get_str(&str);
printf("%s\n", str);
return 0;
}

Why is strcmp not working when I am reading a line from a file using malloc and determining the strings size independently

Alright, I am seriously puzzled.
I used a snippet of code that would open and read a file and store it in dynamic memory. The output is executed in a void function, where the file is given line by line. The size of the buffer character array was specified because the number of bytes that each line contains is unknown. I want to compare the string that is read with a user input (given by char *word) to see if the program works or not.
In my mind, the strings can only be the same if both their sizes and their character sequences are equivalent, and the dynamic string size is not the same as the file's string size. By using the counter in the main () loop (i.e. pos), I can determine the correct size of the string being read from the file and pass it on to the void function. In the void handle_line function I copied in the characters from the buffer into char *temp because I know the number of characters that temp needs (i.e. pos) and can therefore initialize its size and copy the contents of the dynamic memory into the char *temp. I proceeded to print out the bytes of the string sizes for line, temp, and word. While char *temp and char *word gave their expected values, char *line was always four bytes. Furthermore, the strcmp() did not give the expected result when H2O was read from the CompoundLib file--this in spite of the fact that I know that their sizes and character sequences are the same.
How can this be? What am I missing??? Many thanks in advance.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void handle_line(char *line, int a) {
char temp[a];
int i=0;
char temp_2[a];
char *word="H2O";
for(i=0;i<=a;i++)
temp[i]=line[i];
temp_2[i]=temp[i];
printf("size of temp= %d\n",sizeof temp_2);
printf("size of word= %d\n",sizeof word);
printf("size of line= %d\n",sizeof line);
if (strcmp(temp_2, word) == 0)
puts("Strings equal");
else
puts("Strings do not equal");
printf("%s\n", line);
printf("%d\n", a);
printf("%s\n",temp);
}
int main(int argc, char *argv[]){
char c[1000];
FILE *fptr;
int size = 1024, pos;
int c;
char *buffer = (char *)malloc(size);
FILE *f = fopen("CompoundLib.txt", "r");
if(f) {
do { // read all lines in file
pos = 0;
do{ // read one line
c = fgetc(f);
if(c != EOF) buffer[pos++] = (char)c;
if(pos >= size - 1) { // increase buffer length - leave room for 0
size *=2; //size = size*2
}
}while(c != EOF && c != '\n');
buffer[pos] = 0;
handle_line(buffer, pos);
} while(c != EOF);
fclose(f);
}
free(buffer);
return 0;
}

How can I read an input string of unknown length?

If I don't know how long the word is, I cannot write char m[6];,
The length of the word is maybe ten or twenty long.
How can I use scanf to get input from the keyboard?
#include <stdio.h>
int main(void)
{
char m[6];
printf("please input a string with length=5\n");
scanf("%s",&m);
printf("this is the string: %s\n", m);
return 0;
}
please input a string with length=5
input: hello
this is the string: hello
Enter while securing an area dynamically
E.G.
#include <stdio.h>
#include <stdlib.h>
char *inputString(FILE* fp, size_t size){
//The size is extended by the input with the value of the provisional
char *str;
int ch;
size_t len = 0;
str = realloc(NULL, sizeof(*str)*size);//size is start size
if(!str)return str;
while(EOF!=(ch=fgetc(fp)) && ch != '\n'){
str[len++]=ch;
if(len==size){
str = realloc(str, sizeof(*str)*(size+=16));
if(!str)return str;
}
}
str[len++]='\0';
return realloc(str, sizeof(*str)*len);
}
int main(void){
char *m;
printf("input string : ");
m = inputString(stdin, 10);
printf("%s\n", m);
free(m);
return 0;
}
With the computers of today, you can get away with allocating very large strings (hundreds of thousands of characters) while hardly making a dent in the computer's RAM usage. So I wouldn't worry too much.
However, in the old days, when memory was at a premium, the common practice was to read strings in chunks. fgets reads up to a maximum number of chars from the input, but leaves the rest of the input buffer intact, so you can read the rest from it however you like.
in this example, I read in chunks of 200 chars, but you can use whatever chunk size you want of course.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char* readinput()
{
#define CHUNK 200
char* input = NULL;
char tempbuf[CHUNK];
size_t inputlen = 0, templen = 0;
do {
fgets(tempbuf, CHUNK, stdin);
templen = strlen(tempbuf);
input = realloc(input, inputlen+templen+1);
strcpy(input+inputlen, tempbuf);
inputlen += templen;
} while (templen==CHUNK-1 && tempbuf[CHUNK-2]!='\n');
return input;
}
int main()
{
char* result = readinput();
printf("And the result is [%s]\n", result);
free(result);
return 0;
}
Note that this is a simplified example with no error checking; in real life you will have to make sure the input is OK by verifying the return value of fgets.
Also note that at the end if the readinput routine, no bytes are wasted; the string has the exact memory size it needs to have.
I've seen only one simple way of reading an arbitrarily long string, but I've never used it. I think it goes like this:
char *m = NULL;
printf("please input a string\n");
scanf("%ms",&m);
if (m == NULL)
fprintf(stderr, "That string was too long!\n");
else
{
printf("this is the string %s\n",m);
/* ... any other use of m */
free(m);
}
The m between % and s tells scanf() to measure the string and allocate memory for it and copy the string into that, and to store the address of that allocated memory in the corresponding argument. Once you're done with it you have to free() it.
This isn't supported on every implementation of scanf(), though.
As others have pointed out, the easiest solution is to set a limit on the length of the input. If you still want to use scanf() then you can do so this way:
char m[100];
scanf("%99s",&m);
Note that the size of m[] must be at least one byte larger than the number between % and s.
If the string entered is longer than 99, then the remaining characters will wait to be read by another call or by the rest of the format string passed to scanf().
Generally scanf() is not recommended for handling user input. It's best applied to basic structured text files that were created by another application. Even then, you must be aware that the input might not be formatted as you expect, as somebody might have interfered with it to try to break your program.
There is a new function in C standard for getting a line without specifying its size. getline function allocates string with required size automatically so there is no need to guess about string's size. The following code demonstrate usage:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char *line = NULL;
size_t len = 0;
ssize_t read;
while ((read = getline(&line, &len, stdin)) != -1) {
printf("Retrieved line of length %zu :\n", read);
printf("%s", line);
}
if (ferror(stdin)) {
/* handle error */
}
free(line);
return 0;
}
If I may suggest a safer approach:
Declare a buffer big enough to hold the string:
char user_input[255];
Get the user input in a safe way:
fgets(user_input, 255, stdin);
A safe way to get the input, the first argument being a pointer to a buffer where the input will be stored, the second the maximum input the function should read and the third is a pointer to the standard input - i.e. where the user input comes from.
Safety in particular comes from the second argument limiting how much will be read which prevents buffer overruns. Also, fgets takes care of null-terminating the processed string.
More info on that function here.
EDIT: If you need to do any formatting (e.g. convert a string to a number), you can use atoi once you have the input.
Safer and faster (doubling capacity) version:
char *readline(char *prompt) {
size_t size = 80;
char *str = malloc(sizeof(char) * size);
int c;
size_t len = 0;
printf("%s", prompt);
while (EOF != (c = getchar()) && c != '\r' && c != '\n') {
str[len++] = c;
if(len == size) str = realloc(str, sizeof(char) * (size *= 2));
}
str[len++]='\0';
return realloc(str, sizeof(char) * len);
}
Read directly into allocated space with fgets().
Special care is need to distinguish a successful read, end-of-file, input error and out-of memory. Proper memory management needed on EOF.
This method retains a line's '\n'.
#include <stdio.h>
#include <stdlib.h>
#define FGETS_ALLOC_N 128
char* fgets_alloc(FILE *istream) {
char* buf = NULL;
size_t size = 0;
size_t used = 0;
do {
size += FGETS_ALLOC_N;
char *buf_new = realloc(buf, size);
if (buf_new == NULL) {
// Out-of-memory
free(buf);
return NULL;
}
buf = buf_new;
if (fgets(&buf[used], (int) (size - used), istream) == NULL) {
// feof or ferror
if (used == 0 || ferror(istream)) {
free(buf);
buf = NULL;
}
return buf;
}
size_t length = strlen(&buf[used]);
if (length + 1 != size - used) break;
used += length;
} while (buf[used - 1] != '\n');
return buf;
}
Sample usage
int main(void) {
FILE *istream = stdin;
char *s;
while ((s = fgets_alloc(istream)) != NULL) {
printf("'%s'", s);
free(s);
fflush(stdout);
}
if (ferror(istream)) {
puts("Input error");
} else if (feof(istream)) {
puts("End of file");
} else {
puts("Out of memory");
}
return 0;
}
I know that I have arrived after 4 years and am too late but I think I have another way that someone can use. I had used getchar() Function like this:-
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//I had putten the main Function Bellow this function.
//d for asking string,f is pointer to the string pointer
void GetStr(char *d,char **f)
{
printf("%s",d);
for(int i =0;1;i++)
{
if(i)//I.e if i!=0
*f = (char*)realloc((*f),i+1);
else
*f = (char*)malloc(i+1);
(*f)[i]=getchar();
if((*f)[i] == '\n')
{
(*f)[i]= '\0';
break;
}
}
}
int main()
{
char *s =NULL;
GetStr("Enter the String:- ",&s);
printf("Your String:- %s \nAnd It's length:- %lu\n",s,(strlen(s)));
free(s);
}
here is the sample run for this program:-
Enter the String:- I am Using Linux Mint XFCE 18.2 , eclispe CDT and GCC7.2 compiler!!
Your String:- I am Using Linux Mint XFCE 18.2 , eclispe CDT and GCC7.2 compiler!!
And It's length:- 67
Take a character pointer to store required string.If you have some idea about possible size of string then use function
char *fgets (char *str, int size, FILE* file);
else you can allocate memory on runtime too using malloc() function which dynamically provides requested memory.
i also have a solution with standard inputs and outputs
#include<stdio.h>
#include<malloc.h>
int main()
{
char *str,ch;
int size=10,len=0;
str=realloc(NULL,sizeof(char)*size);
if(!str)return str;
while(EOF!=scanf("%c",&ch) && ch!="\n")
{
str[len++]=ch;
if(len==size)
{
str = realloc(str,sizeof(char)*(size+=10));
if(!str)return str;
}
}
str[len++]='\0';
printf("%s\n",str);
free(str);
}
I have a solution using standard libraries of C and also creating a string type (alias of char*) like in C++
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef char* string;
typedef struct __strstr {
char ch;
struct __strstr *next;
}Strstr;
void get_str(char **str) {
char ch, *buffer, a;
Strstr *new = NULL;
Strstr *head = NULL, *tmp = NULL;
int c = 0, k = 0;
while ((ch = getchar()) != '\n') {
new = malloc(sizeof(Strstr));
if(new == NULL) {
printf("\nError!\n");
exit(1);
}
new->ch = ch;
new->next = NULL;
new->next = head;
head = new;
}
tmp = head;
while (tmp != NULL) {
c++;
tmp = tmp->next;
}
if(c == 0) {
*str = "";
} else {
buffer = malloc(sizeof(char) * (c + 1));
*str = malloc(sizeof(char) * (c + 1));
if(buffer == NULL || *str == NULL) {
printf("\nError!\n");
exit(1);
}
tmp = head;
while (tmp != NULL) {
buffer[k] = tmp->ch;
k++;
tmp = tmp->next;
}
buffer[k] = '\0';
for (int i = 0, j = strlen(buffer)-1; i < j; i++, j--) {
a = buffer[i];
buffer[i] = buffer[j];
buffer[j] = a;
}
strcpy(*str, buffer);
// Dealloc
free(buffer);
while (head != NULL) {
tmp = head;
head = head->next;
free(tmp);
}
}
}
int main() {
string str;
printf("Enter text: ");
get_str(&str);
printf("%s\n", str);
return 0;
}

array of strings access error

When I print char** surname and char** first, I get some strange outputs. I am not sure if I am doing the malloc correctly or if I'm doing something else incorrectly.
The Input -> names1.txt
The outputs
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main ()
{
int size, i;
char **surname, **first, *middle_init, dummy, str[80];
FILE *fp_input = fopen("names1.txt", "r");
fscanf(fp_input, "%d%c", &size, &dummy); // gets size of array from file
/* dynamic memory allocation */
middle_init = (char*)malloc(size * sizeof(char));
surname = (char**)malloc(size * sizeof(char*));
first = (char**)malloc(size * sizeof(char*));
for (i = 0; i < size; i++)
{
surname[i] = (char*)malloc(17 * sizeof(char));
first[i] = (char*)malloc(17 * sizeof(char));
} // for
/* reads from file and assigns value to arrays */
i = 0;
strcpy(middle_init, "");
while (fgets(str, 80, fp_input) != NULL)
{
surname[i] = strtok(str, ", \n");
first[i] = strtok(NULL, ". ");
strcat(middle_init, strtok(NULL, ". "));
i++;
} // while
/* prints arrays */
for (i = 0; i < size; i++)
printf("%s %s\n", surname[i], first[i]);
return 0;
} // main
A casual look at the code suggests:
You must use strcpy() or a variant on the theme to copy the string found by strtok() into the surname, etc.
The way you've written it, you throw away your allocated memory.
You get the repeated output because you're storing pointers to the string you use to hold the line in the surname and first arrays. That string only holds the last line when you do the printing. This and the previous point are corollaries of the first point.
You only allocate a single character for the middle initials. You then use strcat() to treat them as strings. I recommend treating middle initials as strings, much like the other names. Or, since you aren't required to print them, you might decide to ignore middle initials altogether.
Using 17 instead of enum { NAME_LENGTH = 17 }; or equivalent is not a good idea.
There are undoubtedly other issues too.
I guess you've not reached structures in your course of study yet. If you have covered structures, you should probably use a structure type to represent a complete name, and use a single array of names instead of parallel arrays. This will likely simplify memory management too; you'd use fixed size array elements in the structure, so you'd only have to make one allocation for each name.
The code below produces the output:
Ryan Elizabeth
McIntyre O
Cauble-Chantrenne Kristin
Larson Lois
Thorpe Trinity
Ruiz Pedro
In this code, the err_exit() function is vastly valuable because it makes error reporting into a one-line call, rather than a 4-line paragraph, which means you're more likely to do the error checking. It is a basic use of variable length argument lists, and you may not understand it yet, but it is extremely convenient and powerful. The only functions that could be error checked but aren't are the fclose() and printf(). If you're reading a file, there's little benefit to checking fclose(); if you're writing and fclose() fails, you may have run out of disk space or something like that and it is probably appropriate to report the error. You could add <errno.h> to the list of headers and report on errno and strerror(errno) if you wanted to improve the error reporting more. The code frees the allocated memory; valgrind gives it a clean bill of health.
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static void err_exit(const char *fmt, ...);
int main(void)
{
enum { NAME_SIZE = 25 };
const char *file = "names1.txt";
int size, i;
char **surname, **first, str[80];
FILE *fp_input = fopen(file, "r");
if (fp_input == NULL)
err_exit("Failed to open file %s\n", file);
if (fgets(str, sizeof(str), fp_input) == 0)
err_exit("Unexpected EOF on file %s\n", file);
if (sscanf(str, "%d", &size) != 1)
err_exit("Did not find integer in line: %s\n", str);
if (size <= 0 || size > 1000)
err_exit("Integer %d out of range 1..1000\n", size);
if ((surname = (char**)malloc(size * sizeof(char*))) == 0 ||
(first = (char**)malloc(size * sizeof(char*))) == 0)
err_exit("Memory allocation failure\n");
for (i = 0; i < size; i++)
{
if ((surname[i] = (char*)malloc(NAME_SIZE * sizeof(char))) == 0 ||
(first[i] = (char*)malloc(NAME_SIZE * sizeof(char))) == 0)
err_exit("Memory allocation failure\n");
}
for (i = 0; i < size && fgets(str, sizeof(str), fp_input) != NULL; i++)
{
char *tok_s = strtok(str, ",. \n");
char *tok_f = strtok(NULL, ". ");
if (tok_s == 0 || tok_f == 0)
err_exit("Failed to read surname and first name from: %s\n", str);
if (strlen(tok_s) >= NAME_SIZE || strlen(tok_f) >= NAME_SIZE)
err_exit("Name(s) %s and %s are too long (max %d)\n", tok_s, tok_f, NAME_SIZE-1);
strcpy(surname[i], tok_s);
strcpy(first[i], tok_f);
}
if (i != size)
err_exit("Only read %d names\n", i);
fclose(fp_input);
/* prints arrays */
for (i = 0; i < size; i++)
printf("%s %s\n", surname[i], first[i]);
for (i = 0; i < size; i++)
{
free(surname[i]);
free(first[i]);
}
free(surname);
free(first);
return 0;
}
static void err_exit(const char *fmt, ...)
{
va_list args;
va_start(args, fmt);
vfprintf(stderr, fmt, args);
va_end(args);
exit(1);
}
here:
surname[i] = (char*)malloc(17 * sizeof(char));
first[i] = (char*)malloc(17 * sizeof(char));
..
surname[i] = strtok(str, ", \n");
first[i] = strtok(NULL, ". ");
you allocate memory for surname and first and you don't use that memory because you assign to it the string returned from strtok which you should not do anyway because it points to a static buffer used by the function for parsing, you could use strdup instead:
while (fgets(str, 80, fp_input) != NULL) {
surname[i] = strdup(strtok(str, ", \n"));
first[i] = strdup(strtok(NULL, ". "));
middle_init[i] = strtok(NULL, ". ")[0];
i++;
} // while
/* prints arrays */
for (i = 0; i < size; i++)
printf("%s %s %c\n", surname[i], first[i], middle_init[i]);
strdup will allocate memory and copy the string, this way you avoid hard coding the string length too, you should free that memory when you're done, also note that middile_init is a char array, so I just assign 1 char.

Using realloc to expand buffer while reading from file crashes

I am writing some code that needs to read fasta files, so part of my code (included below) is a fasta parser. As a single sequence can span multiple lines in the fasta format, I need to concatenate multiple successive lines read from the file into a single string. I do this, by realloc'ing the string buffer after reading every line, to be the current length of the sequence plus the length of the line read in. I do some other stuff, like stripping white space etc. All goes well for the first sequence, but fasta files can contain multiple sequences. So similarly, I have a dynamic array of structs with a two strings (title, and actual sequence), being "char *". Again, as I encounter a new title (introduced by a line beginning with '>') I increment the number of sequences, and realloc the sequence list buffer. The realloc segfaults on allocating space for the second sequence with
*** glibc detected *** ./stackoverflow: malloc(): memory corruption: 0x09fd9210 ***
Aborted
For the life of me I can't see why. I've run it through gdb and everything seems to be working (i.e. everything is initialised, the values seems sane)... Here's the code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
#include <math.h>
#include <errno.h>
//a struture to keep a record of sequences read in from file, and their titles
typedef struct {
char *title;
char *sequence;
} sequence_rec;
//string convenience functions
//checks whether a string consists entirely of white space
int empty(const char *s) {
int i;
i = 0;
while (s[i] != 0) {
if (!isspace(s[i])) return 0;
i++;
}
return 1;
}
//substr allocates and returns a new string which is a substring of s from i to
//j exclusive, where i < j; If i or j are negative they refer to distance from
//the end of the s
char *substr(const char *s, int i, int j) {
char *ret;
if (i < 0) i = strlen(s)-i;
if (j < 0) j = strlen(s)-j;
ret = malloc(j-i+1);
strncpy(ret,s,j-i);
return ret;
}
//strips white space from either end of the string
void strip(char **s) {
int i, j, len;
char *tmp = *s;
len = strlen(*s);
i = 0;
while ((isspace(*(*s+i)))&&(i < len)) {
i++;
}
j = strlen(*s)-1;
while ((isspace(*(*s+j)))&&(j > 0)) {
j--;
}
*s = strndup(*s+i, j-i);
free(tmp);
}
int main(int argc, char**argv) {
sequence_rec *sequences = NULL;
FILE *f = NULL;
char *line = NULL;
size_t linelen;
int rcount;
int numsequences = 0;
f = fopen(argv[1], "r");
if (f == NULL) {
fprintf(stderr, "Error opening %s: %s\n", argv[1], strerror(errno));
return EXIT_FAILURE;
}
rcount = getline(&line, &linelen, f);
while (rcount != -1) {
while (empty(line)) rcount = getline(&line, &linelen, f);
if (line[0] != '>') {
fprintf(stderr,"Sequence input not in valid fasta format\n");
return EXIT_FAILURE;
}
numsequences++;
sequences = realloc(sequences,sizeof(sequence_rec)*numsequences);
sequences[numsequences-1].title = strdup(line+1); strip(&sequences[numsequences-1].title);
rcount = getline(&line, &linelen, f);
sequences[numsequences-1].sequence = malloc(1); sequences[numsequences-1].sequence[0] = 0;
while ((!empty(line))&&(line[0] != '>')) {
strip(&line);
sequences[numsequences-1].sequence = realloc(sequences[numsequences-1].sequence, strlen(sequences[numsequences-1].sequence)+strlen(line)+1);
strcat(sequences[numsequences-1].sequence,line);
rcount = getline(&line, &linelen, f);
}
}
return EXIT_SUCCESS;
}
You should use strings that look something like this:
struct string {
int len;
char *ptr;
};
This prevents strncpy bugs like what it seems you saw, and allows you to do strcat and friends faster.
You should also use a doubling array for each string. This prevents too many allocations and memcpys. Something like this:
int sstrcat(struct string *a, struct string *b)
{
int len = a->len + b->len;
int alen = a->len;
if (a->len < len) {
while (a->len < len) {
a->len *= 2;
}
a->ptr = realloc(a->ptr, a->len);
if (a->ptr == NULL) {
return ENOMEM;
}
}
memcpy(&a->ptr[alen], b->ptr, b->len);
return 0;
}
I now see you are doing bioinformatics, which means you probably need more performance than I thought. You should use strings like this instead:
struct string {
int len;
char ptr[0];
};
This way, when you allocate a string object, you call malloc(sizeof(struct string) + len) and avoid a second call to malloc. It's a little more work but it should help measurably, in terms of speed and also memory fragmentation.
Finally, if this isn't actually the source of error, it looks like you have some corruption. Valgrind should help you detect it if gdb fails.
One potential issue is here:
strncpy(ret,s,j-i);
return ret;
ret might not get a null terminator. See man strncpy:
char *strncpy(char *dest, const char *src, size_t n);
...
The strncpy() function is similar, except that at most n bytes of src
are copied. Warning: If there is no null byte among the first n bytes
of src, the string placed in dest will not be null terminated.
There's also a bug here:
j = strlen(*s)-1;
while ((isspace(*(*s+j)))&&(j > 0)) {
What if strlen(*s) is 0? You'll end up reading (*s)[-1].
You also don't check in strip() that the string doesn't consist entirely of spaces. If it does, you'll end up with j < i.
edit: Just noticed that your substr() function doesn't actually get called.
I think the memory corruption problem might be the result of how you're handling the data used in your getline() calls. Basically, line is reallocated via strndup() in the calls to strip(), so the buffer size being tracked in linelen by getline() will no longer be accurate. getline() may overrun the buffer.
while ((!empty(line))&&(line[0] != '>')) {
strip(&line); // <-- assigns a `strndup()` allocation to `line`
sequences[numsequences-1].sequence = realloc(sequences[numsequences-1].sequence, strlen(sequences[numsequences-1].sequence)+strlen(line)+1);
strcat(sequences[numsequences-1].sequence,line);
rcount = getline(&line, &linelen, f); // <-- the buffer `line` points to might be
// smaller than `linelen` bytes
}

Resources