Storing words from text file into char array using feof - c

So I have a textfile which goes like this:
zero three two one five zero zero five seven .. etc
and there is a lot of it, 9054 words to be exact
My idea was to create a char array of 9054 spaces and store it in, this is what I have done so far:
#include <stdio.h>
int main(void)
{
char tmp;
int i = 0;
int j = 0;
char array[44000];
FILE *in_file;
in_file = fopen("in.txt", "r");
// Read file in to array
while (!feof(in_file))
{
fscanf(in_file,"%c",&tmp);
array[i] = tmp;
i++;
}
// Display array
while (j<i)
{
printf("%c",array[j]);
j++;
}
fclose(in_file);
while(1);
return 0;
}
The problem is I don't know how to store words, because from what I have done stores each character into the array so it becomes an array of around 44000. How can I make it so the array holds words instead?
Also I don't have an idea what the feof function does, especially the line
while (!feof(in_file))
what does this line exactly mean? Sorry I am still in the baby stages of learning C, I tried looking up what feof does but there is not much to find

Rather than check feof(), which tells you if the end-of-file occurred in the previous input operation, check the result of fscanf()
Reads "words" with "%s" and limit the max numbers of char to be read.
char buf[100];
fscanf(in_file,"%99s",buf);
Putting that together:
#define WORD_SIZE_MAX 20
#define WORD_COUNT_MAX 10000
char array[WORD_COUNT_MAX][WORD_SIZE_MAX];
unsigned word_i = 0;
for (i=0; i<WORD_COUNT_MAX; i++) {
if (fscanf(in_file,"%19s", word_list[i]) != 1) {
break;
}
}
Another approach is to use OP code nearly as is. Read the whole file into 1 array. Then on printing, skip white-space.

Usually you may use the following steps:
Dump the whole text file to a char buffer.
Use strtok to split the char buffer to multiple tokens or words.
Use an array of pointer to char to store individual words.
Something along this line would do. Note, I use your question title as the text file. You will need to replace 20 as appropriately.
int main ()
{
FILE *in_file;
in_file = fopen("in.txt", "r");
fseek( in_file, 0, SEEK_END );
long fsize = ftell( in_file );
fseek( in_file, 0, SEEK_SET );
char *buf = malloc( fsize + 1 );
fread( buf, fsize, 1, in_file ); // Dump the whole file to a char buffer.
fclose( in_file );
char *items[20] = { NULL };
char *pch;
pch = strtok (buf," \t\n");
int i = 0;
while (pch != NULL)
{
items[i++] = pch;
pch = strtok (NULL, " \t\n");
}
for( i = 0; i < 20; i++ )
{
if( items[i] != NULL )
{
printf( "items[%d] = %s\n", i, items[i] );
}
}
return 0;
}
Output:
items[0] = Storing
items[1] = words
items[2] = from
items[3] = textfile
items[4] = into
items[5] = char
items[6] = array
items[7] = using
items[8] = feof?

Related

C, values in array of pointers dissapear (pointers)

I seem to be losing the reference to my pointers here. I dont know why but I suspect its the pointer returned by fgets that messes this up.
I was told a good way to read words from a file was to get the line then separate the words with strok, but how can I do this if my pointers inside words[i] keep dissapearing.
text
Natural Reader is
john make tame
Result Im getting.
array[0] = john
array[1] = e
array[2] =
array[3] = john
array[4] = make
array[5] = tame
int main(int argc, char *argv[]) {
FILE *file = fopen(argv[1], "r");
int ch;
int count = 0;
while ((ch = fgetc(file)) != EOF){
if (ch == '\n' || ch == ' ')
count++;
}
fseek(file, 0, SEEK_END);
size_t size = ftell(file);
fseek(file, 0, SEEK_SET);
char** words = calloc(count, size * sizeof(char*) +1 );
int i = 0;
int x = 0;
char ligne [250];
while (fgets(ligne, 80, file)) {
char* word;
word = strtok(ligne, " ,.-\n");
while (word != NULL) {
for (i = 0; i < 3; i++) {
words[x] = word;
word = strtok(NULL, " ,.-\n");
x++;
}
}
}
for (i = 0; i < count; ++i)
if (words[i] != 0){
printf("array[%d] = %s\n", i, words[i]);
}
free(words);
fclose(file);
return 0;
}
strtok does not allocate any memory, it returns a pointer to a delimited string in the buffer.
therefore you need to allocate memory for the result if you want to keep the word between loop iterations
e.g.
word = strdup(strtok(ligne, " ,.-\n"));
You could also hanle this by using a unique ligne for each line read, so make it an array of strings like so:
char ligne[20][80]; // no need to make the string 250 since fgets limits it to 80
Then your while loop changes to:
int lno = 0;
while (fgets(ligne[lno], 80, file)) {
char *word;
word = strtok(ligne[lno], " ,.-\n");
while (word != NULL) {
words[x++] = word;
word = strtok(NULL, " ,.-\n");
}
lno++;
}
Adjust the first subscript as needed for the maximum size of the file, or dynamically allocate the line buffer during each iteration if you don't want such a low limit. You could also use getline instead of fgets, if your implementation supports it; it can handle the allocation for, though you then need to free the blocks when you are done.
If you are processing real-world prose, you might want to include other delimiters in your list, like colon, semicolon, exclamation point, and question mark.

ascii file processing in C

I have a hard time understanding how you process ascii files in c. I have no problem opening files and closing them or reading files with one value on each line. However, when the data is separated with characters, I really don't understand what the code is doing at a lower level.
Example: I have a file containing names separated with comas that looks like this:
"MARY","PATRICIA","LINDA","BARBARA","ELIZABETH","JENNIFER"
I have created an array to store them:
char names[6000][20];
And now, my code to process it is while (fscanf(data, "\"%s\",", names[index]) != EOF) { index++; }
The code executes for the 1st iteration and names[0] contains the whole file.
How can I separate all the names?
Here is the full code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
char names[6000][20]; // an array to store 6k names of max length 19
FILE * data = fopen("./022names.txt", "r");
int index = 0;
int nbNames;
while (fscanf(data, "\"%s\",", names[index]) != EOF) {
index++;
}
nbNames = index;
fclose(data);
printf("%d\n", index);
for (index=0; index<nbNames; index++) {
printf("%s \n", names[index]);
}
printf("\n");
return 0;
}
PS: I am thinking this might also be because of the data structure of my array.
If you want a simple solution, you can read the file character by character using fgetc. Since there are no newlines in the file, just ignore quotation marks and move to the next index when you find a comma.
char names[6000][20]; // an array to store 6k names of max length 19
FILE * data = fopen("./022names.txt", "r");
int name_count = 0, current_name_ind = 0;
int c;
while ((c = fgetc(data)) != EOF) {
if (c == ',') {
names[name_count][current_name_ind] = '\0';
current_name_ind = 0;
++name_count;
} else if (c != '"') {
names[name_count][current_name_ind] = c;
++current_name_ind;
}
}
names[name_count][current_name_ind] = '\0';
fclose(data);
"The code executes for the 1st iteration and names[0] contains the whole file...., How can I separate all the names?"
Regarding the first few statements:
char names[6000][20]; // an array to store 6k names of max length 19
FILE * data = fopen("./022names.txt", "r");
What if there are there are 6001 names. Or one of the names has more than 20 characters?
Or what if there are way less than 6000 names?
The point is that with some effort to enumerate the tasks you have listed, and some time mapping out what information is needed to create the code that matches your criteria, you can create a better product: The following is derived from your post:
Process ascii files in c
Read file content that is separated by characters
input is a comma separated file, with other delimiters as well
Choose a method best suited to parse a file of variable size
As mentioned in the comments under your question there are ways to create your algorithms in such way as to flexibly allow for extra long names, or for a variable number of names. This can be done using a few C standard functions commonly used in parsing files. ( Although fscanf() has it place, it is not the best option for parsing file contents into array elements.)
The following approach performs the following steps to accomplish the user needs enumerated above
Read file to determine number of, and longest element
Create array sized to contain exact contents of file using count of elements and longest element using variable length array (VLA)
Create function to parse file contents into array. (using this technique of passing VLA as function argument.)
Following is a complete example of how to implement each of these, while breaking the tasks into functions when appropriate...
Note, code below was tested using the following input file:
names.txt
"MARY","PATRICIA","LINDA","BARBARA","ELIZABETH","JENNIFER",
"Joseph","Bart","Daniel","Stephan","Karen","Beth","Marcia",
"Calmazzothoulumus"
.
//Prototypes
int count_names(const char *filename, size_t *count);
size_t filesize(const char *fn);
void populateNames(const char *fn, int longest, char arr[][longest]);
char *filename = ".\\names.txt";
int main(void)
{
size_t count = 0;
int longest = count_names(filename, &count);
char names[count][longest+1];//VLA - See linked info
// +1 is room for null termination
memset(names, 0, sizeof names);
populateNames(filename, longest+1, names);
return 0;
}
//populate VLA with names in file
void populateNames(const char *fn, int longest, char names[][longest])
{
char line[80] = {0};
char *delim = "\",\n ";
char *tok = NULL;
FILE * fp = fopen(fn, "r");
if(fp)
{
int i=0;
while(fgets(line, sizeof line, fp))
{
tok = strtok(line, delim);
while(tok)
{
strcpy(names[i], tok);
tok = strtok(NULL, delim);
i++;
}
}
fclose(fp);
}
}
//passes back count of tokens in file, and return longest token
int count_names(const char *filename, size_t *count)
{
int len=0, lenKeep = 0;
FILE *fp = fopen(filename, "r");
if(fp)
{
char *tok = NULL;
char *delim = "\",\n ";
int cnt = 0;
size_t fSize = filesize(filename);
char *buf = calloc(fSize, 1);
while(fgets(buf, fSize, fp)) //goes to newline for each get
{
tok = strtok(buf, delim);
while(tok)
{
cnt++;
len = strlen(tok);
if(lenKeep < len) lenKeep = len;
tok = strtok(NULL, delim);
}
}
*count = cnt;
fclose(fp);
free(buf);
}
return lenKeep;
}
//return file size in bytes (binary read)
size_t filesize(const char *fn)
{
size_t size = 0;
FILE*fp = fopen(fn, "rb");
if(fp)
{
fseek(fp, 0, SEEK_END);
size = ftell(fp);
fseek(fp, 0, SEEK_SET);
fclose(fp);
}
return size;
}
You can use the in-built strtok() function which is easy to use.
I have used the tok+1 instead of tok to omit the first " and strlen(tok) - 2 to omit the last ".
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
char names[6000][20]; // an array to store 6k names of max length 19
FILE * data = fopen("./022names.txt", "r");
int index = 0;
int nbNames;
char *str = (char*)malloc(120000*sizeof(char));
while (fscanf(data, "%s", str) != EOF) {
char *tok = strtok(str, ",");
while(tok != 0){
strncpy(names[index++], tok+1, strlen(tok)-2);
tok = strtok(0, ",");
}
}
nbNames = index;
fclose(data);
free(str); // just to free the memory occupied by the str variable in the heap.
printf("%d\n", index);
for (index=0; index<nbNames; index++) {
printf("%s \n", names[index]);
}
printf("\n");
return 0;
}
Also, the parameter 120000 is just the maximum number of characters that can be in the file. It is just 6000 * 20 as you mentioned.

Find text inside the beg and end () parentheses in textile and read/print into a buffer. IN C

I am new to C and am getting very frustrated with learning this language. Currently I'm trying to write a program that reads in a program textfile, reads and prints all the string literals, and tokens each on separate line. I have most of it except for one snag. within the text file there is a line such as: (..text..). I need to be able to search, read and print all the text is inside the parentheses on it's own line. Here is an idea I have so far:
#define KEY 32
#define BUFFER_SIZE 500
FILE *fp, *fp2;
int main()
{
char ch, buffer[BUFFER_SIZE], operators[] = "+-*%=", separators[] = "(){}[]<>,";
char *pus;
char source[200 + 1];
int i, j = 0, k = 0;
char *words = NULL, *word = NULL, c;
fp = fopen("main.txt", "r");
fp2 = fopen ("mynewfile.txt","w") ;
while ((ch = fgetc(fp)) != EOF)
{
// pus[k++] = ch;
if( ch == '(')
{
for ( k = 0;, k < 20, K++){
buffer[k] = ch;
buffer[k] = '\0';
}
printf("%s\n", buffer)
}
....
The textfile is this:
#include <stdio.h>
int main(int argc, char **argv)
{
for (int i = 0; i < argc; ++i)
{
printf("argv[%d]: %s\n", i, argv[i]);
}
}
So far I've been able to read char by char and place it into a buffer. But this idea just isn't working, and I'm stumped. I've tried dabbling with strcopy(), ands strtok, but they all take char arrays. Any ideas would be appreciated thank you.
Most likely the best way would be to use fgets() with a file to read in each line as a string (char array) and then delimit that string. See the short example below:
char buffer[BUFFER_SIZE];
int current_line = 0;
//Continually read in lines until nothing is left...
while(fgets(buffer, BUFFER_SIZE - 1, fp) != NULL)
{
//Line from file is now in buffer. We can delimit it.
char copy[BUFFER_SIZE];
//Copy as strtok will overwrite a string.
strcpy(copy, buffer);
printf("Line: %d - %s", current_line, buffer); //Print the line.
char * found = strtok(copy, separators); //Will delmit based on the separators.
while(found != NULL)
{
printf("%s", found);
found = strtok(NULL, separators);
}
current_line++;
}
strtok will return a char pointer to where the first occurrence of a delimiter is. It will replace the delimiter with the null terminator, thereby making "new" string. We can pass NULL to strtok to tell it to continue where it left off. Using this, we can parse line by line from a file based on multiple delimiters. You could save these individual string or evaluate them further.

C loop to read lines of input

I want to create a program in C that takes an arbitrary number of lines of arbitrary length as input and then prints to console the last line that was inputted. For example:
input:
hi
my name is
david
output: david
I figured the best way to do this would be to have a loop that takes each line as input and stores it in a char array, so at the end of the loop the last line ends up being what is stored in the char array and we can just print that.
I have only had one lecture in C so far so I think I just keep setting things up wrong with my Java/C++ mindset since I have more experience in those languages.
Here is what I have so far but I know that it's nowhere near correct:
#include <stdio.h>
int main()
{
printf("Enter some lines of strings: \n");
char line[50];
for(int i = 0; i < 10; i++){
line = getline(); //I know this is inproper syntax but I want to do something like this
}
printf("%s",line);
}
I also have i < 10 in the loop because I don't know how to find the total number of lines in the input which, would be the proper amount of times to loop this. Also, the input is being put in all at once from the
./program < test.txt
command in Unix shell, where test.txt has the input.
Use fgets():
while (fgets(line, sizeof line, stdin)) {
// don't need to do anything here
}
printf("%s", line);
You don't need a limit on the number of iterations. At the end of the file, fgets() returns NULL and doesn't modify the buffer, so line will still hold the last line that was read.
I'm assuming you know the maximum length of the input line.
This one here will surely do the job for you
static char *getLine( char * const b , size_t bsz ) {
return fgets(b, bsz, stdin) );
}
But remember fgets also puts a '\n' character at the end of buffer so perhaps something like this
static char *getLine( char * const b , size_t bsz ) {
if( fgets(b, bsz, stdin) ){
/* Optional code to strip NextLine */
size_t size = strlen(b);
if( size > 0 && b[size-1] == '\n' ) {
b[--size] = '\0';
}
/* End of Optional Code */
return b;
}
return NULL;
}
and your code needs to be altered a bit while calling the getline
#define BUF_SIZE 256
char line[BUF_SIZE];
for(int i = 0; i < 10; i++){
if( getLine(line, BUF_SIZE ) ) {
fprintf(stdout, "line : '%s'\n", line);
}
}
Now it is how ever quite possible to create function like
char *getLine();
but then one needs to define the behavior of that function for instance if the function getLine() allocates memory dynamically then you probably need use a free to de-allocate the pointer returned by getLine()
in which case the function may look like
char *getLine( size_t bsz ) {
char *b = malloc( bsz );
if( b && fgets(b, bsz, stdin) ){
return b;
}
return NULL;
}
depending on how small your function is you can entertain thoughts about making it inline perhaps that's a little off topic for now.
In order to have dynamic number of input of dynamic length, you have to keep on reallocating your buffer when the input is of greater length. In order to store the last line, you have to take another pointer to keep track of it and to stop the input from the terminal you have to press EOF key(ctrl+k). This should do your job.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *get_last_line(FILE* fp, size_t size){
//The size is extended by the input with the value of the provisional
char *str, *last_str = NULL;
int ch;
size_t len = 0, last_len = 0;
str = realloc(NULL, sizeof(char)*size);//size is start size
if(!str)return str;
while(ch=fgetc(fp)){
if(ch == EOF){
break;
}
if(ch == '\n'){
str[len]='\0';
last_len = len;
last_str = realloc(last_str,sizeof(char)*last_len);
last_str[last_len]='\0';
//storing the last line
memcpy(last_str,str,sizeof(char)*last_len);
str = realloc(NULL, sizeof(char)*size);//size is start size
len = 0;
}
else {
str[len++]=ch;
if(len==size){
str = realloc(str, sizeof(char)*(size+=16));
if(!str)return str;
}
}
}
free(str);
return last_str;
}
int main(void){
char *m;
printf("input strings : ");
m = get_last_line(stdin, 10);
printf("last string :");
printf("%s\n", m);
free(m);
return 0;
}

Buffer to array (segmentation fault)

I'm trying to open a file, read the content line by line (excluding the empty lines) and store all these lines in an array, but seems I cannot come to the solution.
#include <stdio.h>
#include <stdlib.h>
int main()
{
char buffer[500];
FILE *fp;
int lineno = 0;
int n;
char topics[lineno];
if ((fp = fopen("abc.txt","r")) == NULL){
printf("Could not open abc.txt\n");
return(1);
}
while (!feof(fp))
{
// read in the line and make sure it was successful
if (fgets(buffer,500,fp) != NULL){
if(buffer[0] == '\n'){
}
else{
strncpy(topics[lineno],buffer, 50);
printf("%d: %s",lineno, topics[lineno]);
lineno++;
printf("%d: %s",lineno, buffer);
}
}
}
return(0);
}
Considering "abc.txt" contains four lines (the third one is empty) like the following:
ab
2
4
I have been trying several ways but all I'm getting now is segmentation fault.
It is mostly because you are trying to store the read line in a 0 length array
int lineno = 0;
int n;
char topics[lineno]; //lineno is 0 here
There are more mistakes in your program after you correct the above mentioned one.
strncpy() needs a char* as its first parameter, and you are passing it a char.
If you want to store all the lines, in a manner such that array[0] is the first line, array[1] is the next one, then you would need an `array of char pointers.
Something like this
char* topics[100];
.
.
.
if (fgets(buffer,500,fp) != NULL){
if(buffer[0] == '\n'){
}
else{
topics[lineno] = malloc(128);
strncpy(topics[lineno],buffer, 50);
printf("%d: %s",lineno, topics[lineno]);
lineno++;
printf("%d: %s",lineno, buffer);
}
NOTE:
Use the standard definition of main()
int main(void) //if no command line arguments.
Bonus
Since you have accidentally stepped onto 0 length array, do read about it here.
This declaration of a variable length array
int lineno = 0;
char topics[lineno];
is invalid because the size of the array may not be equal to 0 and does not make sense in the context of the program/
You could dynamically allocate an array of pojnters to char that is of type char * and reallocate it each time when a new record is added.
For example
int lineno = 0;
int n;
char **topics = NULL;
//...
char **tmp = realloc( topics, ( lineno + 1 ) * sizeof( char * ) );
if ( tmp != NULL )
{
topics = tmp;
topics[lineno] = malloc( 50 * sizeof( char ) );
//... copy the string and so on
++lineno;
}

Resources