I am doing a project where I have to read in text from a file and then extract every word that is 4 characters long and allocate it into dynamic array.My approach is to create int function that will get number of 4 letter words and return that number , then create another function that will grab that number and create dynamic array consisting of that many elements. The problem with this approach is how to populate that array with words that meet the requirement.
int func1(FILE *pFile){
int counter = 0;
int words = 0;
char inputWords[length];
while(fscanf(pFile,"%s",inputWords) != EOF){
if(strlen(inputWords)==4){
#counting 4 letter words
counter++;
}
}
}
return counter;
}
int main(){
#creating pointer to a textFile
FILE *pFile = fopen("smallDictionary.txt","r");
int line = 0;
#sending pointer into a function
func1(pFile);
fclose(pFile);
return 0;
}
I would suggest reading lines of input with fgets(), and breaking each line into tokens with strtok(). As each token is found, the length can be checked, and if the token is four characters long it can be saved to an array using strdup().
In the code below, storage is allocated for pointers to char which will store the addresses of four-letter words. num_words holds the number of four-letter words found, and max_words holds the maximum number of words that can currently be stored. When a new word needs to be added, num_words is incremented, and if there is not enough storage, more space is allocated. Then strdup() is used to duplicate the token, and the address is assigned to the next pointer in words.
Note that strdup() is not in the C Standard Library, but that it is POSIX. The feature test macro in the first line of the program may be needed to enable this function. Also note that strdup() allocates memory for the duplicated string which must be freed by the caller.
#define _POSIX_C_SOURCE 200809L
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define BUF_SZ 1000
#define ALLOC_INC 100
int main(void)
{
FILE *fp = fopen("filename.txt", "r");
if (fp == NULL) {
perror("Unable to open file");
exit(EXIT_FAILURE);
}
char buffer[BUF_SZ];
char **words = NULL;
size_t num_words = 0;
size_t max_words = 0;
char *token;
char *delims = " \t\r\n";
while (fgets(buffer, sizeof buffer, fp) != NULL) {
token = strtok(buffer, delims);
while (token != NULL) {
if (strlen(token) == 4) {
++num_words;
if (num_words > max_words) {
max_words += ALLOC_INC;
char **temp = realloc(words, sizeof *temp * max_words);
if (temp == NULL) {
perror("Unable to allocate memory");
exit(EXIT_FAILURE);
}
words = temp;
}
words[num_words-1] = strdup(token);
}
token = strtok(NULL, delims);
}
}
if (fclose(fp) != 0) {
perror("Unable to close file");
exit(EXIT_FAILURE);
}
for (size_t i = 0; i < num_words; i++) {
puts(words[i]);
}
/* Free allocated memory */
for (size_t i = 0; i < num_words; i++) {
free(words[i]);
}
free(words);
return 0;
}
Update
OP has mentioned that nonstandard functions are not permitted in solving this problem. Though strdup() is POSIX, and both common and standard in this sense, it is not always available. In such circumstances it is common to simply implement strdup(), as it is straightforward to do so. Here is the above code, modified so that now the function my_strdup() is used in place of strdup(). The code is unchanged, except that the feature test macro has been removed, the call to strdup() has been changed to my_strdup(), and of course now there is a function prototype and a definition for my_strdup():
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define BUF_SZ 1000
#define ALLOC_INC 100
char * my_strdup(const char *);
int main(void)
{
FILE *fp = fopen("filename.txt", "r");
if (fp == NULL) {
perror("Unable to open file");
exit(EXIT_FAILURE);
}
char buffer[BUF_SZ];
char **words = NULL;
size_t num_words = 0;
size_t max_words = 0;
char *token;
char *delims = " \t\r\n";
while (fgets(buffer, sizeof buffer, fp) != NULL) {
token = strtok(buffer, delims);
while (token != NULL) {
if (strlen(token) == 4) {
++num_words;
if (num_words > max_words) {
max_words += ALLOC_INC;
char **temp = realloc(words, sizeof *temp * max_words);
if (temp == NULL) {
perror("Unable to allocate memory");
exit(EXIT_FAILURE);
}
words = temp;
}
words[num_words-1] = my_strdup(token);
}
token = strtok(NULL, delims);
}
}
if (fclose(fp) != 0) {
perror("Unable to close file");
exit(EXIT_FAILURE);
}
for (size_t i = 0; i < num_words; i++) {
puts(words[i]);
}
/* Free allocated memory */
for (size_t i = 0; i < num_words; i++) {
free(words[i]);
}
free(words);
return 0;
}
char * my_strdup(const char *str)
{
size_t sz = strlen(str) + 1;
char *dup = malloc(sizeof *dup * sz);
if (dup) {
strcpy(dup, str);
}
return dup;
}
Final Update
OP had not posted code in the question when the above solution was written. The posted code does not compile as is. In addition to missing #includes and various syntax errors (extra braces, incorrect comment syntax) there are a couple of more significant issues. In func1(), the length variable is used uninitialized. This should be large enough so that inputWords[] can hold any expected word. Also, width specifiers should be used with %s in scanf() format strings to avoid buffer overflow. And, OP code should be checking whether the file opened successfully. Finally, func1() returns a value, but the calling function does not even assign this value to a variable.
To complete the task, the value returned from func1() should be used to declare a 2d array to store the four-letter words. The file can be rewound, but this time as fscanf() retrieves words in a loop, if a word has length 4, strcpy() is used to copy the word into the array.
Here is a modified version of OP's code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_WORD 100
int func1(FILE *pFile){
int counter = 0;
char inputWords[MAX_WORD];
while(fscanf(pFile,"%99s",inputWords) != EOF) {
if(strlen(inputWords) == 4) {
counter++;
}
}
return counter;
}
int main(void)
{
FILE *pFile = fopen("filename.txt","r");
if (pFile == NULL) {
perror("Unable to open file");
exit(EXIT_FAILURE);
}
char inputWords[MAX_WORD];
int num_4words = func1(pFile);
char words[num_4words][MAX_WORD];
int counter = 0;
rewind(pFile);
while(fscanf(pFile,"%99s",inputWords) != EOF) {
if(strlen(inputWords) == 4) {
strcpy(words[counter], inputWords);
counter++;
}
}
if (fclose(pFile) != 0) {
perror("Unable to close file");
}
for (int i = 0; i < num_4words; i++) {
puts(words[i]);
}
return 0;
}
Related
I am reading a file that contains several lines of strings(max length 50 characters). To store those strings I created a char double-pointer using calloc. The way my code works is as it finds a line in the file it adds one new row (char *) and 50 columns (char) and then stores the value.
My understanding is that I can call this method and get this pointer with values in return. However, I was not getting the values so I check where I am losing it and I found that the memory is not persisting after while loop. I am able to print strings using print 1 statement but print 2 gives me null.
Please let me know what I am doing wrong here.
char **read_file(char *file)
{
FILE *fp = fopen(file, "r");
char line[50] = {0};
char **values = NULL;
int index = 0;
if (fp == NULL)
{
perror("Unable to open file!");
exit(1);
}
// read both sequence
while (fgets(line, 50, fp))
{
values = (char **)calloc(index + 1, sizeof(char *));
values[index] = (char *)calloc(50, sizeof(char));
values[index] = line;
printf("%s",values[index]); // print 1
index++;
}
fclose(fp);
printf("%s", values[0]); // print 2
return values;
}
line content is overwritten on each loop iteration (by fgets()).
values is overwritten (data loss) and leaks memory on each iteration index > 1.
value[index] is allocated memory on each iteration which leaks as you overwrite it with the address of line on the following line.
line is a local variable so you cannot return it to caller where it will be out of scope.
caller has no way to tell how many entries values contain.
Here is a working implementation with a few changes. On error it closes the file and frees up memory allocated and return NULL instead of exiting. Moved printf() to caller:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define BUF_LEN 50
char **read_file(char *file) {
FILE *fp = fopen(file, "r");
if(!fp) {
perror("Unable to open file!");
return NULL;
}
char **values = NULL;
char line[BUF_LEN];
unsigned index;
for(index = 0;; index++) {
char **values2 = realloc(values, (index + 1) * sizeof(char *));
if(!values2) {
perror("realloc failed");
goto err;
}
values = values2;
if(!fgets(line, BUF_LEN, fp)) break;
values[index] = strdup(line);
}
fclose(fp);
values[index] = NULL;
return values;
err:
fclose(fp);
for(unsigned i = 0; i < index; i++) {
free(values[i]);
}
free(values);
return NULL;
}
int main() {
char **values = read_file("test.txt");
for(unsigned i = 0; values[i]; i++) {
printf("%s", values[i]);
free(values[i]);
}
free(values);
return 0;
}
fgets() returns line ending in '\n' or at most BUF_LEN - 1 of data. This means a given value[i] may or may not be ending with a \n. You may want this behavior, or you want value[i] to be consistent and not contain any trailing \n irregardless of the input.
strdup() is _POSIX_C_SOURCE >= 200809L and not standard c,
so if you build with --std=c11 the symbol would not be defined.
Hi I was trying to create an array of string of an undetermined length in c.
This is my code :
int main()
{
int lineCount=linesCount();
char text[lineCount][10];
printf("%d",lineCount);
FILE * fpointer = fopen("test.txt","r");
fgets(text,10,fpointer);
fclose(fpointer);
printf("%s",text);
return 0;
}
I would like to replace 10 in
char text[lineCount][10];
My code reads out a file I already made the amount of lines dynamic.
Since the line length is unpredictable I would like to replace 10 by a something dynamic.
Thanks in advance.
To do this cleanly, we want a char * array rather than an 2D char array:
char *text[lineCount];
And, we need to use memory from the heap to store the individual lines.
Also, don't "hardwire" so called "magic" numbers like 10. Use an enum or #define (e.g) #define MAXWID 10. Note that with the solution below, we obviate the need for using the magic number at all.
Also, note the use of sizeof(buf) below instead of a magic number.
And, we want [separate] loops when reading and printing.
Anyway, here's the refactored code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int
linesCount(void)
{
return 23;
}
int
main(void)
{
int lineCount = linesCount();
char *text[lineCount];
char buf[10000];
printf("%d", lineCount);
// open file and _check_ the return
const char *file = "test.txt";
FILE *fpointer = fopen(file, "r");
if (fpointer == NULL) {
perror(file);
exit(1);
}
int i = 0;
while (fgets(buf, sizeof(buf), fpointer) != NULL) {
// strip newline
buf[strcspn(buf,"\n")] = 0;
// store line -- we must allocate this
text[i++] = strdup(buf);
}
fclose(fpointer);
for (i = 0; i < lineCount; ++i)
printf("%s\n", text[i]);
return 0;
}
UPDATE:
The above code is derived from your original code. But, it assumes that the linesCount function can predict the number of lines. And, it doesn't check against overflow of the fixed length text array.
Here is a more generalized version that will allow an arbitrary number of lines with varying line lengths:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int
main(void)
{
int lineCount = 0;
char **text = NULL;
char buf[10000];
// open file and _check_ the return
const char *file = "test.txt";
FILE *fpointer = fopen(file, "r");
if (fpointer == NULL) {
perror(file);
exit(1);
}
int i = 0;
while (fgets(buf, sizeof(buf), fpointer) != NULL) {
// strip newline
buf[strcspn(buf,"\n")] = 0;
++lineCount;
// increase number of lines in array
text = realloc(text,sizeof(*text) * lineCount);
if (text == NULL) {
perror("realloc");
exit(1);
}
// store line -- we must allocate this
text[lineCount - 1] = strdup(buf);
}
fclose(fpointer);
// print the lines
for (i = 0; i < lineCount; ++i)
printf("%s\n", text[i]);
// more processing ...
// free the lines
for (i = 0; i < lineCount; ++i)
free(text[i]);
// free the list of lines
free(text);
return 0;
}
So I have this bit of code
int main(int argc, char *argv[]) {
char *vendas[1];
int size = 1;
int current = 0;
char buffer[50];
char *token;
FILE *fp = fopen("Vendas_1M.txt", "r");
while(fgets(buffer, 50, fp)) {
token = strtok(buffer, "\n");
if (size == current) {
*vendas = realloc(*vendas, sizeof(vendas[0]) * size * 2);
size *= 2;
}
vendas[current] = strdup(token);
printf("%d - %d - %s\n", current, size, vendas[current]);
current++;
}
}
Here's the thing... Using GDB it's giving a segmentation fault on
vendas[current] = strdup(token);
but the weirdest thing is it works up until the size it at 1024. The size grows up to 1024 and then it just spits a segmentation fault at around the 1200 element.
I know the problem is on the memory reallocation, because it worked when I had a static array. Just can't figure out what.
You cannot reallocate a local array, you want vendas to be a pointer to an allocated array of pointers: char **vendas = NULL;.
You should also include the proper header files and check for fopen() and realloc() failure.
Here is a modified version:
#include <stdio.h>
#include <stdlib.h>
void free_array(char **array, size_t count) {
while (count > 0) {
free(array[--count]);
}
free(array);
}
int main(int argc, char *argv[]) {
char buffer[50];
char **vendas = NULL;
size_t size = 0;
size_t current = 0;
char *token;
FILE *fp;
fp = fopen("Vendas_1M.txt", "r");
if (fp == NULL) {
printf("cannot open file Vendas_1M.txt\n");
return 1;
}
while (fgets(buffer, sizeof buffer, fp)) {
token = strtok(buffer, "\n");
if (current >= size) {
char **savep = vendas;
size = (size == 0) ? 4 : size * 2;
vendas = realloc(vendas, sizeof(*vendas) * size);
if (vendas == NULL) {
printf("allocation failure\n");
free_array(savep, current);
return 1;
}
}
vendas[current] = strdup(token);
if (vendas[current] == NULL) {
printf("allocation failure\n");
free_array(vendas, current);
return 1;
}
printf("%d - %d - %s\n", current, size, vendas[current]);
current++;
}
/* ... */
/* free allocated memory (for cleanliness) */
free_array(vendas, current);
return 0;
}
You only have room for one (1) pointer in you array of char *vendas[1]. So second time around you are outside the limits of the array and are in undefined behavior land.
Also, the first call to realloc passes in a pointer that was not allocated by malloc so there is another undefined behavior.
I have a file with tab delimited data. I want to read the every line into a Structure. I have a code to read the data to char buffer. But I want to load the data into a Structure.
This is My sample data.
empname1\t001\t35\tcity1
empname2\t002\t35\tcity2
My Structure definition .
struct employee
{
char *empname;
char *empid;
int age;
char *addr;
};
My sample program to read data to a char array buffer
char buffer[BUF_SIZE]; /* Character buffer */
input_fd = open (fSource, O_RDONLY);
if (input_fd == -1) {
perror ("open");
return 2;
}
while((ret_in = read (input_fd, &buffer, BUF_SIZE)) > 0){
// Do Some Process
}
Here I want to load the content to a structure variable instead of the character buffer. How I can achieve that?
Well, a possible solution could be
Read a complete line from the file using fgets().
tokenize the input buffer based on the required delimiter [tab in your case] using strtok().
allocate memory (malloc()/ realloc()) to a pointer variable of your structure.
copy the tokenized inputs into the member variables.
Note:
1. fgets() reads and stores the trailing \n.
2. Please check carefully how to use strtok(). The input string should be mutable.
3. Allocate memory to pointers before using them. IMO, use statically allocated array as struct employee member variables.
You can use the fscanf function. Open a file as a stream then use the fscanf to get a input from the file.
int fscanf(FILE *stream, const char *format, ...);
FILE *fp=fopen(fsource,"r+");
struct employee detail;
fscanf(fp,"%s %s %d %s",detail.empname,detail.empid,&detail.age,detail.addr);
Make sure that allocation of memory to the variables.
Or else you can use the strtok function. That time you have to use the sscanf function.
You can use fscanf to read each line from file, strtok to tokenize the line read.
Since your structure members are pointers, allocate memory appropriately.
The following minimal code does exactly what you want.
#define SIZE 50
FILE *fp = NULL;
int i = 0;
struct employee var = {NULL, NULL, 0, NULL};
char line[SIZE] = {0}, *ptr = NULL;
/* 1. Open file for Reading */
if (NULL == (fp = fopen("file.txt","r")))
{
perror("Error while opening the file.\n");
exit(EXIT_FAILURE);
}
/* 2. Allocate Memory */
var.empname = malloc(SIZE);
var.empid = malloc(SIZE);
var.addr = malloc(SIZE);
/* 3. Read each line from the file */
while (EOF != fscanf(fp, "%s", line))
{
/* 4. Tokenise the read line, using "\" delimiter*/
ptr = strtok(line, "\\");
var.empname = ptr;
while (NULL != (ptr = strtok(NULL, "\\")))
{
i++;
/* 5. Store the tokens as per structure members , where (i==0) is first member and so on.. */
if(i == 1)
var.empid = ptr;
else if(i == 2)
var.age = atoi(ptr);
else if (i == 3)
var.addr = ptr;
}
i = 0; /* Reset value of i */
printf("After Reading: Name:[%s] Id:[%s] Age:[%d] Addr:[%s]\n", var.empname, var.empid, var.age, var.addr);
}
Working Demo: http://ideone.com/Kp9mzN
Few things to Note here:
This is guaranteed to work, as long as your structure definition (and order of members) remains the same (see manipulation of value i).
strtok(line, "\\");, Second argument is just escaping (first \) the actual \ character.
Clarification from the OP:
In your structure definition, third member is an int, however you're trying to read t35 into it (which is a string).
So var.age = atoi(ptr); will give you 0,
You could change the structure definition, making third member as char * and allocating memory like other members.
Or change file contents, making sure an int is present as the third value.
I think this may be what you are looking for
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/stat.h>
struct employee
{
char *empname;
char *empid;
int age;
char *addr;
};
int readEmploee(char *line, struct employee *employee)
{
char *token;
char *saveptr;
char *endptr;
if ((employee == NULL) || (line == NULL))
return 0;
token = strtok_r(line, "\t", &saveptr);
if (token == NULL)
return 0;
employee->empname = strdup(token);
token = strtok_r(NULL, "\t", &saveptr);
if (token == NULL)
return 0;
employee->empid = strdup(token);
token = strtok_r(NULL, "\t", &saveptr);
if (token == NULL)
return 0;
employee->age = strtol(token, &endptr, 10);
if (*endptr != '\0')
return 0;
token = strtok_r(NULL, "\t", &saveptr);
if (token == NULL)
return 0;
employee->addr = strdup(token);
return 1;
}
char *mygetline(int fd)
{
char *line;
size_t length;
size_t count;
char character;
line = malloc(128);
if (line == NULL)
return NULL;
length = 0;
count = 1;
do
{
if (read(fd, &character, 1) != 1) /* end of file probably reached */
{
free(line);
return NULL;
}
else if (character != '\n')
{
if (length > 128 * count)
{
char *temp;
temp = realloc(line, 128 * count);
if (temp == NULL)
{
free(line);
return NULL;
}
line = temp;
count += 1;
}
line[length++] = character;
}
} while (character != '\n');
line[length] = 0;
return line;
}
struct employee *readFile(const char *const fSource, size_t *count)
{
struct employee *employees;
int employeeCount;
int input_fd;
char *line;
if ((count == NULL) || (fSource == NULL))
return NULL;
*count = 0;
employees = NULL;
employeeCount = 0;
input_fd = open (fSource, O_RDONLY);
if (input_fd == -1)
{
perror ("open");
return NULL;
}
while ((line = mygetline(input_fd)) != NULL)
{
struct employee employee;
if (readEmploee(line, &employee) != 0)
{
struct employee *temp;
temp = realloc(employees, (1 + employeeCount) * sizeof(struct employee));
if (temp != NULL)
employees = temp;
employees[employeeCount++] = employee;
}
free(line);
}
*count = employeeCount;
return employees;
}
int
main()
{
size_t count;
size_t index;
struct employee *employees;
employees = readFile("somesamplefile.txt", &count);
if (employees == NULL)
return 1;
for (index = 0 ; index < count ; index++)
{
struct employee current;
current = employees[index];
fprintf(stderr, "%s, %s, %d, %s\n", current.empname, current.empid, current.age, current.addr);
if (current.empname != NULL)
free(current.empname);
if (current.empid != NULL)
free(current.empid);
if (current.addr != NULL)
free(current.addr);
}
free(employees);
return 0;
}
I'm reading a file and want to put each line into a string in an array. The length of the file is arbitrary and the length of each line is arbitrary (albeit assume it will be less than 100 characters).
Here's what I've got and it's not compiling. Essentially this is an array to an array of characters, right? So shouldn't it be char** words = (**char)malloc(sizeof(*char));?
#include <stdio.h>
#include <stdlib.h>
int main(){
int BUFSIZE = 32767;//max number of lines to read
char** words = (**char)malloc(sizeof(*char));//gives error: expected expression before 'char'
FILE *fp = fopen("coll.txt", "r");
if (fp == 0){
fprintf(stderr, "Error opening file");
exit(1);
}
int i = 0;
words[i] = malloc(BUFSIZE);
while(fscanf(fp, "%100s", words[i]) == 1)//no line will be longer than 100
{
i++;
words[i] = realloc(words, sizeof(char*)*i);
}
int j;
for(j = 0; j < i; j++)
printf("%s\n", words);
return 0;
}
Note: I've read "Reading from a file and storing in array" but it doesn't answer my question.
There are a few issues with your program. The realloc() statement is not used correctly. I also prefer fgets() for getting a line. Here is my solution. This also uses realloc() to increase the allocation of the buffer lines so that you neither have to know the number of lines in advance nor do you have to read the file in two passes (faster that way). This is a common technique to use when you don't know how much memory you'll have to allocate in advance.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int lines_allocated = 128;
int max_line_len = 100;
/* Allocate lines of text */
char **words = (char **)malloc(sizeof(char*)*lines_allocated);
if (words==NULL)
{
fprintf(stderr,"Out of memory (1).\n");
exit(1);
}
FILE *fp = fopen("coll.txt", "r");
if (fp == NULL)
{
fprintf(stderr,"Error opening file.\n");
exit(2);
}
int i;
for (i=0;1;i++)
{
int j;
/* Have we gone over our line allocation? */
if (i >= lines_allocated)
{
int new_size;
/* Double our allocation and re-allocate */
new_size = lines_allocated*2;
words = (char **)realloc(words,sizeof(char*)*new_size);
if (words==NULL)
{
fprintf(stderr,"Out of memory.\n");
exit(3);
}
lines_allocated = new_size;
}
/* Allocate space for the next line */
words[i] = malloc(max_line_len);
if (words[i]==NULL)
{
fprintf(stderr,"Out of memory (3).\n");
exit(4);
}
if (fgets(words[i],max_line_len-1,fp)==NULL)
break;
/* Get rid of CR or LF at end of line */
for (j=strlen(words[i])-1;j>=0 && (words[i][j]=='\n' || words[i][j]=='\r');j--)
;
words[i][j+1]='\0';
}
/* Close file */
fclose(fp);
int j;
for(j = 0; j < i; j++)
printf("%s\n", words[j]);
/* Good practice to free memory */
for (;i>=0;i--)
free(words[i]);
free(words);
return 0;
}
You should change the line:
char** words = (**char)malloc(sizeof(*char));
into this:
char** words=(char **)malloc(sizeof(char *)*Max_Lines);