xv6 C skipping blank lines - c

I am trying to write a program in C (for xv6) that returns the last "n" number of lines of input text or a file (essentially tail) with the exception that it should not print blank lines("\n"). At the moment, my code is able to correctly ignore blank lines, but it will still print a blank line if the first line to be printed is a blank line.
For example, this is my output if only 2 lines are to be printed:
\n (a blank space)
the quick brown fox jumped over the lazy dog
But the output should look like:
the quick brown fox jumped over the lazy dog
Things I have tried:
Checking if the current index, AND the next index in buf[] contains \n
Checking if the current index AND the previous index in buf[] does not contain \n
I feel like the solution is simple, but I can't figure it out. Does anybody have any ideas?
edit - provided full code
#include "types.h"
#include "stat.h"
#include "user.h"
#include "fcntl.h"
char buf [1024];
void tail (int fd, char* name, int lines){
int chunk_index; //keeps track of the chunk index
int chunk_size; //keeps track of the size of the chunk
int lines_in_doc = 0; //keeps track of the total number of lines
int current_line_num = 0; //keeps track of the character count in each chunk
int temp_line = open("temporary_file", O_CREATE | O_RDWR);
while ((chunk_size = read(fd, buf, sizeof(buf))) > 0){
write(temp_line, buf, chunk_size);
for (chunk_index = 0; chunk_index <= chunk_size; chunk_index++){
if (buf[chunk_index] != '\n'){
continue;
}else{
lines_in_doc++;
}
}
}
close(temp_line);
if (chunk_size < 0){
printf(1, "tail - read error \n");
exit();
}
//int total_chunks_read = 0;
temp_line = open("temporary_file", 0);
while((chunk_size = read(temp_line, buf, sizeof(buf))) > 0){
for (chunk_index = 0; chunk_index < chunk_size; chunk_index++){
if (current_line_num >= (lines_in_doc - lines)){
if ((buf[chunk_index] == '\n') && (buf[chunk_index+1] == '\n') && (buf[chunk_index-1] == '\n')){
printf(1,"haha!");
}
else{
printf(1, "%c", buf[chunk_index]);
}
}
else if (buf[chunk_index] == '\n'){
current_line_num++;
}
}
}
close(temp_line);
unlink("temporary_file");
}
//main function
int
main(int argc, char *argv[]){
int i;
int fd = 0;
int x = 10;
char *file;
char a;
file = "";
if (argc <= 1){
tail(0, "", 10);
exit();
} else{
for (i = 1; i < argc; i++){
a = *argv[i];
if (a == '-'){
argv[i]++;
x = atoi(argv[i]++);
}else{
if ((fd = open(argv[i], 0)) < 0){
printf(1, "tail: cannot open %s \n", argv[i]);
exit();
}
}
}
tail(fd, file, x);
close(fd);
exit();
}
}

The requirements for my course's tail assignment was to count, but not print blank lines. So, it would have taken some tinkering of the code of writing to the temporary file which I didn't want to do, since this is not what official tail.c does.
I realized that my logic was missing a component: a blank line between two lines of text would appear as a blankspace, a blankspace, and a non-blankspace. So, I needed to update my if statement to be the following:
if ((buf[chunk_index] == '\n') && (buf[chunk_index + 1] != '\n') &&
(buf[chunk_index - 1] == '\n')) {
printf(1, "");
}

Related

Replacing characters from buffer array

I am attempting to create a text file and fill it with text, which contains even outer double quotes. Then I want to open this file, give it read-write access, and read the file. After reading the file, iterate over each character and replace those matching a single quote. However, for each match of a single double quote, I either replace it with two backquotes or two single quotes.
However, my attempt was to insert this one index position higher than the matching index for a double quote. This meant I needed to push all characters one step higher.
When given text input like:
//sometext.txt
"some text here"
"read it"
I get:
"some text here"
"read it"``````````````````````````````````````````````````
When using the following:
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#define MAX_READ 50
int main(){
int fd;
fd = open("test.txt", O_RDWR);
if (fd == -1){
printf("Failed");
exit(EXIT_FAILURE);
}
char buffer[MAX_READ+1];
ssize_t numRead;
numRead = read(fd, buffer, MAX_READ);
if (numRead == -1){
printf("Failed");
exit(EXIT_FAILURE);
}
buffer[numRead] = '\0';
int j = 1;
size_t chars = sizeof(buffer)/sizeof(buffer[0]);
for(int i = 0; i < chars; i++){
if(buffer[i] == '"'){
if(j % 2 != 0){
buffer[i] = '`';
chars++;
while(i < chars){
i++;
buffer[i] = buffer[i-1];
}
}else{
buffer[i] = '\'';
chars++;
while(i < chars){
i++;
buffer[i] = buffer[i-1];
}
}
j++;
}
printf("\nString characters: %c", buffer[i]);
}
ssize_t numWrite;
numWrite = write(fd, buffer, MAX_READ);
if (numWrite == -1){
printf("Failed");
exit(EXIT_FAILURE);
}
return 0;
}
Expected output:
``some text here''
``read it''

Reading, a set line range from a file in C

I' am writing a C program which allows the user to dynamically specify the File name from which the data is to be read. Next the user enters a lower bound and an upper bound. The data in the lines from between the bounds is to be printed.
For this the main function makes a call: readValues(cTargetName, iLower, iHiger);
The function readValues is supposed to work as follows:
Check if file exist, if yes. Open it with fopen
Read with feof and fgets line by line the whole file, and store each line in char string
With a for loop, print the correct range of lines from the string
I'm not sure why but the while loop doesn't seem to exit although I use the feof statement, which should terminate after the end of the File is reached.
The code looks as follows:
#include <stdio.h>
#include <stdlib.h>
void readValues(char cFileName[75], int n, int m)
{
//Variable declaration;
char strArray[50][50];
char *parser;
int i = 0;
FILE *Data;
if(Data = fopen(cFileName, "rt") == NULL){
printf("File could not be opened");
return 1; //Can you return 1 in a void function?
}
//Read the file line by line
while(feof(Data)==0){
fgets(strArray[i], 200, Data);
i++;
}
//Reading the specified lines
for(n; n<=m; n++){
printf("%s", strArray[n]);
}
}
int main()
{
char cTargetName[75] = {"C:/Users/User1/Desktop/C_Projects_1/TestData.txt"};
int iLower = 2;
int iHiger = 4;
readValues(cTargetName, iLower, iHiger);
return 0;
}
All help is appreciated. Thanks in advance!
Here is my solution to your question:
#include <stdio.h>
#include <stdlib.h>
#define MIN_LINE_LENGTH 64
typedef enum {
false, true
} bool;
int main() {
char filename[PATH_MAX] = {0};
printf("Enter filename:\n");
fgets(filename, PATH_MAX, stdin); // get filename from stdin
char *ptr = filename;
while (*ptr) { // remove trailing newline at the end of filename (fgets() includes newline)
if (*ptr == '\n') {
*ptr = 0;
}
++ptr;
}
printf("Enter starting line and end line, separated by a space:\n");
size_t startLine = 0;
size_t endLine = 0;
bool hasFirstNum = false;
bool hasSecondNum = false;
bool hasMiddleSpace = false;
bool hasLastSpace = false;
size_t numCount = 0;
int ch;
while ((ch = fgetc(stdin)) != EOF && ch != '\n') { // continually receive chars from stdin
if (ch != 32 && !(ch >= 48 && ch <= 57)) { // if not a space or number, raise error
fprintf(stderr, "Only numerical values (and spaces) can be entered.\n");
return 1;
}
if (ch == 32) {
if (hasFirstNum) {
hasMiddleSpace = true;
}
if (hasSecondNum) {
hasLastSpace = true;
}
continue;
}
else if (!hasFirstNum) {
++numCount;
hasFirstNum = true;
}
else if (!hasSecondNum && hasMiddleSpace) {
++numCount;
hasSecondNum = true;
}
else if (hasLastSpace) {
++numCount;
}
if (numCount == 1) {
startLine *= 10;
startLine += ch - 48; // '0' character in ASCII is 48
}
else if (numCount == 2){
endLine *= 10;
endLine += ch - 48;
}
else {
break;
}
}
FILE *fp = fopen(filename, "r");
if (fp == NULL) {
fprintf(stderr, "Error opening file.\n");
return 1;
}
char **lines = malloc(sizeof(char *));
char *line = malloc(MIN_LINE_LENGTH);
*lines = line;
int c;
size_t char_count = 0;
size_t line_count = 1;
while ((c = fgetc(fp)) != EOF) { // continually get chars from file stream
if (c == '\n') { // expand lines pointer if a newline is encountered
*(line + char_count) = 0;
++line_count;
lines = realloc(lines, line_count*sizeof(char *));
line = (*(lines + line_count - 1) = malloc(MIN_LINE_LENGTH));
char_count = 0;
continue;
}
if ((char_count + 1) % MIN_LINE_LENGTH == 0 && char_count != 0) { // expand line pointer if needed
line = realloc(line, char_count + MIN_LINE_LENGTH);
}
*(line + char_count) = c;
++char_count;
}
*(line + char_count) = 0; // to ensure the last line always ends with the null byte
if (startLine >= line_count) { // raise error if starting line specified is greater than num. of lines in doc.
fprintf(stderr, "Specified starting line is less than total lines in document.\n");
return 1;
}
if (endLine > line_count) { // adjust ending line if it is greater than number of lines in doc.
endLine = line_count;
}
if (startLine == 0) { // we will be using the starting index of 1 as the first line
startLine = 1;
}
char **linesPtr = lines + startLine - 1;
while (startLine++ <= endLine) { // print lines
printf("%s\n", *linesPtr++);
}
for (size_t i = 0; i < line_count; ++i) { // free all memory
free(*(lines + i));
}
free(lines);
return 0;
}
It is a little more convoluted, but because it uses dynamic memory allocation, it can handle lines of any length within a text file.
If there is anything unclear, please let me know and I would be happy to explain.
Hope this helps!!
several issues here,
first, you limited the length of lines to 200, not exactly what you might expect to get.
the fgets function returns lines up to specified length unless hit by newline character - this should be taken into account.
additionally, fgets returns NULL if you hit EOF - no real need to use feof.
second, you could save yourself a lot of pain and simply count the number of times you get a string, and for the times you are within the range just print it immediately. will save you a nice amount of overhead
like this:
#include <stdio.h>
#include <stdlib.h>
#define MAXLINE 200//or anything else you want
void readValues(char cFileName[75], int n, int m)
{
//Variable declaration;
char line[MAXLINE];
int i = 0;
FILE *Data;
if((Data = fopen(cFileName, "rt")) == NULL){
printf("File could not be opened");
return 1; //Can you return 1 in a void function?
}
//Read the file line by line and print within range of lines
while((line=fgets(line, MAXLINE,Data))!=NULL){//terminates upon EOF
if (++i>=n&&i<=m)
printf(""%s\n",line);
}
}

Count lines of a file using file descriptor in C

I'm trying to count the number of lines of a file that I'm reading trough a File Descriptor but I don't know what I'm doing wrong because it does not worlk.
This is the code:
fd_openedFile = open(filename, O_RDONLY)
char *miniBuffer[1];
int lineCounter = 0;
while( read(fd_openedFile, miniBuffer, 1) >0) {
if (*miniBuffer[0] == '\n')
lineCounter++;
}
The software never enters the "if" and I've tested a lot of variants that I thought that could work but none of them worked (this is just the one that makes more sense to me).
Any help would be highly apreciated.
Thank you very much!
I have added the full code below:
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
void err_sys(const char* cadena, int continueExecuting) {
perror(cadena);
if (continueExecuting == 0)
exit(1);
}
int main()
{
//vars
char filename[200];
int fd_output = 1;
int fd_openedFile = -1;
int fd_newFile = -1;
//Ask for the file and open it
while (fd_openedFile < 0){
write(fd_output, "Write the filename: ", 20);
scanf("%s", filename);
if ((fd_openedFile = open(filename, O_RDONLY)) < 0)
err_sys("Error opening the original file", 1);
}
//Construct the new file's name
char *partOfOldFilename = strtok(filename, ".");
char newFileName[208], finalPart[8];
strcpy(newFileName, partOfOldFilename);
strcpy(finalPart, "OUT.txt");
strcat(newFileName, finalPart);
//Create the new file
if ((fd_newFile = open(newFileName, O_WRONLY | O_CREAT)) < 0)
err_sys("Error opening the new file", 1);
//Count the number of lines
char miniBuffer[1];
int lineCounter = 0;
while( read(fd_openedFile, &miniBuffer[0], 1) >0) {
write(fd_output, "R", 1); //To debug
if (miniBuffer[0] == '\n') {
lineCounter++;
write(fd_output, "1", 1); //To debug
} else {
write(fd_output, "0", 1); //To debug
write(fd_output, miniBuffer, 1); //To debug
}
}
lseek(fd_openedFile,0,SEEK_SET);
write(fd_output, "=========\n", 10); //To debug
//Count the number of chars per line
char* charsPerLine[lineCounter];
lineCounter = 0;
int charCounter = 0;
while( read(fd_openedFile, miniBuffer, 1) >0){
write(fd_output, "C", 1); //To debug
if (miniBuffer[0] == '\n') {
*(charsPerLine[lineCounter]) = charCounter +'0';
lineCounter++;
charCounter = 0;
write(fd_output, "1", 1); //To debug
} else {
write(fd_output, "0", 1); //To debug
write(fd_output, miniBuffer, 1); //To debug
charCounter ++;
}
}
lseek(fd_openedFile,0,SEEK_SET);
write(fd_output, "END", 4); //To debug
//Write a copy of the original file starting each line with the number of chars in it
lineCounter = 0;
int bufSize = 1;
char buffer[bufSize];
//First number write
if (write(fd_newFile,charsPerLine[lineCounter], bufSize)!=bufSize)
err_sys("write_error", 0);
lineCounter++;
while( read(fd_openedFile, buffer, bufSize) >0){
if (write(fd_newFile,buffer, bufSize)!=bufSize)
err_sys("write_error", 0);
if (buffer[0] == '\n') {
if (write(fd_newFile,charsPerLine[lineCounter], bufSize)!=bufSize)
err_sys("write_error", 0);
lineCounter++;
}
}
//Finish program
if (close(fd_openedFile)!=0) err_sys("error closing original file's file descriptor", 0);
if (close(fd_newFile)!=0) err_sys("error closing new file's file descriptor", 0);
return 0;
}
This codes assumes that the file is a .txt and that at the end of each line there is a "break line" and it is currently in development.
Thanks again.
You're not allocating any memory for miniBuffer which is an array of char pointers. Which isn't really the problem - the problem is that it shouldn't be an array of char pointers in the first place. You only need it to be an array of char like the following.
char miniBuffer[1];
And the other change then is to check that single element of the array for it being a \n character.
if (miniBuffer[0] == '\n')
You might find it would be more efficient to read in larger chunks by increasing the size of the array and use functions like strchr to find any \n in the string. You would need to store the amount read returns so you could properly NUL terminate the string though.

Replacing a line in a text file with a string

I am learning file handling in C. I wrote code to replace a line in a file with a string input by the user. The replacing progress itself works great, but somehow the first line is always empty and I am able to understand what goes wrong.
Additionally I have some additional questions about file handling itself and about tracking down the mistakes in my code. I understand by now that I should have used perror() and errno. This will be the next thing I will read on.
Why shouldn't I use "w+" establishing the file stream? (A user on here told me to better not use it, unfortunately I couldn't get an explanation)
I tried to use gdb to find the mistake, but when I display my fileStored array I get only numbers, since its obviously an int array, how could I improve the displaying of the variable
What would be a good approach in gdb to track the mistake down I made in the code?
The code:
#include <stdio.h>
#include <stdlib.h>
#define MAXLENGTH 100
int main(int argc, char *argv[]){
FILE *fileRead;
char fileName[MAXLENGTH],newLine[MAXLENGTH];
int fileStored[MAXLENGTH][MAXLENGTH];
short lineNumber, lines = 0;
int readChar;
printf("Input the filename to be opened:");
int i = 0;
while((fileName[i] = getchar()) != '\n' && fileName[i] != EOF && i < MAXLENGTH){
i++;
}
fileName[i] = '\0';
if((fileRead = fopen(fileName, "r")) == NULL){
printf("Error: File not found!\n");
return EXIT_FAILURE;
}
i = 0;
while((readChar = fgetc(fileRead)) != EOF){
if(readChar == '\n'){
fileStored[lines][i] = readChar;
i = 0;
lines++;
}
fileStored[lines][i] = readChar;
i++;
}
fclose(fileRead);
fileRead = fopen(fileName, "w");
printf("Input the content of the new line:");
i = 0;
while((newLine[i] = getchar()) != '\n' && newLine[i] != EOF && i < MAXLENGTH){
i++;
}
newLine[i] = '\0';
printf("There are %d lines.\nInput the line number you want to replace:",lines);
scanf("%d",&lineNumber);
if((lineNumber > lines) || (lineNumber <=0)){
printf("Error: Line does not exist!");
return EXIT_FAILURE;
}
int j = 0;
for(i = 0; i < lines; i++){
if(i == lineNumber-1){
fprintf(fileRead,"\n%s",newLine);
continue;
}
do{
fputc(fileStored[i][j],fileRead);
j++;
}while(fileStored[i][j] != '\n');
j = 0;
}
fclose(fileRead);
return EXIT_SUCCESS;
}

Can't eliminate one character in my array while parsing it even though I handle that character

So this is my second time adapting my code to fscanf to get what I want. I threw some comments next to the output. The main issue I am having is that the one null character or space is getting added into the array. I have tried to check for the null char and the space in the string variable and it does not catch it. I am a little stuck and would like to know why my code is letting that one null character through?
Part where it is slipping up "Pardon, O King," output:King -- 1; -- 1
so here it parses king a word and then ," goes through the strip function and becomes \0, then my check later down the road allows it through??
Input: a short story containing apostrophes and commas (the lion's rock. First, the lion woke up)
//Output: Every unique word that shows up with how many times it shows up.
//Lion -- 1
//s - 12
//lion -- 8
//tree -- 2
//-- 1 //this is the line that prints a null char?
//cub -- //3 it is not a space! I even check if it is \0 before entering
//it into the array. Any ideas (this is my 2nd time)?
//trying to rewrite my code around a fscanf function.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <ctype.h>
//Remove non-alpha numeric characters
void strip_word(char* string)
{
char* string_two = calloc(80, sizeof(char));
int i;
int c = 0;
for(i = 0; i < strlen(string); i++)
{
if(isalnum(string[i]))
{
string_two[c] = string[i];
++c;
}
}
string_two[i] = '\0';
strcpy(string, string_two);
free(string_two);
}
//Parse through file
void file_parse(FILE* text_file, char*** word_array, int** count_array, int* total_count, int* unique_count)
{
int mem_Size = 8;
int is_unique = 1;
char** words = calloc(mem_Size, sizeof(char *)); //Dynamically allocate array of size 8 of char*
if (words == NULL)
{
fprintf(stderr, "ERROR: calloc() failed!");
}
int* counts = calloc(mem_Size, sizeof(int)); //Dynamically allocate array of size 8 of int
if (counts == NULL)
{
fprintf(stderr, "ERROR: calloc() failed!");
}
printf("Allocated initial parallel arrays of size 8.\n");
fflush(stdout);
char* string;
while('A')
{
is_unique = 1;
fscanf(text_file, " ,");
fscanf(text_file, " '");
while(fscanf(text_file, "%m[^,' \n]", &string) == 1) //%m length modifier
{
is_unique = 1;
strip_word(string);
if(string == '\0') continue; //if the string is empty move to next iteration
else
{
int i = 0;
++(*total_count);
for(i = 0; i < (*unique_count); i++)
{
if(strcmp(string, words[i]) == 0)
{
counts[i]++;
is_unique = 0;
break;
}
}
if(is_unique)
{
++(*unique_count);
if((*unique_count) >= mem_Size)
{
mem_Size = mem_Size*2;
words = realloc(words, mem_Size * sizeof(char *));
counts = realloc(counts, mem_Size * sizeof(int));
if(words == NULL || counts == NULL)
{
fprintf(stderr, "ERROR: realloc() failed!");
}
printf("Re-allocated parallel arrays to be size %d.\n", mem_Size);
fflush(stdout);
}
words[(*unique_count)-1] = calloc(strlen(string) + 1, sizeof(char));
strcpy(words[(*unique_count)-1], string);
counts[(*unique_count) - 1] = 1;
}
}
free(string);
}
if(feof(text_file)) break;
}
printf("All done (successfully read %d words; %d unique words).\n", *total_count, *unique_count);
fflush(stdout);
*word_array = words;
*count_array = counts;
}
int main(int argc, char* argv[])
{
if(argc < 2 || argc > 3) //Checks if too little or too many args
{
fprintf(stderr, "ERROR: Invalid Arguements\n");
return EXIT_FAILURE;
}
FILE * text_file = fopen(argv[1], "r");
if (text_file == NULL)
{
fprintf(stderr, "ERROR: Can't open file");
}
int total_count = 0;
int unique_count = 0;
char** word_array;
int* count_array;
file_parse(text_file, &word_array, &count_array, &total_count, &unique_count);
fclose(text_file);
int i;
if(argv[2] == NULL)
{
printf("All words (and corresponding counts) are:\n");
fflush(stdout);
for(i = 0; i < unique_count; i++)
{
printf("%s -- %d\n", word_array[i], count_array[i]);
fflush(stdout);
}
}
else
{
printf("First %d words (and corresponding counts) are:\n", atoi(argv[2]));
fflush(stdout);
for(i = 0; i < atoi(argv[2]); i++)
{
printf("%s -- %d\n", word_array[i], count_array[i]);
fflush(stdout);
}
}
for(i = 0; i < unique_count; i++)
{
free(word_array[i]);
}
free(word_array);
free(count_array);
return EXIT_SUCCESS;
}
I'm not sure quite what's going wrong with your code. I'm working on macOS Sierra 10.12.3 with GCC 6.3.0, and the local fscanf() does not support the m modifier. Consequently, I modified the code to use a fixed size string of 80 bytes. When I do that (and only that), your program runs without obvious problem (certainly on the input "the lion's rock. First, the lion woke up").
I also think that the while ('A') loop (which should be written conventionally while (1) if it is used at all) is undesirable. I wrote a function read_word() which gets the next 'word', including skipping blanks, commas and quotes, and use that to control the loop. I left your memory allocation in file_parse() unchanged. I did get rid of the memory allocation in strip_word() (eventually — it worked OK as written too).
That left me with:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <ctype.h>
static void strip_word(char *string)
{
char string_two[80];
int i;
int c = 0;
int len = strlen(string);
for (i = 0; i < len; i++)
{
if (isalnum(string[i]))
string_two[c++] = string[i];
}
string_two[c] = '\0';
strcpy(string, string_two);
}
static int read_word(FILE *fp, char *string)
{
if (fscanf(fp, " ,") == EOF ||
fscanf(fp, " '") == EOF ||
fscanf(fp, "%79[^,' \n]", string) != 1)
return EOF;
return 0;
}
static void file_parse(FILE *text_file, char ***word_array, int **count_array, int *total_count, int *unique_count)
{
int mem_Size = 8;
char **words = calloc(mem_Size, sizeof(char *));
if (words == NULL)
{
fprintf(stderr, "ERROR: calloc() failed!");
}
int *counts = calloc(mem_Size, sizeof(int));
if (counts == NULL)
{
fprintf(stderr, "ERROR: calloc() failed!");
}
printf("Allocated initial parallel arrays of size 8.\n");
fflush(stdout);
char string[80];
while (read_word(text_file, string) != EOF)
{
int is_unique = 1;
printf("Got [%s]\n", string);
strip_word(string);
if (string[0] == '\0')
continue;
else
{
int i = 0;
++(*total_count);
for (i = 0; i < (*unique_count); i++)
{
if (strcmp(string, words[i]) == 0)
{
counts[i]++;
is_unique = 0;
break;
}
}
if (is_unique)
{
++(*unique_count);
if ((*unique_count) >= mem_Size)
{
mem_Size = mem_Size * 2;
words = realloc(words, mem_Size * sizeof(char *));
counts = realloc(counts, mem_Size * sizeof(int));
if (words == NULL || counts == NULL)
{
fprintf(stderr, "ERROR: realloc() failed!");
exit(EXIT_FAILURE);
}
printf("Re-allocated parallel arrays to be size %d.\n", mem_Size);
fflush(stdout);
}
words[(*unique_count) - 1] = calloc(strlen(string) + 1, sizeof(char));
strcpy(words[(*unique_count) - 1], string);
counts[(*unique_count) - 1] = 1;
}
}
}
printf("All done (successfully read %d words; %d unique words).\n", *total_count, *unique_count);
fflush(stdout);
*word_array = words;
*count_array = counts;
}
int main(int argc, char *argv[])
{
if (argc < 2 || argc > 3)
{
fprintf(stderr, "ERROR: Invalid Arguements\n");
return EXIT_FAILURE;
}
FILE *text_file = fopen(argv[1], "r");
if (text_file == NULL)
{
fprintf(stderr, "ERROR: Can't open file");
return EXIT_FAILURE;
}
int total_count = 0;
int unique_count = 0;
char **word_array = 0;
int *count_array = 0;
file_parse(text_file, &word_array, &count_array, &total_count, &unique_count);
fclose(text_file);
if (argv[2] == NULL)
{
printf("All words (and corresponding counts) are:\n");
fflush(stdout);
for (int i = 0; i < unique_count; i++)
{
printf("%s -- %d\n", word_array[i], count_array[i]);
fflush(stdout);
}
}
else
{
printf("First %d words (and corresponding counts) are:\n", atoi(argv[2]));
fflush(stdout);
for (int i = 0; i < atoi(argv[2]); i++)
{
printf("%s -- %d\n", word_array[i], count_array[i]);
fflush(stdout);
}
}
for (int i = 0; i < unique_count; i++)
free(word_array[i]);
free(word_array);
free(count_array);
return EXIT_SUCCESS;
}
When run on the data file:
the lion's rock. First, the lion woke up
the output was:
Allocated initial parallel arrays of size 8.
Got [the]
Got [lion]
Got [s]
Got [rock.]
Got [First]
Got [the]
Got [lion]
Got [woke]
Got [up]
All done (successfully read 9 words; 7 unique words).
All words (and corresponding counts) are:
the -- 2
lion -- 2
s -- 1
rock -- 1
First -- 1
woke -- 1
up -- 1
When the code was run on your text, including double quotes, like this:
$ echo '"Pardon, O King,"' | cw37 /dev/stdin
Allocated initial parallel arrays of size 8.
Got ["Pardon]
Got [O]
Got [King]
Got ["]
All done (successfully read 3 words; 3 unique words).
All words (and corresponding counts) are:
Pardon -- 1
O -- 1
King -- 1
$
It took a little finnagling of the code. If there isn't an alphabetic character, your code still counts it (because of subtle problems in strip_word()). That would need to be handled by checking strip_word() more carefully; you test if (string == '\0') which checks (belatedly) whether memory was allocated where you need if (string[0] == '\0') to test whether the string is empty.
Note that the code in read_word() would be confused into reporting EOF if there were two commas in a row, or an apostrophe followed by a comma (though it handles a comma followed by an apostrophe OK). Fixing that is fiddlier; you'd probably be better off using a loop with getc() to read a string of characters. You could even use that loop to strip non-alphabetic characters without needing a separate strip_word() function.
I am assuming you've not yet covered structures yet. If you had covered structures, you'd use an array of a structure such as struct Word { char *word; int count; }; and allocate the memory once, rather than needing two parallel arrays.

Resources