How can I break up this into more than two functions? - c

How can I divide those two functions to more than two?
The functions read the file line by line.
An instruction will appear in a line in the file (at the end of each instruction there will be a newline
character). At the start of the running, the program will read the instruction line by line. Then it will
decode the required action and parameters and will call to perform the action with the appropriate
parameters.
I tried to put the foor loop, array, getc() to another function but it doesn't work.
void read_line(FILE *fp, char *orders, char *book_name, char *book_name_2, int *num)
{
int i = 0;
char c ;
*num = 0;
c = getc(fp);
while ((c != '\n') && (!feof(fp))) {
for (i = 0; (c != '$') && (c != '\n') && (!feof(fp)); i++) {
orders[i] = c;
c = getc(fp);
}
orders[i] = '\0';
if (c != '\n' && (!feof(fp))) {
fseek(fp, 3, 1);
c = getc(fp);
}
else break;
for (i = 0; (c != '$') && (c != '\n'); i++) {
book_name[i] = c;
c = getc(fp);
}
book_name[i] = '\0';
if (c != '\n' && (!feof(fp))) {
fseek(fp, 3, 1);
c = getc(fp);
}
else break;
if (strcmp(orders, "Rename ") != 0) {
for (i = 0; c != '\n'; i++) {
*num = (*num) * 10 + (c - '0');
c = getc(fp);
}
}
else {
for (i = 0; c != '\n'; i++) {
book_name_2[i] = c;
c = getc(fp);
}
book_name_2[i] = ' ';
book_name_2[i + 1] = '\0';
}
return;
}
}
Book* read_file_books(FILE *fp, Book *head, char *book_name, int *copies)
{
int i = 0;
char c ;
*copies = 0;
c = getc(fp);
while ((c != '\n') && (!feof(fp))) {
for (i = 0; (c != '$') && (c != '\n'); i++) {
book_name[i] = c;
c = getc(fp);
}
book_name[i] = '\0';
if (c != '\n') {
fseek(fp, 3, 1);
c = getc(fp);
}
else break;
for (i = 0; (c != '\n') && (!feof(fp)); i++) {
*copies = (*copies) * 10 + (c - '0');
c = getc(fp);
}
return add(head, book_name, *copies);
}
return head;
}

The most streamlined way of extracting a code block is to basically just copy the block to a function and then use pointers for variables that are declared outside that block. Let's take the for loop:
for (i = 0; (c != '$') && (c != '\n'); i++) {
book_name[i] = c;
c = getc(fp);
}
book_name[i] = '\0';
So, we will need access to i, c, book_name and fp. The simplest (but not the best) is this:
void foo(int *i, char *c, char *book_name, FILE *fp)
{
for (*i = 0; (*c != '$') && (*c != '\n'); (*i)++) {
book_name[*i] = *c;
*c = getc(fp);
}
book_name[*i] = '\0';
}
And then replace the for loop with:
foo(&i, &c, book_name, fp);
That's an easy procedure. But it gives you quite a lot of pointers. That's actually nothing wrong with the method itself. But you could get rid of some of them by considering in which scope you're declaring the variables. For instance, you can declare variables inside the for header, and you should unless you want to keep the last value. If you had done that, you could remove one parameter and get
void foo(char *c, char *book_name, FILE *fp)
{
for (int i = 0; (*c != '$') && (*c != '\n'); i++) {
book_name[i] = *c;
*c = getc(fp);
}
book_name[i] = '\0';
}
Note that I had to peek forward until the next usage if i to determine that this was safe. Your code is simply not written in a way that makes it suitable for block extraction.
You should also try to make the c variable local. Speaking of which, it should be an int and not a char. Read the documentation of getc to understand why.
Using for loop here is not wrong per se, but it's non idiomatic. For loops are typically used when you know how many times the loop should execute before the loop starts. When you want to loop until something happens, a while is more suitable. But what would be much better here is do-while. Because, when reading files, you want to try to read, and then check if you succeeded. You are technically doing that, but what makes this code hard to extract is that each block ends with reading a character that the next loop should process.
To correct this, we need to start from the beginning. First remove the very first instance to getc. Also, remove the conditions from the loop headers.
while (1) {
i=0;
// Each block takes care of 100% of their input
while((c = getc(fp)) != '$' && c != '\n' && c != EOF) {
orders[i] = c;
i++;
}
orders[i] = '\0';
// Negated your condition to make a cleaner if and get rid of else
if (!(c != '\n' && (!feof(fp)))) break;
fseek(fp, 3, 1);
...
The main thing I have corrected here is that every code block starts from scratch. It does not care about what happened before. In your code, you always started by checking the values from the previous blocks. This change makes it MUCH easier to extract code. In most cases when people ask how to extract code, the question they really should have asked is to how to change the code so that extraction becomes trivial.
Also, read Why is while(!feof(fp)) always wrong?

Related

Difficulties with an example 1.9 of The C Programming Language

I'm am working my way through the exercises of the first chapter of The C Programming Language and while I understand most of what is said and shown, there is one example that I don't understand.
In 1.9, there is a function shown to return the length of a line while setting a char array, passed as an argument, to the contents.
int get_line(char s[], int lim)
{
int c, i, l;
for (i = 0, l = 0; (c = getchar()) != EOF && c != '\n'; ++i) {
if (i < lim - 1)
s[l++] = c;
}
if (c == '\n')
if (l < lim - 1)
s[l++] = c;
s[l] = '\0';
return l;
}
The thing I do not understand, is why we need this: if (c == '\n') {...}. Could this not be combined in the for-loop? Where we explicitly check that c is not equal to '\n'? I'm having trouble wrapping my head around why this needs to be an external condition.
Any light shed would be helpful!
Thanks!
The for loop is exited if either c equals EOF or c equals '\n'. Therefore, immediately after the for loop, if you want to know which value c has, you must test.
is why we need this: if (c == '\n') {...}.
get_line() is structurally:
get_line() {
initialize
while get, A not true and B not true
perform X
if B
perform X
finalize
The loop quits under 2 conditions. With one of those (c == '\n'), we still want to perform X somewhere as that is part of the function goal.
​Could this not be combined in the for-loop?
It could be combined, yet then we have 2 locations that exit the loop.
Typical coding guidelines promote a single location to quit the loop. If we set aside that goal, then:
get_line() {
initialize
while get, A not true
perform X
if B quit the loop
finalize
As below with the same number of conditions checks, yet 2 loop exit points.
int get_line(char s[], int lim) {
int c, i, l;
for (i = 0, l = 0; (c = getchar()) != EOF; ++i) {
if (i < lim - 1)
s[l++] = c;
if (c == '\n')
break;
}
s[l] = '\0';
return l;
}
We could contort the code to get the 2 checks back on the same line and not have that pesky after the loop if (c == '\n'). Stylistically this may be harder to follow.
int get_line(char s[], int lim) {
int c, i, l;
for (i = 0, l = 0, c = 0; c != '\n' && (c = getchar()) != EOF; ++i) {
if (i < lim - 1)
s[l++] = c;
}
s[l] = '\0';
return l;
}
Lastly, code could use improvements:
No need for i and l index counters. One is enough.
Array sizing and index best uses size_t type. Warning: size_t is some unsigned type.
Using a leading size parameter allows for better static code analysis and self-documenting code: the lim relates to s[].
Avoid math on input parameters to not incur overflow. We have more range control on local objects.
Careful when lim is at an extreme or zero.
Rather than assign after declaration, where practical, initialize. E.g. int i = 0;
get_line() {
initialize
while B not true, get, A not true
perform X
finalize
or
#include <stdio.h>
#include <stdlib.h>
size_t get_line(size_t size, char s[size]) {
int ch = 0;
size_t i = 0;
while (ch != '\n' && (ch = getchar()) != EOF) {
if (i + 1 < size)
s[i++] = (char) ch;
}
// size might have been pathologically 0, so no room for \0
if (i < size) {
s[i] = '\0';
}
return i;
}
If you want to put it in the loop, you have to do something like that:
int get_line(char s[], int lim)
{
int c, i, l;
for (i = 0, l = 0; (c = getchar()) != EOF; ++i) {
if ((i < lim - 1) && (c != '\n'))
s[l++] = c;
else if (c == '\n') {
if (l < lim - 1)
s[l++] = c;
break;
}
}
s[l] = '\0';
return l;
}
So as you see, wrapping the condition inside the loop, led to more conditions checks and a break statatement.

What is the difference between this loop using array bracket vs pointer notation in C?

In the C programming language, I have the following code:
void rm_newline(char input[])
{
assert(input);
size_t i;
for(i = 0; input[i] != '\0'; ++i)
{
if(input[i] == '\n') input[i] = '\0';
}
return;
}
This code works as intended by replacing a '\n' char with '\0'. However I had a previous version shown below:
void rm_newline(char input[])
{
assert(input);
char *input_ptr = input;
while(*input_ptr != '\0')
{
if(*input_ptr++ == '\n')
{
*input_ptr = '\0';
}
}
return;
}
This second code was not properly replacing the '\n' with '\0' but I'm not sure why. Would someone please explain how the second code is functionally different from the first code?
In the second case,
if(*input_ptr++ == '\n')
input_ptr is incremented before the body of the conditional executes. You need to increment after the replacement is done, something like
while(*input_ptr != '\0')
{
if(*input_ptr == '\n')
{
*input_ptr = '\0';
}
input_ptr++; // do the increment here
}

using comma operator in the body of a loop

I'm trying to write a while loop in two different ways and noticed that using ',' in the condition twice doesn't work the way I think it does. I'm just curious about what actually happens. Here's the code :
#include <stdio.h>
int main()
{
int i = 0, lim = 1000, c;
char s[lim];
while (i < lim - 1, (c = getchar()) != '\n')
{
if (c != EOF)
{
s[i] = c;
}
else
{
break;
}
++i;
}
printf("%s\n", s);
return 0;
}
#include <stdio.h>
int main()
{
int i = 0, lim = 1000, c;
char s[lim];
while (i < lim - 1, (c = getchar()) != '\n', c != EOF)
{
s[i] = c;
++i;
}
printf("%s\n", s);
return 0;
}
Lets look at the while condition:
i < lim - 1, (c = getchar()) != '\n', c != EOF
The first part, i < lim -1 has no effect whatsoever. The second part will execute c = getchar(), but the comparison with '\n' is completely discarded. The return value of the whole condition is that of c != EOF. So the condition is equivalent to:
(c = getchar()) != EOF
What you can do is to change the comma operator for && instead.
i < lim - 1 && (c = getchar()) != '\n' && c != EOF
Comma does not make complex logical operations. You need to use logical operators instead

Using fgetc to read words?

I want to read a text file, character by character, and then do something with the characters and something with the words. This is my implementation:
char c;
char* word="";
fp = fopen("text.txt","rt");
do
{
c = (char)fgetc(fp);
if(c == ' ' || c == '\n' || c == '\0' || c == '\t')
{
//wordfunction(word)
word = ""; //Reset word
}
else
{
strcat(word, &c); //Keeps track of current word
}
//characterfunction(c);
}while(c != EOF);
fclose(fp);
However, when I try to run this my program instantly crashes. Is there a problem with setting word to ""? If so, what should I do instead?
In your word variable initial assignment, you're pointing to a static string of length 0. When you try to write data into there, you'll overwrite something else and your program will brake. You need, instead, to reserve space to your words.
where you have
char* word="";
use instead
char word[100];
This will create a space of 100 chars for your word.
char c;
char word[100];
fp = fopen("text.txt","rt");
int index = 0;
do {
c = (char)fgetc(fp);
if(c == ' ' || c == '\n' || c == '\0' || c == '\t') {
//wordfunction(word)
word[0] = 0; //Reset word
index = 0;
} else {
word[index++] = c;
word[index] = 0;
//strcat(word, &c); //Keeps track of current word
}
//characterfunction(c);
} while(c != EOF);
fclose(fp);
word points to a constant area which is not allowed to be write by strcat, so your code dumped;
word should have enough space to reserve chars, try hard-coded or realloc(word, size_t);
this can be compiled with gcc -o:
int main(){
char c;
char word[1000] = {0}; //enough space
FILE* fp = fopen("text.txt","rt");
int index = 0;
assert(0 != fp);
do
{
c = (char)fgetc(fp);
if(c == ' ' || c == '\n' || c == '\0' || c == '\t')
{
//wordfunction(word)
index = 0;
}
else
{
word[index++] = c;
}
//characterfunction(c);
}while(c != EOF);
word[index] = 0;
fclose(fp);
return 0;
}

C Random, having problems

void getS(char *fileName){
FILE *src;
if((src = fopen(fileName, "r")) == NULL){
printf("%s %s %s", "Cannot open file ", fileName, ". The program is now ending.");
exit(-1);
}
//char *get = " ";
int c = 1;
char ch = 'x';
while(ch!=EOF) {
ch = fgetc(src);
if(ch == '\n') c++;
}
fseek(src, 0, SEEK_SET);
int random = rand() % c;
int i = 0;
for(i = 0; i < random; i++){
while(ch != '\n'){
ch = fgetc(src);
}
}
do{
ch = fgetc(src);
if(ch != '\n' && ch != EOF){
printf("%c", ch);
}
}while(ch != '\n' && ch != EOF);
printf("%c", '\n');
fclose(src);
}
So this is my function that grabs a file and prints out a random word in the file if each word is separated by a new line.
Question 1:
Why is the random having preference to the first 2 words?
Question 2: How would I make it so I can use this function multiple times without doing the printf("%c", '\n'); because if I don't have that in the end the previous function call just overwrites the old one.
Thanks in advance, I've been asking a bit today thanks for all the help stackoverflow! :)
P.S. using srand(time(NULL));
Look at the logic here:
for(i = 0; i < random; i++){
while(ch != '\n'){
ch = fgetc(src);
}
}
Once you hit a newline, you won't read any more characters, so you're always going to print either the first or second line.
You can fix it like this:
for(i = 0; i < random; i++){
ch = fgetc(src); // start by reading the first character on the line
while(ch != '\n'){
ch = fgetc(src);
}
}
Jim Balter also notes that ch would best be declared as an int. This is because EOF is not considered to be a regular character.
without printf("%c","\n"); line at the end it is working fine...

Resources