I bought a C book called "The C (ANSI C) PROGRAMMING LANGUAGE" to try and teach myself, well C. Anyhow, the book includes a lot of examples and practices to follow across the chapters, which is nice.
Anyhow, the code below is my answer to the books "count the longest line type of program", the authors are using a for-loop in the function getLine(char s[], int lim). Which allows for a proper display of the string line inside the main() function. However using while won't work - for a reason that is for me unknown, perhaps someone might shed a light on the situation to what my error is.
EDIT: To summarize the above. printf("%s\n", line); won't display anything.
Thankful for any help.
#include <stdio.h>
#define MAXLINE 1024
getLine(char s[], int lim) {
int c, i = 0;
while((c = getchar()) != EOF && c != '\n' && i < lim) {
s[++i] = c;
}
if(c == '\n' && i != 0) {
s[++i] = c;
s[++i] = '\0';
}
return i;
}
main(void) {
int max = 0, len;
char line[MAXLINE], longest[MAXLINE];
while((len = getLine(line,MAXLINE)) > 0) {
if(len > max) {
max = len;
printf("%s\n", line);
}
}
return 0;
}
You have a number of serious bugs. Here's the ones I found and how to fix them.
change your code to postincrement i to avoid leaving the first array member uninitialised, and to avoid double printing the final character:
s[++i] = c;
...
s[++i] = c;
s[++i] = '\0';
to
s[i++] = c;
...
// s[++i] = c; see below
...
s[i++] = '\0';
and fix your EOF bug:
if(c == '\n' && i != 0) {
s[++i] = c;
s[++i] = '\0';
}
to
if(c == '\n')
{
s[i++] = '\n';
}
s[i] = '\0'
Theory
When writing programs that deal with strings, arrays or other vector-type structures it is vitally important that you check the logic of your program. You should do this by hand, and run a few sample cases through it, providing sample inputs to your program and thinking out what happens.
The cases you need to run through it are:
a couple general cases
all the edge cases
In this case, your edge cases are:
first character ever is EOF
first character is 'x', second character ever is EOF
first character is '\n', second character is EOF
first character is 'x', second character is '\n', third character is EOF
a line has equal to lim characters
a line has one less than lim characters
a line has one more than lim characters
Sample edge case
first character is 'x', second character is '\n', third character is EOF
getLine(line[MAXLINE],MAXLINE])
(s := line[MAXLINE] = '!!!!!!!!!!!!!!!!!!!!!!!!...'
c := undef, i := 0
while(...)
c := 'x'
i := 1
s[1] := 'x' => s == '!x!!!!...' <- first bug found
while(...)
c := '\n'
end of while(...)
if (...)
(c== '\n' (T) && i != 0 (T)) = T
i := i + 1 = 2
s[2] = '\n' => s == '!x\n!!!!'
i := i + 1 = 3
s[3] = '\0' => s == '!x\n\0!!!' <- good, it's terminated
return i = 3
(len = i = 3) > 0) = T (the while eval)
if (...)
len (i = 3) > max = F
max = 3 <- does this make sense? is '!x\n' a line 3 chars long? perhaps. why did we keep the '\n' character? this is likely to be another bug.
printf("%s\n", line) <- oh, we are adding ANOTHER \n character? it was definitely a bug.
outputs "!x\n\n\0" <- oh, I expected it to print "x\n". we know why it didn't.
while(...)
getLine(...)
(s := line[MAXLINE] = '!x\n\0!!!!!!!!!!!!!!!!!!!...' ; <- oh, that's fun.
c := undef, i := 0
while(...)
c := EOF
while terminates without executing body
(c == '\n' && i != 0) = F
if body not executed
return i = 0
(len = i = 0 > 0) = F
while terminates
program stops.
So you see this simple process, that can be done in your head or (preferably) on paper, can show you in a matter of minutes whether your program will work or not.
By following through the other edge cases and a couple general cases you will discover the other problems in your program.
It's not clear from your question exactly what problem you're having with getLine (compile error? runtime error?), but there are a couple of bugs in your implementation. Instead of
s[++i] = something;
You should be using the postfix operator:
s[i++] = something;
The difference is that the first version stores 'something' at the index of (i+1), but the second version will store something at the index of i. In C/C++, arrays are indexed from 0, so you need to make sure it stores the character in s[0] on the first pass through your while loop, in s[1] on the second pass through, and so on. With the code you posted, s[0] is never assigned to, which will cause the printf() to print out unintialised data.
The following implementation of getline works for me:
int getLine(char s[], int lim) {
int c;
int i;
i = 0;
while((c = getchar()) != EOF && c != '\n' && i < lim) {
s[i++] = c;
}
if(c == '\n' && i != 0) {
s[i++] = c;
s[i++] = '\0';
}
return i;
}
By doing ++i instead of i++, you are not assigning anything to s[0] in getLine()!
Also, you are unnecesarilly incrementing when assigning '\0' at the end of the loop, which BTW you should always assign, so take it out from the conditional.
Also add return types to the functions (int main and int getLine)
Watch out for the overflow as well - you are assigning to s[i] at the end with a limit of i == lim thus you may be assigning to s[MAXLINE]. This would be a - wait for it - stack overflow, yup.
Related
I'm trying to write a program which gets one or more input lines, and if one line is too long, it gets folded at a maximum number of chars. My approach would be to write the input chars in a first array, of a given length. If Maximum length is reached or '\n' as input, i copy the content in a bigger array which will be the final string to print and get the second line in input. Problem is: it doesn't work and I can't figure out why. Thanks for the help
#include <stdio.h>
#define MAXCOL 10
#define FINAL_LENGTH 300
char line[MAXCOL];
char final_string[FINAL_LENGTH];
extern int fnlstr_pos = 0;
int main()
{
int pos, c;
pos = 0;
while(c=getchar() != EOF)
{
line[pos] = c;
if (pos + 1 >= MAXCOL || c == '\n'){
to_printandi(pos);
pos = 0;
}
++pos;
}
printf("%s", final_string);
}
to_printandi(pos)
int pos;
{
int i;
for(i = 0; i <= pos; ++i){
final_string[fnlstr_pos] = line[i];
++fnlstr_pos;
}
if (final_string[fnlstr_pos] != '\n'){
final_string[++fnlstr_pos] = '\n';
}
++fnlstr_pos;
}
There are several problems in the code. Others have already pointed out the bug in the getchar() line.
More variables and more functions and more code only twist one around in knots. If you take some time to think about what you want to achieve, go slowly, you can get your results with much less effort. Less code full of helpful comments, make for better programs.
EDIT
Looking at code with fresh eyes, I realised that the two lines explicitly setting the 'trailing' byte to '\0' were write 0 overtop of bytes already initialised to 0. Have commented out those two lines as they are superfluous.
#include <stdio.h>
int main () {
char buf[ 1024 ] = { 0 }; // buffer initialised
int ch, cnt = 0, ccnt = 0; // input char and counters
while( ( ch = getchar() ) != EOF ) { // get a character
ccnt++; // count this character
buf[ cnt++ ] = (char)ch; // assign this character
// buf[ cnt ] = '\0'; // string always terminated
if( buf[ cnt-1 ] == '\n' ) // user supplied LF?
ccnt = 0; // reset the counter (for width)
else
if( ccnt == 10 ) { // reached max output width?
buf[ cnt++ ] = '\n'; // inject a LF
// buf[ cnt ] = '\0'; // string always terminated
ccnt = 0; // reset the counter (for width)
}
}
puts( buf ); // output the whole shebang
return 0;
}
0123456789abcdefghijklmnop
qrs
tuv
wxyz
^D // end of input
0123456789
abcdefghij
klmnop
qrs
tuv
wxyz
Like the OP code, this does not test for overrunning the buffer. An easy addition left as an exercise for the reader.
EDIT2:
Then again, why have a buffer to overrun?
#include <stdio.h>
void main( void ) {
for( int ch, n = 0; ( ch = getchar() ) != EOF; /**/ )
if( (n = putchar( ch ) == '\n' ? 0 : n + 1) == 10 )
putchar( '\n' ), n = 0;
}
This question is very broad and it would helpful if you said what the problem is but I can see one issue -- you don't null terminate the final_string variable. add
final_string[fnlstr_pos] = '\0';
before the printf.
Maybe that fixes the problem you are having.
For starters this statement
while(c=getchar() != EOF)
is equivalent to
while( c = ( getchar() != EOF ) )
So c is always equal to1 if EOF is not encountered.
You need to write
while( ( c=getchar() ) != EOF)
And you need to append the input sequence with the terminating zero character '\0' tp form a string.
Another problem is these code snippet with for loop
for(i = 0; i <= pos; ++i){
final_string[fnlstr_pos] = line[i];
++fnlstr_pos;
}
As within the loop the variable fnlstr_pos was increased then this if statement
if (final_string[fnlstr_pos] != '\n'){
final_string[++fnlstr_pos] = '\n';
}
invokes undefined behavior because the variable points to a non-initialized element of the array.
The main problem is here:
while(c=getchar() != EOF)
Given the operator precedence, this will result in the same as:
while(c= (getchar() != EOF))
So c will be 1 (true) inside the loop.
Change this to:
while((c=getchar()) != EOF)
This is the main problem. Just as #Hogan suggested, there are other issues such as not null terminating the strings.
As you declare them global, they will be zeroed so you can get away with that though not in the case the user provides a string with maximum length.
Also it would greatly improve the code if you could use string manipulation functions from <string.h> instead of copying byte by byte.
I tried to implement a solution for the exercise on the C language of K&R's book. I wanted to ask here if this could be considered a legal "solution", just modifying the main without changing things inside external functions.
Revise the main routine of the longest-line program so it will
correctly print the length of arbitrary long input lines, and as much
as possible of the text.
#include <stdio.h>
#define MAXLINE 2 ////
int get_line1(char s[], int lim)
{
int c, i;
for (i = 0; i < lim - 1 && ((c = getchar()) != EOF) && c != '\n'; i++) {
s[i] = c;
}
if (c == '\n') {
s[i] = c;
i++;
}
s[i] = '\0';
return i;
}
int main()
{
int len;
int max = MAXLINE;
char line[MAXLINE];
int tot = 0;
int text_l = 0;
while ((len = get_line1(line, max)) > 0) {
if (line[len - 1] != '\n') {
tot = tot + len;
}
if (line[1] == '\n' || line[0] == '\n') {
printf("%d\n", tot + 1);
text_l = text_l + (tot + 1);
tot = 0;
}
}
printf("%d\n", text_l);
}
The idea is to set the max lenght of the string considered for the array line ad 2.
For a string as abcdef\n , the array line will be ab. Since the last element of the array is not \n (thus the line we are considering is not over), we save the length up until now and repeat the cycle. We will get then the array made of cd, then ef and at the end we will get the array of just \n. Then the else if condition is executed, since the first element of this array is\n, and we print the tot length obtained from the previous additions. We add +1 in order to also consider the new character \n. This works also for odd strings: with abcdefg\n the process will go on up until we reach g\n and the sum is done correctly.
Outside the loop then we print the total amount of text.
Is this a correct way to do the exercise?
The exercise says to “Revise the main routine,” but you altered the definition of MAXLINE, which is outside of main, so that is not a valid solution.
Also, your code does not have the copy or getline routines of the original. Your get_line1 appears to be identical except for the name. However, a correction solution would use identical source code except for the code inside main.
Additionally, the exercise says to print “as much as possible of the text.” That is unclearly stated, but I expect it means to keep a buffer of MAXLINE characters (with MAXLINE at its original value of 1000) and use it to print the first MAXLINE−1 characters of the longest line.
i could not understand what mgetline does in this code.
anyone can help me?
int mgetline(char s[],int lim)
{
int c, i;
for(i = 0; i < lim - 1 && (c = getchar()) != EOF && c != '\n'; ++i)
s[i] = c;
if(c == '\n')
{
s[i] = c;
++i;
}
s[i] = '\0';
return i;
}
The function basically reads characters one-by-one from the the standard input stream stdin until you enter a \n (newline) or the array limit of s, lim, is reached. The characters are stored in the char s[] and the length of what was read is returned.
It's hard to answer with more detail since it's a little unclear what it is you don't understand, but I've tried to annotate the code to make it somewhat clearer.
This is the same code, only reformatted to fit my comments.
int mgetline(char s[], int lim) {
int c, i;
for(i = 0; // init-statement, start with `i` at zero
i < lim - 1 // condition, `i` must be less than `lim - 1`
&& // condition, logical AND
(c = getchar()) !=EOF // (function call, assignment) condition, `c` must not be EOF
&& // condition, logical AND
c != '\n'; // condition, `c` must not be `\n` (newline)
++i) // iteration_expression, increase i by one
{
s[i] = c; // store the value of `c` in `s[i]`
}
if(c == '\n') { // if a newline was the last character read
s[i] = c; // store it
++i; // and increase i by one
}
s[i] = '\0'; // store a null terminator last
return i; // return the length of the string stored in `s`
}
In the condition part of the for loop you have 3 conditions that must all be true for the loop to enter the statement for(...;...;...) statement. I've made that statement into a code block to make it easier to see the scope. EOF is a special value that is returned by getchar() if the input stream (stdin) is closed.
Note: If you pass an array of one char (lim == 1) this function will cause undefined behavior. Any program reading uninitialized variables has undefined behavior - and that's a bad thing. In this case, if lim == 1, you will read c after the loop and c will then still be uninitialized.
Either initialize it:
int mgetline(char s[], int lim) {
int c = 0, i;
or bail out of the function:
int mgetline(char s[], int lim) {
if(lim < 2) {
if(lim == 1) s[0] = '\0';
return 0;
}
int c, i;
I'm towards solving the exercise, but just half way, I find it so weird and cannot figure it out,
the next is the code snippet, I know it is steps away from finished, but I think it's worth figuring out how come the result is like this!
#define MAXLINE 1000
int my_getline(char line[], int maxline);
int main(){
int len;
char line[MAXLINE];/* current input line */
int j;
while((len = my_getline(line, MAXLINE)) > 0 ){
for (j = 0 ; j <= len-1 && line[j] != ' ' && line[j] != '\t'; j++){
printf("%c", line[j]);
}
}
return 0;
}
int my_getline(char s[], int limit){
int c,i;
for (i = 0 ; i < limit -1 && (c = getchar()) != EOF && c != '\n'; i++)
s[i] = c;
if (c == '\n'){
s[i] = c;
++i;
}
s[i] = '\0';
return i;
}
It will be compiled successfully with cc: cc code.c. But the following result is subtle!
Iit is working for lines without \t and blanks:
hello
hello
but it does not work for the line in the picture:
I typed hel[blank][blank]lo[blank]\n:
Could anyone help me a bit? many thanks!
The problem is that you are stuck because you try to get a full line and process it. It's better to process (and the problems of K&R are mostly this way all) the input char by char. If you don't print characters as you detect spaces, but save them in a buffer, and print them if there's a nontab character when you read one past the accumulated ones, then everything works fine. This is also true for new lines. You should keep the last (nonblank) character (as blanks are eliminated before a new line) read to see if it is a newline... in that case, the new line you have just read is not printed, and so, sequences of two or more newlines are only printed the first. This is a sample complete program that does this:
#include <stdio.h>
#include <stdlib.h>
#define F(_f) __FILE__":%d:%s: "_f, __LINE__, __func__
int main()
{
char buffer[1000];
int bs = 0;
int last_char = '\n', c;
unsigned long
eliminated_spntabs = 0,
eliminated_nl = 0;
while((c = getchar()) != EOF) {
switch(c) {
case '\t': case ' ':
if (bs >= sizeof buffer) {
/* full buffer, cannot fit more blanks/tabs */
fprintf(stderr,
"we can only hold upto %d blanks/tabs in"
" sequence\n", (int)sizeof buffer);
exit(1);
}
/* add to buffer */
buffer[bs++] = c;
break;
default: /* normal char */
/* print intermediate spaces, if any */
if (bs > 0) {
printf("%.*s", bs, buffer);
bs = 0;
}
/* and the read char */
putchar(c);
/* we only update last_char on nonspaces and
* \n's. */
last_char = c;
break;
case '\n':
/* eliminate the accumulated spaces */
if (bs > 0) {
eliminated_spntabs += bs;
/* this trace to stderr to indicate the number of
* spaces/tabs that have been eliminated.
* Erase it when you are happy with the code. */
fprintf(stderr, "<<%d>>", bs);
bs = 0;
}
if (last_char != '\n') {
putchar('\n');
} else {
eliminated_nl++;
}
last_char = '\n';
break;
} /* switch */
} /* while */
fprintf(stderr,
F("Eliminated tabs: %lu\n"),
eliminated_spntabs);
fprintf(stderr,
F("Eliminated newl: %lu\n"),
eliminated_nl);
return 0;
}
The program prints (on stderr to not interfer the normal output) the number of eliminated tabs/spaces surrounded by << and >>. And also prints at the end the full number of eliminated blank lines and the number of no content lines eliminated. A line full of spaces (only) is considered a blank line, and so it is eliminated. In case you don't want blank lines with spaces (they will be eliminated anyway, as they are at the end) to be eliminated, just assign spaces/tabs seen to the variable last_char.
In addition to the good answer by #LuisColorado, there a several ways you can look at your problem that may simplify things for you. Rather than using multiple conditionals to check for c == ' ' and c == '\t' and c == '\n', include ctype.h and use the isspace() macro to determine if the current character is whitespace. It is a much clearer way to go.
When looking at the return. POSIX getline uses ssize_t as the signed return allowing it to return -1 on error. While the type is a bit of an awkward type, you can do the same with long (or int64_t for a guaranteed exact width).
Where I am a bit unclear on what you are trying to accomplish, you appear to be wanting to read the line of input and ignore whitespace. (while POSIX getline() and fgets() both include the trailing '\n' in the count, it may be more advantageous to read (consume) the '\n' but not include that in the buffer filled by my_getline() -- up to you. So from your example output provided above it looks like you want both "hello" and "hel lo ", to be read and stored as "hello".
If that is the case, then you can simplify your function as:
long my_getline (char *s, size_t limit)
{
int c = 0;
long n = 0;
while ((size_t)n + 1 < limit && (c = getchar()) != EOF && c != '\n') {
if (!isspace (c))
s[n++] = c;
}
s[n] = 0;
return n ? n : c == EOF ? -1 : 0;
}
The return statement is just the combination of two ternary clauses which will return the number of characters read, including 0 if the line was all whitespace, or it will return -1 if EOF is encountered before a character is read. (a ternary simply being a shorthand if ... else ... statement in the form test ? if_true : if_false)
Also note the choice made above for handling the '\n' was to read the '\n' but not include it in the buffer filled. You can change that by simply removing the && c != '\n' from the while() test and including it as a simple if (c == '\n') break; at the very end of the while loop.
Putting together a short example, you would have:
#include <stdio.h>
#include <ctype.h>
#define MAXC 1024
long my_getline (char *s, size_t limit)
{
int c = 0;
long n = 0;
while ((size_t)n + 1 < limit && (c = getchar()) != EOF && c != '\n') {
if (!isspace (c))
s[n++] = c;
}
s[n] = 0;
return n ? n : c == EOF ? -1 : 0;
}
int main (void) {
char str[MAXC];
long nchr = 0;
fputs ("enter line: ", stdout);
if ((nchr = my_getline (str, MAXC)) != -1)
printf ("%s (%ld chars)\n", str, nchr);
else
puts ("EOF before any valid input");
}
Example Use/Output
With your two input examples, "hello" and "hel lo ", you would have:
$ ./bin/my_getline
enter line: hello
hello (5 chars)
Or with included whitespace:
$ ./bin/my_getline
enter line: hel lo
hello (5 chars)
Testing the error condition by pressing Ctrl + d (or Ctrl + z on windows):
$ ./bin/my_getline
enter line: EOF before any valid input
There are many ways to put these pieces together, this is just one possible solution. Look things over and let me know if you have further questions.
So I'm working through the K&R C book and there was a bug in my code that I simply cannot figure out.
The program is supposed to remove all the comments from a C program. Obviously I'm just using stdin
#include <stdio.h>
int getaline (char s[], int lim);
#define MAXLINE 1000 //maximum number of characters to put into string[]
#define OUTOFCOMMENT 0
#define INASINGLECOMMENT 1
#define INMULTICOMMENT 2
int main(void)
{
int i;
int isInComment;
char string[MAXLINE];
getaline(string, MAXLINE);
for (i = 0; string[i] != EOF; ++i) {
//finds whether loop is in a comment or not
if (string[i] == '/') {
if (string[i+1] == '/')
isInComment = INASINGLECOMMENT;
if (string[i+1] == '*')
isInComment = INMULTICOMMENT;
}
//fixes the problem of print messing up after the comment
if (isInComment == INASINGLECOMMENT && string[i] == '\0')
printf("\n");
//if the line is done, restates all the variables
if (string[i] == '\0') {
getaline(string, MAXLINE);
i = 0;
if (isInComment != INMULTICOMMENT)
isInComment = OUTOFCOMMENT;
}
//prints current character in loop
if(isInComment == OUTOFCOMMENT && string[i] != EOF)
printf("%c", string[i]);
//checks to see of multiline comment is over
if(string[i] == '*' && string[i+1] == '/' ) {
++i;
isInComment = OUTOFCOMMENT;
}
}
return 0;
}
So this works great except for one problem. Whenever a line starts with a comment, it prints that comment.
So for instance, if I had a line that was simply
//this is a comment
without anything before the comment begins, it will print that comment even though it's not supposed to.
I thought I was making good progress, but this bug has really been holding me up. I hope this isn't some super easy thing I've missed.
EDIT: Forget the getaline function
//puts line into s[], returns length of that line
int getaline(char s[], int lim)
{
int c, i;
for (i = 0; i < lim-1 && (c = getchar()) != '\n'; ++i)
s[i] = c;
if (c == '\n') {
s[i] = c;
++i;
}
s[i] = '\0';
return i;
}
There are many problems in your code:
isInComment is not initialized in function main.
as pointed by others, string[i] != EOF is wrong. You need to test for end of file more precisely, especially for files that do not end with a linefeed. This test only works if char type is signed and EOF is a valid signed char value. It will nonetheless mistakenly stop on a stray \377 character, which is legal in a string or in a comment.
When you detect the end of line, you read another line and reset i to 0, but i will be incremented by the for loop before you test again for single line comment... hence the bug!
You do not handle special cases such as /* // */ or // /*
You do not handle strings. This is not a comment: "/*", nor this: '//'
You do not handle \ at end of line (escaped linefeed). This can be used to extend single line comments, strings, etc. There are more subtle cases related to \ handling and if you really want completeness, you should handle trigraphs too.
Your implementation has a limit for line size, this is not needed.
The problem you are assigned is a bit tricky. Instead of reading and parsing lines, read one character at a time and implement a state machine to parse escaped linefeeds, strings, and both comment styles. The code is not too difficult if you do it right with this method.
if (string[i] == '\0') {
getaline(string, MAXLINE);
i = 0;
if (isInComment != INMULTICOMMENT)
isInComment = OUTOFCOMMENT;
}
When you start a new line, you initialize i to 0. But then in the next iteration:
for (i = 0; string[i] != EOF; ++i)
i will be incremented, so you'll begin the new line with index 1. Therefore there is a bug when the line begins with //.
You can see that it solves the problem if you write instead:
if (string[i] == '\0') {
getaline(string, MAXLINE);
i = 0;
if (isInComment != INMULTICOMMENT)
isInComment = OUTOFCOMMENT;
}
though it's usually considered as bad style to modify for loop indices inside the loop. You may redesign your implementation in a more readable way.