K&R exercise 1-18 gives segmentation fault when EOF is encountered - c

I've been stuck at this problem:
Write a Program to remove the trailing blanks and tabs from each input line and to delete entirely blank lines.
for the last couple of hours, it seems that I can not get it to work properly.
#include<stdio.h>
#define MAXLINE 1000
int mgetline(char line[],int lim);
int removetrail(char rline[]);
//==================================================================
int main(void)
{
int len;
char line[MAXLINE];
while((len=mgetline(line,MAXLINE))>0)
if(removetrail(line) > 0)
printf("%s",line);
return 0;
}
//==================================================================
int mgetline(char s[],int lim)
{
int i,c;
for(i = 0; i < lim - 1 && (c = getchar()) != EOF && c != '\n'; ++i)
s[i] = c;
if( c == '\n')
{
s[i]=c;
++i;
}
s[i]='\0';
return i;
}
/* To remove Trailing Blanks,tabs. Go to End and proceed backwards removing */
int removetrail(char s[])
{
int i;
for(i = 0; s[i] != '\n'; ++i)
;
--i; /* To consider raw line without \n */
for(i > 0; ((s[i] == ' ') || (s[i] == '\t')); --i)
; /* Removing the Trailing Blanks and Tab Spaces */
if( i >= 0) /* Non Empty Line */
{
++i;
s[i] = '\n';
++i;
s[i] = '\0';
}
return i;
}
I am using gedit text editor in debian.
Anyway when I type text into the terminal and hit enter it just copies the whole line down, and if I type text with blanks and tabs and I press EOF(ctrl+D) I get the segmentation fault.
I guess the program is running out of memory and/or using memory 'blocks' out of its array, I am still really new to all of this.
Any kind of help is appreciated, thanks in advance.
P.S.: I tried using both the code from the solutions book and code from random sites on internet but both of them give me the segmentation fault message when EOF is encountered.

It's easy:
The mgetline returns the buffer filled with data you enter in two cases:
when new line char is encountered
when EOF is encountered.
In first case it put the new line char into the buffer, in the latter - it does not.
Then you pass the buffer to removetrail function that first tries to find the newline char:
for(i=0; s[i]!='\n'; ++i)
;
But there is no new line char when you hit Ctrl-D! Thus you get memory access exception as you pass over the mapped memory.

Related

K&R C programming book exercise 1-18

I'm towards solving the exercise, but just half way, I find it so weird and cannot figure it out,
the next is the code snippet, I know it is steps away from finished, but I think it's worth figuring out how come the result is like this!
#define MAXLINE 1000
int my_getline(char line[], int maxline);
int main(){
int len;
char line[MAXLINE];/* current input line */
int j;
while((len = my_getline(line, MAXLINE)) > 0 ){
for (j = 0 ; j <= len-1 && line[j] != ' ' && line[j] != '\t'; j++){
printf("%c", line[j]);
}
}
return 0;
}
int my_getline(char s[], int limit){
int c,i;
for (i = 0 ; i < limit -1 && (c = getchar()) != EOF && c != '\n'; i++)
s[i] = c;
if (c == '\n'){
s[i] = c;
++i;
}
s[i] = '\0';
return i;
}
It will be compiled successfully with cc: cc code.c. But the following result is subtle!
Iit is working for lines without \t and blanks:
hello
hello
but it does not work for the line in the picture:
I typed hel[blank][blank]lo[blank]\n:
Could anyone help me a bit? many thanks!
The problem is that you are stuck because you try to get a full line and process it. It's better to process (and the problems of K&R are mostly this way all) the input char by char. If you don't print characters as you detect spaces, but save them in a buffer, and print them if there's a nontab character when you read one past the accumulated ones, then everything works fine. This is also true for new lines. You should keep the last (nonblank) character (as blanks are eliminated before a new line) read to see if it is a newline... in that case, the new line you have just read is not printed, and so, sequences of two or more newlines are only printed the first. This is a sample complete program that does this:
#include <stdio.h>
#include <stdlib.h>
#define F(_f) __FILE__":%d:%s: "_f, __LINE__, __func__
int main()
{
char buffer[1000];
int bs = 0;
int last_char = '\n', c;
unsigned long
eliminated_spntabs = 0,
eliminated_nl = 0;
while((c = getchar()) != EOF) {
switch(c) {
case '\t': case ' ':
if (bs >= sizeof buffer) {
/* full buffer, cannot fit more blanks/tabs */
fprintf(stderr,
"we can only hold upto %d blanks/tabs in"
" sequence\n", (int)sizeof buffer);
exit(1);
}
/* add to buffer */
buffer[bs++] = c;
break;
default: /* normal char */
/* print intermediate spaces, if any */
if (bs > 0) {
printf("%.*s", bs, buffer);
bs = 0;
}
/* and the read char */
putchar(c);
/* we only update last_char on nonspaces and
* \n's. */
last_char = c;
break;
case '\n':
/* eliminate the accumulated spaces */
if (bs > 0) {
eliminated_spntabs += bs;
/* this trace to stderr to indicate the number of
* spaces/tabs that have been eliminated.
* Erase it when you are happy with the code. */
fprintf(stderr, "<<%d>>", bs);
bs = 0;
}
if (last_char != '\n') {
putchar('\n');
} else {
eliminated_nl++;
}
last_char = '\n';
break;
} /* switch */
} /* while */
fprintf(stderr,
F("Eliminated tabs: %lu\n"),
eliminated_spntabs);
fprintf(stderr,
F("Eliminated newl: %lu\n"),
eliminated_nl);
return 0;
}
The program prints (on stderr to not interfer the normal output) the number of eliminated tabs/spaces surrounded by << and >>. And also prints at the end the full number of eliminated blank lines and the number of no content lines eliminated. A line full of spaces (only) is considered a blank line, and so it is eliminated. In case you don't want blank lines with spaces (they will be eliminated anyway, as they are at the end) to be eliminated, just assign spaces/tabs seen to the variable last_char.
In addition to the good answer by #LuisColorado, there a several ways you can look at your problem that may simplify things for you. Rather than using multiple conditionals to check for c == ' ' and c == '\t' and c == '\n', include ctype.h and use the isspace() macro to determine if the current character is whitespace. It is a much clearer way to go.
When looking at the return. POSIX getline uses ssize_t as the signed return allowing it to return -1 on error. While the type is a bit of an awkward type, you can do the same with long (or int64_t for a guaranteed exact width).
Where I am a bit unclear on what you are trying to accomplish, you appear to be wanting to read the line of input and ignore whitespace. (while POSIX getline() and fgets() both include the trailing '\n' in the count, it may be more advantageous to read (consume) the '\n' but not include that in the buffer filled by my_getline() -- up to you. So from your example output provided above it looks like you want both "hello" and "hel lo ", to be read and stored as "hello".
If that is the case, then you can simplify your function as:
long my_getline (char *s, size_t limit)
{
int c = 0;
long n = 0;
while ((size_t)n + 1 < limit && (c = getchar()) != EOF && c != '\n') {
if (!isspace (c))
s[n++] = c;
}
s[n] = 0;
return n ? n : c == EOF ? -1 : 0;
}
The return statement is just the combination of two ternary clauses which will return the number of characters read, including 0 if the line was all whitespace, or it will return -1 if EOF is encountered before a character is read. (a ternary simply being a shorthand if ... else ... statement in the form test ? if_true : if_false)
Also note the choice made above for handling the '\n' was to read the '\n' but not include it in the buffer filled. You can change that by simply removing the && c != '\n' from the while() test and including it as a simple if (c == '\n') break; at the very end of the while loop.
Putting together a short example, you would have:
#include <stdio.h>
#include <ctype.h>
#define MAXC 1024
long my_getline (char *s, size_t limit)
{
int c = 0;
long n = 0;
while ((size_t)n + 1 < limit && (c = getchar()) != EOF && c != '\n') {
if (!isspace (c))
s[n++] = c;
}
s[n] = 0;
return n ? n : c == EOF ? -1 : 0;
}
int main (void) {
char str[MAXC];
long nchr = 0;
fputs ("enter line: ", stdout);
if ((nchr = my_getline (str, MAXC)) != -1)
printf ("%s (%ld chars)\n", str, nchr);
else
puts ("EOF before any valid input");
}
Example Use/Output
With your two input examples, "hello" and "hel lo ", you would have:
$ ./bin/my_getline
enter line: hello
hello (5 chars)
Or with included whitespace:
$ ./bin/my_getline
enter line: hel lo
hello (5 chars)
Testing the error condition by pressing Ctrl + d (or Ctrl + z on windows):
$ ./bin/my_getline
enter line: EOF before any valid input
There are many ways to put these pieces together, this is just one possible solution. Look things over and let me know if you have further questions.

K&R exercise 1-22 Hints

The exercise states as follows
Write a program to "fold" long input lines in two or more shorter lines after the non-last blank character that occurs before the n-th columns of input. Make sure your program does something intelligent with very long lines, and if there are no blanks or tabs before the specified column.
My question is on how to implement the foldStrings function. I have tried some things but none of them worked.
Can you give me some hints on how to do this, but please don't write the solution down I want to figure it myself.
I have written some code but I am stuck at the folding part
#include <stdio.h>
#include <string.h>
int getline(char s[], int lim);
void emptystring(char s[]);
void foldStrings(char s[],int len);
int main(){
int len ;
char line[255];
while((len = getline(line,255))>0)
{
foldStrings(line,len);
}
return 0;
}
int getline(char s[],int lim)
{
int c , i ;
for( i = 0 ; i < lim-1 && ( c = getchar()) != EOF && c !='\n';++i)
s[i] = c;
if ( c == '\n')
{
s[i] = c;
++i;
}
return i;
}
void foldStrings(char s[], int len)
{
}
void emptystring(char s[])
{
int i;
int len = strlen(s);
for( i = 0 ; i < len ; ++i){
s[i] = 0 ;
}
}
I am stuck at the foldStrings function.
P.S I am using the empty string function to print the lines, so print a segmented line and then empty it, fill it up again and print it and so on.
Update
I have tried doing the foldStrings, here is one of my implementations
void foldStrings(char s[], int len)
{
int i ;
char temp[255];
for(i = 1;i < len-1 ;++i)
{
if( i % 16 != 0)
{
temp[i-1] = s[i-1];
}
else if(i%16 == 0)
{
printf("%s",temp)
emptystring(temp);
}
}
}
When getline() is done, s is not necessarily null character terminated.
// for( i = 0 ; i < lim-1 && ...
for( i = 0 ; i < lim-2 && ( c = getchar()) != EOF && c !='\n';++i)
...
s[i] = '\0'; // add
return i;
Same for foldStrings(). missing null character.
temp[i-1] = s[i-1];
temp[i] = '\0'; // add
Other problems may exist
This exercise is a little challenging.
At first, you don't need to buffer at all, as you only have two kinds of read chars (blank/non blank) and for a non blank character you always have to print it, so your main loop can be something like
while((c = getchar()) != EOF) {
...
}
(much the style all exercises in K&R are written)
look that when blank character are input, you only have to count them, and reset the counter on \n input.
As you ask not to reveal the final solution, I'll commit on that, but the trick is to count characters as you read the line, outputting on nonblanks (and counting) and not outputting (but only counting) on blank characters. If the character read is a blank and you have passed the limit, you'll fold the line (emit a \n)
EDITION 1
In my first attempt to write the code I discovered that the pattern blank->nonblank crossing the maximum line length boundary makes the need to break the line at the point of the first blank character and remember all the nonblank characters read so far. In that case, I'll need at most, the maximum output line length of storage (to store the non blank characters that happen to be in the data when we reach the maximum line length and have to break the line), and a maximum output line length of them have to be stored, as if I get more, for sure the line must be broken before that point.
My first attempt will be to store the number of blank characters read so far, followed by a buffer of maximum output line length (not input line length, which is unbounded, as specified in the problem). The possible statuses will be: (follow next edition of page)

C Program won't remove comments that take up the whole line

So I'm working through the K&R C book and there was a bug in my code that I simply cannot figure out.
The program is supposed to remove all the comments from a C program. Obviously I'm just using stdin
#include <stdio.h>
int getaline (char s[], int lim);
#define MAXLINE 1000 //maximum number of characters to put into string[]
#define OUTOFCOMMENT 0
#define INASINGLECOMMENT 1
#define INMULTICOMMENT 2
int main(void)
{
int i;
int isInComment;
char string[MAXLINE];
getaline(string, MAXLINE);
for (i = 0; string[i] != EOF; ++i) {
//finds whether loop is in a comment or not
if (string[i] == '/') {
if (string[i+1] == '/')
isInComment = INASINGLECOMMENT;
if (string[i+1] == '*')
isInComment = INMULTICOMMENT;
}
//fixes the problem of print messing up after the comment
if (isInComment == INASINGLECOMMENT && string[i] == '\0')
printf("\n");
//if the line is done, restates all the variables
if (string[i] == '\0') {
getaline(string, MAXLINE);
i = 0;
if (isInComment != INMULTICOMMENT)
isInComment = OUTOFCOMMENT;
}
//prints current character in loop
if(isInComment == OUTOFCOMMENT && string[i] != EOF)
printf("%c", string[i]);
//checks to see of multiline comment is over
if(string[i] == '*' && string[i+1] == '/' ) {
++i;
isInComment = OUTOFCOMMENT;
}
}
return 0;
}
So this works great except for one problem. Whenever a line starts with a comment, it prints that comment.
So for instance, if I had a line that was simply
//this is a comment
without anything before the comment begins, it will print that comment even though it's not supposed to.
I thought I was making good progress, but this bug has really been holding me up. I hope this isn't some super easy thing I've missed.
EDIT: Forget the getaline function
//puts line into s[], returns length of that line
int getaline(char s[], int lim)
{
int c, i;
for (i = 0; i < lim-1 && (c = getchar()) != '\n'; ++i)
s[i] = c;
if (c == '\n') {
s[i] = c;
++i;
}
s[i] = '\0';
return i;
}
There are many problems in your code:
isInComment is not initialized in function main.
as pointed by others, string[i] != EOF is wrong. You need to test for end of file more precisely, especially for files that do not end with a linefeed. This test only works if char type is signed and EOF is a valid signed char value. It will nonetheless mistakenly stop on a stray \377 character, which is legal in a string or in a comment.
When you detect the end of line, you read another line and reset i to 0, but i will be incremented by the for loop before you test again for single line comment... hence the bug!
You do not handle special cases such as /* // */ or // /*
You do not handle strings. This is not a comment: "/*", nor this: '//'
You do not handle \ at end of line (escaped linefeed). This can be used to extend single line comments, strings, etc. There are more subtle cases related to \ handling and if you really want completeness, you should handle trigraphs too.
Your implementation has a limit for line size, this is not needed.
The problem you are assigned is a bit tricky. Instead of reading and parsing lines, read one character at a time and implement a state machine to parse escaped linefeeds, strings, and both comment styles. The code is not too difficult if you do it right with this method.
if (string[i] == '\0') {
getaline(string, MAXLINE);
i = 0;
if (isInComment != INMULTICOMMENT)
isInComment = OUTOFCOMMENT;
}
When you start a new line, you initialize i to 0. But then in the next iteration:
for (i = 0; string[i] != EOF; ++i)
i will be incremented, so you'll begin the new line with index 1. Therefore there is a bug when the line begins with //.
You can see that it solves the problem if you write instead:
if (string[i] == '\0') {
getaline(string, MAXLINE);
i = 0;
if (isInComment != INMULTICOMMENT)
isInComment = OUTOFCOMMENT;
}
though it's usually considered as bad style to modify for loop indices inside the loop. You may redesign your implementation in a more readable way.

The C Programming Language by K&R Examples CH1

I am going through this book and have hit some examples that I am not sure how to test from chapter 1. They have you reading in lines and looking for different characters but I have no idea how to test the code in C that I have made.
For example:
/* K&R2: 1.9, Character Arrays, exercise 1.17
STATEMENT:
write a programme to print all the input lines
longer thans 80 characters.
*/
<pre>
#include<stdio.h>
#define MAXLINE 1000
#define MAXLENGTH 81
int getline(char [], int max);
void copy(char from[], char to[]);
int main()
{
int len = 0; /* current line length */
char line[MAXLINE]; /* current input line */
while((len = getline(line, MAXLINE)) > 0)
{
if(len > MAXLENGTH)
printf("LINE-CONTENTS: %s\n", line);
}
return 0;
}
int getline(char line[], int max)
{
int i = 0;
int c = 0;
for(i = 0; ((c = getchar()) != EOF) && c != '\n' && i < max - 1; ++i)
line[i] = c;
if(c == '\n')
line[i++] = c;
line[i] = '\0';
return i;
}
I have no idea how to create a file with varying line lengths to test this on. After doing some research I saw someone try it this way:
[arch#voodo kr2]$ gcc -ansi -pedantic -Wall -Wextra -O ex_1-17.c
[arch#voodo kr2]$ ./a.out
like htis
and
this line has more than 80 characters in it so it will get printed on the terminal right
now without any troubles. you can see for yourself
LINE-CONTENTS: this line has more than 80 characters in it so it will get printed on the
terminal right now without any troubles. you can see for yourself
but this will not get printed
[arch#voodo kr2]$
But I have no idea how he manages it. Any help would be greatly appreciated.
for(i = 0; ((c = getchar()) != EOF) && c != '\n' && i < max - 1; ++i)
This is the line that tells you everything you need to know about your getline() function.
It will read character by character and store it in the array until:
You don't press ^D(linux)/^Z(win) on the terminal(^ = control)
You don't press "enter" key on your keyboard
The number of characters entered shouldn't be more than max - 1. Otherwise they'll not be copied. In your example max = 1000 hence only 999 characters are input.
That program reads the standard input. If you just type exactly what is shown in that example, you'll see the same output. Enter a ^D to end your program.

Why is the End-of-File character not being interpreted correctly?

#include <stdio.h>
#define MAXLENGTH 77
int getline(char[], int);
main() {
char in_line[MAXLENGTH];
int count, in_line_length;
in_line_length = 0;
for (count = 0; count < MAXLENGTH; ++count)
in_line[count] = 0;
in_line_length = getline(in_line, MAXLENGTH);
printf("%s", in_line);
}
int getline(char line[], int max_length) {
int count, character;
character = 0;
for (count = 0; count < (max_length -2) && (character = getchar()) != EOF && character != '\n'; count++) {
line[count] = character;
}
if (character = '\n') {
line[count++] = '\n';
}
line[count] = '\0';
return count;
}
I'm having a peculiar issue with the above code. As a little background, I'm compiling the above C code with the Tiny C Compiler (TCC) and running the executable in a Windows XP command prompt (cmd.exe).
The problem that appears to be occuring has to do with the EOF character. When running, if I issue the EOF character (CTRL+Z) and then hit return, the loop seems to exit correctly, but then character is populated with a newline character (\n).
So, something like this (as input):
test^Ztesttwo
Would output:
test\n
Due to the fact that the if statement following the for loop is always executed. What I'm wondering, is why the loop successfully exits upon recognition of the EOF character if it continues to grab another character from the stream?
Your problem is this line:
if (character = '\n') {
it is an assigment instead of a test operator:
if (character == '\n') {
You can prevent erros like this by putting the constant left, that way you will get a compiler error if you make a mistake.
if ('\n' = character) {
// ^ wont compile

Resources