K&R Exercise 1.16 - Limitation on line length - c

I'm learning C from K&R's "The C Programming Language" book. I'm doing the exercises specified in the book. I'm on exercise number 1.16, but I don't understand it.
Exercise 1.16:
Revise the main routine of the longest-line program so it will
correctly print the length of arbitrarily long input lines, and as
much as possible of the text.
My questions:
"...as much as possible of the text..." - is there some limitation on string length? Maybe in standard headers there's a variable with the max allowed value of string length?
"...the length of arbitrarily long input lines..." - but in the code MAXLINE is defined as 1000. It is limited size too. I see some solutions here, but in my opinion it is not solution decision, since on the former there is a restriction on length of a line (1000 characters).
Maybe I don't understood the task. My understanding is I must remove the 1000-character limitation.

It's a pretty early exercise in K&R, you're just supposed to do some minor changes to the code, not a total redesign of the code.
"...as much as possible of the text..."
is up to you to interpret. I'd do it by printing what's stored in the longest buffer. i.e. print out up to 1000 characters of the line. Again, it's an early exercise, with little introduction to dynamically allocated memory yet. And at the time K&R was written, storing away arbitrarily long text lines wasn't as feasible as it is today.
"...the length of arbitrarily long input lines..."
Is a hard requirement. You're supposed to find the correct length no matter how long it is (at least within the bounds of an int. )
One way to solve this problem is:
After the call to getline(), check if the last character read into the line buffer is a newline ('\n')
If it is, you read a complete line. The len variable is the correct length of the line(the return value of getline(), and no special consideration is needed compared to to original code.
If it is not , you did not read the entire line, and need to hunt for the end of this line. You add a while loop, calling getchar() until it returns a newline (or EOF), and count the number of characters you read in that loop. Just do len++ to count.
When the while loop is done, the new len is now the actual length of the line, but our buffer just has the first 999 characters of it.
As before, you store away (the copy() function call) the current line buffer (max 1000 chars) if this line is the longest so far.
When you're done, you print out the stored line as before (the longest buffer) and the max variable for the length.
Due to the above mentioned while loop that max length is now correct.
If the longest line indeed was longer than 1000 chars. you at least print out those first 999 chars - which is "as much as possible".
I'll not spoil it and post the code you need to accomplish this, but it is just 6 lines of code that you need to add to the longest-line program of exercise 1-16.

On modern machines "as much as possible of the text" is likely to be all of the text, thanks to automatically line-wrapping terminal programs. That book was written when teletype terminals were still in use. There is no limitation on string length other than perhaps memory limitations of the machine you're working on.
They're expecting you to add some kind of loop to read characters and look for newlines rather than assuming that a read into the MAXLINE sized buffer is going to contain a newline for sure.

here is my version:
int getline(char s[],int lim)
{
int c,i;
for(i=0;i<lim-1&&(c=getchar())!=EOF&&c!='\n';++i)
s[i]=c;
if(c=='\n')
{
s[i]=c;
++i;
}
if(c!=EOF)
{
while((c=getchar())!=EOF&&c!='\n')
i++;
}
s[i]='\0';
return i;
}
#define MAXLINE 1000
int len;
int max;
char line[MAXLINE];
char longest[MAXLINE];
max=0;
while((len=getline(line,MAXLINE))>1)
{
if(len>max)
{
max=len;
copy(longest,line);
}
}
if(max>0)
{
printf("%d:%s",max,longest);
}
return 0;
for some unknown reasons ,the example code doesn't work in my pc
particularly,when the condition is 'len>0',the loop won't end
i think the main reason is that when you type nothing,but you still have to press enter,so it is received as '\n',and the len is 1;
i think it satisfy the requirement that print the length of arbitrarily long input lines, and as much as possible of the text.
And it works like this

#include
main()
{
long tlength = 0;
short input, llength = 1;
while (llength > 0) {
llength = 0;
while ((input = getchar()) != EOF) {
++llength;
if (input == '\n')
break;
}
tlength = tlength + llength;
printf("\nLength of just above line : %5d\n\n", llength);
}
printf("\n\tLength of entire text : %8ld\n", tlength);
return 0;
}
According to me, This question only wants the length of each arbitrarily line + At last the length of entire text.
Try to run this code and tell me is it correct according to question because i too confuse in this problem.

I want to offer that this exercise actually makes more sense if imagine that the limit of the number of characters you can copy is very small -- say, 100 characters -- and that your program is supposed to judge between lines that are longer than that limit.
(If you actually change the limit so that it's very small, the code becomes easier to test: if it picks out the first line that hits that small limit, you'll know your code isn't working, whereas if it returns the first however-many characters of the longest line, it's working.)
Keep the part of the code that copies and counts characters until it hits a newline or EOF or the line-size-limit. Add code that picks up where this counting and copying leaves off, and which will keep counting even after the copying has stopped, so long as getchar() still hasn't returned an EOF or a newline.

My solution: just below the call to getLine
if ( line[len-1] != '\n' && line[len-1] != EOF) //if end of line or file wasnt found after max length
{
int c;
while ( ( c = getchar() ) != '\n' && c != EOF )
len++; //keep counting length until end of line or file is found
}
to test it, change MAXLINE to 25

Related

I have two char arrays and when their sizes are different the program doesn't read/write them properly

I have a program to do basic things with two char arrays. Everything works fine when the size limit of the first is equal to size limit of the second, but when the size of the first char array is different to the size of the other, the program starts to read/write the strings in a strange way.
For example, if the limit of the first is 31 and the limit of the other is 5, if the typed characters in the first are more than 8 or something like that the program won't let the user typed anything on the second array as if it was full already.
I tried to fix it without using the functions of string.h, but the programs still did the same when the size limit of the two char arrays were different.
#include <stdio.h>
#include <string.h>
#define LIMIT1 31
#define LIMIT2 5
/*Function: void copy_string(char *pointer_destination, char *pointer_source)
Precondition: it needs a pointer to the direction of memory of the first element of two char vectors and the size limit of the 'destination' vector
Postcondition: it puts all the elements of the 'source' vector into the other until the last element that */
void copy_string(char *pointer_destination, char *pointer_source, int LIMd){
//Variable declaration
int i = 0;
/*Cycle for replacing the element of the 'destination' vector by the element of the 'source' vector.
When the element of the 'destination' OR of the 'source' is the null character, this cycle ends*/
for(; i < LIMd && *(pointer_source + i) != '\0'; i++){
*(pointer_destination + i) = *(pointer_source + i);
}
*(pointer_destination + i) = '\0';
}
int main(){
//Variable declaration
int restart;
char username[LIMIT1], string2[LIMIT2];//Here we define the limit for obvious reasons
//Restart cycle starts here
do{
//Data input
printf("Type your username (maximum 30 characters)\n");
fgets(username, LIMIT1 - 1, stdin);
fflush(stdin);
printf("Type a string of maximum 30 characters\n");
fgets(string2, LIMIT2 - 1, stdin);
fflush(stdin);
printf("Your typed username and your typed second string are, respectively:\n");
fputs(username, stdout);
fputs(string2, stdout);
printf("Concatenating, the username is now\n");
strcat(username, string2);
fputs(username, stdout);
printf("Now I'll copy what is in your username and I'll put it in the second string,\n");
copy_string(string2, username, LIMIT2 - 1);
fputs(string2, stdout);
//Restart cycle switch
printf("Type '0' to close this program, otherwise it'll restart itself\n");
scanf("%d", &restart);
fflush(stdin);
//Restart cycle ends here
}while(restart);
return 0;
}
I expected that if the size of the two arrays were different, the program would still read and write them properly (if the size of the first is 3, read from the user only the first three characters and put behing a \0 and if the size of the other is 25 do the same but with 25 as the size limit)
You're not very specific about your actual and expected output, but I imagine it's this:
Steps to reproduce:
Run the program as posted
Get the prompt Type your username (maximum 30 characters)
Enter this is an especially long string
Expected result:
A prompt that says Type a string of maximum 30 characters and lets you enter a new string
Actual result:
Type a string of maximum 30 characters is written to screen but the program continues immediately without letting you enter another string.
This happens because the first fgets is set up to read no more than 30 characters from the user. If you enter more, it will only consume the first 30.
The next fgets will then consume the remainder of that line instead of a new line, giving the appearance of skipping the prompt.
You should use a large enough buffer to accomodate the line, so that this is not an issue. Alternatively, you can manually read and discard one character at a time until you find a \n, effectively draining stdin of the rest of the line.
You seem to be relying on fflush(stdin) to clear any unread input. This is undefined behaviour in standard C, and only works on some platforms as a non-standard extension. I suspect it doesn't work on yours, and either breaks input altogether or does nothing and causes the next fgets to read the rest of the input intended for the previous one.
Instead of fflush, you can check whether the string read by fgets ends in a newline ('\n', which you probably want to remove if it is there). If not, keep reading (and discarding) input until either a newline '\n' or EOF is encountered.
(In general I would also recommend not using scanf for user input - it's a lot easier to read into a temporary buffer with fgets and parse that with sscanf as needed.)
Another obvious, but unrelated, problem is strcat(username, string2); – this may exceed the length of username. You need to leave at least LIMIT2 - 1 extra space (that you don't allow fgets to use), or simply allocate a new array of the correct size after you know the lengths of each.
Warning fgets will save the \n if there is enough place to save it, I think your problem comes because in your examples the end of line is saved at least in the usename. So you need to remove it if present.
Warning you give the size minus 1 to fgets, but fgets already read the given length minus 1 to have place to put the null character at the end.
Note the message to read the second string is wrong because it indicates a length 30 rather than 4.

Stack Smashing and using malloc

I'm making a program that counts the number of words contained within a file. My code works for certain test cases with files that have less than a certain amount of words/characters...But when I test it with, let's say, a word like:
"loooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong", (this is not random--this is an actual test case I'm required to check), it gives me this error:
*** stack smashing detected ***: ./wcount.out terminated
Abort (core dumped)
I know what the error means and that I have to implement some sort of malloc line of code to be able to allocate the right amount of memory, but I can't figure out where in my function to put it or how to do it:
int NumberOfWords(char* argv[1]) {
FILE* inFile = NULL;
char temp_word[20]; <----------------------I think this is the problem
int num_words_in_file;
int words_read = 0;
inFile = fopen(argv[1], "r");
while (!feof(inFile)) {
fscanf(inFile, "%s", temp_word);
words_read++;
}
num_words_in_file = words_read;
printf("There are %d word(s).\n", num_words_in_file - 1);
fclose(inFile);
return num_words_in_file;
}
As you've correctly identified by rendering your source code invalid (future tip: /* put your arrows in comments */), the problem is that temp_word only has enough room for 20 characters (one of which must be a terminal null character).
In addition, you should check the return value of fopen. I'll leave that as an exercise for you. I've answered this question in other questions (such as this one), but I don't think just shoving code into your face will help you.
In this case, I think it may pay to better analyse the problem you have, to see if you actually need to store words to count them. As we define a word (the kind read by scanf("%s", ...) as a sequence of non-whitespace characters followed by a sequence of (zero or more) whitespace characters, we can see that such a counting program as yours needs to follow the following procedure:
Read as much whitespace as possible
Read as much non-whitespace as possible
Increment the "word" counter if all was successful
You don't need to store the non-whitespace any more than you do the whitespace, because once you've read it you'll never revisit it. Thus you could write this as two loops embedded into one: one loop which reads as much whitespace as possible, another which reads non-whitespace, followed by your incrementation and then the outer loop repeats the whole lot... until EOF is reached...
This will be best achieved using the %*s directive, which tells scanf-related functions not to try to store the word. For example:
size_t word_count = 0;
do {
fscanf(inFile, "%*s");
} while (!feof(inFile) && ++word_count);
You are limited by the size of your array. A simple solution would be to increase the size of your array. But you are always susceptible to stack smashing if someone enters a long word.
A word is delimited by spaces.
You can simply store a counter variable initialized to zero, and a variable that records the current char that you are looking at. Every time you read in a character using fgetc(inFile, &temp) that is a space, you increment the counter.
In your current code you simply want to count the words. Therefore you are not interested in the words themselves. You can suppress the assignment with the optional * character:
fscanf(inFile, "%*s");

How to XOR two byte streams in C?

I've been reading through SO for the past couple of days trying to figure this out, I am stumped. I want to read in two 32 bit byte arrays (from stdin, input will be hex) and xor them, then print the result.
So far I've tried using scanf, fgets, and gets. My thought was to read the large hex numbers into a char buffer and perform the xor in a for loop until I hit an EOL (with fgets) or a null terminator. So far my output is not even close. I tried lots of variations, but I will only post my latest fail below. The challenge I've been trying to complete is: http://cryptopals.com/sets/1/challenges/2/
I am trying it in C because I'm really trying to learn C, but I'm really getting frustrated with none of these attempts working.
#include <stdio.h>
#include <math.h>
int main()
{
char buff1[100];
char buff2[100];
char buff3[100];
int size = sizeof(buff1);
puts("Enter value\n");
fgets(buff1, size, stdin);
puts(buff1);
puts("Enter value\n");
fgets(buff2, size, stdin);
puts(buff2);
for (int i = 0; i != '\n'; i++) {
buff3[i] = buff2[i] ^ buff1[i];
printf("%x", buff3[i]);
}
return 0;
}
When using sizeof() it should be used with types, not data. For instance if you want space for 100 chars, you need to find the sizeof(char) and then multiply by 100 to find out how many bytes you need and that goes into the buffer. A char is usually a byte so expect 100 bytes. fgets() will work but I prefer to use this
int getchar()
Just stop when the the user
enters a newline/terminator character. Since you don't know how many characters will come in from stdin, you
need to dynamically increase the size of your buffer or it will overflow. For the purposes of this question you can just make it a very big array, check to see if its about to overflow and then terminate the program. So to recap the steps.
1.) Create a big array
2.) While loop over getchar() and stop when the output is the terminator, take note of
how many chars you read.
3.) Since both buffers are guaranteed to have equal chars make your
final array equal to that many chars in size.
4.) For loop over getchar() and as the chars come out, xor them with the first array
and put the result into the final array. You should try doing this with 1 array
afterwards to get some more C practice.
Good luck!
EDIT:
fgets() can be used but depending on the implementation it is useful to know how many chars have been read in.
#include <string.h>
#include <ctype.h>
static inline unsigned char hc2uc(char d){
const char *table = "0123456789abcdef";
return strchr(table, tolower(d)) - table;
}
...
for(int i=0;buff1[i]!='\n';i++){
buff3[i]=hc2uc(buff2[i])^hc2uc(buff1[i]);
printf("%x",buff3[i]);
}

In K&R 1.9 longest line example, what is getchar() doing?

I seem to understand the program now, except the getline function is not very intuitive as it seems to copy everything getchar() returns to a character array s[] which is never really used for anything important.
int getline(char s[], int lim)
{
int c, i;
for(i=0; i < lim - 1 && (c = getchar()) != EOF && c != '\n'; ++i)
s[i] = c;
if(c == '\n')
{
s[i] = c;
++i;
}
s[i] = '\0';
return i;
}
The function could just as easily ignore the line s[i] = c; because all the function is really doing is counting the number of characters until it reaches EOF or '\n' returns from getchar()
What I really do not understand is why the program progressed forward as the main loop is as follows:
main()
{
int len; /* current line length */
int max; /* maximum length seen so far */
char line[MAXLINE]; /* current input line */
char longest[MAXLINE]; /* longest line saved here */
max = 0;
while ((len = getline(line, MAXLINE)) > 0)
if (len > max)
{
max = len;
copy(longest, line);
}
if (max > 0) /* there was a line */
printf("%s", longest);
return 0;
}
The only explanation would be that the getchar() function does its magic after the user has entered in a full line of text, and hits the enter key. So it would appear to work during run-time is my guess.
Is this how the program progresses? Does the program first enter the while loop, and then wait for a user to enter a line of text, and once the user hits enter, the getline function's for-loop is iterated? I feel like this would be the case, since the user can enter backspace during input.
My question is, how exactly does the program move forward at all? Is it all because of the getchar() function?
When I hit ctrl-D in the terminal, some other confusing stuff happens. If I hit ctrl-D at the start of a newline, the program will terminate. If I hit ctrl-D at the end of a line filled with some text, it does not terminate and it does not act the same way as hitting enter. If I hit ctrl-D a few times in a line with text, the program will finally end.
Is this just the way my terminal is treating the session, or is this all stuff I should not be worrying about if I just want to learn C?
The reason why I ask is that I like to trace the program to get a good understanding of it, but the getchar() function makes that tricky.
In a parameter declaration (and only in that context), char s[] really means char *s. The way the C standard describes this is that:
A declaration of a parameter as "array of type" shall be adjusted to
"qualified pointer to
type".
So s really is a pointer, of type char*, and when the function modifies s[i] it's modifying the ith element of line.
On the call:
getline(line, MAXLINE)
line is an array, but in most contexts an array expression is implicitly converted to a pointer to the array's first element.
These two rules almost seem to be part of a conspiracy to make it look like arrays and pointers are really the same thing in C. They most definitely are not. A pointer object contains the address of some object (or a null pointer that doesn't point to any object); an array object contains an ordered sequence of elements. But most manipulation of arrays in C is done via pointers to the array's elements, with pointer arithmetic used to advance from one element to the next.
Suggested reading (I say this a lot): section 6 of the comp.lang.c FAQ.
getchar reads a character from standard input. So if that's you sitting at the terminal, it blocks the program until it receives a character you've typed, then it's done. But standard input is line buffered when its interactive, so what you type isn't processed by the program until you press enter. That means that getchar will be able to keep reading all the characters you typed, as they're read from the buffer.
You're mistaken about the function. The array is passed to the function*, and it stores each character read by getchar (except for EOF or newline) in successive elements. That's the point of it - not to count the characters, but to store them in the array.
(*a pointer is actually passed, but the function here can still treat it like an array.)
The array is used for something important: it is provided by the caller and returned modified with the new content. From a reasonable point of view, filling in the array is the purpose of calling the function.
That array (an array reference) is actually a pointer, char s[] is the same as char *s. so it's building its result in that array, which is why it's copied later in main. there is rarely any "magic" in K&R.

Please Explain this Example C Code

This code comes from K&R. I have read it several times, but it still seems to escape my grasp.
#define BUFSIZE 100
char buf[BUFSIZE];
int bufp = 0;
int getch(void)
{
return(bufp>0)?buf[--bufp]:getchar();
}
int ungetch(int c)
{
if(bufp>=BUFSIZE)
printf("too many characters");
else buf[bufp++]=c;
}
The purpose of these two functions, so K&R says, is to prevent a program from reading too much input. i.e. without this code a function might not be able to determine it has read enough data without first reading too much. But I don't understand how it works.
For example, consider getch().
As far as I can see this is the steps it takes:
check if bufp is greater than 0.
if so then return the char value of buf[--bufp].
else return getchar().
I would like to ask a more specific question, but I literally dont know how this code achieves what it is intended to achieve, so my question is: What is (a) the purpose and (b) the reasoning of this code?
Thanks in advance.
NOTE: For any K&R fans, this code can be found on page 79 (depending on your edition, I suppose)
(a) The purpose of this code is to be able to read a character and then "un-read" it if it turns out you accidentally read a character too many (with a max. of 100 characters to be "un-read"). This is useful in parsers with lookahead.
(b) getch reads from buf if it has contents, indicated by bufp>0. If buf is empty, it calls getchar. Note that it uses buf as a stack: it reads it from right-to-left.
ungetch pushes a character onto the stack buf after doing a check to see if the stack isn't full.
The code is not really for "reading too much input", instead is it so you can put back characters already read.
For example, you read one character with getch, see if it is a letter, put it back with ungetch and read all letters in a loop. This is a way of predicting what the next character will be.
This block of code is intended for use by programs that make decisions based on what they read from the stream. Sometimes such programs need to look at a few character from the stream without actually consuming the input. For example, if your input looks like abcde12xy789 and you must split it into abcde, 12, xy, 789 (i.e. separate groups of consecutive letters from groups of consecutive digits) you do not know that you have reached the end of a group of letters until you see a digit. However, you do not want to consume that digit at the time you see it: all you need is to know that the group of letters is ending; you need a way to "put back" that digit. An ungetch comes in handy in this situation: once you see a digit after a group of letters, you put the digit back by calling ungetch. Your next iteration will pick that digit back up through the same getch mechanism, sparing you the need to preserve the character that you read but did not consume.
1. The other idea also shown here can be also called as a very primitive I/O stack mangement system and gives the implementation of the function getch() and ungetch().
2. To go a step further , suppose you want to design an Operating System , how can you handle the memory which stores all the keystrokes?
This is solved by the above code snippet.An extension of this concept is used in file handling , especially in editing files .In that case instead of using getchar() which is used to take input from Standard input , a file is used as a source of input.
I have a problem with code given in question. Using buffer (in form of stack) in this code is not correct as when getting more than one extra inputs and pushing into stack will have undesired effect in latter processing (getting input from buffer).
This is because when latter processing (getting input) going on ,this buffer (stack) will give extra input in reverse order (means last extra input given first).
Because of LIFO (Last in first out ) property of stack , the buffer in this code must be quene as it will work better in case of more than one extra input.
This mistake in code confused me and finally this buffer must be quene as shown below.
#define BUFSIZE 100
char buf[BUFSIZE];
int bufr = 0;
int buff = 0;
int getch(void)
{
if (bufr ==BUFSIZE)
bufr=0;
return(bufr>=0)?buf[bufr++]:getchar();
}
int ungetch(int c)
{
if(buff>=BUFSIZE && bufr == 0)
printf("too many characters");
else if(buff ==BUFSIZE)
buff=0;
if(buff<=BUFSIZE)
buf[buff++]=c;
}

Resources