I tried to make a program that gets a user input(lines) and prints the longest line that is over 80 characters long. I made the program , but when i ran it , it outputed some very weird symbols. Could you please tell me what might be wrong with my code ?
#include <stdio.h>
#define MINLINE 80
#define MAXLINE 1000
int getline(char current[]);
void copy(char from[], char to[]);
int main(void)
{
int len; // current input line lenght
int max; // the lenght of a longest line that's over 80 characters
char current[MAXLINE]; // current input line
char over80[MAXLINE]; // input line that's over 80 characters long
while (len = (getline(current)) > 0) {
if (len > MINLINE) {
max = len;
copy(current, over80);
}
}
if (max > 0) {
printf("%s", over80);
}
else {
printf("No input line was over 80 characters long");
}
return 0;
}
int getline(char current[]) {
int i = 0, c;
while (((c = getchar()) != EOF) && c != '\n') {
current[i] = c;
++i;
}
if (i == '\n') {
current[i] = c;
++i;
}
current[i] = '\0';
return i;
}
void copy(char from[], char to[]) {
int i = 0;
while ((to[i] = from[i]) != '\0') {
++i;
}
}
Thank you very much for your help !
max can be not initialized if no long line is found. Using it in if (max > 0) is then undefined behavior.
This line:
while (len = (getline(current)) > 0) {
assigns the value of (getline(current)) > 0) to len, which is not what you want (len will be 0 or 1 afterwards.
EDIT: Just saw AusCBloke's comment, you should also check for both len > max and len > MINLINE or you'll just get the latest line longer than 80 chars, not the longest overall line.
You should also initialize max to 0, so it should be
max = 0;
while ((len = getline(current)) > 0) {
if ((len > MINLINE) && (len > max)) {
Other minor errors/tips:
The built in functions strcpy and strncpy do what your copy function does, there's no need to reinvent the wheel.
In your getline function, use MAXLINE to prevent buffer overflows.
Assuming that this is a homework, here's a hint: this piece of code looks very suspicious:
if (i == '\n') {
current[i] = c;
++i;
}
Since i represents a position and is never assigned a character, you are effectively checking if the position is equal to the ASCII code of '\n'.
Your copy method doesn't null terminate the string:
void copy(char from[], char to[]) {
int i = 0;
while ((to[i] = from[i]) != '\0') {
++i;
}
to[i] = '\0'
}
which probably explains the weird characters being printed.
You could use the builtin strcpy() to make life easier.
I can't test your code right now, but it may be caused by character arrays not being cleaned. Try memset-ing the char arrays to 0.
If you supply input data that has lines with more than 1000 characters you will overflow your fixed size buffers. By feeding in such input I was able to achieve the following output:
╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠
╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠
╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠
There are a number of problems with your code. Mostly they are due to wheel-reinvention.
int getline(char current[]);
You don't need to define your own, getline(), there is already one in stdio.h.
void copy(char from[], char to[]);
There are also a number of functions for copying strings in string.h.
It's also a good idea to initialise all 0f your variables, like this:
int len = 0; // current input line length
...this can prevent problems later, like comparisons to max when you haven't initialised it.
If you initialise max like this...
int max = MINLINE; // the length of a longest line that's over 80 characters
...then it's easier to do the length comparison later on.
char* current = NULL;
size_t allocated = 0;
If current is NULL, then getline() will allocate a buffer for storing the line, which should be freed by the user program. getline() also takes a pointer to a size_t, which contains the amount of bytes needed to store the line.
while (len = (getline(current)) > 0) {
Should be replaced by the following...
while ((len = getline(¤t, &allocated, stdin)) > 0) {
...which updates and compares len to 0.
Now, instead of...
if (len > MINLINE) {
...you need to compare with the last longest line, which we initialised earlier...
if (len > max) {
...and then you're good to update max as you were...
max = len;
Where you called your copy() use strncpy(), which will prevent you writing over 1,000 characters into the allocated buffer:
strncpy(over80, current, MAXLINE);
Because we initialised max, you'll need to change your check at the end from if (max > 0) to if (max > MINLINE).
One more tip, changing the following line...
printf("No input line was over 80 characters long");
...to...
printf("No input line was over %d characters long", MINLINE);
...will mean that you only have to change the #define at the top of the file to increase or decrease the minimum length.
Don't forget to...
free(current);
...to prevent memory leaks!
Related
I m trying to do this little programm with defensive programming but its more than difficult for me to handle this avoiding the Loop-Goto as i know that as BAD programming. I had try with while and do...while loop but in one case i dont have problem. Problem begins when i m going to make another do...while for the second case ("Not insert space or click enter button"). I tried and nested do...while but here the results was more complicated.
#include <ctype.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int i;
int length;
char giventext [25];
Loop:
printf("String must have 25 chars lenght:\n");
gets(giventext);
length = strlen(giventext);
if (length > 25) {
printf("\nString has over %d chars.\nMust give a shorter string\n", length);
goto Loop;
}
/* Here i trying to not give space or nothing*/
if (length < 1) {
printf("You dont give anything as a string.\n");
goto Loop;
} else {
printf("Your string has %d\n",length);
printf("Letter in lower case are: \n");
for (i = 0; i < length; i++) {
if (islower(giventext[i])) {
printf("%c",giventext[i]);
}
}
}
return 0;
}
Note that your code is not defensive at all. You have no way to avoid a buffer overflow because,
you check for the length of the string after it has been input to your program so after the buffer overflow has already occurred and
you used gets() which doesn't check input length and thus is very prone to buffer overflow.
Use fgets() instead and just discard extra characters.
I think you need to understand that strlen() doesn't count the number of characters of input but instead the number of characters in a string.
If you want to ensure that there are less than N characters inserted then
int
readinput(char *const buffer, int maxlen)
{
int count;
int next;
fputc('>', stdout);
fputc(' ', stdout);
count = 0;
while ((next = fgetc(stdin)) && (next != EOF) && (next != '\n')) {
// We need space for the terminating '\0';
if (count == maxlen - 1) {
// Discard extra characters before returning
// read until EOF or '\n' is found
while ((next = fgetc(stdin)) && (next != EOF) && (next != '\n'))
;
return -1;
}
buffer[count++] = next;
}
buffer[count] = '\0';
return count;
}
int
main(void)
{
char string[8];
int result;
while ((result = readinput(string, (int) sizeof(string))) == -1) {
fprintf(stderr, "you cannot input more than `%d' characters\n",
(int) sizeof(string) - 1);
}
fprintf(stdout, "accepted `%s' (%d)\n", string, result);
}
Note that by using a function, the flow control of this program is clear and simple. That's precisely why goto is discouraged, not because it's an evil thing but instead because it can be misused like you did.
Try using functions that label logical steps that your program needs to execute:
char * user_input() - returns an input from the user as a pointer to a char (using something other than get()! For example, look at scanf)
bool validate_input(char * str_input) - takes the user input from the above function and performs checks, such as validate the length is between 1 and 25 characters.
str_to_lower(char * str_input) - if validate_input() returns true you can then call this function and pass it the user input. The body of this function can then print the user input back to console in lower case. You could use the standard library function tolower() here to lower case each character.
The body of your main function will then be much simpler and perform a logical series of steps that tackle your problem. This is the essence of defensive programming - modularising your problem into separate steps that are self contained and easily testable.
A possible structure for the main function could be:
char * user_input();
bool validate_input(char *);
void str_to_lower(char *);
int main()
{
char * str_input = user_input();
//continue to get input from the user until it satisfies the requirements of 'validate_input()'
while(!validate_input(str_input)) {
str_input = user_input();
}
//user input now satisfied 'validate_input' so lower case and print it
str_to_lower(str_input);
return 0;
}
I've just started to read K&R and on pages 32-33, there is a code that
finds the longest line among the inputs.
I nearly completely copy-pasted the code given in the book, just added some comment lines to make the code more understandable for me. But it isn't working.
Edit: I'm sorry for bad questioning. It seems the program does not act properly when I press Ctrl + Z, in order to terminate it. No matter how many lines I type and how many times I press Ctrl + Z, it just does nothing.
The following is my version of the code:
/* Find the longest line among the giving inputs and print it */
#include <stdio.h>
#define MAXLINE 1000 /* maximum input line length */
int getLine(char line[], int maxLine);
void copy(char to[], char from[]);
int main(void) {
int len; /* current line length */
int max; /* maximum length seen so far */
char line[MAXLINE]; /* current input line */
char longest[MAXLINE]; /* longest line saved here*/
max = 0;
/* getLine function takes all the input from user, returns it's size and equates it to the variable len
* Then, len is compared whether it's greater than zero because if there's no input, no need to do any calculation
* EDGE CASE
*/
while ((len = getLine(line, MAXLINE)) > 0)
/* If the length of input is larger than the previous max length, set max as the new length value and copy that input */
if (len > max) {
max = len;
copy(longest, line);
}
if (max > 0) /* there was a line, EDGE CASE */
printf("%s", longest);
return 0;
}
/* Read a line into s, return length.
* Since the input length is unknown, there should be a limit */
int getLine(char s[], int lim) {
int c, i;
/* The loop's first condition is whether the input length is below the limit. EDGE CASE
* If it's not, omit the rest because it would cause a BUFFER OVERFLOW. Next, take the input as long as it's not an EOF command.
* Finally, if the input is end of line, finish the loop, don' take it.
*/
for (i = 0; i < lim - 1 && (c = getchar()) != EOF && c != '\n'; i++)
s[i] = c;
if (c == '\n')
s[i++] = c;
s[i++] = '\0'; // always put a '\0' character to a string array ending, so that the compiler knows it's a string.
return i;
}
void copy(char to[], char from[]) {
int i = 0;
// This loop is readily assigns all chars from the source array to the target array until it reaches the ending char.
while ((to[i] = from[i]) != '\0')
++i;
}
Thanks in advance!
Okay, here's the error:
s[i++] = '\0'; // always put a '\0' character to a string array ending, so that the compiler knows it's a string.
This will cause it to terminate the string even for no input (when it got EOF directly), and since it increments i before returning it, getLine() will never return 0 and thus main() will never stop. Removing the ++ fixed it, for my simple test.
Also, the comment is misleading, the compiler doesn't know anything. The compiler is no longer around when the code runs; the in-memory format of strings is something that's needed to keep the run-time libraries happy since that's what they expect.
I have a small issue with an K&R example (sort line example, page 108).
I do not understand the behaviour I see when I uncomment the line in readlines which removes the newline character added when reading input with getline.
int main()
{
int nlines;
if ((nlines = readlines(lineptr, MAXLINES)) >= 0) {
my_qsort(lineptr, 0, nlines-1);
writelines(lineptr, nlines);
return 0;
} else {
printf("error: input too big \n");
return 1;
}
}
int readlines(char *lineptr[], int maxlines)
{
int len, nlines;
char *p, line[MAXLEN];
nlines = 0;
while ((len = my_getline(line, MAXLEN)) > 0)
if (nlines >= maxlines || (p = alloc(len)) == NULL)
return -1;
else {
line[len-1] = '\0'; //delete newline.
my_strcpy(p, line);
lineptr[nlines++] = p;
}
return nlines;
}
void writelines(char *lineptr[], int nlines)
{
while (nlines-- > 0)
printf("%s\n", *lineptr++);
}
For example, if I then pipe in the following:
linje1
linje2
linje3
linje4
then writelines will output:
linje1
linje2
linje3
linje4
linje2
linje3
linje4
linje3
linje4
linje4
"and one last newline..."
From which I deduce that lineptr[0] points to all the lines. lineptr[1] points to all but the first line, ... , lineptr[3] points just to "linje4"
I do not understand how we get this behaviour from storing lines as "linje1\n", instead of "linje1".
Clarification:
in writelines (when lineptr points to the start of the array)
how can the call printf("%s", *lineptr) print all the lines?
Edit 2:
Ah, I see, but here is the getline function from K&R
int my_getline(char s[], int lim)
{
int c, i;
for (i=0; i < lim-1 && (c=getchar()) != EOF && c != '\n'; i++)
s[i] = c;
if (c == '\n') {
s[i] = c;
++i;
}
s[i] = '\0';
return i;
}
And I was sure that would always give me a null-terminated string, regardless of whether it ended with a newline or not?
and here is K&R's alloc:
#define ALLOCSIZE 10000
static char allocbuf[ALLOCSIZE]; // Storeage for alloc
static char *allocp = allocbuf; // Next free position
char *alloc(int n) // Return pointer to n characters
{
if (allocbuf + ALLOCSIZE - allocp >= n) { // it fits
allocp += n;
return allocp - n;
} else
return 0;
}
Edit 3:
Thank you for all the comments. But the entire program is typed up exactly as in K&R and works perfectly (I have compared the output with grep), so all the peripheral functions do as intended (my_strcpy for example works exactly like strcpy and copies the string up to and including the null terminator). The alloc function is just a pointer to K&R's big char array.
What I still don't understand is:
C reads in some lines of text, copies it, which stores line i somewhere in memory, and have lineptr[i] point to that memory location:
The lines are read with my_getline, which reads in the entire line (including newline character) and then terminates the string with the nullcharacter.
If I skip the line[len-1] = '\0'; step, readlines then stores a pointer to a copy of this line in lineptr[i]. And in memory, I thought the string (for i=1) looked like this "linje1\n\0"
But as #DanJAB pointed out, the nullcharacter is most likely missing, so the string is stored as "linje1\n", and so when writelines prints (via the pointer in the first entry in lineptr) this line, it prints everything following this in memory, since the nullcharacter is missing, which happens to be the rest of the lines.
But what I just can't wrap my head around is why is line[len-1] = '\0'; then evidently is needed for the string (i=1) to be stored as "linje1\0", when my_getline always returns a nullterminated string?
Thanks again, and sorry for any potential unclarities.
Final edit
The entire problem was with alloc(len) not allocating space for the final nullcharacter! Thank you for helping me out.
1) With char line[MAXLEN]; ... my_getline(line, MAXLEN) ... my_getline(char s[], int lim), lim is the size of the buffer.
But the function my_getline() is designed that lim is the maximum string length.
C string length is 1 less than the minimum size needed for that char array to resides in.
Use char line[MAXLEN+1]; or alternatively change my_getline(line, MAXLEN) code to i < lim-2
2) The result of my_getline(line, MAXLEN) can be "" (but the len > 0 test takes care of that) , further the line may not end with a '\n'.
line[len-1] = '\0'; //delete newline.
Better to use
if (len > 0 && line[len-1] == '\n') {
line[len-1] = '\0'; //delete newline.
}
3) p = alloc(len) is inadequate. Use p = alloc(len+1u)
4) Recommend commenting out my_qsort(lineptr, 0, nlines-1); until all other code is working.
5) All this makes me suspect the unposted my_strcpy()/my_qsort() too.
Code may have additional problems, but what is posted in not compilable.
If you are talking about the line line[len-1] = '\0'; It's not deleting a new line, it's replacing it with a null terminator. This means that if you don't have that line then you don't have the that thing that marks the end of the string, therefore when you print it, you also get whatever follows it in memory (the next strings).
I have a simple function, which is supposed to read line from standard input and put it into an char array, and I call this function in a loop till EOF is inputed. The problem is, that for extremely long lines (more than 10k characters) the fgets reads only a number of characters and stops, although it has not encountered any \n and the buffer has sufficient space, therefore next invoking of this function reads the rest of the line. Is there a reason for this behaviour (wrongly written code, some buffers I am unavare of)? Is it possible to fix it? If I have something wrong in the code I will be gratefull if you point it out.
static int getLine(char** line){
if(feof(stdin)) return 0;
int len=0;
char* pointer=NULL;
int max = 1;
while(1){
max+=400;
*line=(char*)realloc( *line,max);
if(pointer==NULL)
pointer=*line;
if(fgets(pointer, 401, stdin)==NULL)break;
int len1=strlen(pointer);
len+=len1;
if(len1!=400 || pointer[len1]=='\n')break;
pointer+=len1;
}
if(len==0)return 0;
if((*line)[len-1]=='\n'){
*line=(char*)realloc(*line, len);
(*line)[len-1]='\0';
return len-1;}//without \n
return len;
}
I think it likely that your problem is the way you use pointer:
char* pointer=NULL;
int max = 1;
while(1){
max+=400;
*line=(char*)realloc( *line,max);
if(pointer==NULL)
pointer=*line;
if(fgets(pointer, 401, stdin)==NULL)
break;
int len1=strlen(pointer);
len+=len1;
if(len1!=400 || pointer[len1]=='\n')
break;
pointer+=len1;
}
The trouble is that realloc() can change where the data is stored, but you fix it to the location you are first given. It is more likely that you'll have data move on reallocation if you handle large quantities of data. You can diagnose this by tracking the value of *line (print it after the realloc() on each iteration).
The fix is fairly simple: use an offset instead of a pointer as the authoritative length, and set pointer on each iteration:
enum { EXTRA_LEN = 400 };
size_t offset = 0;
int max = 1;
while (1)
{
max += EXTRA_LEN;
char *space = (char*)realloc(*line, max); // Leak prevention
if (space == 0)
return len;
*line = space;
char *pointer = *line + offset;
if (fgets(pointer, EXTRA_LEN + 1, stdin) == NULL)
break;
int len1 = strlen(pointer);
len += len1;
if (len1 != EXTRA_LEN || pointer[len1] == '\n')
break;
offset += len1;
}
I have reservations about the use of 401 rather than 400 in the call to fgets(), but I haven't the energy to expend establishing whether it is correct or not. I've done about the minimum changes to your code that I can; I would probably make more extensive changes if it were code I was polishing. (In particular, max would start at 0, not 1, and I would not use the +1 in the call to fgets().
I have a file like this:
...
words 13
more words 21
even more words 4
...
(General format is a string of non-digits, then a space, then any number of digits and a newline)
and I'd like to parse every line, putting the words into one field of the structure, and the number into the other. Right now I am using an ugly hack of reading the line while the chars are not numbers, then reading the rest. I believe there's a clearer way.
Edit: You can use pNum-buf to get the length of the alphabetical part of the string, and use strncpy() to copy that into another buffer. Be sure to add a '\0' to the end of the destination buffer. I would insert this code before the pNum++.
int len = pNum-buf;
strncpy(newBuf, buf, len-1);
newBuf[len] = '\0';
You could read the entire line into a buffer and then use:
char *pNum;
if (pNum = strrchr(buf, ' ')) {
pNum++;
}
to get a pointer to the number field.
fscanf(file, "%s %d", word, &value);
This gets the values directly into a string and an integer, and copes with variations in whitespace and numerical formats, etc.
Edit
Ooops, I forgot that you had spaces between the words.
In that case, I'd do the following. (Note that it truncates the original text in 'line')
// Scan to find the last space in the line
char *p = line;
char *lastSpace = null;
while(*p != '\0')
{
if (*p == ' ')
lastSpace = p;
p++;
}
if (lastSpace == null)
return("parse error");
// Replace the last space in the line with a NUL
*lastSpace = '\0';
// Advance past the NUL to the first character of the number field
lastSpace++;
char *word = text;
int number = atoi(lastSpace);
You can solve this using stdlib functions, but the above is likely to be more efficient as you're only searching for the characters you are interested in.
Given the description, I think I'd use a variant of this (now tested) C99 code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
struct word_number
{
char word[128];
long number;
};
int read_word_number(FILE *fp, struct word_number *wnp)
{
char buffer[140];
if (fgets(buffer, sizeof(buffer), fp) == 0)
return EOF;
size_t len = strlen(buffer);
if (buffer[len-1] != '\n') // Error if line too long to fit
return EOF;
buffer[--len] = '\0';
char *num = &buffer[len-1];
while (num > buffer && !isspace((unsigned char)*num))
num--;
if (num == buffer) // No space in input data
return EOF;
char *end;
wnp->number = strtol(num+1, &end, 0);
if (*end != '\0') // Invalid number as last word on line
return EOF;
*num = '\0';
if (num - buffer >= sizeof(wnp->word)) // Non-number part too long
return EOF;
memcpy(wnp->word, buffer, num - buffer);
return(0);
}
int main(void)
{
struct word_number wn;
while (read_word_number(stdin, &wn) != EOF)
printf("Word <<%s>> Number %ld\n", wn.word, wn.number);
return(0);
}
You could improve the error reporting by returning different values for different problems.
You could make it work with dynamically allocated memory for the word portion of the lines.
You could make it work with longer lines than I allow.
You could scan backwards over digits instead of non-spaces - but this allows the user to write "abc 0x123" and the hex value is handled correctly.
You might prefer to ensure there are no digits in the word part; this code does not care.
You could try using strtok() to tokenize each line, and then check whether each token is a number or a word (a fairly trivial check once you have the token string - just look at the first character of the token).
Assuming that the number is immediately followed by '\n'.
you can read each line to chars buffer, use sscanf("%d") on the entire line to get the number, and then calculate the number of chars that this number takes at the end of the text string.
Depending on how complex your strings become you may want to use the PCRE library. At least that way you can compile a perl'ish regular expression to split your lines. It may be overkill though.
Given the description, here's what I'd do: read each line as a single string using fgets() (making sure the target buffer is large enough), then split the line using strtok(). To determine if each token is a word or a number, I'd use strtol() to attempt the conversion and check the error condition. Example:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
/**
* Read the next line from the file, splitting the tokens into
* multiple strings and a single integer. Assumes input lines
* never exceed MAX_LINE_LENGTH and each individual string never
* exceeds MAX_STR_SIZE. Otherwise things get a little more
* interesting. Also assumes that the integer is the last
* thing on each line.
*/
int getNextLine(FILE *in, char (*strs)[MAX_STR_SIZE], int *numStrings, int *value)
{
char buffer[MAX_LINE_LENGTH];
int rval = 1;
if (fgets(buffer, buffer, sizeof buffer))
{
char *token = strtok(buffer, " ");
*numStrings = 0;
while (token)
{
char *chk;
*value = (int) strtol(token, &chk, 10);
if (*chk != 0 && *chk != '\n')
{
strcpy(strs[(*numStrings)++], token);
}
token = strtok(NULL, " ");
}
}
else
{
/**
* fgets() hit either EOF or error; either way return 0
*/
rval = 0;
}
return rval;
}
/**
* sample main
*/
int main(void)
{
FILE *input;
char strings[MAX_NUM_STRINGS][MAX_STRING_LENGTH];
int numStrings;
int value;
input = fopen("datafile.txt", "r");
if (input)
{
while (getNextLine(input, &strings, &numStrings, &value))
{
/**
* Do something with strings and value here
*/
}
fclose(input);
}
return 0;
}