K&R: Array of pointers to strings with newlines - c

I have a small issue with an K&R example (sort line example, page 108).
I do not understand the behaviour I see when I uncomment the line in readlines which removes the newline character added when reading input with getline.
int main()
{
int nlines;
if ((nlines = readlines(lineptr, MAXLINES)) >= 0) {
my_qsort(lineptr, 0, nlines-1);
writelines(lineptr, nlines);
return 0;
} else {
printf("error: input too big \n");
return 1;
}
}
int readlines(char *lineptr[], int maxlines)
{
int len, nlines;
char *p, line[MAXLEN];
nlines = 0;
while ((len = my_getline(line, MAXLEN)) > 0)
if (nlines >= maxlines || (p = alloc(len)) == NULL)
return -1;
else {
line[len-1] = '\0'; //delete newline.
my_strcpy(p, line);
lineptr[nlines++] = p;
}
return nlines;
}
void writelines(char *lineptr[], int nlines)
{
while (nlines-- > 0)
printf("%s\n", *lineptr++);
}
For example, if I then pipe in the following:
linje1
linje2
linje3
linje4
then writelines will output:
linje1
linje2
linje3
linje4
linje2
linje3
linje4
linje3
linje4
linje4
"and one last newline..."
From which I deduce that lineptr[0] points to all the lines. lineptr[1] points to all but the first line, ... , lineptr[3] points just to "linje4"
I do not understand how we get this behaviour from storing lines as "linje1\n", instead of "linje1".
Clarification:
in writelines (when lineptr points to the start of the array)
how can the call printf("%s", *lineptr) print all the lines?
Edit 2:
Ah, I see, but here is the getline function from K&R
int my_getline(char s[], int lim)
{
int c, i;
for (i=0; i < lim-1 && (c=getchar()) != EOF && c != '\n'; i++)
s[i] = c;
if (c == '\n') {
s[i] = c;
++i;
}
s[i] = '\0';
return i;
}
And I was sure that would always give me a null-terminated string, regardless of whether it ended with a newline or not?
and here is K&R's alloc:
#define ALLOCSIZE 10000
static char allocbuf[ALLOCSIZE]; // Storeage for alloc
static char *allocp = allocbuf; // Next free position
char *alloc(int n) // Return pointer to n characters
{
if (allocbuf + ALLOCSIZE - allocp >= n) { // it fits
allocp += n;
return allocp - n;
} else
return 0;
}
Edit 3:
Thank you for all the comments. But the entire program is typed up exactly as in K&R and works perfectly (I have compared the output with grep), so all the peripheral functions do as intended (my_strcpy for example works exactly like strcpy and copies the string up to and including the null terminator). The alloc function is just a pointer to K&R's big char array.
What I still don't understand is:
C reads in some lines of text, copies it, which stores line i somewhere in memory, and have lineptr[i] point to that memory location:
The lines are read with my_getline, which reads in the entire line (including newline character) and then terminates the string with the nullcharacter.
If I skip the line[len-1] = '\0'; step, readlines then stores a pointer to a copy of this line in lineptr[i]. And in memory, I thought the string (for i=1) looked like this "linje1\n\0"
But as #DanJAB pointed out, the nullcharacter is most likely missing, so the string is stored as "linje1\n", and so when writelines prints (via the pointer in the first entry in lineptr) this line, it prints everything following this in memory, since the nullcharacter is missing, which happens to be the rest of the lines.
But what I just can't wrap my head around is why is line[len-1] = '\0'; then evidently is needed for the string (i=1) to be stored as "linje1\0", when my_getline always returns a nullterminated string?
Thanks again, and sorry for any potential unclarities.
Final edit
The entire problem was with alloc(len) not allocating space for the final nullcharacter! Thank you for helping me out.

1) With char line[MAXLEN]; ... my_getline(line, MAXLEN) ... my_getline(char s[], int lim), lim is the size of the buffer.
But the function my_getline() is designed that lim is the maximum string length.
C string length is 1 less than the minimum size needed for that char array to resides in.
Use char line[MAXLEN+1]; or alternatively change my_getline(line, MAXLEN) code to i < lim-2
2) The result of my_getline(line, MAXLEN) can be "" (but the len > 0 test takes care of that) , further the line may not end with a '\n'.
line[len-1] = '\0'; //delete newline.
Better to use
if (len > 0 && line[len-1] == '\n') {
line[len-1] = '\0'; //delete newline.
}
3) p = alloc(len) is inadequate. Use p = alloc(len+1u)
4) Recommend commenting out my_qsort(lineptr, 0, nlines-1); until all other code is working.
5) All this makes me suspect the unposted my_strcpy()/my_qsort() too.
Code may have additional problems, but what is posted in not compilable.

If you are talking about the line line[len-1] = '\0'; It's not deleting a new line, it's replacing it with a null terminator. This means that if you don't have that line then you don't have the that thing that marks the end of the string, therefore when you print it, you also get whatever follows it in memory (the next strings).

Related

How can I use a for loop to read until the scanned character is a newline?

I am a C beginner who has been assigned to write a program that uses pointers to reverse a message. I am having trouble getting the for loop that reads the characters to break after it reads a newline and I don't want to use a while loop.
Below is my code:
#include <stdio.h>
#include <string.h>
int main(){
//declare string
char reverse[100];
//declare pointer
char *first;
//set pointer to point to first element of array
first = &reverse[0];
//get chars until end of input
printf("Enter a message:");
for (first = reverse; *first != '\n'; first++){
scanf("%c", first);
printf("%c", *first);
}
//reverse chars one by one
printf("Reversal: ");
for (first; first >= reverse; first--){
printf("%c", *first);
}
printf("\n");
return 0;
}
Thank you! Any help is appreciated :)
You have effectively implemented gets, a function so dangerous it was removed from the language, with a side of undefined behaviour because *first != '\n' is executed at the start of each iteration of the loop, and *first is an indeterminate value at this point.
first-- and first >= reverse are also a problem as holding a pointer value to one before the start of an object is also undefined behaviour. It is valid C to hold (but not use with unary *) an address one past the end of an array object (C11 §6.5.6).
Note, using getchar would be preferable to scanf.
If you do not want the newline in the buffer:
#include <stdio.h>
#define SIZE 100
int main(void) {
char buffer[SIZE];
char *dest = buffer;
for (; dest - buffer < sizeof buffer - 1; dest++)
if (1 != scanf("%c", dest) || '\n' == *dest)
break;
*dest = 0;
while (dest > buffer)
putchar(*--dest);
putchar('\n');
}
If you want the newline in the buffer:
#include <stdio.h>
#define SIZE 100
int main(void) {
char buffer[SIZE];
char *dest = buffer;
for (; dest - buffer < sizeof buffer - 1;)
if (1 != scanf("%c", dest) || '\n' == *dest++)
break;
*dest = 0;
while (dest > buffer)
putchar(*--dest);
putchar('\n');
}
Note the difference in when dest is incremented.
while (dest > buffer)
putchar(*--dest);
works by starting at the null-terminating character, and printing each element that comes before the current one, until our current element is the start of the array.
This can also get extremely terse, and you may be tempted to write something like
/* newline in buffer */
for (; dest - buffer < sizeof buffer - 1 && 1 == scanf("%c", dest) && '\n' != *dest; dest++);
but use of the break keyword can help to keep it readable.

Infinite Loop on Get_Next_Line in C

I have to create a C function that returns a line read from a file descriptor. I have to define a macro READ_SIZE (that can be editable). This READ_SIZE indicates the number of characters to read at each call of read(). The number can only be positive.
I also have to use one or several static variables to save the characters that were read but not sent back to the calling function. One .C file (5 functions max, 25 lines max per function) and one .h file only.
My function Get_Next_Line shall return its return without the '\n'. If there is nothing more to read on the file descriptor, or if an error occur while reading, the function returns NULL.
Here is the prototype of the function:
char *get_next_line(const int fd)
FUNCTIONS ALLOWED: malloc, free, read, write (to use with my_putchar, my_putstr, etc).
Here is what I have, but it doesn't work. It does an infinite loop I am trying to know why.
char *my_strcat(char *str1, char *str2)
{
int i;
int j;
int s;
char *strfinal;
i = 0;
j = 0;
s = 0;
if ((strfinal = malloc(sizeof(char) * (my_strlen(str1) + my_strlen(str2)
+ 1))) == NULL)
return (NULL);
while (str1[i] != '\0')
{
strfinal[j] = str1[i];
i++;
j++;
}
while (str2[s] != '\0')
{
strfinal[j] = str2[s];
s++;
j++;
}
free(str1);
strfinal[j] = '\0';
return (strfinal);
}
char *get_next_line(const int fd)
{
int n;
int i;
char *str_to_return;
static char buff[READ_SIZE] = {'\0'};
n = 1;
i = 0;
str_to_return = NULL;
while (n)
{
if (i == 0 && buff == '\0')
{
if ((read(fd, buff, READ_SIZE)) <= 0)
return(str_to_return);
if (i == READ_SIZE - 1 || buff[i] == '\n')
{
n = 0;
str_to_return = my_strcat(buff, str_to_return);
i = -1;
}
}
i++;
}
printf("%s\n", str_to_return);
return (str_to_return);
}
in this code:
while (str1[i] != '\0')
{
strfinal[j] = str1[i];
i++;
j++;
}
what guarantee do you have that there will be the null character \0 somewhere in str1[] ???
same goes for the str2 while loop.
If no null character is encountered, then there will be an infinite loop there.
verify the functions you are using to populate characters into memory under str1[] and str2[] include the null character. Since you are only using the read() function prior then that answer is no.
The problem with your two while loops for str1[] and str2[] is that you are relying on the null character to already be there in memory. And that then begs the question, who put that data there in memory and were they given a requirement to terminate the character data with a null character?
you therefore need to somehow place a control over any loop you write so as not get caught in an infinite loop condition; in this case maybe use a counter and after so many advances of the i to access str1[i] then stop, because you have yet to see a null character.
for example, the fgets() function will read so many characters from a FILE stream into an array, and always terminate it with the null character.
if (i == 0 && buff == '\0')
is always false because your definition of buff is
static char buff[READ_SIZE] = {'\0'};
You are attempting to test if buff is empty when i is 0. However as a char pointer, buff is an address and is never 0. You mean to make the if
if (i == 0 && buff[0] == '\0')
in order to check if the first character is the Null character.
However, once i is incremented, then it always fails even if you test against
if (i == 0 && buff[i] == '\0')
in order to find the NULL character within the buffer. Since you enter the while with i = 0 and are checking if buff is empty, you do not need the while.
If you want to just fill the buffer and keep reading until it is full, you need a different type of test. You also need a way of checking if you need to exit the while loop if the if fails (put in an else to determine what to do).
You also do not need to check each character in buff against '\0' because your code has always insured that it ends with one (even for initialization). Thus, strlen(buff) would be valid.
Another point is that when you call mystrcat() you have already verified that buffer is empty.
Also, since the second string in the call is what you read in, then the mystrcat() will not always have a '\0' at the end of str2 (though you are guaranteeing that buff (str1) will). You should call it with the number of characters in str2 to use.

Return a string made with a line read from input

i am trying to code a C function which returns a line read from the input as a char* . I am on Windows and i test my program in the command line by giving files as input and output of my program like this:
cl program.c
program < test_in.txt > test_out.txt
This is my (not working) function:
char* getLine(void)
{
char* result = "";
int i, c;
i = 1;
while((c = getchar()) != EOF)
{
*result++ = c;
i++;
if(c == '\n')
return result - i;
}
return result - i;
}
I was expecting it to work because i previously wrote:
char* getString(char* string)
{
//char* result = string; // the following code achieve this.
char* result = "";
int i;
for(i = 1; *result++ = *string++; i++);
return result - i;
}
And these lines of code have a correct behaviour.
Even if every answers will be appreciated, i would be really thankfull
if any of you could explain me why my getString() function works while my getLine() function doesn't.
Your function does not allocate enough space for the string being read. The variable char* result = "" defines a char pointer to a string literal ("", empty string), and you store some arbitrary number of characters into the location pointed to by result.
char* getLine(void)
{
char* result = ""; //you need space to store input
int i, c;
i = 1;
while((c = getchar()) != EOF)
{
*result++ = c; //you should check space
i++;
if(c == '\n')
return result - i; //you should null-terminate
}
return result - i; //you should null-terminate
}
You need to allocate space for your string, which is challenging because you don't know how much space you are going to need a priori. So you need to decide whether to limit how much you read (ala fgets), or dynamically reallocate space as you read more. Also, how to you indicate that you have finished input (reached EOF)?
The following alternative assumes dynamic reallocation is your chosen strategy.
char* getLine(void)
{
int ch; int size=100; size_t pos=0;
char* result = malloc(size*sizeof(char*));
while( (ch=getchar()) != EOF )
{
*result++ = ch;
if( ++pos >= size ) {
realloc(result,size+=100);
//or,realloc(result,size*=2);
if(!result) exit(1); //realloc failed
}
if( c=='\n' ) break;
}
*result = '\0'; //null-terminate
return result - pos;
}
When you are done with the string returned from the above function, please remember to free() the allocated space.
This alternative assumes you provide a buffer to store the string (and specifies the size of the buffer).
char* getLine(char* buffer, size_t size)
{
int ch;
char* result = buffer;
size_t pos=0;
while( (ch=getchar()) != EOF )
{
*result++ = ch;
if( ++pos >= size ) break; //full
if( c=='\n' ) break;
}
*result = '\0'; //null-terminate
return buffer;
}
Both avoid the subtle interaction between detecting EOF, and having enough space to store a character read. The solution is to buffer a character if you read and there is not enough room, and then inject that on a subsequent read. You will also need to null-ter
Both functions have undefined behaviour since you are modifying string literals. It just seems to work in one case. Basically, result needs to point to memory that can be legally accessed, which is not the case in either of the snippets.
On the same subject, you might find this useful: What Every C Programmer Should Know About Undefined Behavior.
Think of it this way.
When you say
char* result = "";
you are setting up a pointer 'result' to point to a 1-byte null terminated string (just the null). Since it is a local variable it will be allocated on the stack.
Then when you say
*result++ = c;
you are storing that value 'c' in to that address + 1.
So, where are you putting it?
Well, most stacks are to-down; so they grow toward lower addresses; so, you are probably writing over what is already on the stack (the return address for whatever called this, all the registers it needs restore and all sorts of important stuff).
That is why you have to be very careful with pointers.
When you expect to return a string from a function, you have two options (1) provide a string to the function with adequate space to hold the string (including the null-terminating character), or (2) dynamically allocate memory for the string within the function and return a pointer. Within your function you must also have a way to insure your are not writing beyond the end of the space available and you are leaving room for the null-terminating character. That requires passing a maximum size if you are providing the array to the function, and keeping count of the characters read.
Putting that together, you could do something similar to:
#include <stdio.h>
#define MAXC 256
char* getLine (char *s, int max)
{
int i = 0, c = 0;
char *p = s;
while (i + 1 < max && (c = getchar()) != '\n' && c != EOF) {
*p++ = c;
i++;
}
*p = 0;
return s;
}
int main (void) {
char buf[MAXC] = {0};
printf ("\ninput : ");
getLine (buf, MAXC);
printf ("output: %s\n\n", buf);
return 0;
}
Example/Output
$ ./bin/getLine
input : A quick brown fox jumps over the lazy dog.
output: A quick brown fox jumps over the lazy dog.

Printing a string due to a new line

Is there any efficient (- in terms of performance) way for printing some arbitrary string, but only until the first new line character in it (excluding the new line character) ?
Example:
char *string = "Hello\nWorld\n";
printf(foo(string + 6));
Output:
World
If you are concerned about performance this might help (untested code):
void MyPrint(const char *str)
{
int len = strlen(str) + 1;
char *temp = alloca(len);
int i;
for (i = 0; i < len; i++)
{
char ch = str[i];
if (ch == '\n')
break;
temp[i] = ch;
}
temp[i] = 0;
puts(temp);
}
strlen is fast, alloca is fast, copying the string up to the first \n is fast, puts is faster than printf but is is most likely far slower than all three operations mentioned before together.
size_t writetodelim(char const *in, int delim)
{
char *end = strchr(in, delim);
if (!end)
return 0;
return fwrite(in, 1, end - in, stdout);
}
This can be generalized somewhat (pass the FILE* to the function), but it's already flexible enough to terminate the output on any chosen delimiter, including '\n'.
Warning: Do not use printf without format specifier to print a variable string (or from a variable pointer). Use puts instead or "%s", string.
C strings are terminated by '\0' (NUL), not by newline. So, the functions print until the NUL terminator.
You can, however, use your own loop with putchar. If that is any performance penalty is to be tested. Normally printf does much the same in the library and might be even slower, as it has to care for more additional constraints, so your own loop might very well be even faster.
for ( char *sp = string + 6 ; *sp != '\0'; sp++ ) {
if ( *sp == '\n' ) break; // newline will not be printed
putchar(*sp);
}
(Move the if-line to the end of the loop if you want newline to be printed.)
An alternative would be to limit the length of the string to print, but that would require finding the next newline before calling printf.
I don't know if it is fast enough, but there is a way to build a string containing the source string up to a new line character only involving one standard function.
char *string = "Hello\nWorld\nI love C"; // Example of your string
static char newstr [256]; // String large enough to contain the result string, fulled with \0s or NULL-terimated
sscanf(string + 6, "%s", newstr); // sscanf will ignore whitespaces
sprintf(newstr); // printing the string
I guess there is no more efficient way than simply looping over your string until you find the first \n in it. As Olaf mentioned it, a string in C ends with a terminating \0 so if you want to use printf to print the string you need to make sure it contains the terminating \0 or yu could use putchar to print the string character by character.
If you want to provide a function creating a string up to the first found new line you could do something like that:
#include <stdio.h>
#include <string.h>
#define MAX 256
void foo(const char* string, char *ret)
{
int len = (strlen(string) < MAX) ? (int) strlen(string) : MAX;
int i = 0;
for (i = 0; i < len - 1; i++)
{
if (string[i] == '\n') break;
ret[i] = string[i];
}
ret[i + 1] = '\0';
}
int main()
{
const char* string = "Hello\nWorld\n";
char ret[MAX];
foo(string, ret);
printf("%s\n", ret);
foo(string+6, ret);
printf("%s\n", ret);
}
This will print
Hello
World
Another fast way (if the new line character is truly unwanted)
Simply:
*strchr(string, '\n') = '\0';

small C program doesnt work

I tried to make a program that gets a user input(lines) and prints the longest line that is over 80 characters long. I made the program , but when i ran it , it outputed some very weird symbols. Could you please tell me what might be wrong with my code ?
#include <stdio.h>
#define MINLINE 80
#define MAXLINE 1000
int getline(char current[]);
void copy(char from[], char to[]);
int main(void)
{
int len; // current input line lenght
int max; // the lenght of a longest line that's over 80 characters
char current[MAXLINE]; // current input line
char over80[MAXLINE]; // input line that's over 80 characters long
while (len = (getline(current)) > 0) {
if (len > MINLINE) {
max = len;
copy(current, over80);
}
}
if (max > 0) {
printf("%s", over80);
}
else {
printf("No input line was over 80 characters long");
}
return 0;
}
int getline(char current[]) {
int i = 0, c;
while (((c = getchar()) != EOF) && c != '\n') {
current[i] = c;
++i;
}
if (i == '\n') {
current[i] = c;
++i;
}
current[i] = '\0';
return i;
}
void copy(char from[], char to[]) {
int i = 0;
while ((to[i] = from[i]) != '\0') {
++i;
}
}
Thank you very much for your help !
max can be not initialized if no long line is found. Using it in if (max > 0) is then undefined behavior.
This line:
while (len = (getline(current)) > 0) {
assigns the value of (getline(current)) > 0) to len, which is not what you want (len will be 0 or 1 afterwards.
EDIT: Just saw AusCBloke's comment, you should also check for both len > max and len > MINLINE or you'll just get the latest line longer than 80 chars, not the longest overall line.
You should also initialize max to 0, so it should be
max = 0;
while ((len = getline(current)) > 0) {
if ((len > MINLINE) && (len > max)) {
Other minor errors/tips:
The built in functions strcpy and strncpy do what your copy function does, there's no need to reinvent the wheel.
In your getline function, use MAXLINE to prevent buffer overflows.
Assuming that this is a homework, here's a hint: this piece of code looks very suspicious:
if (i == '\n') {
current[i] = c;
++i;
}
Since i represents a position and is never assigned a character, you are effectively checking if the position is equal to the ASCII code of '\n'.
Your copy method doesn't null terminate the string:
void copy(char from[], char to[]) {
int i = 0;
while ((to[i] = from[i]) != '\0') {
++i;
}
to[i] = '\0'
}
which probably explains the weird characters being printed.
You could use the builtin strcpy() to make life easier.
I can't test your code right now, but it may be caused by character arrays not being cleaned. Try memset-ing the char arrays to 0.
If you supply input data that has lines with more than 1000 characters you will overflow your fixed size buffers. By feeding in such input I was able to achieve the following output:
╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠
╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠
╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠
There are a number of problems with your code. Mostly they are due to wheel-reinvention.
int getline(char current[]);
You don't need to define your own, getline(), there is already one in stdio.h.
void copy(char from[], char to[]);
There are also a number of functions for copying strings in string.h.
It's also a good idea to initialise all 0f your variables, like this:
int len = 0; // current input line length
...this can prevent problems later, like comparisons to max when you haven't initialised it.
If you initialise max like this...
int max = MINLINE; // the length of a longest line that's over 80 characters
...then it's easier to do the length comparison later on.
char* current = NULL;
size_t allocated = 0;
If current is NULL, then getline() will allocate a buffer for storing the line, which should be freed by the user program. getline() also takes a pointer to a size_t, which contains the amount of bytes needed to store the line.
while (len = (getline(current)) > 0) {
Should be replaced by the following...
while ((len = getline(&current, &allocated, stdin)) > 0) {
...which updates and compares len to 0.
Now, instead of...
if (len > MINLINE) {
...you need to compare with the last longest line, which we initialised earlier...
if (len > max) {
...and then you're good to update max as you were...
max = len;
Where you called your copy() use strncpy(), which will prevent you writing over 1,000 characters into the allocated buffer:
strncpy(over80, current, MAXLINE);
Because we initialised max, you'll need to change your check at the end from if (max > 0) to if (max > MINLINE).
One more tip, changing the following line...
printf("No input line was over 80 characters long");
...to...
printf("No input line was over %d characters long", MINLINE);
...will mean that you only have to change the #define at the top of the file to increase or decrease the minimum length.
Don't forget to...
free(current);
...to prevent memory leaks!

Resources