Word Wrap Program C - c

At the end of Chapter 1 of The C Programming Language, there are a few exercises to complete. The one I am doing now asks you to make a program that wraps a long string of text into multiple lines at a specific length. The following function works 100%, aside from the last line which does not get wrapped, no matter the specified maximum width of a line.
// wrap: take a long input line and wrap it into multiple lines
void wrap(char s[], const int wrapline)
{
int i, k, wraploc, lastwrap;
lastwrap = 0; // saves character index after most recent line wrap
wraploc = 0; // used to find the location for next word wrap
for (i = 0; s[i] != '\0'; ++i, ++wraploc) {
if (wraploc >= wrapline) {
for (k = i; k > 0; --k) {
// make sure word wrap doesn't overflow past maximum length
if (k - lastwrap <= wrapline && s[k] == ' ') {
s[k] = '\n';
lastwrap = k+1;
break;
}
}
wraploc = 0;
}
} // end main loop
for (i = 0; i < wrapline; ++i) printf(" ");
printf("|\n");
printf("%s\n", s);
}
I have found the issue to be with the variable wraploc, which is incremented until it is greater than wrapline (the maximum index of a line). Once it is greater than wrapline, a newline is inserted at the appropriate location and wraploc is reset to 0.
The problem is that on the last line, wraploc is never greater than wrapline, even when it should be. It increments perfectly throughout iteration of the string, until the last line. Take this example:
char s[] = "This is a sample string the last line will surely overflow";
wrap(s, 15);
$ ./a.out
|
this is a
sample string
the last line
will surely overflow
The line represents the location where it should be wrapped. In this case, wraploc has the value 14, when there are clearly more characters than that.
I have no idea why this is happening, can someone help me out?
(Also I'm a complete beginner to C and I have no experience with pointers so please stay away from those in your answers, thanks).

You increment wraploc with i until it reaches wrapline (15 in the example).
When you wrap, you backtrack from i, back to the last whitespace.
That means that in your next line you already have some characters between the lastwrap location and i, i.e., you can't reset wraploc to 0 there.
Try setting wraploc = i-lastwrap instead.

Anybody who might, like me, find this question and run into a problem with new-lines in the source string.
This is my answer:
inline int wordlen(const char * str){
int tempindex=0;
while(str[tempindex]!=' ' && str[tempindex]!=0 && str[tempindex]!='\n'){
++tempindex;
}
return(tempindex);
}
void wrap(char * s, const int wrapline){
int index=0;
int curlinelen = 0;
while(s[index] != '\0'){
if(s[index] == '\n'){
curlinelen=0;
}
else if(s[index] == ' '){
if(curlinelen+wordlen(&s[index+1]) >= wrapline){
s[index] = '\n';
curlinelen = 0;
}
}
curlinelen++;
index++;
}
}

Related

How do I replace all occurrences in an array with another array in C

I want to replace all occurrences in an array (string) with another array.
I have a code that:
stores the string in an array in which the replacing is to take place output[],
another array that stores the string to be searched for as replace[] and a third array called toBeReplacedBy and the replacing of the first occurrence works just fine but it skips the other occurrences in the output
for example:
replace[]:
abc
toBeReplacedBy[]:
xyz
output[]:
abcdefabc
becomes
xyzdefabc
but it should become:
xyzdefxyz
I suspect the problem lies with the replacer code :
//the replacer
for (i = 0; i<80; i++) {
if (output[i] == replace[i])
output[i] = toBeReplacedBy[i];
}
//debug purpose
puts("output[]:\n");
puts(output);
return 0;
}
What have I done wrong here and how could I get it to replace all occurrences in the array.
please be aware that I only wish to use stdio.h to do this
thabks in advance
Never iterate further than the array length. This leads to undefined and possibly dangerous behaviour. If you only expect strings, use something like:
int i = 0;
while(output[i] != '\0')
{
// your logic here
i++;
}
Additionally you want to check for concurrent appearances of the same characters. But in your code you only check the first three characters. Everything after that is undefinded behaviour, because you cannot know what replace[3] returns.
Something similar to this could work:
int i = 0;
int j = 0;
int k;
while(output[i] != '\0')
{
if (output[i] == replace[j])
j++;
else
j = 0;
// replace 3 with the array length of the replace[] array
if (j == 3)
{
for(k = i; j >= 0; k-- )
{
output[k] = toBeReplacedBy[j]
j--
}
j = 0;
}
i++;
}
But please check the array boundaries.
edit: Additionally as Nellie states using a debugger would help you to understand what went wrong. Go through your program step by step and look how and when values change.
First advice is to try to debug your program if it does not work.
for (i = 0; i<80; i++) {
if (output[i] == replace[i])
output[i] = toBeReplacedBy[i];
}
There are two problems in this loop.
The first is that are iterating until i is 80. Let's look what happens when i becomes 3. output[3] in case of abcdefabc is d, but what is replace[3]? Your replacement array had only 3 letters, so you have to go back in the replacement array once you finish with one occurrence of it in the original string.
The second is that you check letter by letter.
Say you original array, which you named output somehow was abkdefabc, first three letters do not match your replacement string, but you will check the first two letters they will match with the replacement's first two letters and you will incorrectly change them.
So you need to first check that the whole replacement string is there and only then replace.
You should use strlen() to know length of your array or iterate until you reach the end of a your array ('\0').
'\0' and strlen are only available for array of char.
Your loop should looks like this :
int i = 0;
int len = strlen(my_string);
while (i < len)
{
//logic here
i = i + 1;
}
OR
int i = 0;
while (my_string[i] != '\0')
{
// logic here
i = i + 1;
}

Why is this array being initialized in an odd way?

I am reading K&R 2nd Edition and I am having trouble understanding exercise 1-13. The answer is this code
#include <stdio.h>
#define MAXHIST 15
#define MAXWORD 11
#define IN 1
#define OUT 0
main()
{
int c, i, nc, state;
int len;
int maxvalue;
int ovflow;
int wl[MAXWORD];
state = OUT;
nc = 0;
ovflow = 0;
for (i = 0; i < MAXWORD; i++)
wl[i] = 0;
while ((c = getchar()) != EOF)
{
if(c == ' ' || c == '\n' || c == '\t')
{
state = OUT;
if (nc > 0)
{
if (nc < MAXWORD)
++wl[nc];
else
++ovflow;
}
nc = 0;
}
else if (state == OUT)
{
state = IN;
nc = 1;
}
else
++nc;
}
maxvalue = 0;
for (i = 1; i < MAXWORD; ++i)
{
if(wl[i] > maxvalue)
maxvalue = wl[i];
}
for(i = 1; i < MAXWORD; ++i)
{
printf("%5d - %5d : ", i, wl[i]);
if(wl[i] > 0)
{
if((len = wl[i] * MAXHIST / maxvalue) <= 0)
len = 1;
}
else
len = 0;
while(len > 0)
{
putchar('*');
--len;
}
putchar('\n');
}
if (ovflow > 0)
printf("There are %d words >= %d\n", ovflow, MAXWORD);
return 0;
}
At the top, wl is being declared and initialized. What I don't understand is why is it looping through it and setting everything to zero if it just counts the length of words? It doesn't keep track of how many words there are, it just keeps track of the word length so why is everything set to 0?
I know this is unclear it's just been stressing me out for the past 20 minutes and I don't know why.
The ith element of the array wl[] is the number of words of length i that have been found in an input file. The wl[] array needs to be zero-initialized first so that ++wl[nc]; does not cause undefined behavior by attempting to use an uninitialized variable, and so that array elements that represent word lengths that are not present reflect that no such word lengths were found.
Note that ++wl[nc] increments the value wl[nc] when a word of length nc is encountered. If the array were not initialized, the first time the code attempts to increment an array element, it would be attempting to increment an indeterminate value. This attempt would cause undefined behavior.
Further, array indices that represent counts of word lengths that are not found in the input should hold values of zero, but without the zero-initialization, these values would be indeterminate. Even attempting to print these indeterminate values would cause undefined behavior.
The moral: initialize variables to sensible values, or store values in them, before attempting to use them.
It would seem simpler and be more clear to use an array initializer to zero-initialize the wl[] array:
int wl[MAXWORD] = { 0 };
After this, there is no need for the loop that sets the array values to zero (unless the array is used again) for another file. But, the posted code is from The C Answer Book by Tondo and Gimpel. This book provides solutions to the exercises found in the second edition of K&R in the style of K&R, and using only ideas that have been introduced in the book before each exercise. This exercise, 1.13, occurs in "Chapter 1 - A Tutorial Introduction". This is a brief tour of the language lacking many details to be found later in the book. At this point, assignment and arrays have been introduced, but array initializers have not (this has to wait until Chapter 4), and the K&R code that uses arrays has initialized arrays using loops thus far. Don't read too much into code style from the introductory chapter of a book that is 30+ years old.
Much has changed in C since K&R was published, e.g., main() is no longer a valid function signature for the main() function. Note that the function signature must be one of int main(void) or int main(int argc, char *argv[]) (or alternatively int main(int argc, char **argv)), with a caveat for implementation-defined signatures for main().
Everything is set to 0 because if you dont initialize the array, the array will be initialize with random number in it. Random number will cause error in your program. Instead of looping in every position of your array you could do this int wl[MAXWORD] = {0}; at the place of int wl[MAXWORD]; this will put 0 at every position in your array so you dont hava to do the loop.
I edited your code and put some comments in as I was working through it, to explain what's going on. I also changed some of your histogram calculations because they didn't seem to make sense to me.
Bottom line: It's using a primitive "state machine" to count up the letters in each group of characters that isn't white space. It stores this in wl[] such that wl[i] contains an integer that tells you how many groups of characters (sometimes called "tokens") has a word length of i. Because this is done by incrementing the appropriate element of w[], each element must be initialized to zero. Failing to do so would lead to undefined behavior, but probably would result in nonsensical and absurdly large counts in each element of w[].
Additionally, any token with a length that can't be reflected in w[] will be tallied in the ovflow variable, so at the end there will be an accounting of every token.
#include <stdio.h>
#define MAXHIST 15
#define MAXWORD 11
#define IN 1
#define OUT 0
int main(void) {
int c, i, nc, state;
int len;
int maxvalue;
int ovflow;
int wl[MAXWORD];
// Initializations
state = OUT; //Start off not assuming we're IN a word
nc = 0; //Start off with a character count of 0 for current word
ovflow = 0; //Start off not assuming any words > MAXWORD length
// Start off with our counters of words at each length at zero
for (i = 0; i < MAXWORD; i++) {
wl[i] = 0;
}
// Main loop to count characters in each 'word'
// state keeps track of whether we are IN a word or OUTside of one
// For each character in the input stream...
// - If it's whitespace, set our state to being OUTside of a word
// and, if we have a character count in nc (meaning we've just left
// a word), increment the counter in the wl (word length) array.
// For example, if we've just counted five characters, increment
// wl[5], to reflect that we now know there is one more word with
// a length of five. If we've exceeded the maximum word length,
// then increment our overflow counter. Either way, since we're
// currently looking at a whitespace character, reset the character
// counter so that we can start counting characters with our next
// word.
// - If we encounter something other than whitespace, and we were
// until now OUTside of a word, change our state to being IN a word
// and start the character counter off at 1.
// - If we encounter something other than whitespace, and we are
// still in a word (not OUTside of a word), then just increment
// the character counter.
while ((c = getchar()) != EOF) {
if (c == ' ' || c == '\n' || c == '\t') {
state = OUT;
if (nc > 0) {
if (nc < MAXWORD) ++wl[nc];
else ++ovflow;
}
nc = 0;
} else if (state == OUT) {
state = IN;
nc = 1;
} else {
++nc;
}
}
// Find out which length has the most number of words in it by looping
// through the word length array.
maxvalue = 0;
for (i = 1; i < MAXWORD; ++i) {
if(wl[i] > maxvalue) maxvalue = wl[i];
}
// Print out our histogram
for (i = 1; i < MAXWORD; ++i) {
// Print the word length - then the number of words with that length
printf("%5d - %5d : ", i, wl[i]);
if (wl[i] > 0) {
len = wl[i] * MAXHIST / maxvalue;
if (len <= 0) len = 1;
} else {
len = 0;
}
// This is confusing and unnecessary. It's integer division, with no
// negative numbers. What we want to have happen is that the length
// of the bar will be 0 if wl[i] is zero; that the bar will have length
// 1 if the bar is otherwise too small to represent; and that it will be
// expressed as some fraction of MAXHIST otherwise.
//if(wl[i] > 0)
// {
// if((len = wl[i] * MAXHIST / maxvalue) <= 0)
// len = 1;
// }
// else
// len = 0;
// Multiply MAXHIST (our histogram maximum length) times the relative
// fraction, i.e., we're using a histogram bar length of MAXHIST for
// our statistical mode, and interpolating everything else.
len = ((double)wl[i] / maxvalue) * MAXHIST;
// Our one special case might be if maxvalue is huge, a word length
// with just one occurrence might be rounded down to zero. We can fix
// that manually instead of using a weird logic structure.
if ((len == 0) && (wl[i] > 0)) len = 1;
while (len > 0) {
putchar('*');
--len;
}
putchar('\n');
}
// If any words exceeded the maximum word length, say how many there were.
if (ovflow > 0) printf("There are %d words >= %d\n", ovflow, MAXWORD);
return 0;
}

2D string array is storing '\0' when it encounters a word with more than one space or digit

I am pretty new to C programming. My program is supposed to take a string and move it into a 2D array. With the words either being separated by a white-space or a digit. This works perfectly fine if there is one space or digit separating it. However, as soon as there is more than one it starts adding '\0' to my array.
//Move the string into a 2D array
for(i = 0; i < total + 1; i++)
{
if(isalpha( *(tempString + i) ))
{
sortingArray[n][j++] = tempString[i];
input++;
}
else
{
sortingArray[n][j++] = '\0';
n++;
j = 0;
}
if(tempString[i] == '\0')
break;
}
This is a sample of what happens (n = number of rows placed)
./a.out "one more way"
5 inputs
before
one
more
way
After
one
more
way
You need to skip consecutive delimiters:
for(i = 0; i < total; i++)
{
if(isalpha(tempString[i]))
{
sortingArray[n][j] = tempString[i];
++j;
++input;
}
else
{
// skip consecutive delimiters
while (i < total && !isalpha(tempString[i]))
++i;
sortingArray[n][j] = '\0';
++j
++n;
j = 0;
}
}
Disclaimer: not verified by a compiler. Use caution!
I also took the liberty of some improvements to your original code.
there is no sense to check for \0 if you have the length of the string.
changed *(tempString + i) to the clear tempString[i]
moved the increments out of the larger expressions into their own full expression. It is clearer this way.
It's a simple logic failure for which a debugger is ideal for identifying.
Imagine you have the string "hello world".
It stores "hello" into sortingArray[0] easily enough. When it gets to the first space it increments n and starts looking for the next word. But the next character it finds is another space so it increments n again.
A slight change is required to your logic
if(isalpha( *(tempString + i) ))
{
sortingArray[n][j++] = tempString[i];
input++;
}
else if(j>0)
{
sortingArray[n][j++] = '\0';
n++;
j = 0;
}
Now the code will only increment n if the previous character was a letter (by virtue of j being more than 0). Otherwise if it doesn't care and will keep going.
You should also check to see if j is non-zero after the loop as that means there is a new entry in sortingArray that needs a NUL added.
One thing also to note is that the way you're doing the for loop is a little odd. You have this
for(i = 0; i < total + 1; i++)
but also this inside the loop
if(tempString[i] == '\0')
break;
Typically, the way to terminate the for loop would be to write it like this
for(i = 0; tempString[i]!='\0'; i++)
as that way you firstly don't care about the length of the string, but the loop will finish when it hits the NUL character.

Parsing character array to words held in pointer array (C-programming)

I am trying to separate each word from a character array and put them into a pointer array, one word for each slot. Also, I am supposed to use isspace() to detect blanks. But if there is a better way, I am all ears. At the end of the code I want to print out the content of the parameter array.
Let's say the line is: "this is a sentence". What happens is that it prints out "sentence" (the last word in the line, and usually followed by some random character) 4 times (the number of words). Then I get "Segmentation fault (core dumped)".
Where am I going wrong?
int split_line(char line[120])
{
char *param[21]; // Here I want to put one word for each slot
char buffer[120]; // Word buffer
int i; // For characters in line
int j = 0; // For param words
int k = 0; // For buffer chars
for(i = 0; i < 120; i++)
{
if(line[i] == '\0')
break;
else if(!isspace(line[i]))
{
buffer[k] = line[i];
k++;
}
else if(isspace(line[i]))
{
buffer[k+1] = '\0';
param[j] = buffer; // Puts word into pointer array
j++;
k = 0;
}
else if(j == 21)
{
param[j] = NULL;
break;
}
}
i = 0;
while(param[i] != NULL)
{
printf("%s\n", param[i]);
i++;
}
return 0;
}
There are many little problems in this code :
param[j] = buffer; k = 0; : you rewrite at the beginning of buffer erasing previous words
if(!isspace(line[i])) ... else if(isspace(line[i])) ... else ... : isspace(line[i]) is either true of false, and you always use the 2 first choices and never the third.
if (line[i] == '\0') : you forget to terminate current word by a '\0'
if there are multiple white spaces, you currently (try to) add empty words in param
Here is a working version :
int split_line(char line[120])
{
char *param[21]; // Here I want to put one word for each slot
char buffer[120]; // Word buffer
int i; // For characters in line
int j = 0; // For param words
int k = 0; // For buffer chars
int inspace = 0;
param[j] = buffer;
for(i = 0; i < 120; i++) {
if(line[i] == '\0') {
param[j++][k] = '\0';
param[j] = NULL;
break;
}
else if(!isspace(line[i])) {
inspace = 0;
param[j][k++] = line[i];
}
else if (! inspace) {
inspace = 1;
param[j++][k] = '\0';
param[j] = &(param[j-1][k+1]);
k = 0;
if(j == 21) {
param[j] = NULL;
break;
}
}
}
i = 0;
while(param[i] != NULL)
{
printf("%s\n", param[i]);
i++;
}
return 0;
}
I only fixed the errors. I leave for you as an exercise the following improvements :
the split_line routine should not print itself but rather return an array of words - beware you cannot return an automatic array, but it would be another question
you should not have magic constants in you code (120), you should at least have a #define and use symbolic constants, or better accept a line of any size - here again it is not simple because you will have to malloc and free at appropriate places, and again would be a different question
Anyway good luck in learning that good old C :-)
This line does not seems right to me
param[j] = buffer;
because you keep assigning the same value buffer to different param[j] s .
I would suggest you copy all the char s from line[120] to buffer[120], then point param[j] to location of buffer + Next_Word_Postition.
You may want to look at strtok in string.h. It sounds like this is what you are looking for, as it will separate words/tokens based on the delimiter you choose. To separate by spaces, simply use:
dest = strtok(src, " ");
Where src is the source string and dest is the destination for the first token on the source string. Looping through until dest == NULL will give you all of the separated words, and all you have to do is change dest each time based on your pointer array. It is also nice to note that passing NULL for the src argument will continue parsing from where strtok left off, so after an initial strtok outside of your loop, just use src = NULL inside. I hope that helps. Good luck!

How to solve this output in C?

I have to search a substring in a string & display the complete word as given below everytime the substring is found-
eg:
Input: excellent
Output: excellent,excellently
I cannot figure out how to make the output like the one above.
My output:
excellent,excellently,
It always give me a comma in the end.
Prog: desc
Iteratively convert every words in the dictionary into lowercase,
and store the converted word in lower.
Use strncmp to compare the first len characters of input_str and lower.
If the return value of strncmp is 0, then the first len characters
of the two strings are the same.
void complete(char *input_str)
{
int len = strlen(input_str);
int i, j, found;
char lower[30];
found = 0;
for(i=0;i<n_words;i++)
{
for(j=0;j<strlen(dictionary[i]);j++)
{
lower[j] = tolower(dictionary[i][j]);
}
lower[j+1]='\0';
found=strncmp(input_str,lower,len);
if(found==0)//found the string n print out
{
printf("%s",dictionary[i]);
printf(",");
}
}
if (!found) {
printf("None.\n");
} else {
printf("\n");
}
}
Check if you have already printed a word before printing a second one:
char printed = 0;
for (i=0; i < n_words; i++)
{
for(j = 0; j < strlen(dictionary[i]); j++)
lower[j] = tolower(dictionary[i][j]);
lower[j + 1] = '\0';
found = strncmp(input_str, lower, len);
if (found == 0)//found the string n print out
{
if (printed)
printf(",");
printf("%s", dictionary[i]);
printed = 1;
}
}
There are two approaches that I tend to use for the comma-printing problem:
At the start of the loop, print the comma if i > 0.
At the end (after printing the real value), print the comma if i < (n - 1).
You can use either, the first is simpler since the comparison is simpler, but it can be slightly less convenient since it moves the comma-printing in time (!). On each loop iteration, you're printing the comma that belongs to the previous iteration. At least that how it's feels to me, but of course it's rather subjective.

Resources