So my exercise is to sort words in 1D char array. My code is almost working, but it always skips the last char of the last word. Here is my code. I've added some comments to make it somehow readable. I know it's not brilliant code but I've just started with programming.
int main(void) {
char input[] = "If you are working on something that you really care about you dont have to be pushed The vision pulls you Steve Jobs";
sort_alphabetically(input);
printf("%s", input);
}
int sort_alphabetically(char tab[]) {
int j = 0, k = 0, i = 0, g = 0, f = 0, l = 0;
char tmp[1001];
char tmp2[501][1001];
while (tab[i] == ' ') // skipping leading whitespaces
i++;
for (j = i; tab[j] != '\0'; j++) {
if (tab[j] != ' ' && tab[j + 1] != '\0')
k++; // counting word length
else if (tab[j] == ' ' || tab[j + 1] == '\0' || tab[j + 1] == '\0') {
// copying word t0 2d array
for (g = k; g > 0; g--) {
tmp[l] = tab[j - g];
l++;
}
tmp[l] = 0;
strcpy(tmp2[f], tmp); // copying
f++; //words ++ in tmp2
k = 0;
l = 0;
tmp[0] = 0;
}
}
tab[0] = 0;
tmp[0] = 0;
for (j = 0; j < f; j++) {
for (i = 0; i < f - 1; i++) {
if (strcmp(tmp2[i], tmp2[i + 1]) > 0) { //sorting words in alphabeticall order
strcpy(tmp, tmp2[i]);
strcpy(tmp2[i], tmp2[i + 1]);
strcpy(tmp2[i + 1], tmp);
}
}
}
for (i = 0; i < f; i++) {
strcat(tab, tmp2[i]); // copying to tab
strcat(tab, " "); //adding spaces after each word
}
// removing whitespaces
for (i = 0; tab[i] == ' ' || tab[i] == '\t'; i++);
for (j = 0; tab[i]; i++) {
tab[j++] = tab[i];
}
tab[j] = '\0';
}
;
After running this code it cuts the s in last word (Jobs). If someone can help me with this spaghetti I would be so happy.
The problem was with how you were handling the null byte vs the space. In the space case, you were actually on the space when you copied the string. But in the null byte case, you were one before the null byte. This leads to an off-by-one error. You need to modify the code to avoid handling it differently for spaces and null bytes:
for (j = i; tab[j] != '\0'; j++) {
//In the space case, you are on the space, but in the \0 case
//you were one before it.
//Changed this if statement so that you always copy the string
//when you're at the last character.
if (tab[j + 1] == ' ' || tab[j + 1] == '\0') {
//k is a length, but we're using it as an index
//so we will need to adjust by one
for (g = k; g > 0; g--) {
tmp[l] = tab[j - g + 1];
l++;
}
}
else
{
k++;
}
}
I worked this out by putting print statements that showed me the value of tab[j] and the value of k at each cycle. Watching your program execute, either with print statements or a debugger, is usually the best way to diagnose these sorts of issues.
The problem you have is in copying characters to the tmp buffer when you reach the end of the input (tab) string; that is, when tab[j + 1] == '\0' is true. In this case, you aren't copying the last data in this the for loop:
for (g = k; g > 0; g--) {
tmp[l] = tab[j - g];
l++;
}
To fix the issue, simply change the loop's 'condition' to include when g is zero, and skip this 'iteration' when you encounter a space character:
for (g = k; g >= 0; g--) { // Make sure to include any 'last' character
if (tab[j - g] != ' ') { // ... but skip if this is a space
tmp[l] = tab[j - g];
l++;
}
}
Note also that you have a redundant test in this line:
else if (tab[j] == ' ' || tab[j + 1] == '\0' || tab[j + 1] == '\0') {
which could just as well be written without the third test (which is the same as the second), thus:
else if (tab[j] == ' ' || tab[j + 1] == '\0') {
Caveat: Most of the other responders have pointed out the major bugs in your code, but this has some smaller ones and some simplification.
Before doing strcat back to tab, we should do tab[0] = 0 so the initial strcat works correctly.
Doing strcat(tab," ") after the one that copies the word goes one beyond the end of tab and is, therefore, undefined behavior. It also requires an unnecessary cleanup loop to remove the extra space that should not have been there in the first place.
The initial "split into words" loop can be [greatly] simplified.
There are some standard speedups to the bubble sort
I realize that you're just starting out [and some schools actually advocate for i, j, etc], but it's better to use some [more] discriptive names
Anyway, here's a somewhat refactored version:
#include <stdio.h>
#include <string.h>
int opt_dbg;
#define dbg(_fmt...) \
if (opt_dbg) \
printf(_fmt)
void
sort_alphabetically(char tab[])
{
char tmp[1001];
char words[501][1001];
char *src;
char *dst;
char *beg;
int chr;
int wordidx;
int wordcnt;
wordidx = 0;
dst = words[wordidx];
beg = dst;
// split up string into individual words
src = tab;
for (chr = *src++; chr != 0; chr = *src++) {
switch (chr) {
case ' ':
case '\t':
// wait until we've seen a non-white char before we start a new
// word
if (dst <= beg)
break;
// finish prior word
*dst = 0;
// point to start of next word
dst = words[++wordidx];
beg = dst;
break;
default:
*dst++ = chr;
break;
}
}
// finish last word
*dst = 0;
// get number of words
wordcnt = wordidx + 1;
if (opt_dbg) {
for (wordidx = 0; wordidx < wordcnt; ++wordidx)
dbg("SPLIT: '%s'\n",words[wordidx]);
}
// in bubble sort, after a given pass, the _last_ element is guaranteed to
// be the largest, so we don't need to examine it again
for (int passlim = wordcnt - 1; passlim >= 1; --passlim) {
int swapflg = 0;
// sorting words in alphabetical order
for (wordidx = 0; wordidx < passlim; ++wordidx) {
char *lhs = words[wordidx];
char *rhs = words[wordidx + 1];
if (strcmp(lhs,rhs) > 0) {
dbg("SWAP/%d: '%s' '%s'\n",passlim,lhs,rhs);
strcpy(tmp,lhs);
strcpy(lhs,rhs);
strcpy(rhs,tmp);
swapflg = 1;
}
}
// if nothing got swapped, we can stop early (i.e. everything is in
// sort)
if (! swapflg)
break;
}
// clear out destination so [first] strcat will work
tab[0] = 0;
// copy back words into original string
// adding the space as a _prefix_ before a word eliminates the need for a
// cleanup to remove the last space
for (wordidx = 0; wordidx < wordcnt; ++wordidx) {
dbg("SORTED: '%s'\n",words[wordidx]);
// adding spaces before each word
if (wordidx > 0)
strcat(tab, " ");
// copying to tab
strcat(tab,words[wordidx]);
}
}
int
main(int argc,char **argv)
{
char input[] = "If you are working on something that you really care"
" about you dont have to be pushed The vision pulls you Steve Jobs";
--argc;
++argv;
for (; argc > 0; --argc, ++argv) {
char *cp = *argv;
if (*cp != '-')
break;
switch (cp[1]) {
case 'd':
opt_dbg = ! opt_dbg;
break;
}
}
sort_alphabetically(input);
printf("%s\n", input);
return 0;
}
Related
I'm writting a program to count the length of each word in array of characters. I was wondering if You guys could help me, because I'm struggling with it for at least two hours for now and i don't know how to do it properly.
It should go like that:
(number of letters) - (number of words with this many letters)
2 - 1
3 - 4
5 - 1
etc.
char tab[1000];
int k = 0, x = 0;
printf("Enter text: ");
fgets(tab, 1000, stdin);
for (int i = 2; i < (int)strlen(tab); i++)
{
for (int j = 0; j < (int)strlen(tab); j++)
{
if (tab[j] == '\0' || tab[j]=='\n')
break;
if (tab[j] == ' ')
k = 0;
else k++;
if (k == i)
{
x++;
k = 0;
}
}
if (x != 0)
{
printf("%d - %d\n", i, x);
x = 0;
k = 0;
}
}
return 0;
By using two for loops, you're doing len**2 character scans. (e.g.) For a buffer of length 1000, instead of 1000 character comparisons, you're doing 1,000,000 comparisons.
This can be done in a single for loop if we use a word length histogram array.
The basic algorithm is the same as your inner loop.
When we have a non-space character, we increment a current length value. When we see a space, we increment the histogram cell (indexed by the length value) by 1. We then set the length value to 0.
Here's some code that works:
#include <stdio.h>
int
main(void)
{
int hist[100] = { 0 };
char buf[1000];
char *bp;
int chr;
int curlen = 0;
printf("Enter text: ");
fflush(stdout);
fgets(buf,sizeof(buf),stdin);
bp = buf;
for (chr = *bp++; chr != 0; chr = *bp++) {
if (chr == '\n')
break;
// end of word -- increment the histogram cell
if (chr == ' ') {
hist[curlen] += 1;
curlen = 0;
}
// got an alpha char -- increment the length of the word
else
curlen += 1;
}
// catch the final word on the line
hist[curlen] += 1;
for (curlen = 1; curlen < sizeof(hist) / sizeof(hist[0]); ++curlen) {
int count = hist[curlen];
if (count > 0)
printf("%d - %d\n",curlen,count);
}
return 0;
}
UPDATE:
and i don't really understand pointers. Is there any simpler method to do this?
Pointers are a very important [essential] tool in the C arsenal, so I hope you get to them soon.
However, it is easy enough to convert the for loop (Removing the char *bp; and bp = buf;):
Change:
for (chr = *bp++; chr != 0; chr = *bp++) {
Into:
for (int bufidx = 0; ; ++bufidx) {
chr = buf[bufidx];
if (chr == 0)
break;
The rest of the for loop remains the same.
Here's another loop [but, without optimization by the compiler] double fetches the char:
for (int bufidx = 0; buf[bufidx] != 0; ++bufidx) {
chr = buf[bufidx];
Here is a single line version. Note this is not recommended practice because of the embedded assignment of chr inside the loop condition clause, but is for illustration purposes:
for (int bufidx = 0; (chr = buf[bufidx]) != 0; ++bufidx) {
Write a program to "fold" long input lines into two or more shorter lines after the last non-blank character that occurs before the n-th column of input. Make sure your program does something intelligent with very long lines, and if there are no blanks or tabs before the specified column.
The algorithm I decided to follow for this was as follows:
If length of input line < maxcol (the column after which one would have to fold), then print the line as it is.
If not, from maxcol, I check towards it's left, and it's right to find the closest non-space character, and save them as 'first' and 'last'. I then print the character array from line[0] to line[first] and then the rest of the array, from line[last] to line[len] becomes the new line array.
Here's my code:
#include <stdio.h>
#define MAXCOL 5
int getline1(char line[]);
int main()
{
char line[1000];
int len, i, j, first, last;
len = getline1(line);
while (len > 0) {
if (len < MAXCOL) {
printf("%s\n", line);
break;
}
else {
for (i = MAXCOL - 1; i >= 0; i--) {
if (line[i] != ' ') {
first = i;
break;
}
}
for (j = MAXCOL - 1; j <= len; j++) {
if (line[j] != ' ') {
last = j;
break;
}
}
//printf("first %d last %d\n", first, last);
for (i = 0; i <= first; i++)
putchar(line[i]);
putchar('\n');
for (i = 0; i < len - last; i++) {
line[i] = line[last + i];
}
len -= last;
first = last = 0;
}
}
return 0;
}
int getline1(char line[])
{
int c, i = 0;
while ((c = getchar()) != EOF && c != '\n')
line[i++] = c;
if (c == '\n')
line[i++] = '\n';
line[i] = '\0';
return i;
}
Here are the problems:
It does not do something intelligent with very long lines (this is fine, as I can add it as an edge case).
It does not do anything for tabs.
I cannot understand a part of the output.
For example, with the input:
asd de def deffff
I get the output:
asd
de
def
defff //Expected until here
//Unexpected lines below
ff
fff
deffff
deffff
deffff
Question 1 - Why do the unexpected lines print? How do I make my program/algorithm better?
Eventually, after spending quite some time with this question, I gave up and decided to check the clc-wiki for solutions. Every program here did NOT work, save one (The others didn't work because they did not cover certain edge cases). The one that worked was the largest one, and it did not make any sense to me. It did not have any comments, and neither could I properly understand the variable names, and what they represented. But it was the ONLY program in the wiki that worked.
#include <stdio.h>
#define YES 1
#define NO 0
int main(void)
{
int TCOL = 8, ch, co[3], i, COL = 19, tabs[COL - 1];
char bls[COL - 1], bonly = YES;
co[0] = co[1] = co[2] = 0;
while ((ch = getchar()) != EOF)
{
if (ch != '\t') {
++co[0];
++co[2];
}
else {
co[0] = co[0] + (TCOL * (1 + (co[2] / TCOL)) - co[2]);
i = co[2];
co[2] = TCOL + (co[2] / TCOL) * TCOL;
}
if (ch != '\n' && ch != ' ' && ch != '\t')
{
if (co[0] >= COL) {
putchar('\n');
co[0] = 1;
co[1] = 0;
}
else
for (i = co[1]; co[1] > 0; --co[1])
{
if (bls[i - co[1]] == ' ')
putchar(bls[i - co[1]]);
else
for (; tabs[i - co[1]] != 0;)
if (tabs[i - co[1]] > 0) {
putchar(' ');
--tabs[i - co[1]];
}
else {
tabs[i - co[1]] = 0;
putchar(bls[i - co[1]]);
}
}
putchar(ch);
if (bonly == YES)
bonly = NO;
}
else if (ch != '\n')
{
if (co[0] >= COL)
{
if (bonly == NO) {
putchar('\n');
bonly = YES;
}
co[0] = co[1] = 0;
}
else if (bonly == NO) {
bls[co[1]] = ch;
if (ch == '\t') {
if (TCOL * (1 + ((co[0] - (co[2] - i)) / TCOL)) -
(co[0] - (co[2] - i)) == co[2] - i)
tabs[co[1]] = -1;
else
tabs[co[1]] = co[2] - i;
}
++co[1];
}
else
co[0] = co[1] = 0;
}
else {
putchar(ch);
if (bonly == NO)
bonly = YES;
co[0] = co[1] = co[2] = 0;
}
}
return 0;
}
Question 2 - Can you help me make sense of this code and how it works?
It fixes all the problems with my solution, and also works by reading character to character, and therefore seems more efficient.
Question 1 - Why do the unexpected lines print? How do I make my program/algorithm better?
You are getting the unexpected lines in the output because after printing the array, you are not terminating the new line array with null character \0 -
Here you are copying character from starting from last till len - last, creating a new line array:
for (i = 0; i < len - last; i++) {
line[i] = line[last + i];
}
You have copied the characters but the null terminating character is still at its original position. Assume the input string is:
asd de def deffff
So, initially the content of line array will be:
"asd de def deffff\n"
^
|
null character is here
Now after printing asd, you are copying characters from last index of line till len - last index to line array itself starting from 0 index. So, after copying the content of line array will be:
"de def deffff\n deffff\n"
|____ _____|
\/
This is causing the unexpected output
(null character is still at the previous location)
So, after for loop you should add the null character just after the last character copied, like this:
line [len - last] = '\0';
With this the content of line array that will be processed in the next iteration of while loop will be:
"de def deffff\n"
One more thing, in the line array you can see the \n (newline) character at the end. May you want to remove it before processing the input, you can do:
line[strcspn(line, "\n")] = 0;
Improvements that you can do in your program:
1. One very obvious improvement that you can do is to use pointer to the input string while processing it. With the help of pointer you don't need to copy the rest of the array, apart from processed part, again to the same array till the program process the whole input. Initialize the pointer to the start of the input string and in every iteration just move the pointer to appropriate location and start processing from that location where pointer is pointing to.
2. Since you are taking the whole input first in a buffer and then processing it. You may consider fgets() for taking input. It will give better control over the input from user.
3. Add a check for line array overflow, in case of very long input. With fgets() you can specify the maximum number of character to be copied to line array from input stream.
Question 2 - Can you help me make sense of this code and how it works?
The program is very simple, try to understand it at least once by yourself. Either use a debugger or take a pen and paper, dry run it once for small size input and check the output. Increase the input size and add some variations like multiple space characters and check the program code path and output. This way you can understand it very easily.
Here's another (and I think better) solution to this exercise :
#include <stdio.h>
#define MAXCOL 10
void my_flush(char buf[]);
int main()
{
int c, prev_char, i, j, ctr, spaceleft, first_non_space_buf;
char buf[MAXCOL+2];
prev_char = -1;
i = first_non_space_buf = ctr = 0;
spaceleft = MAXCOL;
printf("Just keep typing once the output has been printed");
while ((c = getchar()) != EOF) {
if (buf[0] == '\n') {
i = 0;
my_flush(buf);
}
//printf("Prev char = %c and Current char = %c and i = %d and fnsb = %d and spaceleft = %d and j = %d and buf = %s \n", prev_char, c, i, first_non_space_buf, spaceleft, j, buf);
if ((((prev_char != ' ') && (prev_char != '\t') && (prev_char != '\n')) &&
((c == ' ') || (c == '\t') || (c == '\n'))) ||
(i == MAXCOL)) {
if (i <= spaceleft) {
printf("%s", buf);
spaceleft -= i;
}
else {
putchar('\n');
spaceleft = MAXCOL;
for (j = first_non_space_buf; buf[j] != '\0'; ++j) {
putchar(buf[j]);
++ctr;
}
spaceleft -= ctr;
}
i = 0;
my_flush(buf);
buf[i++] = c;
first_non_space_buf = j = ctr = 0;
}
else {
if (((prev_char == ' ') || (prev_char == '\t') || (prev_char == '\n')) &&
((c != ' ') && (c != '\t') && (c != '\n'))) {
first_non_space_buf = i;
}
buf[i++] = c;
buf[i] = '\0';
}
prev_char = c;
}
printf("%s", buf);
return 0;
}
void my_flush(char buf[])
{
int i;
for (i = 0; i < MAXCOL; ++i)
buf[i] = '\0';
}
Below is my solution, I know the thread is no longer active but my code might help someone who's facing issues to grasp the already presented code snippets.
*EDIT
explaination
Keep reading input unless the input contains '\n', '\t' or there've been
atleast MAXCOl chars.
Incase of '\t', use expandTab to replace with required spaces and use printLine if it doesn't exceed MAXCOl.
Incase of '\n', directly use printLine and reset the index.
If index is 10:
find the last blank using findBlank ad get a new index.
use printLine to print the current line.
get new index as 0 or index of newly copied char array using the newIndex function.
code
/* fold long lines after last non-blank char */
#include <stdio.h>
#define MAXCOL 10 /* maximum column of input */
#define TABSIZE 8 /* tab size */
char line[MAXCOL]; /* input line */
int expandTab(int index);
int findBlank(int index);
int newIndex(int index);
void printLine(int index);
void main() {
int c, index;
index = 0;
while((c = getchar()) != EOF) {
line[index] = c; /* store current char */
if (c == '\t')
index = expandTab(index);
else if (c == '\n') {
printLine(index); /* print current input line */
index = 0;
} else if (++index == MAXCOL) {
index = findBlank(index);
printLine(index);
index = newIndex(index);
}
}
}
/* expand tab into blanks */
int expandTab(int index) {
line[index] = ' '; /* tab is atleast one blank */
for (++index; index < MAXCOL && index % TABSIZE != 0; ++index)
line[index] = ' ';
if (index > MAXCOL)
return index;
else {
printLine(index);
return 0;
}
}
/* find last blank position */
int findBlank(int index) {
while( index > 0 && line[index] != ' ')
--index;
if (index == 0)
return MAXCOL;
else
return index - 1;
}
/* re-arrange line with new position */
int newIndex(int index) {
int i, j;
if (index <= 0 || index >= MAXCOL)
return 0;
else {
i = 0;
for (j = index; j < MAXCOL; ++j) {
line[i] = line[j];
++i;
}
return i;
}
}
/* print line until passed index */
void printLine(int index) {
int i;
for(i = 0; i < index; ++i)
putchar(line[i]);
if (index > 0)
putchar('\n');
}
I basically have a sentence in a string and want to break it down word per word. Every word should go into an array of strings. I am not allowed to use strtok. I have this code but it doesn't work. Can someone help?
There is for sure something similar in the internet but I couldn't find anything...
int main(){
char s[10000]; // sentence
char array[100][100]; // array where I put every word
printf("Insert sentence: "); // receive the sentence
gets(s);
int i = 0;
int j = 0;
for(j = 0; s[j] != '\0'; j++){ // loop until I reach the end
for(i = 0; s[i] != ' '; i++){ // loop until the word is over
array[j][i] = s[i]; // put every char in the array
}
}
return 0;
}
Every word should go into an array of strings. I am not allowed to use
strtok.
Interesting problem which could be resolved in a compact algorithm.
It handles multiple spaces and punctuation marks specified in check(char c).
The most difficult part of the problem is to properly handle corner cases. We may have situation when words are longer more than WORD_LEN length or the number of words exceeds the capacity of the array.
Both cases are properly handled. The algorithm truncates the excessive words and parses only to the capacity of the array.
(BTW. Do not use gets: Why is the gets function so dangerous that it should not be used?)
Edit: The fully tested find_tokens function has been presented.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define WORD_LEN 3 // 100 // MAX WORD LEN
#define NR_OF_WORDS 3 // 100 // MAX NUMBER OF WORDS
#define INPUT_SIZE 10000
int is_delimiter(const char * delimiters, char c) // check for a delimiter
{
char *p = strchr (delimiters, c); // if not NULL c is separator
if (p) return 1; // delimeter
else return 0; // not a delimeter
}
int skip(int *i, char *str, int skip_delimiters, const char *delimiters)
{
while(1){
if(skip_delimiters) {
if( (str[(*i)+1] =='\0') || (!is_delimiter(delimiters, str[(*i)+1])) )
break; // break on nondelimeter or '\0'
else (*i)++; // advance to next character
}
else{ // skip excess characters in the token
if( is_delimiter(delimiters, str[(*i)]) )
{
if( (str[(*i)+1] =='\0') || !is_delimiter(delimiters, str[(*i)+1]) )
break; // break on non delimiter or '\0'
else (*i)++; // skip delimiters
}
else (*i)++; // skip non delimiters
}
}
if ( str[(*i)+1] =='\0') return 0;
else return 1;
}
int find_tokens(int max_tokens, int token_len, char *str, char array[][token_len+1], const char *delimiters, int *nr_of_tokens)
{
int i = 0;
int j = 0;
int l = 0;
*nr_of_tokens = 0;
int status = 0; // all OK!
int skip_leading_delimiters = 1;
int token = 0;
int more;
for(i = 0; str[i] != '\0'; i++){ // loop until I reach the end
// skip leading delimiters
if( skip_leading_delimiters )
{
if( is_delimiter( delimiters, str[i]) ) continue;
skip_leading_delimiters = 0;
}
if( !is_delimiter(delimiters,str[i]) && (j < token_len) )
{
array[l][j] = str[i]; // put char in the array
//printf("%c!\n", array[l][j] );
j++;
array[l][j] = 0;
token = 1;
}
else
{
//printf("%c?\n", str[i] );
array[l][j] = '\0'; // token terminations
if (j < token_len) {
more = skip(&i, str, 1, delimiters); // skip delimiters
}
else{
more = skip(&i, str, 0, delimiters); // skip excess of the characters in token
status = status | 0x01; // token has been truncated
}
j = 0;
//printf("more %d\n",more);
if(token){
if (more) l++;
}
if(l >= max_tokens){
status = status | 0x02; // more tokens than expected
break;
}
}
}
if(l>=max_tokens)
*nr_of_tokens = max_tokens;
else{
if(l<=0 && token)
*nr_of_tokens = 1;
else
{
if(token)
*nr_of_tokens = l+1;
else
*nr_of_tokens = l;
}
}
return status;
}
int main(void){
char input[INPUT_SIZE+1]; // sentence
char array[NR_OF_WORDS][WORD_LEN+1]; // array where I put every word, remeber to include null terminator!!!
int number_of_words;
const char * delimiters = " .,;:\t"; // word delimiters
char *p;
printf("Insert sentence: "); // receive the sentence
fgets(input, INPUT_SIZE, stdin);
if ( (p = strchr(input, '\n')) != NULL) *p = '\0'; // remove '\n'
int ret = find_tokens(NR_OF_WORDS, WORD_LEN, input, array, delimiters, &number_of_words);
printf("tokens= %d ret= %d\n", number_of_words, ret);
for (int i=0; i < number_of_words; i++)
printf("%d: %s\n", i, array[i]);
printf("End\n");
return 0;
}
Test:
Insert sentence: ..........1234567,,,,,,abcdefgh....123::::::::::::
tokens= 3 ret= 1
0: 123
1: abc
2: 123
End
You are not '\0'-terminating the strings and you are scanning the source from
the beginning every time you've found a empty character.
You only need one loop and, the inner loop and the condition must be s[i] != 0:
int j = 0; // index for array
int k = 0; // index for array[j]
for(i = 0; s[i] != '\0'; ++i)
{
if(k == 99)
{
// word longer than array[j] can hold, aborting
array[j][99] = 0; // 0-terminating string
break;
}
if(j == 99)
{
// more words than array can hold, aborting
break;
}
if(s[i] == ' ')
{
array[j][k] = 0; // 0-terminating string
j++; // for the next entry in array
k = 0;
} else
array[j][k++] = s[i];
}
Note that this algorithm doesn't handle multiple spaces and punctuation marks.
This can be solved by using a variable that stores the last state.
int j = 0; // index for array
int k = 0; // index for array[j]
int sep_state = 0; // 0 normal mode, 1 separation mode
for(i = 0; s[i] != '\0'; ++i)
{
if(k == 99)
{
// word longer than array[j] can hold, aborting
array[j][99] = 0; // 0-terminating string
break;
}
if(j == 99)
{
// more words than array can hold, aborting
break;
}
// check for usual word separators
if(s[i] == ' ' || s[i] == '.' || s[i] == ',' || s[i] == ';' || s[i] == ':')
{
if(sep_state == 1)
continue; // skip multiple separators
array[j][k] = 0; // 0-terminating string
j++; // for the next entry in array
k = 0;
sep_state = 1; // enter separation mode
} else {
array[j][k++] = s[i];
sep_state = 0; // leave separation mode
}
}
As you can see, using the sep_state variable I'm able to check if multiple
separators come one after the other and skips subsequent separators. I also
check for common punctuation marks.
#include <stdio.h>
int main()
{
char s[10000]; // sentence
char array[100][100]; // array where i put every word
printf("Insert sentence: "); // receive the sentece
gets(s);
printf("%s",s);
int i = 0;
int j = 0;
int k = 0;
for(j = 0; s[j] != '\0'; j++){ // loop until i reach the end
if ( s[j] != ' ' || s[j] == '\0' )
{
array[i][k] = s[j];
k++;
}
else {
i++;
k = 0;
}
}
return 0;
}
please note that the gets function is very unsafe and shouldn't in any case be used, use scanf or fgets instead
My program is designed to allow the user to input a string and my program will output the number of occurrences of each letters and words. My program also sorts the words alphabetically.
My issue is: I output the words seen (first unsorted) and their occurrences as a table, and in my table I don't want duplicates. SOLVED
For example, if the word "to" was seen twice I just want the word "to" to appear only once in my table outputting the number of occurrences.
How can I fix this? Also, why is it that i can't simply set string[i] == delim to apply to every delimiter rather than having to assign it manually for each delimiter?
Edit: Fixed my output error. But how can I set a condition for string[i] to equal any of the delimiters in my code rather than just work for the space bar? For example on my output, if i enter "you, you" it will out put "you, you" rather than just "you". How can I write it so it removes the comma and compares "you, you" to be as one word.
Any help is appreciated. My code is below:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
const char delim[] = ", . - !*()&^%$##<> ? []{}\\ / \"";
#define SIZE 1000
void occurrences(char s[], int count[]);
void lower(char s[]);
int main()
{
char string[SIZE], words[SIZE][SIZE], temp[SIZE];
int i = 0, j = 0, k = 0, n = 0, count;
int c = 0, cnt[26] = { 0 };
printf("Enter your input string:");
fgets(string, 256, stdin);
string[strlen(string) - 1] = '\0';
lower(string);
occurrences(string, cnt);
printf("Number of occurrences of each letter in the text: \n");
for (c = 0; c < 26; c++){
if (cnt[c] != 0){
printf("%c \t %d\n", c + 'a', cnt[c]);
}
}
/*extracting each and every string and copying to a different place */
while (string[i] != '\0')
{
if (string[i] == ' ')
{
words[j][k] = '\0';
k = 0;
j++;
}
else
{
words[j][k++] = string[i];
}
i++;
}
words[j][k] = '\0';
n = j;
printf("Unsorted Frequency:\n");
for (i = 0; i < n; i++)
{
strcpy(temp, words[i]);
for (j = i + 1; j <= n; j++)
{
if (strcmp(words[i], words[j]) == 0)
{
for (a = j; a <= n; a++)
strcpy(words[a], words[a + 1]);
n--;
}
} //inner for
}
i = 0;
/* find the frequency of each word */
while (i <= n) {
count = 1;
if (i != n) {
for (j = i + 1; j <= n; j++) {
if (strcmp(words[i], words[j]) == 0) {
count++;
}
}
}
/* count - indicates the frequecy of word[i] */
printf("%s\t%d\n", words[i], count);
/* skipping to the next word to process */
i = i + count;
}
printf("ALphabetical Order:\n");
for (i = 0; i < n; i++)
{
strcpy(temp, words[i]);
for (j = i + 1; j <= n; j++)
{
if (strcmp(words[i], words[j]) > 0)
{
strcpy(temp, words[j]);
strcpy(words[j], words[i]);
strcpy(words[i], temp);
}
}
}
i = 0;
while (i <= n) {
count = 1;
if (i != n) {
for (j = i + 1; j <= n; j++) {
if (strcmp(words[i], words[j]) == 0) {
count++;
}
}
}
printf("%s\n", words[i]);
i = i + count;
}
return 0;
}
void occurrences(char s[], int count[]){
int i = 0;
while (s[i] != '\0'){
if (s[i] >= 'a' && s[i] <= 'z')
count[s[i] - 'a']++;
i++;
}
}
void lower(char s[]){
int i = 0;
while (s[i] != '\0'){
if (s[i] >= 'A' && s[i] <= 'Z'){
s[i] = (s[i] - 'A') + 'a';
}
i++;
}
}
I have the solution to your problem and its name is called Wall. No, not the type to bang your head against when you encounter a problem that you can't seem to solve but for the Warnings that you want your compiler to emit: ALL OF THEM.
If you compile C code with out using -Wall then you can commit all the errors that people tell you is why C is so dangerous. But once you enable Warnings the compiler will tell you about them.
I have 4 for your program:
for (c; c< 26; c++) { That first c doesn't do anything, this could be written for (; c < 26; c++) { or perhaps beter as for (c = 0; c <26; c++) {
words[i] == NULL "Statement with no effect". Well that probably isn't what you wanted to do. The compiler tells you that that line doesn't do anything.
"Unused variable 'text'." That is pretty clear too: you have defined text as a variable but then never used it. Perhaps you meant to or perhaps it was a variable you thought you needed. Either way it can go now.
"Control reaches end of non-void function". In C main is usually defined as int main, i.e. main returns an int. Standard practice is to return 0 if the program successfully completed and some other value on error. Adding return 0; at the end of main will work.
You can simplify your delimiters. Anything that is not a-z (after lower casing it), is a delimiter. You don't [need to] care which one it is. It's the end of a word. Rather than specify delimiters, specify chars that are word chars (e.g. if words were C symbols, the word chars would be: A-Z, a-z, 0-9, and _). But, it looks like you only want a-z.
Here are some [untested] examples:
void
scanline(char *buf)
{
int chr;
char *lhs;
char *rhs;
char tmp[5000];
lhs = tmp;
for (rhs = buf; *rhs != 0; ++rhs) {
chr = *rhs;
if ((chr >= 'A') && (chr <= 'Z'))
chr = (chr - 'A') + 'a';
if ((chr >= 'a') && (chr <= 'z')) {
*lhs++ = chr;
char_histogram[chr] += 1;
continue;
}
*lhs = 0;
if (lhs > tmp)
count_string(tmp);
lhs = tmp;
}
if (lhs > tmp) {
*lhs = 0;
count_string(tmp);
}
}
void
count_string(char *str)
{
int idx;
int match;
match = -1;
for (idx = 0; idx < word_count; ++idx) {
if (strcmp(words[idx],str) == 0) {
match = idx;
break;
}
}
if (match < 0) {
match = word_count++;
strcpy(words[match],str);
}
word_histogram[match] += 1;
}
Using separate arrays is ugly. Using a struct might be better:
#define STRMAX 100 // max string length
#define WORDMAX 1000 // max number of strings
struct word {
int word_hist; // histogram value
char word_string[STRMAX]; // string value
};
int word_count; // number of elements in wordlist
struct word wordlist[WORDMAX]; // list of known words
I'm currently doing the exercises from "Cracking the coding interview". And even though the answer to this problem seems to be in many versions out there. I'm not able to work mine properly.
I've checked the other versions in stackoverflow but I'm not able to find the difference or what's causing me this problem.
THE PROBLEM: it does change spaces for '%20' but it does not print the last word.
EXAMPLE:
Input: "Mr John Smith
Output: "Mr%20John%20"
I leave you here my code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* Maximum sentence size + 1. */
#define MAX_SENTENCE_SZ 256
int amountOfSpaces(char* str){
int cnt=0;
while(*str!='\0'){
if(*str == ' '){
cnt++;
}
str++;
}
return cnt;
}
int main(){
char sentence[100]={'\0'};
/* Get the sentence, with size limit. */
fgets (sentence, MAX_SENTENCE_SZ, stdin);
/* Remove trailing newline, if there. */
if ((strlen(sentence)>0) && (sentence[strlen (sentence) - 1] == '\n'))
sentence[strlen (sentence) - 1] = '\0';
int cant = amountOfSpaces(sentence);
int newleng = sizeof(sentence) + cant*2;
char *ans = (char *)malloc(newleng+sizeof(char));
int pos=0;
for(int i=0; i<= sizeof(sentence)-1; i++,pos++){
if (sentence[i] == ' ') {
ans[pos] = '%';
ans[pos + 1] = '2';
ans[pos + 2] = '0';
pos += 3;
} else {
ans[pos] = sentence[i];
}
}
ans[pos+1]='\0';
printf("%lu\n", sizeof(sentence));
printf("%lu\n", sizeof(ans));
printf("%s",ans);
/* Free memory and exit. */
free (ans);
return 0;
}
There are several errors in your code
don't call strlen every time you need the length, it computes the length, so it's expensive
sentenceLength = strlen(sentence);
/* Remove trailing newline, if there. */
if ((sentenceLength > 0) && (sentence[sentenceLength - 1] == '\0'))
sentence[--sentenceLength] = '\0';
int cant = amountOfSpaces(sentence);
You already have the length of the string, and sizeof sentence is 100, that is not the length of the string
int newleng = sentenceLength + 2 * cant;
You don't need to cast malloc, and you need one extra character for the terminating '\0'
char *ans = malloc(1 + newleng);
size_t pos = 0;
Since you already have the length, use it also the same sizeof error
for(size_t i=0 ; i < sentenceLength ; i++, pos++){
if (sentence[i] == ' ') {
ans[pos] = '%';
ans[pos + 1] = '2';
ans[pos + 2] = '0';
You are incrementing pos at every iteration so here you increment it by 2
pos += 2;
} else {
ans[pos] = sentence[i];
}
}
pos + 1 is wrong, you already increment it in the loop.
ans[pos] = '\0';
printf("%lu\n", sentenceLength);
printf("%lu\n", pos);
printf("%s",ans);
/* Free memory and exit. */
free (ans);
return 0;
The complete fixed code
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* Maximum sentence size + 1. */
#define MAX_SENTENCE_SZ 256
int amountOfSpaces(char* str) {
int cnt=0;
while(*str!='\0') {
if(*str == ' ') {
cnt++;
}
str++;
}
return cnt;
}
int main(){
char sentence[100];
size_t sentenceLength;
/* Get the sentence, with size limit. */
fgets(sentence, sizeof(sentence), stdin);
sentenceLength = strlen(sentence);
/* Remove trailing newline, if there. */
if ((sentenceLength > 0) && (sentence[sentenceLength - 1] == '\0'))
sentence[--sentenceLength] = '\0';
int cant = amountOfSpaces(sentence);
int newleng = sentenceLength + 2 * cant;
char *ans = malloc(1 + newleng);
size_t pos = 0;
for(size_t i=0 ; i < sentenceLength ; i++, pos++) {
if (sentence[i] == ' ') {
ans[pos] = '%';
ans[pos + 1] = '2';
ans[pos + 2] = '0';
pos += 2;
} else {
ans[pos] = sentence[i];
}
}
ans[pos] = '\0';
printf("%lu\n", (unsigned long)sentenceLength);
printf("%lu\n", (unsigned long)pos);
printf("%s",ans);
/* Free memory and exit. */
free (ans);
return 0;
}
...I'm not able to find [...] what's causing me this problem.
Start using a debugger, get familiar with it.
Anyway:
if (sentence[i] == ' ') {
ans[pos] = '%';
ans[pos + 1] = '2';
ans[pos + 2] = '0';
pos += 2; // <<<<< BUG HERE -- increment by 2 not 3
} else {
Because pos is already incremented by 1 at each for iteration.
Btw: instead of using sizeof(sentence) store the actual length of the string in a variable and use that:
int l = strlen(sentence);
Remember l won't include the trailing null character.