Rotate words around vowels in C - c

I am trying to write a program that reads the stdin stream looking for words (consecutive alphabetic characters) and for each word rotates it left to the first vowel (e.g. "friend" rotates to "iendfr") and writes this sequence out in place of the original word. All other characters are written to stdout unchanged.
So far, I have managed to reverse the letters, but have been unable to do much more. Any suggestions?
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#define MAX_STK_SIZE 256
char stk[MAX_STK_SIZE];
int tos = 0; // next available place to put char
void push(int c) {
if (tos >= MAX_STK_SIZE) return;
stk[tos++] = c;
}
void putStk() {
while (tos >= 0) {
putchar(stk[--tos]);
}
}
int main (int charc, char * argv[]) {
int c;
do {
c = getchar();
if (isalpha(c) && (c == 'a' || c == 'A' || c == 'e' || c == 'E' || c == 'i' || c == 'o' || c == 'O' || c == 'u' || c == 'U')) {
push(c);
} else if (isalpha(c)) {
push(c);
} else {
putStk();
putchar(c);
}
} while (c != EOF);
}
-Soul

I am not going to write the whole program for you, but this example shows how to rotate a word from the first vowel (if any). The function strcspn returns the index of the first character matching any in the set passed, or the length of the string if no matches are found.
#include <stdio.h>
#include <string.h>
void vowelword(const char *word)
{
size_t len = strlen(word);
size_t index = strcspn(word, "aeiou");
size_t i;
for(i = 0; i < len; i++) {
printf("%c", word[(index + i) % len]);
}
printf("\n");
}
int main(void)
{
vowelword("friend");
vowelword("vwxyz");
vowelword("aeiou");
return 0;
}
Program output:
iendfr
vwxyz
aeiou

There are a number of ways your can approach the problem. You can use a stack, but that just adds handling the additional stack operations. You can use a mathematical reindexing, or you can use a copy and fill solution where you copy from the first vowel to a new string and then simply add the initial characters to the end of the string.
While you can read/write a character at a time, you are probably better served by creating the rotated string in a buffer to allow use of the string within your code. Regardless which method you use, you need to validate all string operations to prevent reading/writing beyond the end of your input and/or rotated strings. An example of a copy/fill approach to rotating to the first vowel in your input could be something like the following:
/* rotate 's' from first vowel with results to 'rs'.
* if 's' contains a vowel, 'rs' contains the rotated string,
* otherwise, 'rs' contais 's'. a pointer to 'rs' is returned
* on success, NULL otherwise and 'rs' is an empty-string.
*/
char *rot2vowel (char *rs, const char *s, size_t max)
{
if (!rs || !s || !max) /* validate params */
return NULL;
char *p = strpbrk (s, "aeiou");
size_t i, idx, len = strlen (s);
if (len > max - 1) { /* validate length */
fprintf (stderr, "error: insuffieient storage (len > max - 1).\n");
return NULL;
}
if (!p) { /* if no vowel, copy s to rs, return rs */
strcpy (rs, s);
return rs;
}
idx = p - s; /* set index offset */
strcpy (rs, p); /* copy from 1st vowel */
for (i = 0; i < idx; i++) /* rotate beginning to end */
rs[i+len-idx] = s[i];
rs[len] = 0; /* nul-terminate */
return rs;
}
Above, strpbrk is used to return a pointer to the first occurrence of a vowel in string 's'. The function takes as parameters a pointer to a adequately sized string to hold the rotated string 'rs', the input string 's' and the allocated size of 'rs' in 'max'. The parameters are validated and s is checked for a vowel with strpbrk which returns a pointer to the first vowel in s (if it exists), NULL otherwise. The length is checked against max to insure adequate storage.
If no vowels are present, s is copied to rs and a pointer to rs returned, otherwise the pointer difference is used to set the offset index to the first vowel, the segment of the string from the first vowel-to-end is copied to rs and then the preceding characters are copied to the end of rs with the loop. rs is nul-terminated and a pointer is returned.
While I rarely recommend the use of scanf for input, (a fgets followed by sscanf or strtok is preferable), for purposes of a short example, it can be used to read individual strings from stdin. Note: responding to upper/lower case vowels is left to you. A short example setting the max word size to 32-chars (31-chars + the nul-terminating char) will work for all known words in the unabridged dictionary (longest word is 28-chars):
#include <stdio.h>
#include <string.h>
enum { BUFSZ = 32 };
char *rot2vowel (char *rs, const char *s, size_t max);
int main (void)
{
char str[BUFSZ] = {0};
char rstr[BUFSZ] = {0};
while (scanf ("%s", str) == 1)
printf (" %-8s => %s\n", str, rot2vowel (rstr, str, sizeof rstr));
return 0;
}
Example Use/Output
(shamelessly borrowing the example strings from WeatherVane :)
$ echo "friend vwxyz aeiou" | ./bin/str_rot2vowel
friend => iendfr
vwxyz => vwxyz
aeiou => aeiou
Look it over and let me know if you have any questions. Note: you can call the rot2vowel function prior to the printf statement and print the results with rstr, but since the function returns a pointer to the string, it can be used directly in the printf statement. How you use it is up to you.

Related

Writing a C program that removes every occurrence of a char except the last one

Im trying to write a C program that removes all occurrences of repeating chars in a string except the last occurrence.For example if I had the string
char word[]="Hihxiivaeiavigru";
output should be:
printf("%s",word);
hxeavigru
What I have so far:
#include <stdio.h>
#include <string.h>
int main()
{
char word[]="Hihxiiveiaigru";
for (int i=0;i<strlen(word);i++){
if (word[i+1]==word[i]);
memmove(&word[i], &word[i + 1], strlen(word) - i);
}
printf("%s",word);
return 0;
}
I am not sure what I am doing wrong.
With short strings, any algorithm will do. OP's attempt is O(n*n) (as well as other working answers and #David C. Rankin that identified OP's short-comings.)
But what if the string was thousands, millions in length?
Consider the following algorithm: #paulsm4
Form a `bool` array used[CHAR_MAX - CHAR_MIN + 1] and set each false.
i,unique = n - 1;
From the end of the string (n-1 to 0) to the front:
if (character never seen yet) { // used[] look-up
array[unique] = array[i];
unique--;
}
Mark used[array[i]] as true (index from CHAR_MIN)
i--;
Shift the string "to the left" (unique - i) places
Solution is O(n)
Coding goal is too fun to just post a fully coded answer.
I would first write a function to determine if a char ch at a given position i is the last occurence of ch given a char *. Like,
bool isLast(char *word, char ch, int p) {
p++;
ch = tolower(ch);
while (word[p] != '\0') {
if (tolower(word[p]) == ch) {
return false;
}
p++;
}
return true;
}
Then you can use that to iteratively emit your desired characters like
int main() {
char *word = "Hihxiivaeiavigru";
for (int i = 0; word[i] != '\0'; i++) {
if (isLast(word, word[i], i)) {
putchar(word[i]);
}
}
putchar('\n');
}
And (for completeness) I used
#include <stdio.h>
#include <ctype.h>
#include <stdbool.h>
Outputs (as requested)
hxeavigru
Additional areas where you are currently hurting yourself.
Your for loop must NOT increment the index, e.g. for (int i=0; word[i];). This is because when you memmove() by 1, you have just incremented the indexes. That also means the value to save for last is now i - 1.
there should only be one call to strlen() in the program. You can simply subtract one from length each time memmove() is called.
only increment your loop counter variable when memmove() is not called.
Additionally, avoid hardcoding strings. You shouldn't have to recompile your code just to test the results of "Hihxiivaeiaigrui" instead of "Hihxiivaeiaigru". You shouldn't have to recompile just to remove all but the last 'a' instead of the 'i'. Either pass the string and character to find as arguments to your program (that's what int argc, char **argv are for), or prompt the user for input.
Putting it altogether you could do (presuming word is 1023 characters or less):
#include <stdio.h>
#include <string.h>
#define MAXC 1024
int main (int argc, char **argv) {
char word[MAXC]; /* storage for word */
strcpy (word, argc > 1 ? argv[1] : "Hihxiivaeiaigru"); /* copy to word */
int find = argc > 2 ? *argv[2] : 'i', /* character to find */
last = -1; /* last index where find found */
size_t len = strlen (word); /* only compute strlen once */
printf ("%s (removing all but last %c)\n", word, find);
for (int i=0; word[i];) { /* loop over each char -- do NOT increment */
if (word[i] == find) { /* is this my character to find? */
if (last != -1) { /* if last is set */
/* overwrite last with rest of word */
memmove (&word[last], &word[last + 1], (int)len - last);
last = i - 1; /* last now i - 1 (we just moved it) */
len = len - 1;
}
else { /* last not set */
last = i; /* set it */
i++; /* increment loop counter */
}
}
else /* all other chars */
i++; /* just increment loop counter */
}
puts (word); /* output result -- no need for printf (no coversions) */
}
Example Use/Output
$ ./bin/rm_all_but_last_occurrence
Hihxiivaeiaigru (removing all but last i)
Hhxvaeaigru
What if you want to use "Hihxiivaeiaigrui"? Just pass it as the 1st argument:
$ ./bin/rm_all_but_last_occurrence Hihxiivaeiaigrui
Hihxiivaeiaigrui (removing all but last i)
Hhxvaeagrui
What if you want to use "Hihxiivaeiaigrui" and remove duplicate 'a' characters? Just pass the string to search as the 1st argument and the character to find as the second:
$ ./bin/rm_all_but_last_occurrence Hihxiivaeiaigrui a
Hihxiivaeiaigrui (removing all but last a)
Hihxiiveiaigrui
Nothing removed if only one of the characters:
$ ./bin/rm_all_but_last_occurrence Hihxiivaeiaigrui H
Hihxiivaeiaigrui (removing all but last H)
Hihxiivaeiaigrui
Let me know if you have further questions.
Im trying to write a C program that removes all occurrences of repeating chars in a string except the last occurrence.
Process the string (or word) from last character and move towards the first character of string (or word). Now, think of it as a problem where you have to remove all occurrence of a character from string and except the first occurrence. Since, we are processing the string from last character to first character, so, we have to move the characters, which are remain after removing duplicates, to the start of string once you have processed whole string and, if, there were duplicate characters found in the string. The complexity of this algorithm is O(n).
Implementation:
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define INDX(x) (tolower(x) - 'a')
void remove_dups_except_last (char str[]) {
int map[26] = {0}; /* to keep track of a character processed */
size_t len = strlen (str);
char *p = str + len; /* pointer pointing to null character of input string */
size_t i = 0;
for (i = len; i != 0; --i) {
if (map[INDX(str[i - 1])] == 0) {
map[INDX(str[i - 1])] = 1;
*--p = str[i - 1];
}
}
/* if there were duplicates characters then only copy
*/
if (p != str) {
for (i = 0; *p; ++i) {
str[i] = *p++;
}
str[i] = '\0';
}
}
int main(int argc, char* argv[])
{
if (argc != 2) {
printf ("Invalid number of arguments\n");
return -1;
}
char str[1024] = {0};
/* Assumption: the input string/word will contain characters A-Z and a-z
* only and size of input will not be more than 1023.
*
* Leaving it up to you to check the valid characters in input string/word
*/
strcpy (str, argv[1]);
printf ("Original string : %s\n", str);
remove_dups_except_last (str);
printf ("Removed duplicated characters except the last one, modified string : %s\n", str);
return 0;
}
Testcases output:
# ./a.out Hihxiivaeiavigru
Original string : Hihxiivaeiavigru
Removed duplicated characters except the last one, modified string : hxeavigru
# ./a.out aa
Original string : aa
Removed duplicated characters except the last one, modified string : a
# ./a.out a
Original string : a
Removed duplicated characters except the last one, modified string : a
# ./a.out TtYyuU
Original string : TtYyuU
Removed duplicated characters except the last one, modified string : tyU
You can re-iterate to get each characters of your string, then if it is not "i" and not the last occurrence of the i, copy to a new string.
#include <stdio.h>
#include <string.h>
int main() {
char word[]="Hihxiiveiaigru";
char newword[10000];
char* ptr = strrchr(word, 'i');
int index=0;
int index2=0;
while (index < strlen(word)) {
if (word[index]!='i' || index ==(ptr - word)) {
newword[index2]=word[index];
index2++;
}
index++;
}
printf("%s",newword);
return 0;
}

Trim function in C

I am writing my own trim() in C. There is a structure which contains all string values, the structure is getting populated from the data coming from a file which contains spaces before and after the beginning of a word.
char *trim(char *string)
{
int stPos,endPos;
int len=strlen(string);
for(stPos=0;stPos<len && string[stPos]==' ';++stPos);
for(endPos=len-1;endPos>=0 && string[endPos]==' ';--endPos);
char *trimmedStr = (char*)malloc(len*sizeof(char));
strncpy(trimmedStr,string+stPos,endPos+1);
return trimmedStr;
}
int main()
{
char string1[]=" a sdf ie ";
char *string =trim(string1);
printf("%s",string);
return 0;
}
Above code is working fine, but i don't want to declare new variable that stores the trimmed word. As the structure contains around 100 variables.
Is there any way to do somthing like below where I dont need any second variable to print the trimmed string.
printf("%s",trim(string1));
I believe above print can create dangling pointer situation.
Also, is there any way where I don't have to charge original string as well, like if I print trim(string) it will print trimmed string and when i print only string, it will print original string
elcuco was faster. but it's done so here we go:
char *trim(char *string)
{
char *ptr = NULL;
while (*string == ' ') string++; // chomp away space at the start
ptr = string + strlen(string) - 1; // jump to the last char (-1 because '\0')
while (*ptr == ' '){ *ptr = '\0' ; ptr--; } ; // overwrite with end of string
return string; // return pointer to the modified start
}
If you don't want to alter the original string I'd write a special print instead:
void trim_print(char *string)
{
char *ptr = NULL;
while (*string == ' ') string++; // chomp away space at the start
ptr = string + strlen(string) - 1; // jump to the last char (-1 because '\0')
while (*ptr == ' '){ ptr--; } ; // find end of string
while (string <= ptr) { putchar(*string++); } // you get the picture
}
something like that.
You could the original string in order to do this. For trimming the prefix I just advance the pointer, and for the suffix, I actually add \0. If you want to keep the original starting as is, you will have to move memory (which makes this an O(n^2) time complexity solution, from an O(n) I provided).
#include <stdio.h>
char *trim(char *string)
{
// trim prefix
while ((*string) == ' ' ) {
string ++;
}
// find end of original string
char *c = string;
while (*c) {
c ++;
}
c--;
// trim suffix
while ((*c) == ' ' ) {
*c = '\0';
c--;
}
return string;
}
int main()
{
char string1[] = " abcdefg abcdf ";
char *string = trim(string1);
printf("String is [%s]\n",string);
return 0;
}
(re-thinking... is it really O(n^2)? Or is it O(2n) which is a higher O(n)...? I guess depending on implementation)
You can modify the function by giving the output in the same input string
void trim(char *string)
{
int i;
int stPos,endPos;
int len=strlen(string);
for(stPos=0;stPos<len && string[stPos]==' ';++stPos);
for(endPos=len-1;endPos>=0 && string[endPos]==' ';--endPos);
for (i=0; i<=(endPos-stPos); i++)
{
string[i] = string[i+stPos];
}
string[i] = '\0'; // terminate the string and discard the remaining spaces.
}
...is there any way where i don't have to charge original string as well, like if i do trim(string) it will print trimmed string and when i print only string, it will print original string – avinashse 8 mins ago
Yes, though it gets silly.
You could modify the original string.
trim(string);
printf("trimmed: %s\n", string);
The advantage is you have the option of duplicating the string if you want to retain the original.
char *original = strdup(string);
trim(string);
printf("trimmed: %s\n", string);
If you don't want to modify the original string, that means you need to allocate memory for the modified string. That memory then must be freed. That means a new variable to hold the pointer so you can free it.
char *trimmed = trim(original);
printf("trimmed: %s\n", trimmed);
free(trimmed);
You can get around this by passing a function pointer into trim and having trim manage all the memory for you.
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
void trim(char *string, void(*func)(char *) )
{
// Advance the pointer to the first non-space char
while( *string == ' ' ) {
string++;
}
// Shrink the length to the last non-space char.
size_t len = strlen(string);
while(string[len-1]==' ') {
len--;
}
// Copy the string to stack memory
char trimmedStr[len + 1];
strncpy(trimmedStr,string, len);
// strncpy does not add a null byte, add it ourselves.
trimmedStr[len] = '\0';
// pass the trimmed string into the user function.
func(trimmedStr);
}
void print_string(char *str) {
printf("'%s'\n", str);
}
int main()
{
char string[]=" a sdf ie ";
trim(string, print_string);
printf("original: '%s'\n", string);
return 0;
}
Ta da! One variable, the original is left unmodified, no memory leaks.
While function pointers have their uses, this is a bit silly.
It's C. Get used to managing memory. ¯\_(ツ)_/¯
Also, is there any way where I don't have to charge original string as
well, like if I print trim(string) it will print trimmed string and
when i print only string, it will print original string
Yes you can, but you cannot allocate new memory in the trim function as you will not be holding the return memory.
You can have a static char buffer in the trim function and operate on it.
Updated version of #elcuco answer.
#include <stdio.h>
char *trim(char *string)
{
static char buff[some max length];
// trim prefix
while ((*string) == ' ' ) {
string++;
}
// find end of original string
int i = 0;
while (*string) {
buff[i++] = *string;
string++;
}
// trim suffix
while ((buff[i]) == ' ' ) {
buff[i] = '\0';
i--;
}
return buff;
}
int main()
{
char string1[] = " abcdefg abcdf ";
char *string = trim(string1);
printf("String is [%s]\n",string);
return 0;
}
With this you don't need to worry about holding reference to trim function return.
Note: Previous values of buff will be overwritten with new call to trim function.
If you don't want to change the original, then you will need to make a copy, or pass a second array of sufficient size as a parameter to your function for filling. Otherwise a simple in-place trmming is fine -- so long as the original string is mutable.
An easy way to approach trimming on leading and trailing whitespace is to determine the number of leading whitespace characters to remove. Then simply use memmove to move from the first non-whitespace character back to the beginning of the string (don't forget to move the nul-character with the right portion of the string).
That leaves only removing trailing whitespace. An easy approach there is to loop from the end of the string toward the beginning, overwriting each character of trailing whitespace with a nul-character until your first non-whitespace character denoting the new end of string is found.
A simple implementation for that could be:
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define DELIM " \t\n" /* whitespace constant delimiters for strspn */
/** trim leading and trailing whitespace from s, (s must be mutable) */
char *trim (char *s)
{
size_t beg = strspn (s, DELIM), /* no of chars of leading whitespace */
len = strlen (s); /* length of s */
if (beg == len) { /* string is all whitespace */
*s = 0; /* make s the empty-string */
return s;
}
memmove (s, s + beg, len - beg + 1); /* shift string to beginning */
for (int i = (int)(len - beg - 1); i >= 0; i--) { /* loop from end */
if (isspace(s[i])) /* checking if char is whitespace */
s[i] = 0; /* overwrite with nul-character */
else
break; /* otherwise - done */
}
return s; /* Return s */
}
int main (void) {
char string1[] = " a sdf ie ";
printf ("original: '%s'\n", string1);
printf ("trimmed : '%s'\n", trim(string1));
}
(note: additional intervening whitespace was added to your initial string to show that multiple intervening whitespace is left unchanged, the output is single-quoted to show the remaining text boundaries)
Example Use/Output
$ ./bin/strtrim
original: ' a sdf ie '
trimmed : 'a sdf ie'
Look things over and let me know if you have further questions.

Printf not printing - returns NULL

beginner here. So I'm trying to write some code that take a sentence and returns the longest word. When I debugg the program everything looks correct as I'd expect including the char array. However when I come to print the output I invariably get a NULL...
I've put in the entire code because I think one of the loops must be effecting the array string pointer in some way?
#include <stdio.h>
#include <string.h>
void LongestWord(char sen1[500]) {
/*
steps:
1. char pointer. Each Byte holds array position of each space or return value Note space = 32 & return = 10.
2. Once got above asses biggest word. Biggest word stored in short int (starting position)
3. Once got biggest word start - move to sen using strncpy
*/
char sen[500];
char *ptr = sen;
int i = 0;
int space_position[500];
int j = 0;
int k = 0;
int word_size_prior_to_each_position[500];
int l = 0;
int largest = 0;
int largest_end_position = 0;
int largest_start_position =0;
memset(&sen[0], 0, 500);
memset(&space_position[0], 0, 2000);
memset(&word_size_prior_to_each_position[0], 0, 2000);
while (i < 500) { //mark out where the spaces or final return is
if ((sen1[i] == 0b00100000) ||
(sen1[i] == 0b00001010))
{
space_position[j] = i;
j = j+1;
}
i = i+1;
}
while (k < 500) {
if (k == 0) {
word_size_prior_to_each_position[k] = (space_position[k]);
}
//calculate word size at each position
if ((k > 0) && (space_position[k] != 0x00)) {
word_size_prior_to_each_position[k] = (space_position[k] - space_position[k-1]) -1;
}
k = k+1;
}
while (l < 500) { //find largest start position
if (word_size_prior_to_each_position[l] > largest) {
largest = word_size_prior_to_each_position[l];
largest_end_position = space_position[l];
largest_start_position = space_position[l-1];
}
l = l+1;
}
strncpy(ptr, sen1+largest_start_position+1, largest);
printf("%s", *ptr);
return 0;
}
int main(void) {
char stringcapture[500];
fgets(stringcapture, 499, stdin);
LongestWord(stringcapture); //this grabs input and posts into the longestword function
return 0;
}
In the function LongestWord replace
printf("%s", *ptr);
with
printf("%s\n", ptr);
*ptr denotes a single character, but you want to print a string (see %s specification), so you must use ptr instead. It makes sense to also add a line break (\n).
Also remove the
return 0;
there, because it's a void function.
Returning the longest word
To return the longest word from the function as pointer to char, you can change the function signature to
char *LongestWord(char sen1[500])
Since your pointer ptr points to a local array in LongestWord it will result in a dangling reference as soon as the function returns.
Therefore you need to do sth like:
return strdup(ptr);
Then in main you can change your code to:
char *longest_word = LongestWord(stringcapture);
printf("%s\n", longest_word);
free(longest_word);
Some more Hints
You have a declaration
int space_position[500];
There you are calling:
memset(&space_position[0], 0, 2000);
Here you are assuming that an int is 4 bytes. That assumption leads to not-portable code.
You should rather use:
memset(&space_position[0], 0, sizeof(space_position));
You can even write:
memset(space_position, 0, sizeof(space_position));
since space_position is the address of the array anyway.
Applied to your memsets, it would look like this:
memset(sen, 0, sizeof(sen));
memset(space_position, 0, sizeof(space_position));
memset(word_size_prior_to_each_position, 0, sizeof(word_size_prior_to_each_position));
Instead of using some binary numbers for space and return, you can alternatively use the probably more readable notation of ' ' and '\n', so that you could e.g. write:
if ((sen1[i] == ' ') ||
(sen1[i] == '\n'))
instead of
if ((sen1[i] == 0b00100000) ||
(sen1[i] == 0b00001010))
The variable largest_end_position is assigned but never used somewhere. So it can be removed.
The following line
strncpy(ptr, sen1 + largest_start_position + 1, largest);
would omit the first letter of the word if the first word were also the longest. It seems largest_start_position is the position of the space, but in case of the first word (largest_start_position == 0) you start to copy from index 1. This special case needs to be handled.
You have a local array in main that is not initialized.
So instead of
char stringcapture[500];
you must write
char stringcapture[500];
memset(stringcapture, 0, sizeof(stringcapture));
alternatively you could use:
char stringcapture[500] = {0};
Finally in this line:
largest_start_position = space_position[l - 1];
You access the array outside the boundaries if l==0 (space_position[-1]). So you have to write:
if (l > 0) {
largest_start_position = space_position[l - 1];
}
else {
largest_start_position = 0;
}
While Stephan has provided you with a good answer addressing the problems you were having with your implementation of your LongestWord function, you may be over-complicating what your are doing to find the longest word.
To be useful, think about what you need to know when getting the longest word from a sentence. You want to know (1) what the longest word is; and (2) how many characters does it contain? You can always call strlen again when the function returns, but why? You will have already handled that information in the function, so you might as well make that information available back in the caller.
You can write your function in a number of ways to either return the length of the longest word, or a pointer to the longest word itself, etc. If you want to return a pointer to the longest word, you can either pass an array of sufficient size as a parameter to the function for filling within the function, or you can dynamically allocate storage within the function so that the storage survives the function return (allocated storage duration verses automatic storage duration). You can also declare an array static and preserve storage that way, but that will limit you to one use of the function in any one expression. If returning a pointer to the longest word, to also make the length available back in the caller, you can pass a pointer as a parameter and update the value at that address within your function making the length available back in the calling function.
So long as you are simply looking for the longest word, the longest word in the unabridged dictionary (non-medical) is 29-characters (taking 30-characters storage total), or for medical terms the longest word is 45-character (taking 46-characters total). So it may make more sense to simply pass an array to fill with the longest word as a parameter since you already know what the max-length needed will be (an array of 64-chars will suffice -- or double that to not skimp on buffer size, your call).
Rather than using multiple arrays, a simple loop and a pair of pointers is all you need to walk down your sentence buffer bracketing the beginning and end of each word to pick out the longest one. (and the benefit there, as opposed to using a strtok, etc. is the original sentence is left unchanged allowing it to be passed as const char * allowing the compiler to further optimize the code)
A longest_word function that passes the sentence and word to fill as parameters returning the length of the longest string is fairly straight forward to do in a single loop. Loosely referred to as a State Loop, where you use a simple flag to keep track of your read state, i.e. whether you are in a word within the sentence or whether you are in whitespace before, between or after the words in the sentence. A simple In/Out state flag.
Then you simply use a pointer p to locate the beginning of each word, and an end-pointer ep to advance down the sentence to locate the end of each word, checking for the word with the max-length as you go. You can use the isspace() macro provided in ctype.h to locate the spaces between each word.
The loop itself does nothing more than loop continually while you keep track of each pointer and then check which word is the longest by the simple pointer difference ep - p when the end of each word is found. If a word is longer than the previous max, then copy that to your longest word array and update max with the new max-length.
A short implementation could be similar to:
size_t longest_word (const char *sentence, char *word)
{
const char *p = sentence, *ep = p; /* pointer & end-pointer */
size_t in = 0, max = 0; /* in-word flag & max len */
if (!sentence || !*sentence) /* if NULL or empty, set word empty */
return (*word = 0);
for (;;) { /* loop continually */
if (isspace (*ep) || !*ep) { /* check whitespace & end of string */
if (in) { /* if in-word */
size_t len = ep - p; /* get length */
if (len > max) { /* if greater than max */
memcpy (word, p, len); /* copy to word */
word[len] = 0; /* nul-terminate word */
max = len; /* update max */
}
p = ep; /* update pointer to end-pointer */
in = 0; /* zero in-word flag */
}
if (!*ep) /* if end of word, bail */
break;
}
else { /* non-space character */
if (!in) { /* if not in-word */
p = ep; /* update pointer to end-pointer */
in = 1; /* set in-word flag */
}
}
ep++; /* advance end-pointer */
}
return max; /* return max length */
}
A complete example taking the sentence to be read as user-input could be similar to:
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define MAXWRD 64 /* longest word size */
#define MAXC 2048 /* max characters in sentence */
size_t longest_word (const char *sentence, char *word)
{
const char *p = sentence, *ep = p; /* pointer & end-pointer */
size_t in = 0, max = 0; /* in-word flag & max len */
if (!sentence || !*sentence) /* if NULL or empty, set word empty */
return (*word = 0);
for (;;) { /* loop continually */
if (isspace (*ep) || !*ep) { /* check whitespace & end of string */
if (in) { /* if in-word */
size_t len = ep - p; /* get length */
if (len > max) { /* if greater than max */
memcpy (word, p, len); /* copy to word */
word[len] = 0; /* nul-terminate word */
max = len; /* update max */
}
p = ep; /* update pointer to end-pointer */
in = 0; /* zero in-word flag */
}
if (!*ep) /* if end of word, bail */
break;
}
else { /* non-space character */
if (!in) { /* if not in-word */
p = ep; /* update pointer to end-pointer */
in = 1; /* set in-word flag */
}
}
ep++; /* advance end-pointer */
}
return max; /* return max length */
}
int main (void) {
char buf[MAXC], word[MAXWRD];
size_t len;
if (!fgets (buf, MAXC, stdin)) {
fputs ("error: user canceled input.\n", stderr);
return 1;
}
len = longest_word (buf, word);
printf ("longest: %s (%zu-chars)\n", word, len);
return 0;
}
Example Use/Output
Entered string has 2-character leading whitespace as well as 2-characters trailing whitespace:
$ ./bin/longest_word
1234 123 12 123456 1234 123456789 12345678 1 1234
longest: 123456789 (9-chars)
This isn't intended to be a substitute for Stephan's answer helping with the immediate issues in your implementation, rather this is an example providing you with an alternative way to think about approaching the problem. Generally the simpler you can keep any coding task, the less error prone it will be. Look it over and let me know if you have any further questions about the approach.

sscanf doesnt seem to capture the correct parts of my strings

I've tried to run this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char a[1000];
void eliminatesp() {
char buff1[1000], buff2[1000];
LOOP: sscanf(a,"%s %s",buff1,buff2);
sprintf(a,"%s%s", buff1, buff2);
for(int i=0; i<strlen(a); ++i) {
if(a[i]==' ') goto LOOP;
}
}
void eliminateline() {
char buff1[1000]; char buff2[1000];
LOOP: sscanf(a,"%s\n\n%s",buff1,buff2);
sprintf(a,"%s\n%s", buff1, buff2);
for(int i=0; i<strlen(a)-1; ++i) {
if(a[i]=='\n'&&a[i+1]=='\n') goto LOOP;
}
}
int main() {sprintf(a,"%s\n\n%s", "hello world","this is my program, cris");
eliminatesp();
eliminateline();
printf("%s",a); return 0;
return 0;
}
but the output was:
hello world
world
How can I correct it? I was trying to remove spaces and empty lines.
Going with your idea of using sscanf and sprintf you can actually eliminate both spaces and newlines in a single function, as sscanf will ignore all whitespace (including newlines) when reading the input stream. So something like this should work:
void eliminate() {
char buff1[1000], buff2[1000], b[1000];
char* p = a, *q = b, *pq = b;
sprintf(q, "%s", p);
while (q != NULL && *q != '\0')
{
if (iswspace(*q))
{
sscanf(pq, "%s %s", buff1, buff2);
sprintf(p, "%s%s", buff1, buff2);
p += strlen(buff1);
pq = ++q;
}
q++;
}
}
Pedro, while the %s format specifier does stop conversion on the first encountered whitespace, it isn't the only drawback to attempting to parse with sscanf. In order to use sscanf you will also need to use the %n conversion specifier (the number of characters consumed during conversion to the point the %n appears) and save the value as an integer (say offset). Your next conversion will begin a a + offset until you have exhausted all words in 'a'. This can be a tedious process.
A better approach can simply be to loop over all characters in 'a' copying non-whitespace and single-delimiting whitespace to the new buffer as you go. (I often find it easier to copy the full string to a new buffer (say 'b') and then read from 'b' writing the new compressed string back to 'a').
As you work your way down the original string, you use simple if else logic to determine whether to store the current (or last) character or whether to just skip it and get the next. There are many ways to do this, no one way more right than the other as long as they are reasonably close in efficiency. Making use of the <ctype.h> functions like isspace() makes things easier.
Also, in your code, avoid the use of global variables. There is no reason you can't declare 'a' in main() and pass it as a parameter to your eliminate functions. If you need a constant in your code, like 1000, then #define a constant and avoid sprinkling magic numbers throughout your code.
Below is an example putting all those pieces together, and combining both your eliminatesp and eliminateline functions into a single eliminatespline function that does both trim whitespace and eliminate blank lines. This will handle blank lines and considers lines containing only whitespace characters as blank.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#define MAXL 1000 /* if you need a constant, define one (or more) */
/** trim leading, compress included, and trim trailing whitespace.
* given non-empty string 'a', trim all leading whitespace, remove
* multiple included spaces and empty lines, and trim all trailing
* whitespace.
*/
void eliminatespline (char *a)
{
char b[MAXL] = "", /* buffer to hold copy of a */
*rp = b, /* read pointer to copy of a */
*wp = a, /* write pointer for a */
last = 0; /* last char before current */
if (!a || !*a) /* a NULL or empty - return */
return;
strcpy (b, a); /* copy a to b */
while (isspace (*rp)) /* skip leading whitespace */
rp++;
last = *rp++; /* fill last with 1st non-whitespace */
while (*rp) { /* loop over remaining chars in b */
/* last '\n' and current whitespace - advance read pointer */
if (last == '\n' && isspace(*rp)) {
rp++;
continue;
} /* last not current or last not space */
else if (last != *rp || !isspace (last))
*wp++ = last; /* write last, advance write pointer */
last = *rp++; /* update last, advance read pointer */
}
if (!isspace (last)) /* if last not space */
*wp++ = last; /* write last, advance write pointer */
*wp = 0; /* nul-terminate at write pointer */
}
int main() {
char a[] = " hello world\n \n\nthis is my program, cris ";
eliminatespline (a);
printf ("'%s'\n", a);
return 0;
}
note: the line being trimmed has both leading and trailing whitespace as well as embedded blank lines and lines containing only whitespace, e.g.
" hello world\n \n\nthis is my program, cris "
Example Use/Output
$ ./bin/elimspaceline
'hello world
this is my program, cris'
(note: the printf statements wraps the output in single-quotes to confirm all leading and trailing whitespace was eliminated.)
If you did want to use sscanf, you could essentially do the same thing with sscanf (using the %n specifier to report characters consumed) and a array of two-characters to treat the next character as a string, and do something like the following:
void eliminatespline (char *a)
{
char b[MAXL] = "", /* string to hold build w/whitespace removed */
word[MAXL] = "", /* string for each word */
c[2] = ""; /* string made of char after word */
int n = 0, /* number of chars consumed by sscanf */
offset = 0; /* offset from beginning of a */
size_t len; /* length of final string in b */
/* sscanf each word and char that follows, reporting consumed */
while (sscanf (a + offset, "%s%c%n", word, &c[0], &n) == 2) {
strcat (b, word); /* concatenate word */
strcat (b, c); /* concatenate next char */
offset += n; /* update offset with n */
}
len = strlen (b); /* get final length of b */
if (len && isspace(b[len - 1])) /* if last char is whitespace */
b[len - 1] = 0; /* remove last char */
strcpy (a, b); /* copy b to a */
}
Look things over, try both approaches and let me know if you have further questions.

Parsing text in C

I have a file like this:
...
words 13
more words 21
even more words 4
...
(General format is a string of non-digits, then a space, then any number of digits and a newline)
and I'd like to parse every line, putting the words into one field of the structure, and the number into the other. Right now I am using an ugly hack of reading the line while the chars are not numbers, then reading the rest. I believe there's a clearer way.
Edit: You can use pNum-buf to get the length of the alphabetical part of the string, and use strncpy() to copy that into another buffer. Be sure to add a '\0' to the end of the destination buffer. I would insert this code before the pNum++.
int len = pNum-buf;
strncpy(newBuf, buf, len-1);
newBuf[len] = '\0';
You could read the entire line into a buffer and then use:
char *pNum;
if (pNum = strrchr(buf, ' ')) {
pNum++;
}
to get a pointer to the number field.
fscanf(file, "%s %d", word, &value);
This gets the values directly into a string and an integer, and copes with variations in whitespace and numerical formats, etc.
Edit
Ooops, I forgot that you had spaces between the words.
In that case, I'd do the following. (Note that it truncates the original text in 'line')
// Scan to find the last space in the line
char *p = line;
char *lastSpace = null;
while(*p != '\0')
{
if (*p == ' ')
lastSpace = p;
p++;
}
if (lastSpace == null)
return("parse error");
// Replace the last space in the line with a NUL
*lastSpace = '\0';
// Advance past the NUL to the first character of the number field
lastSpace++;
char *word = text;
int number = atoi(lastSpace);
You can solve this using stdlib functions, but the above is likely to be more efficient as you're only searching for the characters you are interested in.
Given the description, I think I'd use a variant of this (now tested) C99 code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
struct word_number
{
char word[128];
long number;
};
int read_word_number(FILE *fp, struct word_number *wnp)
{
char buffer[140];
if (fgets(buffer, sizeof(buffer), fp) == 0)
return EOF;
size_t len = strlen(buffer);
if (buffer[len-1] != '\n') // Error if line too long to fit
return EOF;
buffer[--len] = '\0';
char *num = &buffer[len-1];
while (num > buffer && !isspace((unsigned char)*num))
num--;
if (num == buffer) // No space in input data
return EOF;
char *end;
wnp->number = strtol(num+1, &end, 0);
if (*end != '\0') // Invalid number as last word on line
return EOF;
*num = '\0';
if (num - buffer >= sizeof(wnp->word)) // Non-number part too long
return EOF;
memcpy(wnp->word, buffer, num - buffer);
return(0);
}
int main(void)
{
struct word_number wn;
while (read_word_number(stdin, &wn) != EOF)
printf("Word <<%s>> Number %ld\n", wn.word, wn.number);
return(0);
}
You could improve the error reporting by returning different values for different problems.
You could make it work with dynamically allocated memory for the word portion of the lines.
You could make it work with longer lines than I allow.
You could scan backwards over digits instead of non-spaces - but this allows the user to write "abc 0x123" and the hex value is handled correctly.
You might prefer to ensure there are no digits in the word part; this code does not care.
You could try using strtok() to tokenize each line, and then check whether each token is a number or a word (a fairly trivial check once you have the token string - just look at the first character of the token).
Assuming that the number is immediately followed by '\n'.
you can read each line to chars buffer, use sscanf("%d") on the entire line to get the number, and then calculate the number of chars that this number takes at the end of the text string.
Depending on how complex your strings become you may want to use the PCRE library. At least that way you can compile a perl'ish regular expression to split your lines. It may be overkill though.
Given the description, here's what I'd do: read each line as a single string using fgets() (making sure the target buffer is large enough), then split the line using strtok(). To determine if each token is a word or a number, I'd use strtol() to attempt the conversion and check the error condition. Example:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
/**
* Read the next line from the file, splitting the tokens into
* multiple strings and a single integer. Assumes input lines
* never exceed MAX_LINE_LENGTH and each individual string never
* exceeds MAX_STR_SIZE. Otherwise things get a little more
* interesting. Also assumes that the integer is the last
* thing on each line.
*/
int getNextLine(FILE *in, char (*strs)[MAX_STR_SIZE], int *numStrings, int *value)
{
char buffer[MAX_LINE_LENGTH];
int rval = 1;
if (fgets(buffer, buffer, sizeof buffer))
{
char *token = strtok(buffer, " ");
*numStrings = 0;
while (token)
{
char *chk;
*value = (int) strtol(token, &chk, 10);
if (*chk != 0 && *chk != '\n')
{
strcpy(strs[(*numStrings)++], token);
}
token = strtok(NULL, " ");
}
}
else
{
/**
* fgets() hit either EOF or error; either way return 0
*/
rval = 0;
}
return rval;
}
/**
* sample main
*/
int main(void)
{
FILE *input;
char strings[MAX_NUM_STRINGS][MAX_STRING_LENGTH];
int numStrings;
int value;
input = fopen("datafile.txt", "r");
if (input)
{
while (getNextLine(input, &strings, &numStrings, &value))
{
/**
* Do something with strings and value here
*/
}
fclose(input);
}
return 0;
}

Resources