Splitting a string into chunks. - c

I'm trying to split a string into chunks of 6 using C and I'm having a rough time of it. If you input a 12 character long string it just prints two unusual characters.
#include <stdio.h>
#include <string.h>
void stringSplit(char string[50])
{
int counter = 0;
char chunk[7];
for (unsigned int i = 0; i < strlen(string); i++)
{
if (string[i] == ' ')
{
continue;
}
int lastElement = strlen(chunk) - 1;
chunk[lastElement] = string[i];
counter++;
if (counter == 6)
{
printf(chunk);
memset(chunk, '\0', sizeof chunk);
counter = 0;
}
}
if (chunk != NULL)
{
printf(chunk);
}
}
int main()
{
char string[50];
printf("Input string. \n");
fgets(string, 50, stdin);
stringSplit(string);
return(0);
}
I appreciate any help.

Your problem is at
int lastElement = strlen(chunk) - 1;
Firstly, strlen counts the number of characters up to the NUL character. Your array is initially uninitialized, so this might cause problems.
Assuming your array is filled with NULs, and you have, let's say, 2 characters at the beginning and you are looking to place the third one. Remember that your 2 characters are at positions 0 and 1, respectively. So, strlen will return 2 (your string has 2 characters), you subtract one, so the lastElement variable has the value 1 now. And you place the third character at index 1, thus overwriting the second character you already had.
Also, this is extremely inefficient, since you compute the number of characters each time. But wait, you already know how many characters you have (you count them in counter, don't you?). So why not use counter to compute the index where the new character should be placed? (be careful not to do the same mistake and overwrite something else).

The function is wrong.
This statement
int lastElement = strlen(chunk) - 1;
can result in undefined behavior of the function because firstly the array chunk is not initially initialized
char chunk[7];
and secondly after this statement
memset(chunk, '\0', sizeof chunk);
the value of the variable lastElement will be equal to -1.
This if statement
if (chunk != NULL)
{
printf(chunk);
}
does not make sense because the address of the first character of the array chunk is always unequal to NULL.
It seems that what you mean is the following.
#include <stdio.h>
#include <ctype.h>
void stringSplit( const char s[] )
{
const size_t N = 6;
char chunk[N + 1];
size_t i = 0;
for ( ; *s; ++s )
{
if ( !isspace( ( unsigned char )*s ) )
{
chunk[i++] = *s;
if ( i == N )
{
chunk[i] = '\0';
i = 0;
puts( chunk );
}
}
}
if ( i != 0 )
{
chunk[i] = '\0';
puts( chunk );
}
}
int main(void)
{
char s[] = " You and I are beginners in C ";
stringSplit( s );
}
The program output is
Youand
Iarebe
ginner
sinC
You can modify the function such a way that the length of the chunk was specified as a function parameter.
For example
#include <stdio.h>
#include <ctype.h>
void stringSplit( const char s[], size_t n )
{
if ( n )
{
char chunk[n + 1];
size_t i = 0;
for ( ; *s; ++s )
{
if ( !isspace( ( unsigned char )*s ) )
{
chunk[i++] = *s;
if ( i == n )
{
chunk[i] = '\0';
i = 0;
puts( chunk );
}
}
}
if ( i != 0 )
{
chunk[i] = '\0';
puts( chunk );
}
}
}
int main(void)
{
char s[] = " You and I are beginners in C ";
for ( size_t i = 3; i < 10; i++ )
{
stringSplit( s, i );
puts( "" );
}
}
The program output will be
You
and
Iar
ebe
gin
ner
sin
C
Youa
ndIa
rebe
ginn
ersi
nC
Youan
dIare
begin
nersi
nC
Youand
Iarebe
ginner
sinC
YouandI
arebegi
nnersin
C
YouandIa
rebeginn
ersinC
YouandIar
ebeginner
sinC

Related

How to write a function in c that counts words

I had an exam yesterday in which one of the questions was about counting words in a given string.
The definition of word would be a portion of a string that is divided by spaces and/or the beginning/end of the string
I am new to C, and was not able to create a condition where it increases the counter word when you find “space (characters) space”
int count_words(char *str)
int i = 0;
int word = 2;
while (str[i])
{
if (str[i] == ‘ ‘)
{
int l = 1;
while (str[i + l]
{
l++;
}
if (l != 1)
{
word++;
}
}
}
For starters the function should be declared like
size_t count_words(const char *str);
The function parameter should be declared with the qualifier const because the passed string is not being changed within the function and the function return type should be size_t that is the same return type as for example of standard string function strlen.
It is unclear why the variable word in your function is initialized by 2
int word = 2;
Or the variable i is not being changed within the function.
The function can look the following way as shown in the demonstration program below
#include <ctype.h>
#include <string.h>
#include <stdio.h>
size_t count_words( const char *s )
{
size_t n = 0;
while ( *s )
{
while ( isspace( ( unsigned char )*s ) ) ++s;
if ( *s )
{
++n;
while ( *s && !isspace( ( unsigned char )*s ) ) ++s;
}
}
return n;
}
int main( void )
{
const char *s = "How to write a function in c that counts words";
size_t n = count_words( s );
printf( "The string \"%s\"\ncontains %zu words\n", s, n );
}
The program output is
The string "How to write a function in c that counts words"
contains 10 words
If to use as delimiters only the space character ' ' then the header <ctype.h> should be removed and the function will look like
size_t count_words( const char *s )
{
size_t n = 0;
while ( *s )
{
while ( *s == ' ' ) ++s;
if ( *s )
{
++n;
while ( *s && *s != ' ' ) ++s;
}
}
return n;
}
A more general function that can process any delimiters can look the following way
size_t count_words( const char *s, const char *delim )
{
size_t n = 0;
while (*s)
{
s += strspn( s, delim );
if (*s)
{
++n;
s += strcspn( s, delim );
}
}
return n;
}
The function has a second parameter that specifies delimiters. For example the function can be called loke
size_t n = count_words( s, " \t?!:;,." );
Try this way:
# include<stdio.h>
# include<string.h>
# define MaxBufferSize 50
void main(){
int count =0, size;
char str[MaxBufferSize+2];
printf("Enter string(max char 50 only): ");
if(fgets(str,sizeof(str), stdin)) {
if(strlen(str) > MaxBufferSize) printf("Max chars exceeded so chars before the limit were used!!!\n");
str[strcspn(str, "\n")] = '\0';
}
size = strlen(str);
int i=0;
if(size == 0) printf("Nothing entered\n");
else {
while(i<size){
while(!isalnum(str[i]) && i<size) i++;// ignores all spaces and other chars
if(str[i] == '\0') break;
while(isalnum(str[i]) && i<size) i++;// includes all words and numbers
count++;
}
printf("Words in string: %d\n",count);
}
}
hope it helps...! ;)
#include <stdio.h>
int count_words(char *str) {
int i, count=0;
int in_word = 0; // Flag to track if we're currently inside a word or not
// Loop through each character in the string
for(i=0; str[i]!='\0'; i++) {
// If current character is not a space or newline and we're not already inside a word, increment word count
if((str[i]!=' ' && str[i]!='\n') && !in_word) {
count++;
in_word = 1; // Set flag to indicate we're currently inside a word
}
// If current character is a space or newline, set flag to indicate we're not inside a word
else if(str[i]==' ' || str[i]=='\n') {
in_word = 0;
}
}
return count;
}
#include <stdio.h>
#include <string.h>
void main()
{
char s[200];
int count = 0, i;
printf("Enter the string:\n");
scanf("%[^\n]s", s);
for (i = 0;s[i] != '\0';i++)
{
if (s[i] == ' ' && s[i+1] != ' ')
count++;
}
printf("Number of words in given string are: %d\n", count + 1);
}
Use this code it will definitely works fine ):

Count and return the number of letters of each word in C

I'm learning C and I've created some small "challenges" for myself to solve. I have to create a program that reads an input string which consists of words separated by underscore and returns the last letter of each odd word followed by the number of chars of that word.
The input won't be empty. The words are separated by exactly 1 underscore. The first and last chars won't be underscores (so no _this_is_a_sentence or this_is_a_sentence_ or _this_is_a_sentence_
Example:
input: we_had_a_lot_of_rain_in_today
output: e2a1f2n2
Explanation:
We only consider words in an odd position, so we just need to consider: we, a, of and in. Now, for each of those words, we get the last char and append the total number of chars of the word: we has 2 chars, so it becomes e2. a has 1 char, so it becomes a1, of has 2 chars so it becomes f2 and in has 2 chars so it becomes n2.
This is my code so far
#include <stdio.h>
void str_dummy_encrypt(char *sentence)
{
int currentWord = 1;
int totalChars = 0;
for (int i = 0; sentence[i] != '\0'; i++)
{
if (sentence[i] == '_')
{
if (currentWord % 2 != 0)
{
// I know the last char of the word is on sentence[i-1]
// and the total chars for this word is totalChars
// but how to return it in order to be printed?
}
currentWord++;
totalChars = 0;
} else {
totalChars++;
}
}
}
int main()
{
char sentence[100];
while (scanf("%s", sentence) != EOF)
{
str_dummy_encrypt(sentence);
}
return 0;
}
I think I'm on the right path, but I don't have any clue on how to return the result to the main function so it can be printed.
Thanks in advance
... how to return the result (?)
You have a couple choices:
Pass in the destination
Caller provides an ample destination.
void str_dummy_encrypt(size_t dsize, char *destination, const char *sentence)
Allocate and return the destination
Caller should free the returned pointer when done.
char *str_dummy_encrypt(const char *sentence) {
...
char *destination = malloc()
...
return destination;
}
Over-write the source
This one is tricky as code needs to insure the destination does not get ahead of the source, but I think you are OK given the task requirements, as long as string length > 1.
void str_dummy_encrypt(char *sentence) {
char *destination = sentence;
...
}
Others
Let us go deeper with pass in the destination and return a flag indicating success/error.
Use snprintf() to form the letter-count.
// Return error flag
int str_dummy_encrypt(size_t dsize, char *destination, const char *sentence) {
...
if (currentWord % 2 != 0) {
int len = snprintf(destination, dsize, "%c%d", sentence[i-1], totalChars);
if (len < 0 || (unsigned) len >= dsize) {
// We ran out of room
return -1; // failure
}
// Adjust to append the next encoding.
dsize -= len;
destination += len;
}
...
return 0;
}
Usage
char sentence[100];
char destination[sizeof sentence + 1]; // I think worse case is 1 more than source.
...
if (str_dummy_encrypt(sizeof destination, destination, sentence)) {
puts("Error");
} else {
puts(destination);
}
Code has other issues:
Does not handle an odd number of words correctly like "abc".
Attempts sentence[i-1] with leading _ like "_abc".
Poor input:
No width limit, weak test.
char sentence[100];
// while(scanf("%s", sentence) != EOF)
while(scanf("%99s", sentence) == 1)
Perhaps other issues.
Consider a test like if(sentence[i+1] == '_' || sentence[i+1] == '\0') to detect end of word and avoid 2 issues mentioned above. (Count and other code will need adjusting too.)
As it follows from the description of the task the function should return a new string that is built based on the fornat of the passed source string.
It means that you need to allocated dynamically a character array within the function where the result string will be stored.
As the source string is not changed within the function then the function parameter should have qualifier const.
And you should always write more general functions. This restriction
The words are separated by exactly 1 underscore. The first and last
chars won't be underscores (so no this_is_a_sentence or
this_is_a_sentence or this_is_a_sentence
for the function does not make it general. The function should be able also to process strings like "_this_is_a_sentence_".
Here is a demonstration program that shows how the function can be implemented.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char * str_dummy_encrypt( const char *s )
{
size_t n = 0;
for (const char *p = s; *p; )
{
size_t length = 0;
while (length == 0 && *p)
{
length = strcspn( p, "_" );
if (length == 0) ++p;
}
if (length != 0)
{
p += length;
n += 1 + snprintf( NULL, 0, "%zu", length );
}
length = 0;
while (length == 0 && *p)
{
length = strcspn( p, "_" );
p += length == 0 ? 1 : length;
}
}
char *result = malloc( n + 1 );
if (result != NULL)
{
result[n] = '\0';
if (n != 0)
{
char *current = result;
for (const char *p = s; *p; )
{
size_t length = 0;
while (length == 0 && *p)
{
length = strcspn( p, "_" );
if (length == 0) ++p;
}
if (length != 0)
{
p += length;
*current++ = p[-1];
current += sprintf( current, "%zu", length );
}
length = 0;
while (length == 0 && *p)
{
length = strcspn( p, "_" );
p += length == 0 ? 1 : length;
}
}
}
}
return result;
}
int main( void )
{
const char *s = "_we__had___a_lot_of_rain_in_today___";
char *result = str_dummy_encrypt( s );
if (result != NULL) puts( result );
free( result );
}
The program output is
e2a1f2n2
The same output will be if to use the string showed in your question that is "we_had_a_lot_of_rain_in_today".
The function would be more general if to add one more parameter that will specify the delimiter as
char * str_dummy_encrypt( const char *s, char c );
Or as the shown function implementation uses the standard C string function strcspn then the function could accept a set of delimiters like
char * str_dummy_encrypt( const char *s, const char *delimiters );
You do not actually need to read the word into a buffer, you can just read one character at a time and keep track of the last char, the word number and its length:
#include <stdio.h>
int main() {
int c, lastc = ' ', n = 1, len = 0;
for (;;) {
c = getchar();
if (c == '_' || c == '\n' || c == EOF) {
if (n & 1) {
printf("%c%d", lastc, len);
}
n++;
len = 0;
if (c != '_')
break;
} else {
lastc = c;
len++;
}
}
printf("\n");
return 0;
}

Is this a good way to remove spaces from a string in C? note, I am only using the <stdio.h> header file. Also I am new to computer science

int remove_whitespace(char* string, int size) /* input parameters: the character array, size of that array */
{
int i = 0, j = 0, num = 0; /* i - counter variable | j - counter variable | num - number of white spaces */
for (i; i < size; i++)
{
if (string[i] == ' ')
{
for (j = i; j < size - 1; j++)
{
string[j] = string[j + 1];
}
num++;
}
}
return num; /* returns the number of white spaces */
}
Count the 5 variables used by the function presented. Five! Are they all being incremented correctly? A pair of nested for() loops and an if() conditional. Are the indexes the correct index for the purpose??
Depending on the parameters passed, will whitespace be compacted but leave trailing characters in the array? Is this why the caller receives the "shrinkage" quantity?
Your title suggests "string", yet the code presented indicates an array of characters (without consideration for a C string terminating '\0'.) Taking the approach that you want to remove SP's (0x20) from a null terminated C string, the following will do that:
void stripSP( char *s ) {
for( char *d = s; (*d = *s) != '\0'; s++ ) d += *d != ' ';
}
One line of code utilising two pointers.
You could pass/use a second parameter that is the character to be expunged, or change the d += *d != ' '; to d += !isspace(*d); to compact-out any whitespace characters (incl. '\t', '\r' & '\n').
Obviously, this leads to possibilities of stripping out punctuation, or whatever you'd like removed. If the caller needs to know the new length of the string, the caller can invoke strlen().
It's one little line of source code, amenable to customising to suit your needs. For instance, the function could preserve and return the start address for chaining function calls.
For starters the function should be declared like
size_t remove_char( char *string, size_t size, char c );
That is you should not write a separate function if you will want to remove another character instead of the space character ' '. Always try to write more general functions.
Also the function should return the size of the character array without the removed character.
Using nested for loops when all characters after the removed character are moved to the left is inefficient.
The function can be defined the following way
size_t remove_char( char *string, size_t size, char c )
{
size_t n = 0;
for ( size_t i = 0; i < size; i++ )
{
if ( string[i] != c )
{
if ( n != i ) string[n] = string[i];
++n;
}
}
return n;
}
Here is a demonstration program.
#include <stdio.h>
size_t remove_char( char *string, size_t size, char c )
{
size_t n = 0;
for ( size_t i = 0; i < size; i++ )
{
if ( string[i] != c )
{
if ( n != i ) string[n] = string[i];
++n;
}
}
return n;
}
int main( void )
{
char s[10] = "H e l l o";
size_t n = remove_char( s, sizeof( s ), ' ' );
printf( "%zu: %s\n", n - 1, s );
}
The program output is
5: Hello
If you want to remove a character from a string then the function can be declared like
char * remove_char( char *string, char c );
because you can determine the length of a string within the function by founding the terminating zero character '\0'.
Here is a demonstration program
#include <stdio.h>
char * remove_whitespace( char* string, char c )
{
if ( *string )
{
char *dsn = string, *src = string;
do
{
if ( *src != c )
{
if ( dsn != src ) *dsn = *src;
++dsn;
}
} while ( *src++ );
}
return string;
}
int main( void )
{
char s[10] = "H e l l o";
puts( remove_whitespace( s, ' ' ) );
}
The program output is
Hello

problem when trying to store the longest word in a string into another string?

This question is mainly based on my past question: to solve this exercise, I needed to ask a standalone question; here's the link: " little question about a thing when it comes to dynamically allocate a string, how can I solve? ". (I said it, because problems are in the heap).
this is the exercise:
write a function that find the longest word in a string, and return another string (dynamically allocated in the heap). (word is defined as: sequence of alphanumeric characters without whitespaces).
this is my code:
#include <stdlib.h>
#include <ctype.h>
#include <string.h>
char* longest_word(const char* sz) {
size_t length = 0;
for (size_t i = 0; sz[i] != 0; i++) {
if (isspace(sz[i])) {
length = 0;
}
else {
length++;
}
}
size_t sum = length + 1;
char* str = malloc(sum);
if (str == NULL) {
return NULL;
}
size_t stringlength = strlen(sz);
size_t sl = stringlength - (sum - 1);
for (size_t i = sl; sz[i] != 0; i++) {
str[i] = sz[i];
}
str[sum - 1] = 0;
return str;
}
int main(void) {
char sz[] = "widdewdw ededudeide sjfsdhiuodsfhuiodfihuodsfihuodsihuodsihuosdihuquesto";
char* str;
str = longest_word(sz);
free(str);
return 0;
}
the final string is the following: "ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍsjfsdhiuodsfhuiodfihuodsfihuodsi".
this is a good sign, because that means that my thinking process was right (although not entirely).
here's a detailed explanation:
find the length of the longest string, if the current character is a whitespace start counting from zero again. this works.
allocate enough space to store each character, plus the zero-terminator. (I've used size_t, because of the accepted answer of the linked question).
here's the critical part: "sz[i]" is the i-th position in the original string (i.e "sz"). I start counting from sz[i].
I have copied each character into str[i] until zero-terminator is reached.
at the end, placed 0 in str[sum-1], (not str[sum], because I've done it and it turned out to be a buffer overflow).
The funtion is incorrect.
This for loop
size_t length = 0;
for (size_t i = 0; sz[i] != 0; i++) {
if (isspace(sz[i])) {
length = 0;
}
else {
length++;
}
}
does not find the maximum length of words in the string. It returns just the last calculated value of the variable length. For example if the string is ended with a space then the value of length after the loop will be equal to 0.
And this for loop
for (size_t i = sl; sz[i] != 0; i++) {
str[i] = sz[i];
}
is trying to copy the tail of the string but not the word with the maximum length.
The function can be defined the following way as it is shown in the demonstration program below.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char * longest_word( const char *s )
{
const char *delim = " \t";
size_t max_n = 0;
const char *max_word = s;
for ( const char *p = s; *p; )
{
p += strspn( p, delim );
if ( *p )
{
const char *q = p;
p += strcspn( p, delim );
size_t n = p - q;
if ( max_n < n )
{
max_n = n;
max_word = q;
}
}
}
char *result = malloc( max_n + 1 );
if ( result != NULL )
{
result[max_n] = '\0';
memcpy( result, max_word, max_n );
}
return result;
}
int main( void )
{
const char *s = "Find the longest word";
char *p = longest_word( s );
if ( p ) puts( p );
free( p );
}
The program output is
longest

Replacing several characters in string with one in C

I need to replace several characters with one (depending if their count is even or odd). If it's even i should replace + with P, if it's odd with p.
Input: kjlz++zux+++
while(p[i])
{
j=i;
k=i;
length=strlen(p);
if(p[i]=='*')
{
position=i;
}
printf("Position is: %d", position);
while(p[j]=='*')
{
counter++;
j++;
}
}
Output: kjlzPzuxp
Im not sure how to remove several characters I know how to input one.
Basically you can leave the text variable intact until you find a +. In that case you start counting how many consecutive plusses there are. Once you know this, it can be decided if you should add a letter P or p. Keep a separate index to write back to your text variable! Otherwise it would start writing to the wrong index after 2 or 3 plusses are found, try to figure out why ;).
#include <stdio.h>
#include <stdlib.h>
int main (void)
{
char text[] = "kjlz++zux+++";
int len = sizeof(text) / sizeof(text[0]);
int index = 0, count = 0;
for(int i = 0; i < len; i++)
{
if(text[i] == '+')
{
count = 0;
while(text[i] == '+') i++, count++;
i--;
text[index++] = count % 2 ? 'p' : 'P';
}
else
{
text[index++] = text[i];
}
}
text[index] = 0;
printf(text);
}
You could allocate space for the text variable with malloc so that you can use realloc afterwards to shrink the array to the size of the output text. This way some memory is saved, this is especially important when you start working with bigger chunks of data.
If I have understood correctly you do not know how to implement a corresponding function.
It can look the following way as it is shown in the demonstrative program.
#include <stdio.h>
char * replace_pluses( char *s )
{
const char plus = '+';
const char odd_plus = 'p';
const char even_plus = 'P';
char *dsn = s;
for ( char *src = s; *src; )
{
if ( *src == plus )
{
int odd = 1;
while ( *++src == plus ) odd ^= 1;
*dsn++ = odd ? odd_plus : even_plus;
}
else
{
if ( dsn != src ) *dsn = *src;
++dsn;
++src;
}
}
*dsn = '\0';
return s;
}
int main(void)
{
char s[] = "kjlz++zux+++";
puts( s );
puts( replace_pluses( s ) );
return 0;
}
The program output is
kjlz++zux+++
kjlzPzuxp
Or you can write a more generic function like this
#include <stdio.h>
char * replace_odd_even_duplicates( char *s, char c1, char c2, char c3 )
{
char *dsn = s;
for ( char *src = s; *src; )
{
if ( *src == c1 )
{
int odd = 1;
while ( *++src == c1 ) odd ^= 1;
*dsn++ = odd ? c2 : c3;
}
else
{
if ( dsn != src ) *dsn = *src;
++dsn;
++src;
}
}
*dsn = '\0';
return s;
}
int main(void)
{
char s[] = "kjlz++zux+++";
puts( s );
puts( replace_odd_even_duplicates( s, '+', 'p', 'P' ) );
return 0;
}

Resources