Issues with Pointer Arithmetic - Trying to tokenize input String

Issues with Pointer Arithmetic - Trying to tokenize input String - c

Currently I am working on a program that allows a user to enter a string that is then tokenized, then the tokens are printed to the screen by using an array of pointers. It is "supposed" to do this by calling my tokenize function which reads the input string until the first separator ( ' ', ',', '.', '?', '!'). It then changes that separator in my string to a NULL char. It then should return a pointer to the next character in my string.
In main after the string has been input, it should keep calling the tokenize function which returns pointers which are then stored in a array of pointers to later print my tokens. Once the tokenize() returns a pointer to a NULL character which is at the end of my string it breaks from that loop. Then I print the tokens out using my array of pointers.
//trying to be detailed
#include <stdio.h>
#include <string.h>
char *tokenize ( char *text, const char *separators );
int main ( void )
{
char text[30];
char separators[6] = { ' ','.',',','?','!','\0'};
char *pch = NULL;
int tokens[15];
int i = 0;
int j = 0;
printf("Enter a string: \n");
fgets( text, 30, stdin );
printf("%s", text );
pch = tokenize ( text, separators );
do
{
pch = tokenize ( pch, separators );
//printf("%c", *pch);
tokens[i] = pch;
i++;
}
while( *pch != NULL );
i--;
while( j != i )
{
printf("%s", tokens[i] );
j++;
}
return 0;
}
char *tokenize ( char *text, const char *separators )
{
while( text != NULL )
{
if( text != NULL )
{
while( separators != NULL )
{
if( text == separators )
{
text = '\0';
}
separators++;
}
}
text++;
}
return text;
}
3 big known problems currently.
1.When I compile, it reads the string then prints it, then gets stuck in a endless loop with nothing printing, still trying to get input.
2. Im pretty sure I am using the " * " for my pointers in the wrong place.
3. My function passes in a reference to my arrays, so I assumed i could just increment them as is.
I appreciate any feedback! I will be watching this post constantly. If i left something unclear, I can respecify. Thanks.

You had right idea for approaching the problem, but you had numerous pointer/int errors throughout your code. Make sure you compile your code with Warnings enabled, this will tell you where you have problems in your code. (don't expect your code to run correctly until you address and eliminate all warnings). At a minimum, compile with -Wall -Wextra options in your build command.
There are a lot easier ways to do this, but for the learning experience, this is a great exercise. Below is your code with the errors corrected. Where possible, I have left your original code commented so you can see where the issues were. I also include a bit of code to remove the newline included by fgets at the end of text. While this isn't required, it is good practice not to have stray newlines filter through your code.
Let me know if you have questions:
#include <stdio.h>
#include <string.h>
char *tokenize ( char *text, const char *separators );
int main ( void )
{
char text[30];
char separators[6] = { ' ','.',',','?','!','\0'};
char *pch = NULL;
char *tokens[15] = {0}; /* declare array of pointers */
int i = 0;
int j = 0;
printf("Enter a string: \n");
fgets( text, 30, stdin );
size_t len = strlen (text);
if (text[len-1] == '\n') /* strip newline from text */
text[--len] = 0;
pch = text; /* pch pointer to next string */
char *str = text; /* str pointer to current */
do
{
pch = tokenize ( str, separators ); /* pch points to next */
tokens[i++] = str; /* save ptr to token */
str = pch; /* new start of str */
}
while (pch != NULL && *pch != 0); /* test both pch & *pch */
printf ("\nTokens collected:\n\n");
while (tokens[j]) /* print each token */
{
printf(" token[%d]: %s\n", j, tokens[j] );
j++;
}
printf ("\n");
return 0;
}
char *tokenize ( char *text, const char *separators )
{
const char *s = separators; /* must use pointer to allow reset */
//while( text != NULL )
while( *text != '\0' )
{
s = separators; /* reset s */
while( *s != 0 ) /* 0 is the same as '\0' */
{
//if( text == separators )
if( *text == *s )
{
//text = '\0';
*text = '\0';
return ++text;
}
s++;
}
text++;
}
return text;
}
Example output:
$ ./bin/tokenizestr
Enter a string:
This is a test string
Tokens collected:
token[0]: This
token[1]: is
token[2]: a
token[3]: test
token[4]: string

Maybe you will want to take a look at
strsep and this post Split string with delimiters in C
If you need more reference points try searching "split string" it is what you want to do if I understood correctly.

Related

how to determine last words of each string?

I have this assignment:
Enter a sequence of sentences from the keyboard into the string array (the end of entering - empty string). Determine the last word of each of these sentences.
The problem is that my program outputs the last word of the last sentence, and I need the last word of each sentence to be output.
Program I have tried:
#include <stdio.h>
#include <string.h>
int main() {
char str[10][100];
int i;
printf("Enter a sequence of sentences:\n");
for (i = 0; i < 10; i++) {
if (*gets(str) == '\0')
break;
}
printf("The last word of each of these sentences is:\n");
for (i = 0; i < 10; i++) {
char *word;
word = strtok(str[i], ".");
while (word != NULL) {
char *last_word = word;
word = strtok(NULL, ".");
}
printf("%s\n", last_word);
}
return 0;
}

The delimiter in this call
word = strtok(str[i], ".");
does not make sense.
It seems you mean
word = strtok(str[i], " \t.");
provided that a sentence can be ended only with a dot and words are separated by spaces or tab characters.
Another problem is that the variable last_word must be declared before the while loop.
For example
char *last_word = NULL;
char *word;
word = strtok(str[i], " \t.");
while (word != NULL) {
last_word = word;
word = strtok(NULL, " \t.");
}
And it is better to use for loop instead of the while loop
char *last_word = NULL;
for ( char *word = strtok(str[i], " \t." );
word != NULL;
word = strtok(NULL, " \t.") )
{
last_word = word;
}
Pay attention to that the function gets is unsafe and is not supported by the C Standard. Instead use standard C function fgets.
And the condition in the second for loop
for(i=0; i<10; i++)
{
char *word;
//...
is incorrect because the user can enter less than 10 sentences.

Without repeating the commentary of the accepted answer provided by #Vlad (kudos!), here is an alternative offering (with comments)
#include <stdio.h>
#include <string.h>
int main( void ) {
// A single large buffer allowing very long lines to be entered.
char buf[ 10 * 100 ], *p = buf;
size_t left = sizeof buf;
int i = 0;
// up to 10 'lines' of input, breaking on an empty line, too
while( i++ < 10 && fgets( p, left, stdin ) && p[0] != '\n' ) {
// typical invocation of strtok() to isolate "words"
// and a "wasteful" copy of each word to the current start of the buffer
for( char *tkn = p; ( tkn = strtok( tkn, " .\n" ) ) != NULL; tkn = NULL )
strcpy( p, tkn );
// having copied the last "word", append '\n' and advance the pointer
size_t len = strlen( p );
p += len;
strcpy( p++, "\n" );
left -= len + 1; // eroding the available size of the buffer
}
printf( "%s", buf ); // a single output of "word1\nword2\nword3\n..."
return 0;
}
NB: strcpy() of overlapping buffers is fraught with hazards. This works in this case, but the practise and its effects must be very well considered before using this technique.

Searching an array for a specific character [duplicate]

I want to write a program in C that displays each word of a whole sentence (taken as input) at a seperate line. This is what I have done so far:
void manipulate(char *buffer);
int get_words(char *buffer);
int main(){
char buff[100];
printf("sizeof %d\nstrlen %d\n", sizeof(buff), strlen(buff)); // Debugging reasons
bzero(buff, sizeof(buff));
printf("Give me the text:\n");
fgets(buff, sizeof(buff), stdin);
manipulate(buff);
return 0;
}
int get_words(char *buffer){ // Function that gets the word count, by counting the spaces.
int count;
int wordcount = 0;
char ch;
for (count = 0; count < strlen(buffer); count ++){
ch = buffer[count];
if((isblank(ch)) || (buffer[count] == '\0')){ // if the character is blank, or null byte add 1 to the wordcounter
wordcount += 1;
}
}
printf("%d\n\n", wordcount);
return wordcount;
}
void manipulate(char *buffer){
int words = get_words(buffer);
char *newbuff[words];
char *ptr;
int count = 0;
int count2 = 0;
char ch = '\n';
ptr = buffer;
bzero(newbuff, sizeof(newbuff));
for (count = 0; count < 100; count ++){
ch = buffer[count];
if (isblank(ch) || buffer[count] == '\0'){
buffer[count] = '\0';
if((newbuff[count2] = (char *)malloc(strlen(buffer))) == NULL) {
printf("MALLOC ERROR!\n");
exit(-1);
}
strcpy(newbuff[count2], ptr);
printf("\n%s\n",newbuff[count2]);
ptr = &buffer[count + 1];
count2 ++;
}
}
}
Although the output is what I want, I have really many black spaces after the final word displayed, and the malloc() returns NULL so the MALLOC ERROR! is displayed in the end.
I can understand that there is a mistake at my malloc() implementation, but I do not know what it is.
Is there another more elegant or generally better way to do it?

http://www.cplusplus.com/reference/clibrary/cstring/strtok/
Take a look at this, and use whitespace characters as the delimiter. If you need more hints let me know.
From the website:
char * strtok ( char * str, const char * delimiters );
On a first call, the function expects a C string as argument for str, whose first character is used as the starting location to scan for tokens. In subsequent calls, the function expects a null pointer and uses the position right after the end of last token as the new starting location for scanning.
Once the terminating null character of str is found in a call to strtok, all subsequent calls to this function (with a null pointer as the first argument) return a null pointer.
Parameters
str
C string to truncate.
Notice that this string is modified by being broken into smaller strings (tokens).
Alternativelly [sic], a null pointer may be specified, in which case the function continues scanning where a previous successful call to the function ended.
delimiters
C string containing the delimiter characters.
These may vary from one call to another.
Return Value
A pointer to the last token found in string.
A null pointer is returned if there are no tokens left to retrieve.
Example
/* strtok example */
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] ="- This, a sample string.";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," ,.-");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ,.-");
}
return 0;
}

For the fun of it here's an implementation based on the callback approach:
const char* find(const char* s,
const char* e,
int (*pred)(char))
{
while( s != e && !pred(*s) ) ++s;
return s;
}
void split_on_ws(const char* s,
const char* e,
void (*callback)(const char*, const char*))
{
const char* p = s;
while( s != e ) {
s = find(s, e, isspace);
callback(p, s);
p = s = find(s, e, isnotspace);
}
}
void handle_word(const char* s, const char* e)
{
// handle the word that starts at s and ends at e
}
int main()
{
split_on_ws(some_str, some_str + strlen(some_str), handle_word);
}

malloc(0) may (optionally) return NULL, depending on the implementation. Do you realize why you may be calling malloc(0)? Or more precisely, do you see where you are reading and writing beyond the size of your arrays?

Consider using strtok_r, as others have suggested, or something like:
void printWords(const char *string) {
// Make a local copy of the string that we can manipulate.
char * const copy = strdup(string);
char *space = copy;
// Find the next space in the string, and replace it with a newline.
while (space = strchr(space,' ')) *space = '\n';
// There are no more spaces in the string; print out our modified copy.
printf("%s\n", copy);
// Free our local copy
free(copy);
}

Something going wrong is get_words() always returning one less than the actual word count, so eventually you attempt to:
char *newbuff[words]; /* Words is one less than the actual number,
so this is declared to be too small. */
newbuff[count2] = (char *)malloc(strlen(buffer))
count2, eventually, is always one more than the number of elements you've declared for newbuff[]. Why malloc() isn't returning a valid ptr, though, I don't know.

You should be malloc'ing strlen(ptr), not strlen(buf). Also, your count2 should be limited to the number of words. When you get to the end of your string, you continue going over the zeros in your buffer and adding zero size strings to your array.

Just as an idea of a different style of string manipulation in C, here's an example which does not modify the source string, and does not use malloc. To find spaces I use the libc function strpbrk.
int print_words(const char *string, FILE *f)
{
static const char space_characters[] = " \t";
const char *next_space;
// Find the next space in the string
//
while ((next_space = strpbrk(string, space_characters)))
{
const char *p;
// If there are non-space characters between what we found
// and what we started from, print them.
//
if (next_space != string)
{
for (p=string; p<next_space; p++)
{
if(fputc(*p, f) == EOF)
{
return -1;
}
}
// Print a newline
//
if (fputc('\n', f) == EOF)
{
return -1;
}
}
// Advance next_space until we hit a non-space character
//
while (*next_space && strchr(space_characters, *next_space))
{
next_space++;
}
// Advance the string
//
string = next_space;
}
// Handle the case where there are no spaces left in the string
//
if (*string)
{
if (fprintf(f, "%s\n", string) < 0)
{
return -1;
}
}
return 0;
}

you can scan the char array looking for the token if you found it just print new line else print the char.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char *s;
s = malloc(1024 * sizeof(char));
scanf("%[^\n]", s);
s = realloc(s, strlen(s) + 1);
int len = strlen(s);
char delim =' ';
for(int i = 0; i < len; i++) {
if(s[i] == delim) {
printf("\n");
}
else {
printf("%c", s[i]);
}
}
free(s);
return 0;
}

char arr[50];
gets(arr);
int c=0,i,l;
l=strlen(arr);
for(i=0;i<l;i++){
if(arr[i]==32){
printf("\n");
}
else
printf("%c",arr[i]);
}

String parsing in C... strtok() doesn't quite get it done

OK... first question so please forgive me if it isn't quite understandable the first go.
I am attempting to parse a string input to stdin through a couple of different conditions.
Example input string: move this into "tokens that I need" \n
I would like to parse this into tokens as:
Token 1 = move
Token 2 = this
Token 3 = into
Token 4 = tokens that I need
Where the tokens are by whitespace (easy enough) until a quote is encountered, then everything inside of the open and close quotes is treated as a single token.
I've tried several different methods, but I unfortunately feel that I may be in over my head here so any help would be greatly appreciated.
My latest attempt:
fgets(input, BUFLEN, stdin); //gets the input
input[strlen(input)-1] = '\0';//removes the new line
printf("Input string = %s\n",input);//Just prints it out for me to see
char *token = strtok(input,delim);//Tokenizes the input, which unfortunately does not do what I need. delim is just my string of delimiters which currently only has a " " in it.
I tried to scan through the string one character at a time and then place those characters into arrays so that I could have them as I wanted, but that failed miserably.

The ultimate solution with customized version of my_strtok_r is here. This solution has advantage over solution with non re-entrant: strtok.
my_strtok_r is re-entrant: you can call them from multiple threads simultaneously, or in nested loops, et cetera.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char * my_strtok_r(char *s, const char *delim1, const char *delim2, char **save_ptr)
{
char *end;
size_t s1;
size_t s2;
int delim2found = 0;
if (s == NULL)
s = *save_ptr;
if (*s == '\0'){
*save_ptr = s;
return NULL;
}
s1 = strspn (s, delim1);
s2 = strspn (s, delim2);
if(s2 > s1){
s += s2;
delim2found = 1;
}
else{
s += s1;
}
if (*s == '\0'){
*save_ptr = s;
return NULL;
}
/* Find the end of the token. */
if(delim2found)
end = s + strcspn (s, delim2);
else
end = s + strcspn (s, delim1);
if (*end == '\0') {
*save_ptr = end;
return s;
}
/* Terminate the token and make *save_ptr point past it. */
*end = '\0';
*save_ptr = end + 1;
return s;
}
int main (void)
{
char str[] = " 123 abc \"SPLITTING WORKS\" yes! \"GREAT WE HAVE A SOLUTION\" ! ";
char *d1 = " ";
char *d2 = "\"";
char *token;
char *rest = str;
char array[20][80];
printf ("Splitting string \"%s\" into tokens:\n",str);
size_t nr_of_tokens = 0;
while ((token = my_strtok_r(rest, d1, d2, &rest)))
{
strcpy (array[nr_of_tokens], token);
nr_of_tokens++;
}
for(int i=0; i < nr_of_tokens; i++)
printf ("%s\n",array[i]);
return 0;
}
Test:
Splitting string " 123 abc "SPLITING WORKS" yes! "GREAT WE HAVE A SOLUTION" ! " into tokens:
123
abc
SPLITTING WORKS
yes!
GREAT WE HAVE A SOLUTION
!
This is another solution (fully tested) which you can use. You can mix any number of tokens delimited by white spaces and '\"'. It can be configured to your needs. Extensive explanations are given in the code itself.
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include <ctype.h>
char *get_str_segment(char *output_str, char *input_str, char extDel)
{
/*
Purpose :
To copy to output first segment.
To extract the segment two types of delimiters are used:
1. white space delimiter
2. 'extDel' -
do not put here white space or '\0'!
(typicaly '"' = quote!)
'extDel' allows us to put white spaces inside the segment.
Notice that 'extDel' cannot be embedded inside the segment!
It makes 'extDel' special character which will not be encountered
in the 'output_string'! First appearance of 'extDel' starts new
segment!
Notice that unbalanced 'extDel' will cause coping whole string to
destination from that point!
Return:
Pointer to the first character after the segment
or NULL !!!
we will not allow **empty** segments with unbalanced 'extDel'
if ('extDel' is unbalanced) it has to have at list one character!
It can be white space!
Notice!
"get_str_segment()" on strings filed with white spaces
and empty strings will return *** NULL *** to indicate that
no conclusive segment has been found!
Example:
input_str = " qwerty"123 45" "samuel" G7 "
output_str = ""
// Call:
char *ptr = get_str_segment(output_str,input_str,'"');
Result:
input_str = " qwerty"123 45" "samuel" G7 "
^
|
ptr----------------------.
output_str = "qwerty"
*/
char *s = input_str;
char *d = output_str;
char i = 0;
if(!s) return NULL; // rule #1 our code never brakes!
if(!d) return NULL;
// eliminate white spaces from front of the block
while(1)
{
if ( *s == '\0')
{
*d = '\0' ; // end the output string
return (NULL) ; // return NULL to indicate that no
// coping has been done.
//
//
// "get_str_segment()" on
// strings filed with white spaces
// and empty strings
// will return NULL to indicate that
// no conclusive segment has been found
//
}
if (isspace(*s)) ++s; // move pointer to next char
else break; // break the loop!
}
// we found first non white character!
if( *s != extDel)
{
// copy block up to end of string first white space or extDel
while( ((*s) != '\0') && (!isspace(*s)) && ((*s) != extDel) )
{
*d = *s; // copy segment characters
++s;
++d;
}
*d = '\0' ; // end the destination string
return (s); // return pointer to end of the string ||
// trailing white space ||
// 'extDel' char
}
else // It is 'extDel' character !
{
++s; // skip opening 'extDel'
while( ((*s) != '\0') && ((*s) != extDel) )
{
i=1; // we executed loop at list one time
*d = *s; // copy segment characters till '\0' or extDel
++s;
++d;
}
*d = '\0' ; // end the destination string
if( *s == extDel ) ++s; // skip *closing* 'extDel'
else
{
// unbalanced 'extDel'!
printf("WARNING:get_str_segment: unbalanced '%c' encountered!\n",extDel);
if (i==0) return NULL; // we will not allow
// **empty** unbalanced segments 'extDel'
// if ('extDel' is unbalanced) it has to have at list one character!
// It can be white space!
}
return (s); // return pointer to next char after 'extDel'
// ( It can be '\0')
// if it is '\0' next pass will return 'NULL'!
}
}
int parse_line_to_table(int firstDim, int secondDim, char *table, char * line, char separator)
{
// Purpose:
// Parse 'line' to 'table'
// Function returns: number of segments
// 'table' has to be passed from outside
char* p;
int i;
if(!table) return (-1);
// parse segments to 'table':
if(line)
{
p = line; // A necessary initialization!
for(i=0; i<firstDim; i++)
{
p = get_str_segment( table+i*secondDim , p , separator );
if(p==NULL) break;
}
}
else
return (-1);
// debug only
// for(int j=0; j<i; j++) { printf(" i=%d %s",j, table+j*secondDim ); }
// printf("\n");
return (i); // notice that i is post incremented
}
int main(void)
{
char table[20][80];
char *line = "move this into \"tokens that I need\"";
int ret = parse_line_to_table(20, 80, table, line, '\"');
for(int i = 0; i < ret; i++ )
printf("%s\n",table[i]);
return 0;
}
Output:
move
this
into
tokens that I need

Split string in C every white space

I want to write a program in C that displays each word of a whole sentence (taken as input) at a seperate line. This is what I have done so far:
void manipulate(char *buffer);
int get_words(char *buffer);
int main(){
char buff[100];
printf("sizeof %d\nstrlen %d\n", sizeof(buff), strlen(buff)); // Debugging reasons
bzero(buff, sizeof(buff));
printf("Give me the text:\n");
fgets(buff, sizeof(buff), stdin);
manipulate(buff);
return 0;
}
int get_words(char *buffer){ // Function that gets the word count, by counting the spaces.
int count;
int wordcount = 0;
char ch;
for (count = 0; count < strlen(buffer); count ++){
ch = buffer[count];
if((isblank(ch)) || (buffer[count] == '\0')){ // if the character is blank, or null byte add 1 to the wordcounter
wordcount += 1;
}
}
printf("%d\n\n", wordcount);
return wordcount;
}
void manipulate(char *buffer){
int words = get_words(buffer);
char *newbuff[words];
char *ptr;
int count = 0;
int count2 = 0;
char ch = '\n';
ptr = buffer;
bzero(newbuff, sizeof(newbuff));
for (count = 0; count < 100; count ++){
ch = buffer[count];
if (isblank(ch) || buffer[count] == '\0'){
buffer[count] = '\0';
if((newbuff[count2] = (char *)malloc(strlen(buffer))) == NULL) {
printf("MALLOC ERROR!\n");
exit(-1);
}
strcpy(newbuff[count2], ptr);
printf("\n%s\n",newbuff[count2]);
ptr = &buffer[count + 1];
count2 ++;
}
}
}
Although the output is what I want, I have really many black spaces after the final word displayed, and the malloc() returns NULL so the MALLOC ERROR! is displayed in the end.
I can understand that there is a mistake at my malloc() implementation, but I do not know what it is.
Is there another more elegant or generally better way to do it?

http://www.cplusplus.com/reference/clibrary/cstring/strtok/
Take a look at this, and use whitespace characters as the delimiter. If you need more hints let me know.
From the website:
char * strtok ( char * str, const char * delimiters );
On a first call, the function expects a C string as argument for str, whose first character is used as the starting location to scan for tokens. In subsequent calls, the function expects a null pointer and uses the position right after the end of last token as the new starting location for scanning.
Once the terminating null character of str is found in a call to strtok, all subsequent calls to this function (with a null pointer as the first argument) return a null pointer.
Parameters
str
C string to truncate.
Notice that this string is modified by being broken into smaller strings (tokens).
Alternativelly [sic], a null pointer may be specified, in which case the function continues scanning where a previous successful call to the function ended.
delimiters
C string containing the delimiter characters.
These may vary from one call to another.
Return Value
A pointer to the last token found in string.
A null pointer is returned if there are no tokens left to retrieve.
Example
/* strtok example */
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] ="- This, a sample string.";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," ,.-");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ,.-");
}
return 0;
}

For the fun of it here's an implementation based on the callback approach:
const char* find(const char* s,
const char* e,
int (*pred)(char))
{
while( s != e && !pred(*s) ) ++s;
return s;
}
void split_on_ws(const char* s,
const char* e,
void (*callback)(const char*, const char*))
{
const char* p = s;
while( s != e ) {
s = find(s, e, isspace);
callback(p, s);
p = s = find(s, e, isnotspace);
}
}
void handle_word(const char* s, const char* e)
{
// handle the word that starts at s and ends at e
}
int main()
{
split_on_ws(some_str, some_str + strlen(some_str), handle_word);
}

malloc(0) may (optionally) return NULL, depending on the implementation. Do you realize why you may be calling malloc(0)? Or more precisely, do you see where you are reading and writing beyond the size of your arrays?

Consider using strtok_r, as others have suggested, or something like:
void printWords(const char *string) {
// Make a local copy of the string that we can manipulate.
char * const copy = strdup(string);
char *space = copy;
// Find the next space in the string, and replace it with a newline.
while (space = strchr(space,' ')) *space = '\n';
// There are no more spaces in the string; print out our modified copy.
printf("%s\n", copy);
// Free our local copy
free(copy);
}

Something going wrong is get_words() always returning one less than the actual word count, so eventually you attempt to:
char *newbuff[words]; /* Words is one less than the actual number,
so this is declared to be too small. */
newbuff[count2] = (char *)malloc(strlen(buffer))
count2, eventually, is always one more than the number of elements you've declared for newbuff[]. Why malloc() isn't returning a valid ptr, though, I don't know.

You should be malloc'ing strlen(ptr), not strlen(buf). Also, your count2 should be limited to the number of words. When you get to the end of your string, you continue going over the zeros in your buffer and adding zero size strings to your array.

Just as an idea of a different style of string manipulation in C, here's an example which does not modify the source string, and does not use malloc. To find spaces I use the libc function strpbrk.
int print_words(const char *string, FILE *f)
{
static const char space_characters[] = " \t";
const char *next_space;
// Find the next space in the string
//
while ((next_space = strpbrk(string, space_characters)))
{
const char *p;
// If there are non-space characters between what we found
// and what we started from, print them.
//
if (next_space != string)
{
for (p=string; p<next_space; p++)
{
if(fputc(*p, f) == EOF)
{
return -1;
}
}
// Print a newline
//
if (fputc('\n', f) == EOF)
{
return -1;
}
}
// Advance next_space until we hit a non-space character
//
while (*next_space && strchr(space_characters, *next_space))
{
next_space++;
}
// Advance the string
//
string = next_space;
}
// Handle the case where there are no spaces left in the string
//
if (*string)
{
if (fprintf(f, "%s\n", string) < 0)
{
return -1;
}
}
return 0;
}

you can scan the char array looking for the token if you found it just print new line else print the char.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char *s;
s = malloc(1024 * sizeof(char));
scanf("%[^\n]", s);
s = realloc(s, strlen(s) + 1);
int len = strlen(s);
char delim =' ';
for(int i = 0; i < len; i++) {
if(s[i] == delim) {
printf("\n");
}
else {
printf("%c", s[i]);
}
}
free(s);
return 0;
}

char arr[50];
gets(arr);
int c=0,i,l;
l=strlen(arr);
for(i=0;i<l;i++){
if(arr[i]==32){
printf("\n");
}
else
printf("%c",arr[i]);
}

How to safety parse tab-delimited string ?

How to safety parse tab-delimiter string ? for example:
test\tbla-bla-bla\t2332 ?

strtok() is a standard function for parsing strings with arbitrary delimiters. It is, however, not thread-safe. Your C library of choice might have a thread-safe variant.
Another standard-compliant way (just wrote this up, it is not tested):
#include <string.h>
#include <stdio.h>
int main()
{
char string[] = "foo\tbar\tbaz";
char * start = string;
char * end;
while ( ( end = strchr( start, '\t' ) ) != NULL )
{
// %s prints a number of characters, * takes number from stack
// (your token is not zero-terminated!)
printf( "%.*s\n", end - start, start );
start = end + 1;
}
// start points to last token, zero-terminated
printf( "%s", start );
return 0;
}

Use strtok_r instead of strtok (if it is available). It has similar usage, except it is reentrant, and it does not modify the string like strtok does. [Edit: Actually, I misspoke. As Christoph points out, strtok_r does replace the delimiters by '\0'. So, you should operate on a copy of the string if you want to preserve the original string. But it is preferable to strtok because it is reentrant and thread safe]
strtok will leave your original string modified. It replaces the delimiter with '\0'. And if your string happens to be a constant, stored in a read only memory (some compilers will do that), you may actually get a access violation.

Using strtok() from string.h.
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] = "test\tbla-bla-bla\t2332";
char * pch;
pch = strtok (str," \t");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " \t");
}
return 0;
}

You can use any regex library or even the GLib GScanner, see here and here for more information.

Yet another version; this one separates the logic into a new function
#include <stdio.h>
static _Bool next_token(const char **start, const char **end)
{
if(!*end) *end = *start; // first call
else if(!**end) // check for terminating zero
return 0;
else *start = ++*end; // skip tab
// advance to terminating zero or next tab
while(**end && **end != '\t')
++*end;
return 1;
}
int main(void)
{
const char *string = "foo\tbar\tbaz";
const char *start = string;
const char *end = NULL; // NULL value indicates first call
while(next_token(&start, &end))
{
// print substring [start,end[
printf("%.*s\n", end - start, start);
}
return 0;
}

If you need a binary safe way to tokenize a given string:
#include <string.h>
#include <stdio.h>
void tokenize(const char *str, const char delim, const size_t size)
{
const char *start = str, *next;
const char *end = str + size;
while (start < end) {
if ((next = memchr(start, delim, end - start)) == NULL) {
next = end;
}
printf("%.*s\n", next - start, start);
start = next + 1;
}
}
int main(void)
{
char str[] = "test\tbla-bla-bla\t2332";
int len = strlen(str);
tokenize(str, '\t', len);
return 0;
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Issues with Pointer Arithmetic - Trying to tokenize input String - c

Maybe you will want to take a look at strsep and this post Split string with delimiters in C If you need more reference points try searching "split string" it is what you want to do if I understood correctly.

Related

how to determine last words of each string?

Searching an array for a specific character [duplicate]

String parsing in C... strtok() doesn't quite get it done

Split string in C every white space

How to safety parse tab-delimited string ?

Categories

Resources