Segmentation Fault from Split String Function - c

I am required to write a function that splits a string into individual words.
My first parameter is a string. We assume that the words in the string are separated by single spaces, with no spaces before the first word or after the second word. Punctuation like spaces for example is part of a word. My second parameter is an address of an integer in which the function gives it the value of the number of words in the string. The return value is a pointer that points of an array of strings containing the individual words in the sentence. I need to allocate it memory from the heap and have one word in each index of the array. The strings are copies of the original words, not pointers. Here is my code :
char** splitString(char theString[], int *arraySize) {
*arraySize = countSpaces(theString) + 1; //Points to the number of words in the string.
char** pointerToArrayOfStrings = malloc(*arraySize * sizeof(char *)); //Allocated memory for '*arraySize' character pointers
int characters = 0;
for (int i = 0; i < *arraySize; i++) {
while (theString[characters] != ' ' || theString[characters] != '\0') {
characters++;
}
characters++;
pointerToArrayOfStrings[i] = (char *)malloc(characters);
pointerToArrayOfStrings[i][characters] = '\0';
}
for (int word = 0; word < *arraySize; word++) {
int ch = 0;
while (ch < strlen(pointerToArrayOfStrings[word])) {
pointerToArrayOfStrings[word][ch] = theString[ch];
}
ch+=2;
}
return pointerToArrayOfStrings;
}
This is immediately giving me segmentation faults. I am very new to pointers, so my method is to first allocate the array with the amount of memory for "numberOfWords" character pointers. Then I allocated each character pointer with the size of the corresponding word. After, I filled the slots with the characters from the original string. I don't know what I'm missing.

The comments have already addressed your questions about seg-faults etc. But since you did not say it was required how you split the string, I wanted to suggest looking at another approach.
Consider these steps:
1) Walk through string counting occurrences of white space (words) and track longest word found.
2) Knowing count, and longest word, you have what you need to allocate memory. Do it.
3) In a for loop, use strtok() with the delimiters of: , \n, \t etc. to tokenize the string.
4) Using strcpy() (also in the loop) to transfer each token into the string array.
5) Return array. Use array. Free all allocated memory.
Example code to do these steps:
char** splitString(const char theString[], int *arraySize);
char ** create_str_array(int strings, int longest) ;
int main(void)
{
int size;
char ** string = splitString("here is a string", &size);
return 0;
}
char** splitString(const char theString[], int *arraySize)
{
*arraySize = strlen(theString) + 1; //Points to the number of words in the string.
char *tok;
char *dup = strdup(theString);//create copy of const char argument
char** pointerToArrayOfStrings = NULL;
int characters = 0;
int len = 0, lenKeep = 0, wordCount = 0, i;
/// get count of words and longsest string
for(i=0;i<*arraySize;i++)
{
if((!isspace(theString[i]) && (theString[i])))
{
len++;
if(lenKeep < len)
{
lenKeep = len;
}
}
else
{
wordCount++;
len = 0;
}
}
/// create memory. (array of strings to hold sub-strings)
pointerToArrayOfStrings = create_str_array(wordCount, lenKeep);
if(pointerToArrayOfStrings)// only if memory creation successful, continue
{
/// parse original string into sub-strings
i = 0;
tok = strtok(dup, " \n\t");
if(tok)
{
strcpy(pointerToArrayOfStrings[i], tok);
tok = strtok(NULL, " \n\t");
while(tok)
{
i++;
strcpy(pointerToArrayOfStrings[i], tok);
tok = strtok(NULL, " \n\t");
}
}
}
/// return array of strings
return pointerToArrayOfStrings;
}
char ** create_str_array(int strings, int longest)
{
int i;
char ** a = calloc(strings, sizeof(char *));
for(i=0;i<strings;i++)
{
a[i] = calloc(longest+1, 1);
}
return a;
}

Related

What is this weird output after using pointer Arithemtic in C?

My goal in the code is to parse some sort of input into words regarding all spaces but at the same time use those spaces to signify a change in words. The logic here is that anytime it encounters a space it loops until there is no longer a space character and then when it encounters a word it loops until it encounters a space character or a '\0' and meanwhile puts each character into one index of an array inside arrays in the 2d array. Then before the while loop continues again it indexes to the next array.
I'm almost certain the logic is implemented well enough for it to work but I get this weird output listed below I've had the same problem before when messing with pointers and whatnot but I just can't get this to work no matter what I do. Any ideas as to why I'm genuinely curious about the reason behind why?
#include <stdio.h>
#include <stdlib.h>
void print_mat(char **arry, int y, int x){
for(int i=0;i<y;i++){
for(int j=0;j<x;j++){
printf("%c",arry[i][j]);
}
printf("\n");
}
}
char **parse(char *str)
{
char **parsed=(char**)malloc(sizeof(10*sizeof(char*)));
for(int i=0;i<10;i++){
parsed[i]=(char*)malloc(200*sizeof(char));
}
char **pointer = parsed;
while(*str!='\0'){
if(*str==32)
{
while(*str==32 && *str!='\0'){
str++;
}
}
while(*str!=32 && *str!='\0'){
(*pointer) = (str);
(*pointer)++;
str++;
}
pointer++;
}
return parsed;
}
int main(){
char str[] = "command -par1 -par2 thething";
char**point=parse(str);
print_mat(point,10,200);
return 0;
}
-par1 -par2 thethingUP%�W���U�6o� X%��U�v;,���UP%���cNjW��]A�aW�Ӹto�8so�z�
-par2 thethingUP%�W���U�6o� X%��U�v;,���UP%���cNjW��]A�aW�Ӹto�8so�z�
thethingUP%�W���U�6o� X%��U�v;,���UP%���cNjW��]A�aW�Ӹto�8so�z�
UP%�W���U�6o� X%��U�v;,���UP%���cNjW��]A�aW�Ӹto�8so�z�
I also tried to simply index the 2d array but to no avail
char **parse(char *str)
{
int i, j;
i=0;
j=0;
char **parsed=(char**)malloc(sizeof(10*sizeof(char*)));
for(int i=0;i<10;i++){
parsed[i]=(char*)malloc(200*sizeof(char));
}
while(*str!='\0'){
i=0;
if(*str==32)
{
while(*str==32 && *str!='\0'){
str++;
}
}
while(*str!=32 && *str!='\0'){
parsed[j][i] = (*str);
i++;
str++;
}
j++;
}
return parsed;
}
Output:
command�&�v�U`'�v�U0(�v�U)�v�U�)�v�U
-par1
-par2
thething
makefile:5: recipe for target 'build' failed
make: *** [build] Segmentation fault (core dumped)
A couple of problems in your code:
Your program is leaking memory.
Your program is accessing memory which it does not own and this is UB.
Lets discuss them one by one -
First problem - Memory leak:
Check this part of parse() function:
while(*str!=32 && *str!='\0'){
(*pointer) = (str);
In the first iteration of outer while loop, the *pointer will give you first member of parsed array i.e. parsed[0], which is a pointer to char. Note that you are dynamically allocating memory to parsed[0], parsed[1]... parsed[9] pointers in parse() before the outer while loop. In the inner while loop you are pointing them to str. Hence, they will loose the dynamically allocated memory reference and leading to memory leak.
Second problem - Accessing memory which it does not own:
As stated above that the pointers parsed[0], parsed[1] etc. will point to whatever was the current value of str in the inner while loop of parse() function. That means, the pointers parsed[0], parsed[1] etc. will point to some element of array str (defined in main()). In the print_mat() function, you are passing 200 and accessing every pointer of array arry from 0 to 199 index. Since, the arry pointers are pointing to str array whose size is 29, that means, your program is accessing memory (array) beyond its size which is UB.
Lets fix these problem in your code without making much of changes:
For memory leak:
Instead of pointing the pointers to str, assign characters of str to the allocated memory, like this:
int i = 0;
while(*str!=32 && *str!='\0'){
(*pointer)[i++] = (*str);
str++;
}
For accessing memory which it does not own:
A point that you should remember:
In C, strings are actually one-dimensional array of characters terminated by a null character \0.
First of all, empty the strings after dynamically allocating memory to them so that you can identify the unused pointers while printing them:
for(int i=0;i<10;i++){
parsed[i]=(char*)malloc(200*sizeof(char));
parsed[i][0] = '\0';
}
Terminate all string in with null terminator character after writing word to parsed array pointers:
int i = 0;
while(*str!=32 && *str!='\0'){
(*pointer)[i++] = (*str);
str++;
}
// Add null terminator
(*pointer)[i] = '\0';
In the print_mat(), make sure once you hit the null terminator character, don't read beyond it. Modify the condition of inner for loop:
for(int j = 0; (j < x) && (arry[i][j] != '\0'); j++){
printf("%c",arry[i][j]);
You don't need to print the strings character by character, you can simply use %s format specifier to print a string, like this -
for (int i = 0;i < y; i++) {
if (arry[i][0] != '\0') {
printf ("%s\n", arry[i]);
}
}
With the above suggested changes (which are the minimal changes required in your program to work it properly), your code will look like this:
#include <stdio.h>
#include <stdlib.h>
void print_mat (char **arry, int y) {
for (int i = 0; i < y; i++) {
if (arry[i][0] != '\0') {
printf ("%s\n", arry[i]);
}
}
}
char **parse(char *str) {
char **parsed = (char**)malloc(sizeof(10*sizeof(char*)));
// check malloc return
for(int i = 0; i < 10; i++){
parsed[i] = (char*)malloc(200*sizeof(char));
// check malloc return
parsed[i][0] = '\0';
}
char **pointer = parsed;
while (*str != '\0') {
if(*str == 32) {
while(*str==32 && *str!='\0') {
str++;
}
}
int i = 0;
while (*str != 32 && *str != '\0') {
(*pointer)[i++] = (*str);
str++;
}
(*pointer)[i] = '\0';
pointer++;
}
return parsed;
}
int main (void) {
char str[] = "command -par1 -par2 thething";
char **point = parse(str);
print_mat (point, 10);
// free the dynamically allocate memory
return 0;
}
Output:
command
-par1
-par2
thething
There is a lot improvements can be done in your code implementation, for e.g. -
As I have shown above, you can use %s format specifier instead of printing string character by character etc.. I am leaving it up to you to identify those changes and modify your program.
Allocate memory to a parsed array pointer only where there is a word in str.
Instead of allocating memory of fixed size (i.e. 200) to parsed array pointers, allocate memory of size of word only.
Few suggestions:
Always check the return value of function like malloc.
Make sure to free the dynamically allocated memory once your program done with it.
You can achieve what you want in a simpler way.
First, define a function that checks if a character (separator) is present in a list of characters (separators):
// Returns true if c is found in a list of separators, false otherwise.
bool belongs(const char c, const char *list)
{
for (const char *p = list; *p; ++p)
if (*p == c) return true;
return false;
}
Then, define a function that splits a given string into tokens, separated by one or more separators:
// Splits a string into into tokens, separated by one of the separators in sep
bool split(const char *s, const char *sep, char **tokens, size_t *ntokens, const size_t maxtokens)
{
// Start with zero tokens.
*ntokens = 0;
const char *start = s, *end = s;
for (const char *p = s; /*no condtition*/; ++p) {
// Can no longer hold more tokens? Exit.
if (*ntokens == maxtokens)
return false;
// Not a token? Continue looping.
if (*p && !belongs(*p, sep))
continue;
// Found a token: calculate its length.
size_t tlength = p - start;
// Empty token?
if (tlength == 0) {
// And reached the end of string? Break.
if (!*p) break;
// Not the end of string? Skip it.
++start;
continue;
}
// Attempt to allocate memory.
char *token = malloc(sizeof(*token) * (tlength + 1));
// Failed? Exit.
if (!token)
return false;
// Copy the token.
strncpy(token, start, tlength+1);
token[tlength] = '\0';
// Put it in tokens array.
tokens[*ntokens] = token;
// Update the number of tokens.
*ntokens += 1;
// Reached the end of string? Break.
if (!*p) break;
// There is more to parse. Set the start to the next char.
start = p + 1;
}
return true;
}
Call it like this:
int main(void)
{
char command[] = "command -par1 -par2 thing";
const size_t maxtokens = 10;
char **tokens = malloc(sizeof *tokens * maxtokens);
if (!tokens) return 1;
size_t ntokens = 0;
split(command, " ", tokens, &ntokens, maxtokens);
// Print all tokens.
printf("Number of tokens = %ld\n", ntokens);
for (size_t i = 0; i < ntokens; ++i)
printf("%s\n", tokens[i]);
// Release memory when done.
for (size_t i = 0; i < ntokens; ++i)
free(tokens[i]);
free(tokens);
}
Output:
Number of tokens = 4
command
-par1
-par2
thing

How to replace characters by strtok function - C?

I really want to change all spaces ' ' in my char array for NULL -
#include <string.h>
void ReplaceCharactersInString(char *pcString, char *cOldChar, char *cNewChar) {
char *p = strtok(pcString, cOldChar);
strcpy(pcString, p);
while (p != NULL) {
strcat(pcString, p);
p = strtok(cNewChar, cOldChar);
}
}
int main() {
char pcString[] = "I am testing";
ReplaceCharactersInString(pcString, " ", NULL);
printf(pcString);
}
OUTPUT: Iamtesting
If I simply put the printf(p) function before:
p = strtok(cNewChar, cOldChar);
In the result I have what I need - but the problem is how to store it in pcString (directly)?
Or there is maybe a better solution to simply do it?
While some functions expect a [single] string to be pre-parsed to: I\0am\0testing, that is rare.
And, if you have multiple spaces/delimiters, you'll get (e.g.) foo\0\0bar, which you probably don't want.
And, your printf in main will only print the first token in the string because it will stop on the first EOS (i.e. '\0').
(i.e.) You probably don't want strcpy/strcat.
More likely, you want to fill an array of char * pointers to the tokens you parse.
So, you'd want to pass down char **argv, then do: argv[argc++] = strtok(...); and then do: return argc
Here's how I would refactor your code:
#include <stdio.h>
#include <string.h>
#define ARGMAX 100
int
ReplaceCharactersInString(int argmax,char **argv,char *pcString,
const char *delim)
{
char *p;
int argc;
// allow space for NULL termination
--argmax;
for (argc = 0; argc < argmax; ++argc, ++argv) {
// get next token
p = strtok(pcString,delim);
if (p == NULL)
break;
// zap the buffer pointer
pcString = NULL;
// store the token in the [returned] array
*argv = p;
}
*argv = NULL;
return argc;
}
int
main(void)
{
char pcString[] = "I am testing";
int argc;
char **av;
char *argv[ARGMAX];
argc = ReplaceCharactersInString(ARGMAX,argv,pcString," ");
printf("argc: %d\n",argc);
for (av = argv; *av != NULL; ++av)
printf("'%s'\n",*av);
return 0;
}
Here's the output:
argc: 3
'I'
'am'
'testing'
strcat strcpy should not be used when the source and destination overlap in memory.
Iterate through the array and replace the matching character with the desired character.
Since zeros are part of the string, printf will stop at the first zero and strlen can't be used for the length to print. sizeof can be used as pcString is defined in the same scope.
Note that ReplaceCharactersInString would not work a second time as it would stop at the first zero. The function could be written to accept a length parameter and loop using the length.
#include <stdio.h>
#include <stdlib.h>
void ReplaceCharactersInString(char *pcString, char cOldChar,char cNewChar){
while ( pcString && *pcString) {//not NULL and not zero
if ( *pcString == cOldChar) {//match
*pcString = cNewChar;//replace
}
++pcString;//advance to next character
}
}
int main ( void) {
char pcString[] = "I am testing";
ReplaceCharactersInString ( pcString, ' ', '\0');
for ( int each = 0; each < sizeof pcString; ++each) {
printf ( "pcString[%02d] = int:%-4d char:%c\n", each, pcString[each], pcString[each]);
}
return 0;
}
You want to split the string into individual tokens separated by spaces such as "I\0am\0testing\0". You can use strtok() for this but this function is error prone. I suggest you allocate an array of pointers and make them point to the words. Note that splitting the source string is sloppy and does not allow for tokens to be adjacent such as in 1+1. You could allocate the strings instead.
Here is an example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char **split_string(const char *str, char *delim) {
size_t i, len, count;
const char *p;
/* count tokens */
p = str;
p += strspn(p, delim); // skip initial delimiters
count = 0;
while (*p) {
count++;
p += strcspn(p, delim); // skip token
p += strspn(p, delim); // skip delimiters
}
/* allocate token array */
char **array = calloc(sizeof(*array, count + 1);
p = str;
p += strspn(p, delim); // skip initial delimiters
for (i = 0; i < count; i++) {
len = strcspn(p, delim); // token length
array[i] = strndup(p, len); // allocate a copy of the token
p += len; // skip token
p += strspn(p, delim); // skip delimiters
}
/* array ends with a null pointer */
array[count] = NULL;
return array;
}
int main() {
const char *pcString = "I am testing";
char **array = split_string(pcString, " \t\r\n");
for (size_t i = 0; array[i] != NULL; i++) {
printf("%zu: %s\n", i, array[i]);
}
return 0;
}
The strtok function pretty much does exactly what you want. It basically replaces the next delimiter with a '\0' character and returns the pointer to the current token. The next time you call strtok, you should pass a NULL argument (see the documentation for strtok) and it will point to the next token, which will again be delimited by '\0'. Read some more examples of correct strtok usage.

Appending words to an array based on a separator

I am trying to break up the sentence "once upon a time" into an array of words. I am doing this via a for loop, detecting three conditions:
It's the end of the loop (add the \0 and break);
It's the separator character (add the \0 and advance to the next word)
It's anything else (add the character)
Here is what I have now:
#include <stdlib.h>
#include <stdio.h>
char ** split_string(char * string, char sep) {
// Allow single separators only for now
// get length of the split string array
int i, c, array_length = 0;
for (int i=0; (c=string[i]) != 0; i++)
if (c == sep) array_length ++;
// allocate the array
char ** array_of_words = malloc(array_length + 1);
char word[100];
for (int i=0, char_num=0, word_num=0;; i++) {
c = string[i];
// if a newline add the word and break
if (c == '\0') {
word[char_num] = '\0';
array_of_words[word_num] = word;
break;
}
// if the separator, add a NUL, increment the word_num, and reset the character counter
if (c == sep) {
word[char_num] = '\0';
array_of_words[word_num] = word;
word_num ++;
char_num = 0;
}
// otherwise, just add the character in the string and increment the character counter
else {
word[char_num] = c;
char_num ++;
}
}
return array_of_words;
}
int main(int argc, char *argv[]) {
char * input_string = "Once upon a time";
// separate the string into a list of tokens separated by the separator
char ** array_of_words;
array_of_words = split_string(input_string, ' ');
printf("The array of words is: ");
// how to get the size of this array? sizeof(array_of_words) / sizeof(array_of_words[0]) gives 1?!
for (int i=0; i < 4 ;i++)
printf("%s[sep]%d", array_of_words[i], i);
return 0;
}
However, instead of printing "once", "upon", "a", "time" at the end, it's printing "time", "time", "time", "time".
Where is the mistake in my code that is causing this?
Here is a working example of the code: https://onlinegdb.com/S1ss6a4Ur
You need to allocate memory for each word, not just for one. char word[100]; only puts aside memory for one word, and once it goes out of scope, the memory is invalid. Instead, you could allocate the memory dynamically:
char* word = malloc(100);
And then, when you found a separator, allocate memory for a new word:
if (c == sep) {
word[char_num] = '\0';
array_of_words[word_num] = word;
word = malloc(100);
Also, this here is incorrect:
char ** array_of_words = malloc(array_length + 1);
You want enough memory for all the char pointers, but you only allocate 1 byte per pointer. Instead, do this:
char ** array_of_words = malloc(sizeof(char*)*(array_length + 1));
The sizeof(array_of_words) / sizeof(array_of_words[0]) works to calculate the amount of elements when array_of_words is an array, because then its size is known at compile time (barring VLAs). It's just a pointer though, so it doesn't work as sizeof(array_of_words) will give you the pointer size. Instead, you'll have to calculate the size on your own. You already do so in the split_string function, so you just need to get that array_of_words out to the main function. There are multiple ways of doing this:
Have it be a global variable
Pass an int* to the function via which you can write the value to a variable in main (this is sometimes called an "out parameter")
Return it along with the other pointer you're returning by wrapping them up in a struct
Don't pass it at all and recalculate it
The global variable solution is the most simple for this small program, just put the int array_length = 0; before the split_string instead of having it inside it.
Last but not least, since we used malloc to allocate memory, we should free it:
for (int i = 0; i < array_length; i++) {
printf("%s[sep]%d", array_of_words[i], i);
free(array_of_words[i]); // free each word
}
free(array_of_words); // free the array holding the pointers to the words
Is strtok not suitable?
char str[] = "once upon a time";
const char delim[] = " ";
char* word = strtok(str, delim);
while(word != NULL)
{
printf("%s\n", word);
word = strtok(NULL, delim);
}

Searching an array for a specific character [duplicate]

I want to write a program in C that displays each word of a whole sentence (taken as input) at a seperate line. This is what I have done so far:
void manipulate(char *buffer);
int get_words(char *buffer);
int main(){
char buff[100];
printf("sizeof %d\nstrlen %d\n", sizeof(buff), strlen(buff)); // Debugging reasons
bzero(buff, sizeof(buff));
printf("Give me the text:\n");
fgets(buff, sizeof(buff), stdin);
manipulate(buff);
return 0;
}
int get_words(char *buffer){ // Function that gets the word count, by counting the spaces.
int count;
int wordcount = 0;
char ch;
for (count = 0; count < strlen(buffer); count ++){
ch = buffer[count];
if((isblank(ch)) || (buffer[count] == '\0')){ // if the character is blank, or null byte add 1 to the wordcounter
wordcount += 1;
}
}
printf("%d\n\n", wordcount);
return wordcount;
}
void manipulate(char *buffer){
int words = get_words(buffer);
char *newbuff[words];
char *ptr;
int count = 0;
int count2 = 0;
char ch = '\n';
ptr = buffer;
bzero(newbuff, sizeof(newbuff));
for (count = 0; count < 100; count ++){
ch = buffer[count];
if (isblank(ch) || buffer[count] == '\0'){
buffer[count] = '\0';
if((newbuff[count2] = (char *)malloc(strlen(buffer))) == NULL) {
printf("MALLOC ERROR!\n");
exit(-1);
}
strcpy(newbuff[count2], ptr);
printf("\n%s\n",newbuff[count2]);
ptr = &buffer[count + 1];
count2 ++;
}
}
}
Although the output is what I want, I have really many black spaces after the final word displayed, and the malloc() returns NULL so the MALLOC ERROR! is displayed in the end.
I can understand that there is a mistake at my malloc() implementation, but I do not know what it is.
Is there another more elegant or generally better way to do it?
http://www.cplusplus.com/reference/clibrary/cstring/strtok/
Take a look at this, and use whitespace characters as the delimiter. If you need more hints let me know.
From the website:
char * strtok ( char * str, const char * delimiters );
On a first call, the function expects a C string as argument for str, whose first character is used as the starting location to scan for tokens. In subsequent calls, the function expects a null pointer and uses the position right after the end of last token as the new starting location for scanning.
Once the terminating null character of str is found in a call to strtok, all subsequent calls to this function (with a null pointer as the first argument) return a null pointer.
Parameters
str
C string to truncate.
Notice that this string is modified by being broken into smaller strings (tokens).
Alternativelly [sic], a null pointer may be specified, in which case the function continues scanning where a previous successful call to the function ended.
delimiters
C string containing the delimiter characters.
These may vary from one call to another.
Return Value
A pointer to the last token found in string.
A null pointer is returned if there are no tokens left to retrieve.
Example
/* strtok example */
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] ="- This, a sample string.";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," ,.-");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ,.-");
}
return 0;
}
For the fun of it here's an implementation based on the callback approach:
const char* find(const char* s,
const char* e,
int (*pred)(char))
{
while( s != e && !pred(*s) ) ++s;
return s;
}
void split_on_ws(const char* s,
const char* e,
void (*callback)(const char*, const char*))
{
const char* p = s;
while( s != e ) {
s = find(s, e, isspace);
callback(p, s);
p = s = find(s, e, isnotspace);
}
}
void handle_word(const char* s, const char* e)
{
// handle the word that starts at s and ends at e
}
int main()
{
split_on_ws(some_str, some_str + strlen(some_str), handle_word);
}
malloc(0) may (optionally) return NULL, depending on the implementation. Do you realize why you may be calling malloc(0)? Or more precisely, do you see where you are reading and writing beyond the size of your arrays?
Consider using strtok_r, as others have suggested, or something like:
void printWords(const char *string) {
// Make a local copy of the string that we can manipulate.
char * const copy = strdup(string);
char *space = copy;
// Find the next space in the string, and replace it with a newline.
while (space = strchr(space,' ')) *space = '\n';
// There are no more spaces in the string; print out our modified copy.
printf("%s\n", copy);
// Free our local copy
free(copy);
}
Something going wrong is get_words() always returning one less than the actual word count, so eventually you attempt to:
char *newbuff[words]; /* Words is one less than the actual number,
so this is declared to be too small. */
newbuff[count2] = (char *)malloc(strlen(buffer))
count2, eventually, is always one more than the number of elements you've declared for newbuff[]. Why malloc() isn't returning a valid ptr, though, I don't know.
You should be malloc'ing strlen(ptr), not strlen(buf). Also, your count2 should be limited to the number of words. When you get to the end of your string, you continue going over the zeros in your buffer and adding zero size strings to your array.
Just as an idea of a different style of string manipulation in C, here's an example which does not modify the source string, and does not use malloc. To find spaces I use the libc function strpbrk.
int print_words(const char *string, FILE *f)
{
static const char space_characters[] = " \t";
const char *next_space;
// Find the next space in the string
//
while ((next_space = strpbrk(string, space_characters)))
{
const char *p;
// If there are non-space characters between what we found
// and what we started from, print them.
//
if (next_space != string)
{
for (p=string; p<next_space; p++)
{
if(fputc(*p, f) == EOF)
{
return -1;
}
}
// Print a newline
//
if (fputc('\n', f) == EOF)
{
return -1;
}
}
// Advance next_space until we hit a non-space character
//
while (*next_space && strchr(space_characters, *next_space))
{
next_space++;
}
// Advance the string
//
string = next_space;
}
// Handle the case where there are no spaces left in the string
//
if (*string)
{
if (fprintf(f, "%s\n", string) < 0)
{
return -1;
}
}
return 0;
}
you can scan the char array looking for the token if you found it just print new line else print the char.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char *s;
s = malloc(1024 * sizeof(char));
scanf("%[^\n]", s);
s = realloc(s, strlen(s) + 1);
int len = strlen(s);
char delim =' ';
for(int i = 0; i < len; i++) {
if(s[i] == delim) {
printf("\n");
}
else {
printf("%c", s[i]);
}
}
free(s);
return 0;
}
char arr[50];
gets(arr);
int c=0,i,l;
l=strlen(arr);
for(i=0;i<l;i++){
if(arr[i]==32){
printf("\n");
}
else
printf("%c",arr[i]);
}

C Language -> Separate words from a paragraph

My function save_words receives armazena and size. Armazena is a dynamic array which contains paragraphs, and size it's the size of the array. In this function i want to put word to word in other dynamic array called words. When i run it, it crashes.
I appreciate your help.
char **save_words(char **armazena, int *size)
{
char *token = NULL;
char** armazena_aux = armazena;
int i, count=0;
char **words = (char**) malloc(sizeof(char*)*(10));
for(i=0; i<size; i++)
{
token = strtok(*(armazena+i)," .?!,");
while( token != NULL )
{
int tam = strlen(token);
armazena[count] = (char*) malloc(tam+2);
strcpy(armazena[count],token);
armazena[count][tam+1]='\0';
count++;
token = strtok(NULL, " .?!,");
if (count%10==0)
{
words = realloc(words, sizeof(char*)*(count + 10));
}
}
}
return words;
}
Is armazena[count] = (char*) malloc(tam+2); what you want? I would have thought words[count] = ...;. The first time through the outer loop is ok, because you hoist armazena[0] into strtok, but if it contains more than one word, your second time through the outer loop will be processing strings generated from the first time.
Worse, if that first string contained more words than the armazena vector can accommodate, you will be corrupting something...

Resources