I usually try hard and harder to solve myself any bugs I find in my code, but this one is totally out of any logic for me. It works really fine with whatever strings and char separators, but only with that useless printf inside the while of the function, otherwise it prints
-> Lorem
then
-> ▼
and crashes aftwerwards. Thanks in advance to anyone that could tell me what is happening.
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <stdint.h>
char **strsep_(char *str, char ch) {
// Sub-string length
uint8_t len = 0;
// The number of sub-strings found means the same as the position where it will be stored in the main pointer
// Obviously, the number tends to increase over time, and at the end of the algorithm, it means the main pointer length too
uint8_t pos = 0;
// Storage for any found sub-strings and one more byte as the pointer is null-terminated
char **arr = (char**)malloc(sizeof(char **) + 1);
while (*str) {
printf("Erase me and it will not work! :)\n");
if (*str == ch) {
// The allocated memory should be one step ahead of the current usage
arr = realloc(arr, sizeof(char **) * pos + 1);
// Allocates enough memory in the current main pointer position and the '\0' byte
arr[pos] = malloc(sizeof(char *) * len + 1);
// Copies the sub-string size (based in the length number) into the previously allocated space
memcpy(arr[pos], (str - len), len);
// `-_("")_-k
arr[pos][len] = '\0';
len = 0;
pos++;
} else {
len++;
}
*str++;
}
// Is not needed to reallocate additional memory if no separator character was found
if (pos > 0) arr = realloc(arr, sizeof(char **) * pos + 1);
// The last chunk of characters after the last separator character is properly allocated
arr[pos] = malloc(sizeof(char *) * len + 1);
memcpy(arr[pos], (str - len), len);
// To prevent undefined behavior while iterating over the pointer
arr[++pos] = NULL;
return arr;
}
void strsep_free_(char **arr) {
char **aux = arr;
while (*arr) {
free(*arr);
*arr = NULL;
arr++;
}
// One more time to fully deallocate the null-terminated pointer
free(*arr);
*arr = NULL;
arr++;
// Clearing The pointer itself
free(aux);
aux = NULL;
}
int main(void) {
char **s = strsep_("Lorem ipsum four words", ' ');
char **i = s;
while (*i != NULL) {
printf("-> %s\n", *i);
i++;
}
strsep_free_(s);
}
Your program has undefined behavior, which means it may behave in unexpected ways, but could by chance behave as expected. Adding the extra printf changes the behavior in a way the seems to correct the bug, but only by coincidence. On a different machine, or even on the same machine at a different time, the behavior may again change.
There are multiple bugs in your program that lead to undefined behavior:
You are not allocating the array with the proper size: it should have space fpr pos + 1 pointers, hence sizeof(char **) * (pos + 1). The faulty statements are: char **arr = (char**)malloc(sizeof(char **) + 1); and arr = realloc(arr, sizeof(char **) * pos + 1);.
Furthermore, the space allocated for each substring is incorrect too: arr[pos] = malloc(sizeof(char *) * len + 1); should read arr[pos] = malloc(sizeof(char) * len + 1);, which by definition is arr[pos] = malloc(len + 1);. This does not lead to undefined behavior, you just allocate too much memory. If your system supports it, allocation and copy can be combined in one call to strndup(str - len, len).
You never check for memory allocation failure, causing undefined behavior in case of memory allocation failure.
Using uint8_t for len and pos is risky: what if the number of substrings exceeds 255? pos and len would silently wrap back to 0, producing unexpected results and memory leaks. There is no advantage at using such a small type, use int or size_t instead.
Here is a corrected version:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char **strsep_(const char *str, char ch) {
// Sub-string length
int len = 0;
// The number of sub-strings found, index where to store the NULL at the end of the array.
int pos = 0;
// return value: array of pointers to substrings with an extra slot for a NULL terminator.
char **arr = (char**)malloc(sizeof(*arr) * (pos + 1));
if (arr == NULL)
return NULL;
for (;;) {
if (*str == ch || *str == '\0') {
// alocate the substring and reallocate the array
char *p = malloc(len + 1);
char **new_arr = realloc(arr, sizeof(*arr) * (pos + 2));
if (new_arr == NULL || p == NULL) {
// allocation failure: free the memory allocated so far
free(p);
if (new_arr)
arr = new_arr;
while (pos-- > 0)
free(arr[pos]);
free(arr);
return NULL;
}
arr = new_arr;
memcpy(p, str - len, len);
p[len] = '\0';
arr[pos] = p;
pos++;
len = 0;
if (*str == '\0')
break;
} else {
len++;
}
str++;
}
arr[pos] = NULL;
return arr;
}
void strsep_free_(char **arr) {
int i;
// Free the array elements
for (i = 0; arr[i] != NULL; i++) {
free(arr[i]);
arr[i] = NULL; // extra safety, not really needed
}
// Free The array itself
free(arr);
}
int main(void) {
char **s = strsep_("Lorem ipsum four words", ' ');
int i;
for (i = 0; s[i] != NULL; i++) {
printf("-> %s\n", s[i]);
}
strsep_free_(s);
return 0;
}
Output:
-> Lorem
-> ipsum
-> four
-> words
The probable reason for the crash is most likely this: realloc(arr, sizeof(char **) * pos + 1).
That is the same as realloc(arr, (sizeof(char **) * pos) + 1) which does not allocate enough space for your "array". You need to do realloc(arr, sizeof(char **) * (pos + 1)).
Same with the allocation for arr[pos], you need to use parentheses correctly there too.
Good answer from #chqrlie. From my side, I think it would be better to count everything before copy, it should help to avoid realloc.
#include <string.h>
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
int count_chars(const char *str, const char ch)
{
int i;
int count;
i = 0;
count = 0;
if (*str == ch)
str++;
while (str[i] != ch && str[i] != '\0')
{
count++;
i++;
}
return (count);
}
int count_delimeter(const char *str, const char ch)
{
int i = 0;
int count = 0;
while (str[i])
{
if (str[i] == ch && str[i + 1] != ch)
count++;
i++;
}
return count;
}
char** strsep_(const char *str, const char ch)
{
char **arr;
int index = 0;
int size = 0;
int i = 0;
size = count_delimeter(str, ch) + 1;
if ((arr = malloc(sizeof(char *) * (size + 1))) == NULL)
return (NULL);
arr[size] = NULL;
while (i < size)
{
if (str[index] == ch)
index++;
if (str[index] && str[index] == ch && str[index + 1] == ch)
{
while (str[index] && str[index] == ch && str[index + 1] == ch)
index++;
index++;
}
int len = count_chars(&str[index], ch);
if ((arr[i] = malloc(sizeof(char) * (len + 1))) == NULL)
return NULL;
memcpy(arr[i], &str[index], len);
index += len;
arr[i++][len] = '\0';
}
return arr;
}
int main(void)
{
char *str = "Lorem ipsum ipsum Lorem lipsum gorem insum";
char **s = strsep_(str, ' ');
/* char *str = "Lorem + Ipsum"; */
/* char **s = strsep_(str, '+'); */
/* char *str = "lorem, torem, horem, lorem"; */
/* char **s = strsep_(str, ','); */
while (*s != NULL) {
printf("-> [%s]\n", *s);
s++;
}
/* dont forget to free */
return 0;
}
Related
I'm trying to solve a challenge, but I have no idea of what's wrong with my code!
The challenge is:
Create a function that splits a string of characters into words.
Separators are spaces, tabs and line breaks.
This function returns an array where each box contains a character-string’s address represented by a word. The last element of this array should be equal to 0 to emphasise the end of the array.
There can’t be any empty strings in your array. Draw the necessary conclusions.
The given string can’t be modified.
Note: The only allowed function is malloc()
The bug/problem:
I faced this problem and I tried to solve it but I wasn't able to identify what's wrong.
I created a function named split_whitespaces() to do the job.
When I print the array of strings inside of the split_whitespaces function, I get the following output:
Inside the function:
arr_str[0] = This
arr_str[1] = is
arr_str[2] = just
arr_str[3] = a
arr_str[4] = test!
And when I print the array of string inside the main function, I get the following output:
Inside the main function:
arr_str[0] = #X#?~
arr_str[1] = `X#?~
arr_str[2] = just
arr_str[3] = a
arr_str[4] = test!
I created a function word_count to count how many words in the input string so I can allocate memory using malloc and with word_count + 1 (null pointer).
int word_count(char *str) {
int i;
int w_count;
int state;
i = 0;
w_count = 0;
state = 0;
while (str[i]) {
if (!iswhitespace(str[i])) {
if (!state)
w_count++;
state = 1;
i++;
} else {
state = 0;
i++;
}
}
return (w_count);
}
And another function called strdup_w to mimic the behavior of strdup but just for single words:
char *strdup_w(char *str, int *index) {
char *word;
int len;
int i;
i = *index;
len = 0;
while (str[i] && !iswhitespace(str[i]))
len++, i++;;
word = (char *) malloc(len + 1);
if (!word)
return (NULL);
i = 0;
while (str[*index]) {
if (!iswhitespace(str[*index])) {
word[i++] = str[*index];
(*index)++;
} else
break;
}
word[len] = '\0';
return (word);
}
Here's my full code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char **split_whitespaces(char *str);
char *strdup_w(char *str, int *index);
int word_count(char *str);
int iswhitespace(char c);
int main(void) {
char *str = "This is just a test!";
char **arr_str;
int i;
i = 0;
arr_str = split_whitespaces(str);
printf("\nOutside the function:\n");
while (arr_str[i]) {
printf("arr_str[%d] = %s\n", i, arr_str[i]);
i++;
}
return (0);
}
char **split_whitespaces(char *str) {
char **arr_str;
int i;
int words;
int w_i;
i = 0;
w_i = 0;
words = word_count(str);
arr_str = (char **)malloc(words + 1);
if (!arr_str)
return (NULL);
printf("Inside the function:\n");
while (w_i < words) {
while (iswhitespace(str[i]) && str[i])
if (!str[i++])
break;
arr_str[w_i] = strdup_w(str, &i);
printf("arr_str[%d] = %s\n", w_i, arr_str[w_i]);
w_i++;
}
arr_str[words] = 0;
return (arr_str);
}
char *strdup_w(char *str, int *index) {
char *word;
int len;
int i;
i = *index;
len = 0;
while (str[i] && !iswhitespace(str[i]))
len++, i++;;
word = (char *)malloc(len + 1);
if (!word)
return (NULL);
i = 0;
while (str[*index]) {
if (!iswhitespace(str[*index])) {
word[i++] = str[*index];
(*index)++;
} else
break;
}
word[len] = '\0';
return (word);
}
int word_count(char *str) {
int i;
int w_count;
int state;
i = 0;
w_count = 0;
state = 0;
while (str[i]) {
if (!iswhitespace(str[i])) {
if (!state)
w_count++;
state = 1;
i++;
} else {
state = 0;
i++;
}
}
return (w_count);
}
int iswhitespace(char c) {
if (c == ' ' || c == '\t' || c == '\n' || c == '\r')
return (1);
return (0);
}
I'm sorry, if anything this is my first time trying to seek help.
There are multiple problems in the code:
the size is incorrect in arr_str = (char **)malloc(words + 1); You must multiply the number of elements by the size of the element:
arr_str = malloc(sizeof(*arr_str) * (words + 1));
it is good style to free the array in the main() function after use.
the test while (iswhitespace(str[i]) && str[i]) is redundant: if w_count is computed correctly, testing str[i] should not be necessary. You should use strspn() to skip the white space and strcspn() to skip the word characters.
if (!str[i++]) break; is completely redundant inside the loop: str[i] has already been tested and is not null.
while (str[i] && !iswhitespace(str[i])) len++, i++;; is bad style. Use braces if there is more than a single simple statement in the loop body.
the last loop in strdup_w is complicated, you could simply use memcpy(word, str + *index, len); *index += len;
Here is a modified version:
#include <stdio.h>
#include <stdlib.h>
char **split_whitespaces(const char *str);
char *strdup_w(const char *str, int *index);
int word_count(const char *str);
int iswhitespace(char c);
int main(void) {
const char *str = "This is just a test!";
char **arr_str;
int i;
arr_str = split_whitespaces(str);
if (arr_str) {
printf("\nOutside the function:\n");
i = 0;
while (arr_str[i]) {
printf("arr_str[%d] = %s\n", i, arr_str[i]);
i++;
}
while (i --> 0) {
free(arr_str[i]);
}
free(arr_str);
}
return 0;
}
char **split_whitespaces(const char *str) {
char **arr_str;
int i;
int words;
int w_i;
i = 0;
w_i = 0;
words = word_count(str);
arr_str = malloc(sizeof(*arr_str) * (words + 1));
if (!arr_str)
return NULL;
printf("Inside the function:\n");
while (w_i < words) {
while (iswhitespace(str[i]))
i++;
arr_str[w_i] = strdup_w(str, &i);
if (!arr_str[w_i])
break;
printf("arr_str[%d] = %s\n", w_i, arr_str[w_i]);
w_i++;
}
arr_str[words] = NULL;
return arr_str;
}
char *strdup_w(const char *str, int *index) {
char *word;
int len;
int start;
int i;
i = *index;
start = i;
while (str[i] && !iswhitespace(str[i])) {
i++;
}
*index = i;
len = i - start;
word = malloc(len + 1);
if (!word)
return NULL;
i = 0;
while (i < len) {
word[i] = str[start + i];
i++;
}
word[i] = '\0';
return word;
}
int word_count(const char *str) {
int i;
int w_count;
int state;
i = 0;
w_count = 0;
state = 0;
while (str[i]) {
if (!iswhitespace(str[i])) {
if (!state)
w_count++;
state = 1;
} else {
state = 0;
}
i++;
}
return w_count;
}
int iswhitespace(char c) {
return (c == ' ' || c == '\t' || c == '\n' || c == '\r');
}
From my top comment ...
In split_whitespaces, try changing:
arr_str = (char **) malloc(words + 1);
into:
arr_str = malloc(sizeof(*arr_str) * (words + 1));
As you have it, words is a count and not a byte length, so you're not allocating enough space, so you have UB.
UPDATE:
But watched some tutorials and they said that malloc takes one argument which is the size of the memory to be allocated (in bytes), that's why I allocated memory for 5 bytes! can you please tell my an alternative of using malloc without sizeof() function. I'll appreciate it. – Achraf EL Khnissi
There's really no clean way to specify this without sizeof.
sizeof is not a function [despite the syntax]. It is a compiler directive. It "returns" the number of bytes occupied by its argument as a compile time constant.
If we have char buf[5];, there are 5 bytes, so sizeof(buf) [or sizeof buf] is 5.
If we have: int buf[5];, there are 5 elements, each of size int which is [typically] 4 bytes, so the total space, in bytes, is sizeof(int) * 5 or 4 * 5 which is 20.
But, int can vary depending on the architecture. On Intel 8086's [circa the 1980's], an int was 2 bytes (i.e. 16 bits). So, the above 4 * 5 would be wrong. It should be 2 * 5.
If we use sizeof(int), then sizeof(int) * 5 works regardless of the architecture.
Similarly, on 32 bit machines, a pointer is [usually] 32 bits. So, sizeof(char *) is 4 [bytes]. On a 64 bit machine, a pointer is 64 bits, which is 8 bytes. So, sizeof(char *) is 8.
Because arr_str is: char **arr_str, we could have done:
arr_str = malloc(sizeof(char *) * (words + 1));
But, if the definition of arr_str ever changed (to (e.g.) struct string *arr_str;), then what we just did would break/fail if we forgot to change the assignment to:
arr_str = malloc(sizeof(struct string) * (words + 1));
So, doing:
arr_str = malloc(sizeof(*arr_str) * (words + 1));
is a preferred idiomatic way to write cleaner code. More statements will adjust automatically without having to find all affected lines of code manually.
UPDATE #2:
You might just add why you removed the (char **) cast :) -- chqrlie
Note that I removed the (char **) cast. See: Do I cast the result of malloc?
This just adds extra/unnecessary "stuff" as the void * return value of malloc can be assigned to any type of pointer.
If we forgot to do: #include <stdlib.h>, there would be no function prototype for malloc, so the compiler would default the return type to int.
Without the cast, the compiler would issue an an error on the statement [which is what we want].
With the cast, this action is masked at compile time [more or less]. On a 64 bit machine, the compiler will use a value that is truncated to 32 bits [because it thinks malloc returns a 32 bit value] instead of the full 64 bit return value of malloc.
This truncation is a "silent killer". What should have been flagged as a compile time error produces a runtime fault (probably segfault or other UB) that is much harder to debug.
My problem now is that I have taken space for different words,but I'm having problems storing this as an array. Even though there are some similar posts like this, nothing seems to work for me and I'm completely stuck here. I want to keep this format(i don't want to change the definition of the function). Grateful for all help and comments!
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int i, len = 0, counter = 0;
char ** p = 0;
for(i = 0; s[i] != '\0'; i++){
len++;
if(s[i] == ' ' || s[i+1] == '\0'){
counter ++;
for(i = 0; i < len; i++){
p[i] = s[i];
}
}
printf("%d\n", len);
printf("%d\n", counter);
return p;
}
int main() {
char *s = "This is a string";
int n;
int i;
for(i = 0; i < n*; i++){
//also not sure how to print this
}
}
I edited your code and it's now working correctly:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char** split(const char* s, int *n);
char** split(const char* s, int *n) {
int i, len = 0, counter = 0;
char ** p = 0;
for(int i = 0; ; ++i) {
if(s[i] == '\0') {
break;
}
if(s[i] == ' ') {
counter += 1;
}
}
++counter;
p = (char **) malloc(counter * sizeof(char*));
for(int i = 0, c = 0; ; ++i, ++c) {
if(s[i] == '\0') {
break;
}
len = 0;
while(s[len + i + 1] != ' ' && s[len + i + 1] != '\0') {
++len;
}
p[c] = (char *) malloc(len * sizeof(char) + 1);
int k = 0;
for(int j = i; j < i + len + 1; ++j) {
p[c][k++] = s[j];
}
p[c][k] = '\0';
i += len + 1;
}
*n = counter;
return p;
}
int main() {
char *s = "This is a string";
int n;
int i;
char** split_s = split(s, &n);
for(i = 0; i < n; i++) {
printf("%s\n", split_s[i]);
}
}
But I suggest you do a little bit clean-up.
Here is a solution using sscanf. scanf and sscanf considers space as an end of input. I have taken benefit of that to make it work for you.
char *str = (char*) "This is a string";
char buffer[50];
char ** p = (char**)malloc(1 * sizeof(*p));
for (int i = 0; str[0] != NULL; i++)
{
if (i > 0)
{
p = (char**)realloc(p, i * sizeof(p));
}
sscanf(str, "%s", buffer);
int read = strlen(buffer);
str += read + 1;
p[i] = (char*)malloc(sizeof(char)*read + 1);
strcpy(p[i], buffer);
printf("%s\n", p[i]);
}
Since this pointer is growing in both the dimensions, every time a new string is found we need to resize the p itself and then the new address that it contains should be resized too .
My problem now is that I have taken space for different words using malloc, but I'm having problems storing this as an array.
When addressable memory for a collection of strings is needed, then a collection of pointers, as well as memory for each pointer needed.
In your code:
p = (char**)malloc(counter*sizeof(char*));
You have created the collection of pointers, but you have not yet created memory at those locations to accommodate the strings. (By the way, the cast is not necessary)
Here are the essential steps to both create a collection of pointers, and memory for each:
//for illustration, pick sizes for count of strings needed,
//and length of longest string needed.
#define NUM_STRINGS 5
#define STR_LEN 80
char **stringArray = NULL;
stringArray = malloc(NUM_STRINGS*sizeof(char *));// create collection of pointers
if(stringArray)
{
for(int i=0;i<NUM_STRINGS;i++)
{
stringArray[i] = malloc(STR_LEN + 1);//create memory for each string
if(!stringArray[i]) //+1 room for nul terminator
{
//handle error
}
}
}
As a function it could look like this: (replacing malloc with calloc for initialized space)
char ** Create2DStr(size_t numStrings, size_t maxStrLen)
{
int i;
char **a = {0};
a = calloc(numStrings, sizeof(char *));
for(i=0;i<numStrings; i++)
{
a[i] = calloc(maxStrLen + 1, 1);
}
return a;
}
using this in your split() function:
char** split(const char* s, int *n){
int i, len = 0, counter = 0, lenLongest = 0
char ** p = 0;
//code to count words and longest word
p = Create2DStr(counter, longest + 1); //+1 for nul termination
if(p)
{
//your searching code
//...
// when finished, free memory
Let's start at the logic.
How does a string like A quick brown fox. get processed? I would suggest:
Count the number of words, and the amount of memory needed to store the words. (In C, each string ends with a terminating nul byte, \0.)
Allocate enough memory for the pointers and the words.
Copy each word from the source string.
We have a string as an input, and we want an array of strings as output. The simplest option is
char **split_words(const char *source);
where the return value is NULL if an error occurs, or an array of pointers terminated by a NULL pointer otherwise. All of it is dynamically allocated at once, so calling free() on the return value will free both the pointers and their contents.
Let's start implementing the logic according to the bullet points above.
#include <stdlib.h>
char **split_words(const char *source)
{
size_t num_chars = 0;
size_t num_words = 0;
size_t w = 0;
const char *src;
char **word, *data;
/* Sanity check. */
if (!source)
return NULL; /* split_words(NULL) will return NULL. */
/* Count the number of words in source (num_words),
and the number of chars needed to store
a copy of each word (num_chars). */
src = source;
while (1) {
/* Skip any leading whitespace (not just spaces). */
while (*src == '\t' || *src == '\n' || *src == '\v' ||
*src == '\f' || *src == '\r' || *src == ' ')
src++;
/* No more words? */
if (*src == '\0')
break;
/* We have one more word. Account for the pointer itself,
and the string-terminating nul char. */
num_words++;
num_chars++;
/* Count and skip the characters in this word. */
while (*src != '\0' && *src != '\t' && *src != '\n' &&
*src != '\v' && *src != '\f' && *src != '\r' &&
*src != ' ') {
src++;
num_chars++;
}
}
/* If the string has no words in it, return NULL. */
if (num_chars < 1)
return NULL;
/* Allocate memory for both the pointers and the data.
One extra pointer is needed for the array-terminating
NULL pointer. */
word = malloc((num_words + 1) * sizeof (char *) + num_chars);
if (!word)
return NULL; /* Not enough memory. */
/* Since 'word' is the return value, and we use
num_words + 1 pointers in it, the rest of the memory
we allocated we use for the string contents. */
data = (char *)(word + num_words + 1);
/* Now we must repeat the first loop, exactly,
but also copy the data as we do so. */
src = source;
while (1) {
/* Skip any leading whitespace (not just spaces). */
while (*src == '\t' || *src == '\n' || *src == '\v' ||
*src == '\f' || *src == '\r' || *src == ' ')
src++;
/* No more words? */
if (*src == '\0')
break;
/* We have one more word. Assign the pointer. */
word[w] = data;
w++;
/* Count and skip the characters in this word. */
while (*src != '\0' && *src != '\t' && *src != '\n' &&
*src != '\v' && *src != '\f' && *src != '\r' &&
*src != ' ') {
*(data++) = *(src++);
}
/* Terminate this word. */
*(data++) = '\0';
}
/* Terminate the word array. */
word[w] = NULL;
/* All done! */
return word;
}
We can test the above with a small test main():
#include <stdio.h>
int main(int argc, char *argv[])
{
char **all;
size_t i;
all = split_words(" foo Bar. BAZ!\tWoohoo\n More");
if (!all) {
fprintf(stderr, "split_words() failed.\n");
exit(EXIT_FAILURE);
}
for (i = 0; all[i] != NULL; i++)
printf("all[%zu] = \"%s\"\n", i, all[i]);
free(all);
return EXIT_SUCCESS;
}
If we compile and run the above, we get
all[0] = "foo"
all[1] = "Bar."
all[2] = "BAZ!"
all[3] = "Woohoo"
all[4] = "More"
The downside of this approach (of using one malloc() call to allocate memory for both the pointers and the data), is that we cannot easily grow the array; we can really just treat it as one big clump.
A better approach, especially if we intend to add new words dynamically, is to use a structure:
typedef struct {
size_t max_words; /* Number of pointers allocated */
size_t num_words; /* Number of words in array */
char **word; /* Array of pointers */
} wordarray;
Unfortunately, this time we need to allocate each word separately. However, if we use a structure to describe each word in a common allocation buffer, say
typedef struct {
size_t offset;
size_t length;
} wordref;
typedef struct {
size_t max_words;
size_t num_words;
wordref *word;
size_t max_data;
size_t num_data;
char *data;
} wordarray;
#define WORDARRAY_INIT { 0, 0, NULL, 0, 0, NULL }
static inline const char *wordarray_word_ptr(wordarray *wa, size_t i)
{
if (wa && i < wa->num_words)
return wa->data + wa->word[i].offset;
else
return "";
}
static inline size_t wordarray_word_len(wordarray *wa, size_t i)
{
if (wa && i < wa->num_words)
return wa->word[i].length;
else
return 0;
}
The idea is that if you declare
wordarray words = WORDARRAY_INIT;
you can use wordarray_word_ptr(&words, i) to get a pointer to the ith word, or a pointer to an empty string if ith word does not exist yet, and wordarray_word_len(&words, i) to get the length of that word (much faster than calling strlen(wordarray_word_ptr(&words, i))).
The underlying reason why we cannot use char * here, is that realloc()ing the data area (where the word pointers would point to) may change its address. If that were to happen, we'd have to adjust every pointer in our array. It is much easier to use offsets to the data area instead.
The only downside to this approach is that deleting words does not mean a corresponding shrinkage in the data area. However, it is possible to write a simple "compactor" function, that repacks the data to a new area, so that holes left by deleted words are "moved" to the end of the data area. Usually, this is not necessary, but you might wish to add a member to the wordarray structure, say the number of lost characters from word deletions, so that the compaction can be done heuristically the next time the data area would be otherwise resized.
Could someone please explain the error?
I was getting the error until, on a whim, I changed the line from:
char *tmp = realloc(str, sizeof(char)*length);
// to added 1
char *tmp = realloc(str, sizeof(char) * length + 1);
I thought that multiplying sizeof(char) by length would reallocate a new memory area of size=sizeof(char)*length. I'm not understanding why adding 1 fixes the problem.
void edit_print(char *inputStr, size_t space_size) {
size_t ch_position = 0;
size_t space_column_count = 0;
size_t num_spaces_left = 0;
while ((inputStr[ch_position] != '\0')) {
if ((inputStr[ch_position] == '\t') && (space_size !=0)) {
num_spaces_left = (space_size-(space_column_count % space_size));
if (ch_position == 0 || !(num_spaces_left)) {
for (size_t i=1; i <= space_size; i++) {
putchar(' ');
space_column_count++;
}
ch_position++;
} else {
for (size_t i=1; i <= num_spaces_left; i++) {
putchar(' ');
space_column_count++;
}
ch_position++;
}
} else {
putchar(inputStr[ch_position++]);
space_column_count++;
}
}
printf("\n");
}
int main(int argc, char *argv[]) {
size_t space_size_arg = 3;
int inputch;
size_t length = 0;
size_t size = 10;
char *str = realloc(NULL, sizeof(char) * size);
printf("Enter stuff\n");
while ((inputch = getchar()) != EOF) {
if (inputch == '\n') {
str[length++] = '\0';
//changed line below
char *tmp = realloc(str, sizeof(char) * length + 1);
if (tmp == NULL) {
exit(0);
} else {
str = tmp;
}
edit_print(str, space_size_arg);
length = 0;
} else {
str[length++] = inputch;
if (length == size) {
char *tmp = realloc(str, sizeof(char) * (size += 20));
if (tmp == NULL) {
exit(0);
} else {
str = tmp;
}
}
}
}
free(str);
return 0;
}
EDIT: the error message I original got was the one in the heading of this post. After making the changes suggested by chux, the error is "realloc(): invalid next size: *hexnumber**"
size needs updating when inputch == '\n'.
char *tmp = realloc(str, sizeof(char) * length + 1 /* or no +1 */); can shrink the allocation. which makes a later if (length == size) invalid (the true allocation size is smaller) and so str[length++] = inputch; lost memory access protection. Update size to fix that hole.
+1 not needed - it simply hid the problem as the + 1 did not shrink the allocation as much.
char *tmp = realloc(str, sizeof(char) * length);
if (tmp == NULL) {
exit(0);
} else {
str = tmp;
}
size = length; // add
Concerning sizeof(char)* code. The idea of scaling by the size of the target type is good, yet with char it is not important as it is always 1. #Lee Daniel Crocker
If code wants to reflect that the type of the target may change, do not use size(the_type), use sizeof(*the_pointer). Easier to code, review and maintain.
// Don't even need to code the type `str` points to
tmp = realloc(str, sizeof *str * length);
I am trying to reverse words in a string and I believe that I have written the right logic, but while debugging I came to know that the value that I am putting in new_array char pointer variable is getting lost? And I don't seem to have any idea why?
Can you tell me what is the mistake I am doing and what can be done to correct it?
#include <stdio.h>
void reverse_words(char *arr, int size) {
char *ptr = arr;
char *new_array = (char*)malloc(sizeof(char*) * size);
while (*ptr != '\0') {
ptr++;
}
ptr--;
for (int i = size - 1; i >= 0; i--) {
if (*ptr != ' ') {
ptr--;
} else {
char *temp = ptr;
temp++;
//
// Problem is in this block new_array value is lost when i increment it
while (*temp != ' ' && *temp != '\0') {
*new_array = *temp;
new_array++;
temp++;
}
if (i != 0) {
*new_array = *ptr;
ptr--;
}
}
}
*new_array = '\0';
strcpy(arr, new_array);
return;
}
int main() {
char arr[] = "My job is coding";
int size = sizeof(arr);
reverse_words(arr, size);
printf("%s", arr);
return 0;
}
Your code is too complicated and has several problems:
you allocate too much memory: (char*)malloc(sizeof(char*) * size); allocates size times the size of a pointer. Use malloc(size); to allocate size bytes.
there are errors in your pointer manipulations,
you forget to free the allocated memory, causing a memory leak.
you do not need to pass the size of the array, the length of the string is computed by scanning for the null terminator, just allocate one extra byte for the null terminator.
There is an alternative solution to reverse the words int the string in place without memory allocation:
for each word, reverse the word
final step: reverse the string
Here is the code using a utility function:
#include <stdio.h>
void reverse_mem(char *str, int size) {
for (int i = 0, j = size; i < --j; i++) {
char c = str[i];
str[i] = str[j];
str[j] = c;
}
}
void reverse_words(char *arr) {
for (int i = 0, j = 0;; i = j) {
for (; str[i] == ' '; i++)
continue;
if (str[i] == '\0')
break;
for (j = i; str[j] != '\0' && str[j] != ' '; j++)
continue;
reverse_mem(str + i, j - i);
}
reverse_mem(str, size);
}
int main(void) {
char arr[] = "My job is coding";
reverse_words(arr);
printf("%s\n", arr);
return 0;
}
you are not getting output as expected because of this statement ..
*new_array = *temp; //2nd time new_array is not pointing previous location so how will you retrieve data
new_array++;// you lost previous data bcz pointers no longer holds previous address
when main loop fail new_array pointer will point to last but it should holds starting address. Instead of incrementing new_array directly do like
* (new_array + j) = *temp
I modified your code as . .
void reverse_words(char *arr,int size)
{
char * ptr = arr;
char *new_array = (char*)malloc(sizeof(char*) * size);
printf("initial : %u \n",ptr);
while(*ptr != '\0')
{
ptr++;
}
int j=0;
ptr--;
printf("first : %c : %u \n",*ptr,ptr);
for(int i = size-1;i >=0;i--)
{
if(*ptr != ' ' ) //&& i!=0)
{
ptr--;//from last to back
}
else
{
char * temp = ptr;
temp++;
while(*temp != '\0' && *temp!=' ')
{
new_array[j++]=*temp;
temp++;
}
if(i!=0)
{
new_array[j++] = *ptr;
ptr--;
}
}
}
new_array[j] = '\0';
printf("new = %s \n",new_array);
strcpy(arr,new_array);
}
I hope you got where it's going wrong, Modify the condition for first words because once i becomes 0 you are not writing any logic for that.
My Solution according to your requirements
void reverse_words(char *arr,int size) {
char *new_array = malloc(size * sizeof(char));
int i,j,k=0;
for(i = size-1 ;i >= -1 ; i--) {
if(arr[i] ==' ' || i == -1)
{
for(j=i+1;arr[j]!=' ' && arr[j]!='\0'; j++)
new_array[k++] = arr[j];
new_array[k++] = ' ';
}
}
new_array[k] = '\0';
strcpy(arr,new_array);
}
The code below reads characters and splits them into C-style strings when a delimiter is encountered, then it stores the words (white-space-separated sequences of characters) in string array till a sentinel is encountered; updates size of string array:
#include <stdio.h> // printf()
#include <stdlib.h> // malloc(); realloc()
#include <string.h> // strcmp()
#include <stddef.h> // size_t
void print_array(char* arr[ ], size_t size); // forward declaration to use in to_array()
char* get_word(char delimiter)
{
size_t size = 8;
size_t index = 0;
int c = 0;
char* word = 0;
char* expand_word = 0;
word = (char*) malloc(sizeof(char) * size);
if (word == NULL)
{
perror("get_word::bad malloc!\n");
exit(-1);
}
while ((c = getchar()) != EOF && c != delimiter && c != '\n')
{
if (index >= size)
{
size *= 2;
expand_word = (char*) realloc(word, sizeof(char) * size);
if (expand_word == NULL)
{
perror("get_word::bad realloc!\n");
exit(-1);
}
word = expand_word;
}
word[index++] = c;
}
word[index] = 0;
return word;
}
//-------------------------------------------------------------------------------------
void to_array(char* arr[ ], size_t* size, char* sentinel)
{
size_t index = 0;
char* word = 0;
char** expand_arr = 0;
char delimiter = ' ';
while ((word = get_word(delimiter)) && strcmp(word, sentinel) != 0)
{
if (index >= (*size))
{
(*size) *= 2;
expand_arr = (char**) realloc(arr, sizeof(char*) * (*size));
if (expand_arr == NULL)
{
perror("to_array::bad realloc!\n");
exit(-1);
}
arr = expand_arr;
}
arr[index++] = word;
}
(*size) = index;
// print_array(arr, *size); // <---- here, all words printed OK.
// getchar();
}
//-------------------------------------------------------------------------------------
void print_array(char* arr[ ], size_t size)
{
size_t i = 0;
printf("{ ");
for (i; i < size; ++i)
{
printf("%s", arr[i]);
if (i < size - 1)
{
printf(", ");
}
}
printf(" }\n");
}
//-------------------------------------------------------------------------------------
int main()
{
size_t size = 4;
char** arr = 0;
char* sentinel = "quit";
arr = (char**) malloc(sizeof(char*) * size);
if (arr == NULL)
{
perror("array of strings::bad malloc!\n");
exit(-1);
}
printf("Type a sentence and get each word as an array element:\n");
to_array(arr, &size, sentinel);
printf("Words:\n");
print_array(arr, size); // <--------- here, error!
getchar();
}
When trying to print the string array, I get:
Access violation reading location 0xcd007361.
Why I can't print the strings in arr at the end?
P.S.: I guess that the problem comes from the pointer arithmetic and the reallocation of the char** arr within function to_array(). (If previous right) I'm not sure what would be the standard way to deal with it?
Problem: the first parameter in void to_array(), i.e. char* arr[ ] passes a copy of a pointer to array of char. Every change on the pointer made inside the function does not affect the actual pointer to char array outside, specifically the function realloc() may move the initial memory block to a new location, which would invalidate the pointer passed as first parameter.
Solution: either to modify the function void to_array() to return the modified arr, or to modify the first parameter of the function to char** arr[ ]. The latter was chosen and the modified code looks like this:
void to_array(char** arr[ ], size_t* size, char* quit)
{
size_t index = 0;
char* word = 0;
char** expand_arr = 0;
char sentinel = ' ';
while ((word = get_word(sentinel)) && strcmp(word, quit) != 0)
{
if (index >= (*size))
{
(*size) *= 2;
expand_arr = (char**) realloc((*arr), sizeof(char*) * (*size));
if (expand_arr == NULL)
{
perror("to_array::bad realloc!\n");
exit(-1);
}
(*arr) = expand_arr;
}
(*arr)[index++] = word;
}
(*size) = index;
}
then the function call must be done as:
to_array(&arr, &size, quit);