My goal is to create a function that converts a string into an array of "words" resulted from splitting an initial string by a delimiter. All words should be null-terminated.
For example: strtoarr("**hello***world*", "*") should result in {"hello", "world"}. Here's my function.
char **strtoarr(char const *s, char c)
{
char **arr;
size_t i;
size_t j;
size_t k;
arr = malloc(sizeof(**arr) * (strlen(s) + 2));
if (arr == 0)
return (NULL);
i = 0;
k = 0;
while (s[i] != '\0')
{
j = 0;
while (s[i] != c)
{
arr[k][j] = s[i];
if (s[i + 1] == c || s[i + 1] == '\0')
{
j++;
arr[k][j] = '\0';
k++;
}
i++;
j++;
}
j = 0;
while (s[i] == c)
i++;
}
arr[k] = 0;
return (arr);
}
It works only with empty strings and segfaults with everything else. I believe the problem is here.
arr[k][j] = s[i];
But I don't understand what the problem is.
Thanks in advance
There are a number of problems with your code but the most important is the dynamic allocation. Your code does not allocate memory for saving an array of strings (aka an array of array of char).
This line:
arr = malloc(sizeof(**arr) * (strlen(s) + 2));
allocates memory for saving a number chars (i.e. strlen(s) + 2 chars) but that is not what you want. Especially not when arr is a pointer to pointer to char.
A simple approach that you can use is to allocate an array of char pointers and then for each of these pointers allocate an array of char.
This would be:
char** arr = malloc(sizeof(*arr) * NUMBER_OF_WORDS_IN_INPUT);
arr[0] = malloc(NUMBER_OF_CHARACTERS_IN_WORD0 + 1);
arr[1] = malloc(NUMBER_OF_CHARACTERS_IN_WORD1 + 1);
...
arr[NUMBER_OF_WORDS_IN_INPUT - 1] = malloc(NUMBER_OF_CHARACTERS_IN_LAST_WORD + 1);
Then you can store characters into arr using the syntax
arr[i][j] = SOME_CHARACTER;
without segfaults. (It is of cause required that i and j is within bounds of the allocation).
The inner while loop need to end if s[i] is NULL : while (s[i] != c && s[i] != '\0')
You check for s[i + 1] in your if statement but you continue looping.
Also you are allocating way more bytes than necessary, you could have a buffer of same size of your input string, and when the delimiter or NULL is found, you allocate a new row in your array of the needed size and copy the buffer into it.
Related
I have function named ft_split(char const *s, char c) that is supposed to take strings and delimiter char c and divide s into bunch of smaller strings.
It is 3rd or 4th day I am trying to solve it and my approach:
Calculates no. of characters in the string including 1 delimiter at the time (if space is delimiter so if there are 2 or more spaces in a row than it counts one space and not more. Why? That space is a memory for adding '\0' at the end of each splitted string)
It finds size (k) of characters between delimiters -> malloc memory -> copy from string to malloc -> copy from malloc to malloc ->start over.
But well... function shows segmentation fault. Debugger shows that after allocating "big" memory it does not go inside while loop, but straight to big[y][z] = small[z] after what it exits the function.
Any tips appreciated.
#include "libft.h"
#include <stdlib.h>
int ft_count(char const *s, char c)
{
int i;
int j;
i = 0;
j = 0;
while (s[i] != '\0')
{
i++;
if (s[i] == c)
{
i++;
while (s[i] == c)
{
i++;
j++;
}
}
}
return (i - j);
}
char **ft_split(char const *s, char c)
{
int i;
int k;
int y;
int z;
char *small;
char **big;
i = 0;
y = 0;
if (!(big = (char **)malloc((ft_count(s, c) + 1) * sizeof(char))))
return (0);
while (s[i] != '\0')
{
while (s[i] == c)
i++;
k = 0;
while (s[i] != c)
{
i++;
k++;
}
if (!(small = (char *)malloc(k * sizeof(char) + 1)))
return (0);
z = 0;
while (z < k)
{
small[z] = s[i - k + z];
z++;
}
small[k] = '\0';
z = 0;
while (z < k)
{
big[y][z] = small[z];
z++;
}
y++;
free(small);
}
big[y][i] = '\0';
return (big);
}
int main()
{
char a[] = "jestemzzbogiemzalfa";
ft_split(a, 'z');
}
I didn't get everything what the code is doing, but:
You have a char **big, it's a pointer-to-pointer-to-char, so presumably is supposed to point to an array of char *, which then point to strings. That would look like this:
[ big (char **) ] -> [ big[0] (char *) ][ big[1] (char *) ][ big[2] ... ]
| [
v v
[ big[0][0] (char) ] ...
[ big[0][1] (char) ]
[ big[0][2] (char) ]
[ ... ]
Here, when you call big = malloc(N * sizeof(char *)), you allocate space for the middle pointers, big[0] to big[N-1], the ones on the top right in the horizontal array. It still doesn't set them to anything, and doesn't reserve space for the final strings (big[0][x] etc.)
Instead, you'd need to do something like
big = malloc(N * sizeof(char *));
for (i = 0; i < N; i++) {
big[i] = malloc(k);
}
for each final string individually, with the correct size etc. Or just allocate a big area in one go, and split it among the final strings.
Now, in your code, it doesn't look like you're ever assigning anything to big[y], so they might be anything, which very likely explains the segfault when referencing big[y][z]. If you used calloc(), you'd now that big[y] was NULL, with malloc() it might be, or might not.
Also, here:
while (s[i] != '\0')
{
while (s[i] == c)
i++;
k = 0;
while (s[i] != c) /* here */
{
i++;
k++;
}
I wonder what happens if the end of string is reached at the while (s[i] != c), i.e. if s[i] is '\0' at that point? The loop should probably stop, but it doesn't look like it does.
There are multiple problems in the code:
the ft_count() function is incorrect: you increment i before testing for separators, hence the number is incorrect if the string starts with separators. You should instead count the number of transitions from separator to non-separator:
int ft_count(char const *s, char c)
{
char last;
int i;
int j;
last = c;
i = 0;
j = 0;
while (s[i] != '\0')
{
if (last == c && s[i] != c)
{
j++;
}
last = s[i];
i++;
}
return j;
}
Furthermore, the ft_split() functions is incorrect too:
the amount of memory allocated for the big array of pointers in invalid: you should multiply the number of elements by the element size, which is not char but char *.
you add an empty string at the end of the array if the string ends with separators. You should test for a null byte after skipping the separators.
you do not test for the null terminator when scanning for the separator after the item.
you do not store the small pointer into the big array of pointers. Instead of copying the string to big[y][...], you should just set big[y] = small and not free(small).
Here is a modified version:
char **ft_split(char const *s, char c)
{
int i;
int k;
int y;
int z;
char *small;
char **big;
if (!(big = (char **)malloc((ft_count(s, c) + 1) * sizeof(*big))))
return (0);
i = 0;
y = 0;
while (42) // aka 42 for ever :)
{
while (s[i] == c)
i++;
if (s[i] == '\0')
break;
k = 0;
while (s[i + k] != '\0' && s[i + k] != c)
{
k++;
}
if (!(small = (char *)malloc((k + 1) * sizeof(char))))
return (0);
z = 0;
while (z < k)
{
small[z] = s[i];
z++;
i++;
}
small[k] = '\0';
big[y] = small;
y++;
}
big[y] = NULL;
return (big);
}
42 rant:
Ces conventions de codage (la norminette) sont contre-productives! Les boucles for sont plus lisibles et plus sûres que ces while, les casts sur les valeurs de retour de malloc() sont inutiles et confusantes, les parenthèses autour de l'argument de return sont infantiles.
I need a function that deletes a character in a string at an index without brackets[], instead i have to use pointers. I am not allowed to use memmove or another function. This is what I have so far:
void stringDeleteChar(char *s, int index){
int i = 0;
char *hold = NULL;
if (index > strlen(s) || index < 0){
s = '\0';
}
else{
s += index;
hold = s+1;
while (i < strlen(s)){
*s = *hold;
s++;
hold++;
i++;
}
}
}
All you need is to take care of the while loop, the s changes inside loop so , it will not give you correct output, so just check your hold as any data until \0.
else{
s += index+1;
hold = s+1;
putchar(*s);putchar(*hold);
while (*hold){
*s = *hold;
s++;
hold++;
i++;
}
*s = *hold;
}
Doing s = '\0' has no effect ignored it and get out of the function. However, index < 0 is wrong, since you might want to remove the first element of the array as well:
void stringDeleteChar(char *s, int index){
if (index <= strlen(s) && index >= 0){
for (char* str_pointer = s + index; *str_pointer != '\0'; str_pointer++)
*str_pointer = *(str_pointer + 1);
}
}
You start with the pointer (i.e., str_pointer) pointing to the position of the element to be removed (i.e., char* str_pointer= s + index). Then the rest is just shifting to the left by one all the remaining elements of the array. Or in another words array[i] = array[i+1] until '\0' is reach.
When you compare the version with pointers against the one (explicitly) without them:
for (int i = index; s[i] != '\0'; i++)
s[i] = s[i+1];
You can see that s[i] is equivalent to *str_pointer
I'm currently working on making a shell. I want to separate a simple string into a 2D array. At first it was working perfectly, but now I have a strange problem: my simple string "str" changes after I malloc anything. For example, if I create a new array like
char *tmp = malloc(sizeof(char) * 15));
then my str, which was "ls -l", will become something like " OO" or " A".
I've already tried to change the malloc size, but it didn't solve the problem, though it did make str change differently. Her is my code:
char **mem_alloc_2d_array(int nb_rows, int nb_cols) {
char **map = malloc(nb_rows * sizeof(*map + 1));
for (int i = 0; i < nb_rows; i++) {
map[i] = malloc(nb_cols * sizeof(**map + 1));
}
return map;
}
int what_is_x(char const *str) {
int x = 2;
for (int i = 0; str[i] != '\0'; i++) {
if (str[i] == ' ')
x++;
}
return x;
}
char **try_this(char const *str) {
int size = my_strlen(str);
int x = what_is_x(str);
char **words = mem_alloc_2d_array(x, size);
return words;
}
char **my_str_to_wordtab(char *str) {
int j = 0;
int i = 0;
char **words = try_this(str);
if (str[0] == '\n' || str[0] == '\r')
words[0] = NULL;
for (; str[i] == ' ' || str[i] == '\t'; i++);
for (int x = 0; str[i] != '\n'; i++, x++) {
if (str[i] == ' ' || str[i] == '\t') {
words[j][x] = '\0';
j++;
x = 0;
while (str[i] == ' ' || str[i] == '\t')
i++;
}
words[j][x] = str[i];
}
j++;
words[j] = (char *)0;
return words;
}
What I expect is that in the function try_this(), if my str is something big like "ls -l Makefile" then both my_putstr(str) calls will print the same thing, but they don't.
It's unclear what you're talking about with regard to calls to a function my_putstr(str), as neither that function nor any calls to it appear in the code you've presented. Nevertheless, I can say for sure that your memory-allocation code is screwy, and at least partially incorrect. Consider this:
char **map = malloc(nb_rows * sizeof(*map + 1));
. What exactly is the point of the + 1 there? Note that *map has type char *, and therefore so does *map + 1. The sizeof operator computes a result based on the type of its operand, so your sizeof expression computes the same value as sizeof(*map). I'm guessing you probably want
char **map = malloc((nb_rows + 1) * sizeof(*map));
, which reserves space for nb_rows + 1 pointers.
Similarly, this ...
map[i] = malloc(nb_cols * sizeof(**map + 1));
... does not do what you probably intend. To reserve space for a terminator for each string, that would be better written as
map[i] = malloc((nb_cols + 1) * sizeof(**map));
. But since this code is specific to strings, and the size of a char is 1 by definition, I would actually write it like this, myself:
map[i] = malloc(nb_cols + 1);
You not having reserved sufficient space for your data, it is not surprising that you see memory corruption.
Note, too, that checking for memory allocation failure (in which case malloc() returns a null pointer) and handling it appropriately if it occurs are essential for robust code. Do get into the habit of doing that as a matter of routine, although failure to do so is probably not contributing to the particular problem you asked about.
I'm trying to read a string in an array, and if a character is not any of the excluded characters int a = ('a'||'e'||'i'||'o'||'u'||'y'||'w'||'h'); it should copy the character into a new array, then print it.
The code reads as:
void letter_remover (char b[])
{
int i;
char c[MAX];
int a = ('a'||'e'||'i'||'o'||'u'||'y'||'w'||'h');
for (i = 0; b[i] != '\0'; i++)
{
if (b[i] != a)
{
c[i] = b[i];
}
i++;
}
c[i] = '\0';
printf("New string without forbidden characters: %s\n", c);
}
However it only prints New string without forbidden characters: h, if the inputted array is, for example hello. I'd like the output of this to be ll (with h, e and o removed).
Use this:
if (b[i] != 'a' && b[i] != 'e' && b[i] != 'i' && b[i] != 'o' && b[i] != 'u' && b[i] != 'y' && b[i] != 'w' && b[i] != 'h')
The boolean OR operator just returns 0 or 1, it doesn't create an object that automatically tests against all the parameters to the operator.
You could also use the strchr() function to search for a character in a string.
char a[] = "aeiouywh";
for (i = 0; b[i] != '\0'; i++)
{
if (!strchr(a, b[i]))
{
c[i] = b[i];
}
i++;
}
c[i] = '\0';
int a = ('a'||'e'||'i'||'o'||'u'||'y'||'w'||'h');
...has an entirely different meaning than you expect. When you Boolean-OR together all those characters, a becomes 1. Since b[] contains no character value 1, no characters will be excluded. Also, your c[] is going to have empty slots if you had tested correctly.
You can use strcspn() to test if your string contains your forbidden characters. For example...
// snip
int i=0, j=0;
char * a = "aeiouywh";
while (b[i])
{
int idx = strcspn(&b[i], a);
if (idx >= 0)
{
if (idx > 0)
strncpy(&c[j], &b[i], idx);
j += idx;
i += idx + 1;
}
}
// etc...
Also, you must be sure c[] is large enough to contain all the characters that might be copied.
I'm in the process of writing a string tokenizer without using strtok(). This is mainly for my own betterment and for a greater understanding of pointers. I think I almost have it, but I've been receiving the following errors:
myToc.c:25 warning: assignment makes integer from pointer without a cast
myToc.c:35 (same as above)
myToc.c:44 error: invalid type argument of 'unary *' (have 'int')
What I'm doing is looping through the string sent to the method, finding each delimiter, and replacing it with '\0.' The "ptr" array is supposed to have pointers to the separated substrings. This is what I have so far.
#include <string.h>
void myToc(char * str){
int spcCount = 0;
int ptrIndex = 0;
int n = strlen(str);
for(int i = 0; i < n; i++){
if(i != 0 && str[i] == ' ' && str[i-1] != ' '){
spcCount++;
}
}
//Pointer array; +1 for \0 character, +1 for one word more than number of spaces
int *ptr = (int *) calloc(spcCount+2, sizeof(char));
ptr[spcCount+1] = '\0';
//Used to differentiate separating spaces from unnecessary ones
char temp;
for(int j = 0; j < n; j++){
if(j == 0){
/*Line 25*/ ptr[ptrIndex] = &str[j];
temp = str[j];
ptrIndex++;
}
else{
if(str[j] == ' '){
temp = str[j];
str[j] = '\0';
}
else if(str[j] != ' ' && str[j] != '\0' && temp == ' '){
/*Line 35*/ ptr[ptrIndex] = &str[j];
temp = str[j];
ptrIndex++;
}
}
}
int k = 0;
while(ptr[k] != '\0'){
/*Line 44*/ printf("%s \n", *ptr[k]);
k++;
}
}
I can see where the errors are occurring but I'm not sure how to correct them. What should I do? Am I allocating memory correctly or is it just an issue with how I'm specifying the addresses?
You pointer array is wrong. It looks like you want:
char **ptr = calloc(spcCount+2, sizeof(char*));
Also, if I am reading your code correctly, there is no need for the null byte as this array is not a string.
In addition, you'll need to fix:
while(ptr[k] != '\0'){
/*Line 44*/ printf("%s \n", *ptr[k]);
k++;
}
The dereference is not required and if you remove the null ptr, this should work:
for ( k = 0; k < ptrIndex; k++ ){
/*Line 44*/ printf("%s \n", ptr[k]);
}
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void myToc(char * str){
int spcCount = 0;
int ptrIndex = 0;
int n = strlen(str);
for(int i = 0; i < n; i++){
if(i != 0 && str[i] == ' ' && str[i-1] != ' '){
spcCount++;
}
}
char **ptr = calloc(spcCount+2, sizeof(char*));
//ptr[spcCount+1] = '\0';//0 initialized by calloc
char temp = ' ';//can simplify the code
for(int j = 0; j < n; j++){
if(str[j] == ' '){
temp = str[j];
str[j] = '\0';
} else if(str[j] != '\0' && temp == ' '){//can omit `str[j] != ' ' &&`
ptr[ptrIndex++] = &str[j];
temp = str[j];
}
}
int k = 0;
while(ptr[k] != NULL){//better use NULL
printf("%s \n", ptr[k++]);
}
free(ptr);
}
int main(){
char test1[] = "a b c";
myToc(test1);
char test2[] = "hello world";
myToc(test2);
return 0;
}
Update: I tried this at http://www.compileonline.com/compile_c99_online.php
with the fixes for lines 25, 35, and 44, and with a main function that called
myToc() twice. I initially encountered segfaults when trying to write null characters
to str[], but that was only because the strings I was passing were (apparently
non-modifiable) literals. The code below worked as desired when I allocated a text buffer and wrote the strings there before passing them in. This version also could be modified to return the array of pointers, which then would point to the tokens.
(The code below also works even when the string parameter is non-modifiable, as long as
myToc() makes a local copy of the string; but that would not have the desired effect if the purpose of the function is to return the list of tokens rather than just print them.)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void myToc(char * str){
int spcCount = 0;
int ptrIndex = 0;
int n = strlen(str);
for(int i = 0; i < n; i++){
if(i != 0 && str[i] == ' ' && str[i-1] != ' '){
spcCount++;
}
}
//Pointer array; +1 for one word more than number of spaces
char** ptr = (char**) calloc(spcCount+2, sizeof(char*));
//Used to differentiate separating spaces from unnecessary ones
char temp;
for(int j = 0; j < n; j++){
if(j == 0){
ptr[ptrIndex] = &str[j];
temp = str[j];
ptrIndex++;
}
else{
if(str[j] == ' '){
temp = str[j];
str[j] = '\0';
}
else if(str[j] != ' ' && str[j] != '\0' && temp == ' '){
ptr[ptrIndex] = &str[j];
temp = str[j];
ptrIndex++;
}
}
}
for (int k = 0; k < ptrIndex; ++k){
printf("%s \n", ptr[k]);
}
}
int main (int n, char** v)
{
char text[256];
strcpy(text, "a b c");
myToc(text);
printf("-----\n");
strcpy(text, "hello world");
myToc(text);
}
I would prefer simpler code, however. Basically you want a pointer to the first non-blank character in str[], then a pointer to each non-blank (other than the first) that is preceded by a blank. Your first loop almost gets this idea except it is looking for blanks preceded by non-blanks. (Also you could start that loop at i = 1 and avoid having to test i != 0 on each iteration.)
I might just allocate an array of char* of size sizeof(char*) * (n + 1)/2 to hold the pointers rather than looping over the string twice (that is, I'd omit the first loop, which is just to figure out the size of the array). In any case, if ptr[0] is non-blank I would write its address to the array; then looping for (int j = 1; j < n; ++j), write the address of str[j] to the array if str[j] is non-blank and str[j - 1] is blank--basically what you are doing, but with fewer ifs and fewer auxiliary variables.
Less code means less opportunity to introduce a bug, as long as the code is clean and makes sense.
Previous remarks:
int *ptr = declares an array of int. For an array of pointers to char, you want
char** ptr = (char**) calloc(spcCount+2, sizeof(char*));
The comment prior to that line also seems to indicate some confusion. There is no terminating null in your array of pointers, and you don't need to allocate space for one, so possibly spcCount+2 could be spcCount + 1.
This also is suspect:
while(ptr[k] != '\0')
It looks like it would work, given the way you used calloc (you do need spcCount+2 to make this work), but I would feel more secure writing something like this:
for (k = 0; k < ptrIndex; ++k)
I do not thing that is what caused the segfault, it just makes me a little uneasy to compare a pointer (ptr[k]) with \0 (which you would normally compare against a char).