Compare same words - c

I'm new in programming in C and I got 2 problems. I have two string, then I need to split them into words and find out if both strings contains same words. I will explain it with my code.
Input:
"He said he would do it."
"IT said: 'He would do it.'"
This two string are placed into two arrays. At first I need to parse words from others characters.
Call function:
char ** w1 = parse(s1, &len1);
Variable len counts number of rows (words).
Function parse:
char ** parse(char *w, int * i)
{
int j = 0, y, dupl = 0; //variables for cycles and duplicate flag
char d[] = " <>[]{}()/\"\\+-*=:;~!##$%^&_`'?,.|";
char* token = strtok(w, d);
unsigned len = strlen(token), x;
unsigned plen = len;
char ** f = (char**) malloc(len * sizeof (char*));
while (token != NULL)
{
len = strlen(token);
for (x = 0; x < len; x++)
{
token[x] = tolower(token[x]);
}
for (y = 0; y < *i; y++) //cycle for deleting duplicated words
{
if (equals(token, f[y]) == 1)
{
dupl = 1; break;
}
}
if (dupl == 1)
{
token = strtok(NULL, d);
dupl = 0;
continue;
}
if (len >= plen)
{
f = (char**) realloc(f, (len+1) * sizeof (char*));
plen = len;
}
else
f = (char**) realloc(f, (plen+1) * sizeof (char*));
f[j] = token;
token = strtok(NULL, d);
*i = *i + 1;
j++;
}
free(token);
return f;
}
Ok, now i have 2x 2Darrays, then just sort it (qsort(w1, len1, sizeof (char*), cmp);) and compare it:
for (i = 0; i < len2; i++)
if (equals(w1[i], w2[i]) == 0)
return 0;
Function equals:
int equals(char *w1, char *w2)
{
if (strcmp(w1, w2) == 0)
return 1;
return 0;
}
I know that all of this can be faster, but at first I need to solve my problem. This works for input which I wrote at the beginning, but when I type something long e.g.500 characters, my result is Aborted. I think that the problem is here:
f = (char**) realloc(f, (len+1) * sizeof (char*));
but dunno why.
Second thing is, that I can't free my arrays. This
void Clear (char ** w, int max)
{
int i;
for (i = 0; i < max; i++)
free(w[i]);
free(w);
}
gives me a segmentation fault.
Thanks for your time and I'm sorry for my bad english and bad programming skills.

It seems to have confused the word length and number of words in the parse function in your logic.
char ** f = (char**) malloc(len * sizeof (char*));
I think for example of len in portions of the above are as should be the number of words rather than characters.

Related

concatenate and add an array of pointers into one index of another array of pointers

Is it possible to concatenate and add an array of pointers into one index of another array of pointers. I'm trying to take the strings inside my *token pointer and make it one string inside the first index of my commands pointer array, so on and so forth
cmd = strtok(str, " ");
while(n < 5 && (act_token = strtok(NULL, " ")))
{
token[n] = act_token;
n++;
}
token[n] = NULL;
/* Below is where I'm trying to add all the elements of the token array into one index of the comands array */
while( z < len ){
comands[b] = token[z];
z++;
}
b++;
}
To avoid O(n*n) complexity caused by looping a concatenation, in #zzxyz otherwise good answer, consider copying to the end of the accumulated destination.
char *concat_alloc(const char *token[], size_t n) {
size_t sum = 1;
for (size_t i = 0; i < n; i++) {
size_t len = strlen(token[i]);
sum += len;
if (sum < len) {
return NULL; // Too long
}
}
char *dest = malloc(sum);
if (dest) {
char *p = dest;
for (size_t i = 0; i < n; i++) {
size_t len = strlen(token[i]);
memcpy(p, token[i], len);
p += len; // advance to the end
}
*p = '\0';
}
return dest;
}
I feel your pain. String handling is very bad in C, and almost as bad in C++.
However, once you write the function, all you have to do is call it...
char *GetStringFromStringArray(const char**sourceStrings, size_t nCount)
{
char *destString = NULL;
size_t destLength = 1; //start with room for null-terminator
if (nCount == 0)
return destString;
for (size_t i = 0; i < nCount; i++)
destLength += strlen(sourceStrings[i]);
destString = (char*)malloc(destLength);
strcpy(destString, sourceStrings[0]);
for (size_t i = 1; i < nCount; i++)
strcat(destString, sourceStrings[i]);
return destString;
}
int main()
{
char *tokens[10] = { "bob", "jim", "hank" };
char *destStrings[2];
destStrings[0] = GetStringFromStringArray((const char**)tokens, 2);
destStrings[1] = GetStringFromStringArray((const char**)&tokens[1], 2);
free(destStrings[0]);
free(destStrings[1]);
}
The way I initialized tokens is not ok, by the way. Purely for easy example.

Three level indirection char pointer in C causes a segment fault

I got a segment fault error at the line with the comments that contains lots of equals signs below.
The function below str_spit, I wrote it because I want to split a string using a specific char, like a comma etc.
Please help.
int str_split(char *a_str, const char delim, char *** result)
{
int word_length = 0;
int cur_cursor = 0;
int last_cursor = -1;
int e_count = 0;
*result = (char **)malloc(6 * sizeof(char *));
char *char_element_pos = a_str;
while (*char_element_pos != '\0') {
if (*char_element_pos == delim) {
char *temp_word = malloc((word_length + 1) * sizeof(char));
int i = 0;
for (i = 0; i < word_length; i++) {
temp_word[i] = a_str[last_cursor + 1 + i];
}
temp_word[word_length] = '\0';
//
*result[e_count] = temp_word;//==============this line goes wrong :(
e_count++;
last_cursor = cur_cursor;
word_length = 0;
}
else {
word_length++;
}
cur_cursor++;
char_element_pos++;
}
char *temp_word = (char *) malloc((word_length + 1) * sizeof(char));
int i = 0;
for (i = 0; i < word_length; i++) {
temp_word[i] = a_str[last_cursor + 1 + i];
}
temp_word[word_length] = '\0';
*result[e_count] = temp_word;
return e_count + 1;
}
//this is my caller function====================
int teststr_split() {
char delim = ',';
char *testStr;
testStr = (char *) "abc,cde,fgh,klj,asdfasd,3234,adfk,ad9";
char **result;
int length = str_split(testStr, delim, &result);
if (length < 0) {
printf("allocate memroy failed ,error code is:%d", length);
exit(-1);
}
free(result);
return 0;
}
I think you mean
( *result )[e_count] = temp_word;//
instead of
*result[e_count] = temp_word;//
These two expressions are equivalent only when e_count is equal to 0.:)
[] has a higher precedence than *, so probably parentheses will solve THIS problem:
(*result)[e_count] = temp_word;
I didn't check for more problems in the code. Hint: strtok() might do your job just fine.

C: string replace in loop (c beginner)

I need to replace a strings in some text. I found this function here at stackoverflow:
char *replace(const char *s, const char *old, const char *new)
{
char *ret;
int i, count = 0;
size_t newlen = strlen(new);
size_t oldlen = strlen(old);
for (i = 0; s[i] != '\0'; i++) {
if (strstr(&s[i], old) == &s[i]) {
count++;
i += oldlen - 1;
}
}
ret = malloc(i + count * (newlen - oldlen));
if (ret == NULL)
exit(EXIT_FAILURE);
i = 0;
while (*s) {
if (strstr(s, old) == s) {
strcpy(&ret[i], new);
i += newlen;
s += oldlen;
} else
ret[i++] = *s++;
}
ret[i] = '\0';
return ret;
}
This function works for me fine for single replacement. But i need to replace a whole array "str2rep" to "replacement". So what i'm trying to do(im just a beginner)
****
#define MAXTEXT 39016
int l;
int j;
char *newsms = NULL;
char text[MAXTEXT];
char *str2rep[] = {":q:",":n:"};
char *replacement[] = {"?","\n"};
strcpy((char *)text,(char *)argv[5]);
l = sizeof(str2rep) / sizeof(*str2rep);
for(j = 0; j < l; j++)
{
newsms = replace(text,(char *)str2rep[j],(char *)replacement[j]);
strcpy(text,newsms);
free(newsms);
}
textlen = strlen(text);
This code even works locally, If I build it from single file... But this is asterisk module, so when this is being executed, asterisk stops with:
* glibc detected * /usr/sbin/asterisk: double free or corruption (!prev): 0x00007fa720006310 *
Issues:
ret = malloc(i + count * (newlen - oldlen)); is too small. Need + 1.
Consider what happens with replace("", "", ""). If your SO ref is this, it is wrong too.
Questionable results mixing signed/unsigned. count is signed. newlen, oldlen are unsigned.
I think the original code works OK, but I do not like using the wrap-around nature of unsigned math when it can be avoided which is what happens when newlen < oldlen.
// i + count * (newlen - oldlen)
size_t newsize = i + 1; // + 1 for above reason
if (newlen > oldlen) newsize += count * (newlen - oldlen);
if (newlen < oldlen) newsize -= count * (oldlen - newlen);
ret = malloc(newsize);
Insure enough space. #hyde Various approaches available here.
// strcpy(text, newsms);
if (strlen(newsms) >= sizeof text) Handle_Error();
strcpy(text, newsms);
Minor
No need for casts
// newsms = replace(text, (char *) str2rep[j], (char *) replacement[j]);
newsms = replace(text, str2rep[j], replacement[j]);
Better to use size_t for i. A pedantic solution would also use size_t count.
// int i;
size_t i;
I will suggest something that to me looks a bit more clear as an alternative, in place of a proper dynamic string implementation. Exception handling is left as an exercise for the reader to add. :)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *appendn(char *to, char *from, int length)
{
return strncat(realloc(to, strlen(to) + length + 1), from, length);
}
char *replace(char *string, char *find, char *sub)
{
char *result = calloc(1, 1);
while (1)
{
char *found = strstr(string, find);
if (!found)
break;
result = appendn(result, string, found - string);
result = appendn(result, sub, strlen(sub));
string = found + strlen(find);
}
return appendn(result, string, strlen(string));
}
int main()
{
const char text[] = "some [1] with [2] to [3] with other [2]";
char *find[] = {"[1]", "[2]", "[3]", NULL};
char *sub[] = {"text", "words", "replace"};
char *result, *s;
int i;
result = malloc(sizeof(text));
(void) strcpy(result, text);
for (i = 0; find[i]; i ++)
{
s = replace(result, find[i], sub[i]);
free(result);
result = s;
}
(void) printf("%s\n", result);
free(result);
}

C changing a pointer in a function

I want to write a function that processes a string that looks like this:
|1,2,3,4|(1->2),(2->3),(3->1)|
The result should be a breaking down of the string into these strings:
1
2
3
4
(1->2)
(2->3)
(3->2)
This is my code:
int processPart(char*** dest, char* from) //Processes a half at a time
{
int len = 0;
char* cutout = strtok(from, ",");
while(cutout)
{
(*dest) = (char**)realloc(dest, (len + 1) * sizeof(char*)); <<<<<<<
(*dest)[len] = (char*)calloc(strlen(cutout) + 1, sizeof(char));
memcpy((*dest)[len], cutout, strlen(cutout));
cutout = strtok(NULL, ",");
len++;
}
return len;
}
void processInput(char*** vertices, char*** edges, char* input, int* sizev, int* sizee)
{
int vlen = 0, elen = 0;
char* string = input + 1;
char* raw_vertices;
char* raw_edges;
string[strlen(string)] = '\0';
raw_vertices = strtok(string, "|");
raw_edges = strtok(NULL, "|");
*sizev = processPart(vertices, raw_vertices); //First the vertices
*sizee = processPart(edges, raw_edges); //Then the edges
}
int main()
{
char* in = stInput(); //input function
char** c = NULL, **b = NULL;
int a, d, i;
processInput(&c, &b, in, &a, &d);
for(i = 0; i < a; i++)
{
printf("%s\n", c[i]);
}
printf("++++++++++++++++");
for(i = 0; i < d; i++)
{
printf("%s\n", b[i]);
}
return 0;
}
However, I get a corruption of the heap at the line marked by <<<<<<<
Anyone knows what my mistake is?
In the erroneous line
(*dest) = (char**)realloc(dest, (len + 1) * sizeof(char*));
a * is missing before the dest argument. You could have spotted this easier if you hadn't cluttered the expression with the useless cast. I'd write
*dest = realloc(*dest, (len + 1) * sizeof**dest);
- that way we can see better the matching of first argument and left operand of the assignment.

Segmentation fault on calling function more then once

running this function more then once will cause a Segmentation fault and i cannot figure out why. Im not looking for alternative ways to split a string.
SplitX will continue splitting for x ammount of delimiters (be it '|' or '\0') and return the x or the number of substrings it could make.
I should note i have just restarted coding in C after 3 years of easy JavaScript and PHP so i could be missing something obvious.
int splitX(char **array, char *string, int x) {
int y;
int z;
int index = 0;
int windex = 0;
for(y = 0; y < x; y++) {
z = index;
while(string[index] != '\0' && string[index] != '|') {
index++;
}
char **tempPtr = realloc(array, (y+1)*sizeof(char *));
if(tempPtr == NULL) {
free(array);
return -3;
}
array = tempPtr;
array[y] = malloc(sizeof(char) * (index - z + 1));
windex = 0;
for(; z < index; z++) {
array[y][windex] = string[z];
windex++;
}
array[y][windex] = '\0';
if(string[index] == '\0')
break;
index++;
}
return y+1;
}
int main() {
char **array;
int array_len = splitX(array, query, 2);
printf("%s %s %d\n", array[0], array[1], array_len);
while(array_len > 0) {
free(array[array_len-1]);
array_len--;
}
free(array);
array_len = splitX(array, "1|2\0", 2);
printf("%s %s %d\n", array[0], array[1], array_len);
while(array_len > 0) {
free(array[array_len-1]);
array_len--;
}
free(array);
}
char **array;
int array_len = splitX(array, query, 2);
This lets splitX() use the uninitialized array, which results in undefined behavior.
Furthermore, C has no pass-by-reference - when you write
array = tempPtr;
inside the function, that has no visible effect outside it.
Im not looking for alternative ways to split a string.
You should really be. Your current approach is at best non-idiomatic, but it also has some other mistakes (like returning y + 1 for some reason where y would do certainly, etc.).
You are also reinventing the wheel: for string and character searching, use strstr(), strchr() and strtok_r() from the C standard library; for duplicaitng a string, use strdup() instead of going through the string manually, etc., etc...
What else:
use size_t for sizes instead of int;
maintain const correctness by using const char * for input strings.
char **split(const char *s, size_t *sz)
{
char **r = NULL;
size_t n = 0, allocsz = 0;
const char *p = s, *t = p;
int end = 0;
do {
const char *tmp = strchr(p, '|');
if (tmp == NULL) {
p = p + strlen(p);
end = 1;
} else {
p = tmp;
}
if (++n > allocsz) {
if (allocsz == 0)
allocsz = 4;
else
allocsz <<= 1;
char **tmp = realloc(r, sizeof(*r) * allocsz);
if (!tmp) abort(); // or whatever, handle error
r = tmp;
}
r[n - 1] = malloc(p - t + 1);
memcpy(r[n - 1], t, p - t);
r[n - 1][p - t] = 0;
p++;
t = p;
} while (!end);
*sz = n;
return r;
}

Resources