EDIT:
I realize that the code in my OP is long and hard to read. I've highlighted the problem with 4 lines of code.
char **t = {"Hello", "World"};
char **a = t;
++(a[0]);
printf("%c\n",**t);
I want to increment through the array of strings without losing the pointer to the first character. Therefore, I initialize a new pointer 'a' to point to the first character. After I increment the 'a' pointer, though, it seems to change what 't' points to! In the printf statement, I expect that t's pointer value remain unchanged, but it seemed to increment with 'a' and now points to the second character. Why is this happening?
SOLVED:
In the above example, a and t seem to be the same pointer so if I change one (by incrementing for example), the change is also reflected in the pther. However, if I dereference t into another variable, then I can change said variable without having that change reflected in t. In the above example, this looks like
char *a = t[0];
++a;
printf("a value: %c\n", *a);
printf("t value: %c\n", **t);
I think that I had originally been confused about dereferencing since t points to a pointer. Every response I've gotten is to use array indexing as opposed to pointers, and I can see why.
ORIGINAL POST:
Say I have:
array1 {"arp", "live", "strong"}, and
array2 {"lively", "alive", "harp", "sharp", "armstrong"}
I'm trying to find the strings in array1 that are substrings of any string in array2.
To do this, I wrote a helper function (compString) that takes in the string from array1, the entire array2, and the length of array2.
Essentially, what the function does is create local pointer values for both the string pointer and the array pointer. It then extracts the first string from array2 and begins to walk through it to find a match for the first letter of the input string. If no match is found, the function will move on to the next string, until a full match is found or until it walks through the entire array2. It then returns to its calling environment.
I ran into some unexpected behavior. When I call the function (with the same arguments), after having already called it, the array pointer seems to point to exactly where it left off in the previous call.
For example, if I call compString("arp", array2, 5) then the function will flag a match starting at the a in harp.
Then, if I call compString("live", array2, 5), the function begins at the a in harp and goes to the end of the array without flagging a match.
Finally, when I call compString("strong", array2, 5), array2 is now pointing to garbage since it has already been iterated through, and does not flag a match.
Since one of the first things the helper function does is "localize" the pointers being passed (that is, create a local pointer variable and assign to it the value of the pointer being passed to the funcion, then iterate that local variable), I would assume that subsequent calls to the function wouldn't "save" the previous value of the pointer. Any pointers?
Source attached:
#include <stdio.h>
#include <string.h>
int compString(char *, char **, int);
int main(void)
{
int sz1 = 3;
int sz2 = 5;
char *p, *p2;
char *array1[] = {"arp\0", "live\0", "strong\0"};
char *array2[] = {"lively\0", "alive\0", "harp\0", "sharp\0", "armstrong\0"};
compString("arp\0",array2,5);
compString("live\0",array2,5);
compString("strong\0",array2,5);
}
int compString(char *arr1, char **arr2, int sz2)
{
printf("\n\n\n");
printf("WORD: %s\n",arr1);
int i = 0;
char *a1 = arr1;
char **a2 = arr2;
char *p;
char *p2;
printf("BEGIN ITERATION %d\n",i);
printf("Checking against word: %s\n",a2[i]);
while (i < sz2)
{
printf("%c\n",*a2[i]);
if (*a1 == *a2[i])
{
char *p = a1;
char *p2 = a2[i];
while ((*p == *p2) && (*p != '\0'))
{
++p;
++p2;
}
if (*p == '\0')
{
return 1;
}
else
{
*++(a2[i]);
if (*(a2[i]) == '\0')
{
++i;
printf("BEGIN ITERATION %d\n",i);
printf("Checking against word: %s\n",a2[i]);
}
}
}
else
{
*++(a2[i]);
if (*(a2[i]) == '\0')
{
++i;
printf("BEGIN ITERATION %d\n",i);
printf("Checking against word: %s\n",a2[i]);
}
}
}
return 0;
}
Your loop is causing an off-by-one error. What you want to do is looping through your array of 5 strings, so from 0 to 4. We can see that when you run all three tests because they somehow depend on the result on eachother (I didn't look into the comparing logic too much, it seems rather obfuscated).
We can replicate the behavior with just one test:
compString("test", array2, 5);
So the 5 is supposed to tell it to loop from 0 to 4. In the comparison function, you have this:
int i = 0;
printf("BEGIN ITERATION %d\n", i);
printf("Checking against word: %s\n", a2[i]);
while (i < sz2)
So far, so good. The i < sz2 is correct, it supposedly loops from 0 to 4, assuming you increase i correctly.
Then, however, you do this somewhere at the end of the function:
++i;
printf("BEGIN ITERATION %d\n", i);
printf("Checking against word: %s\n", a2[i]);
So when i is 4, you increase it to 5, and at that point the function should stop looping through the array, but at that point you do that print that tries to access a2[5], which doesn't exist. That's where it crashes for me on MSVC.
My suggestion is that you rework your loop logic to something like this:
for (int i = 0; i < sz2, i++){
printf("BEGIN ITERATION %d\n", i);
printf("Checking against word: %s\n", a2[i]);
// do something with a2[i] and don't manually change the value of "i"
}
Also, I would tidy up that string logic, there might be a bug in it somewhere. You don't need all those suspicious dereferencing calls. When you want to access character x of string y in a2, then a2[y][x] does the trick. For instance, if you want to find some letter, simply do:
for (int n = 0; n < strlen(a2[y]), n++){
if (a2[y][n] == 'a')
printf("found letter 'a' at position %d\n", n);
}
Furthermore, you don't need to add \0 to string literals. Those are automatically added, so you're just adding a second one. Instead of this:
char *array1[] = {"arp\0", "live\0", "strong\0"};
Do this:
char *array1[] = {"arp", "live", "strong"};
Also, I don't know if you have to implement this function because it's a task you've been given, but if you just want to find substrings, then you don't need to reinvent the wheel as strstr already does that.
are you looking for something like this:
char *array1[] = {"arp", "live", "strong", NULL};
char *array2[] = {"lively", "alive", "harp", "sharp", "armstrong", NULL};
void findsrings(char **neadles, char **haystack)
{
while(*neadles)
{
char **hay = haystack;
size_t pos = 0;
printf("Searching for %s\n", *neadles);
while(*hay)
{
if(strstr(*hay, *neadles))
{
printf("Found!! Haystack word is: %s at index %zu in haystack\n", *hay, pos);
}
pos++;
hay++;
}
neadles++;
}
}
int main()
{
findsrings(array1, array2);
return 0;
}
you do not need the '\0' at the end of the string literals as they are added automatically by the C compiler. I have added NULL wihch terminates the array of the string pointers - so you do not need to provide the sizes of the arrays/.
As mentioned in the comments the side effect you've noticed, is due to this line *++(a2[i]); which is altering the contents of your second array. As time progresses you'll eventually end up with the second array having no actual words in it.
Generally your code is overly complicated and you're using while loops when for loops are better suited.
The outer loop for example would work better as:
for(i=0;i<sz2;i++)
{
printf("BEGIN ITERATION %d\n",i);
printf("Checking against word: %s\n",arr2[i]);
And then since you want to check each substring in arr2[i], you can use a for loop for that...
for(wordstart=arr2[i];*wordstart!='\0';wordstart2++)
{
Finally, you have an inner loop that compares each character of arr1 with the substring defined by the wordstart. You need to make sure that neither p1 or p2 goes beyond the end of their respective strings and that they point to the same character.
for(p1=arr1,p2=wordstart;(*p1!='\0')&&(*p2!='\0')&&(*p1==*p2);p1++,p2++);
Once any of those 3 conditions is no longer true, if you check that p1 has reached the end of the string, you know that it must have found a substring.
if(*p1=='\0')
{
printf("Matched %s\n",arr2[i]);
return 1;
}
The resulting function looks like:
int compString(char *arr1, char **arr2, int sz2)
{
printf("\n\n\n");
printf("WORD: %s\n",arr1);
int i = 0;
char *p1;
char *wordstart;
char *p2;
for(i=0;i<sz2;i++)
{
printf("BEGIN ITERATION %d\n",i);
printf("Checking against word: %s\n",arr2[i]);
for(wordstart=arr2[i];*wordstart!='\0';wordstart++)
{
for(p1=arr1,p2=wordstart;(*p1!='\0')&&(*p2!='\0')&&(*p1==*p2);p1++,p2++);
if(*p1=='\0')
{
printf("Matched %s\n",arr2[i]);
return 1;
}
}
}
return 0;
}
Other things to note is that you don't need to implicitly add the \0 to a string. The following is just fine.
char *array1[] = {"arp", "live", "strong"};
You could also add NULL as the last element in the list of strings so that you don't need to track how many strings there are.
char *array2[] = {"lively", "alive", "harp", "sharp", "armstrong"};
which means the outer loop could be simplified to
for(i=0;arr2[i];i++)
My approach that does it's job (in my compiler at least) but has a SIGSEGV signal (postmortem report). I want to understand what's happening wrong.
Function prints "an" or "a", depending on the p->field first char.
example of what function may display (a/an):
Jack was an Astronaut
or
Jack was a Solider
void display(const struct student_t* p)
{
const char vovel[] = "aoeiuAOEIU";
char article[3] = "a";
for (unsigned int i=0; i<sizeof(vovel); i++) {
if (p->field[0]==vovel[i]) //<-------here it detects SIGSEGV.
strcpy(article, "an");
}
if (p!=NULL)
printf("%s was %s %s\n", p->name, article, p->field);
}
The Structure can be like:
struct student_t
{
char name[20];
char field[50];
};
What is wrong with that?
I'm thinking that maybe what I made wrong is the way I access first sign of p->field... Or maybe it has something to do with the last sign of my vovel array which is '\0'..? Or can it be that SIGSEGV is caused by another part of my program, but is detected here instead? I used similar loop elsewhere, clang detected "Out of bound memory access (access exceeds upper limit of memory block) -- Logic error bug" in this other loop (posted below) that I based my "vovel" detecting function on...
const char* find(const char * start, char **end, char **next, const char *delims) {
static const char blanks[] = " \t\r";
start += strspn(start, blanks);
*end=NULL;
*end = strpbrk(start, delims); //<-- cpp check: Either the condition 'end==0' is redundant or there is possible null pointer dereference: end. -- warning
if (end == NULL) {
return NULL;
}
*next = *(end) + 1;
while(*end > start) {
bool found = false;
for (unsigned int i=0; i<sizeof(blanks); i++) {
if ((*end)[-1] == blanks[i]) { //<-- clang: Out of bound memory access (access exceeds upper limit of memory block) -- Logic error bug
--*end ;
found = true;
break;
}
}
if (!found) break;
}
return start;
}
Thanks a lot for all the help!
Right now, I'm attempting to familiarize myself with C by writing a function which, given a string, will replace all instances of a target substring with a new substring. However, I've run into a problem with a reallocation of a char* array. To my eyes, it seems as though I'm able to successfully reallocate the array string to a desired new size at the end of the main loop, then perform a strcpy to fill it with an updated string. However, it fails for the following scenario:
Original input for string: "use the restroom. Then I need"
Target to replace: "the" (case insensitive)
Desired replacement value: "th'"
At the end of the loop, the line printf("result: %s\n ",string); prints out the correct phrase "use th' restroom. Then I need". However, string seems to then reset itself: the call to strcasestr in the while() statement is successful, the line at the beginning of the loop printf("string: %s \n",string); prints the original input string, and the loop continues indefinitely.
Any ideas would be much appreciated (and I apologize in advance for my flailing debug printf statements). Thanks!
The code for the function is as follows:
int replaceSubstring(char *string, int strLen, char*oldSubstring,
int oldSublen, char*newSubstring, int newSublen )
{
printf("Starting replace\n");
char* strLoc;
while((strLoc = strcasestr(string, oldSubstring)) != NULL )
{
printf("string: %s \n",string);
printf("%d",newSublen);
char *newBuf = (char *) malloc((size_t)(strLen +
(newSublen - oldSublen)));
printf("got newbuf\n");
int stringIndex = 0;
int newBufIndex = 0;
char c;
while(true)
{
if(stringIndex > 500)
break;
if(&string[stringIndex] == strLoc)
{
int j;
for(j=0; j < newSublen; j++)
{
printf("new index: %d %c --> %c\n",
j+newBufIndex, newBuf[newBufIndex+j], newSubstring[j]);
newBuf[newBufIndex+j] = newSubstring[j];
}
stringIndex += oldSublen;
newBufIndex += newSublen;
}
else
{
printf("old index: %d %c --> %c\n", stringIndex,
newBuf[newBufIndex], string[stringIndex]);
newBuf[newBufIndex] = string[stringIndex];
if(string[stringIndex] == '\0')
break;
newBufIndex++;
stringIndex++;
}
}
int length = (size_t)(strLen + (newSublen - oldSublen));
string = (char*)realloc(string,
(size_t)(strLen + (newSublen - oldSublen)));
strcpy(string, newBuf);
printf("result: %s\n ",string);
free(newBuf);
}
printf("end result: %s ",string);
}
At first the task should be clarified regarding desired behavior and interface.
The topic "Char array..." is not clear.
You provide strLen, oldSublen newSublen, so it looks that you indeed want to work just with bulk memory buffers with given length.
However, you use strcasestr, strcpy and string[stringIndex] == '\0' and also mention printf("result: %s\n ",string);.
So I assume that you want to work with "null terminated strings" that can be passed by the caller as string literals: "abc".
It is not needed to pass all those lengths to the function.
It looks that you are trying to implement recursive string replacement. After each replacement you start from the beginning.
Let's consider more complicated sets of parameters, for example, replace aba by ab in abaaba.
Case 1: single pass through input stream
Each of both old substrings can be replaced: "abaaba" => "abab"
That is how the standard sed string replacement works:
> echo "abaaba" | sed 's/aba/ab/g'
abab
Case 2: recursive replacement taking into account possible overlapping
The first replacement: "abaaba" => "ababa"
The second replacement in already replaced result: "ababa" => "abba"
Note that this case is not safe, for example replace "loop" by "loop loop". It is an infinite loop.
Suppose we want to implement a function that takes null terminated strings and the replacement is done in one pass as with sed.
In general the replacement cannot be done in place of input string (in the same memory).
Note that realloc may allocate new memory block with new address, so you should return that address to the caller.
For implementation simplicity it is possible to calculate required space for result before memory allocation (Case 1 implementation). So reallocation is not needed:
#define _GNU_SOURCE
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
char* replaceSubstring(const char* string, const char* oldSubstring,
const char* newSubstring)
{
size_t strLen = strlen(string);
size_t oldSublen = strlen(oldSubstring);
size_t newSublen = strlen(newSubstring);
const char* strLoc = string;
size_t replacements = 0;
/* count number of replacements */
while ((strLoc = strcasestr(strLoc, oldSubstring)))
{
strLoc += oldSublen;
++replacements;
}
/* result size: initial size + replacement diff + sizeof('\0') */
size_t result_size = strLen + (newSublen - oldSublen) * replacements + 1;
char* result = malloc(result_size);
if (!result)
return NULL;
char* resCurrent = result;
const char* strCurrent = string;
strLoc = string;
while ((strLoc = strcasestr(strLoc, oldSubstring)))
{
memcpy(resCurrent, strCurrent, strLoc - strCurrent);
resCurrent += strLoc - strCurrent;
memcpy(resCurrent, newSubstring, newSublen);
resCurrent += newSublen;
strLoc += oldSublen;
strCurrent = strLoc;
}
strcpy(resCurrent, strCurrent);
return result;
}
int main()
{
char* res;
res = replaceSubstring("use the restroom. Then I need", "the", "th");
printf("%s\n", res);
free(res);
res = replaceSubstring("abaaba", "aba", "ab");
printf("%s\n", res);
free(res);
return 0;
}
I got the following string from the user:
char *abc = "a234bc567d";
but all the numbers can have different lengths than in this example (letters are constants).
How can I get each part of numbers? (again, it can be 234 or 23743 or something else..)
I tried to use strchr and strncpy but I need to allocate memory for this (for strncpy), and I hope there is a better solution.
Thanks.
You can do something like this:
char *abc = "a234bc567d";
char *ptr = abc; // point to start of abc
// While not at the end of the string
while (*ptr != '\0')
{
// If position is the start of a number
if (isdigit(*ptr))
{
// Get value (assuming base 10), store end position of number in ptr
int value = strtol(ptr, &ptr, 10);
printf("Found value %d\n", value);
}
else
{
ptr++; // Increase pointer
}
}
If I understand your question, you are trying to extract the parts of the user input that contain numbers ... and the sequence of numbers can be variable ... but the letters are fixed i.e. a or b or c or d. Correct ... ? The following program may help you. I tried it for strings "a234bc567d", "a23743bc567d" and "a23743bc5672344d". Works ...
int main()
{
char *sUser = "a234bc567d";
//char *sUser = "a23743bc567d";
//char *sUser = "a23743bc5672344d";
int iLen = strlen(sUser);
char *sInput = (char *)malloc((iLen+1) * sizeof(char));
strcpy(sInput, sUser);
char *sSeparator = "abcd";
char *pToken = strtok(sInput, sSeparator);
while(1)
{
if(pToken == NULL)
break;
printf("Token = %s\n", pToken);
pToken = strtok(NULL, sSeparator);
}
return 0;
}
Why I cannot get "xxx"? The returned value is something very strange symbols... I want the returned value to be xxx, but I don't know what is wrong with this program. The function works well and can print "xxx" for me, but once it returns value to main function, the string outcome just cannot display "xxx" well. Can somebody tell me the reason?
char* cut_command_head(char *my_command, char *character) {
char command_arg[256];
//command_arg = (char *)calloc(256, sizeof(char));
char *special_position;
int len;
special_position = strstr(my_command, character);
//printf("special_position: %s\n", special_position);
for (int i=1; special_position[i] != '\0'; i++) {
//printf("spcial_position[%d]: %c\n", i, special_position[i]);
command_arg[i-1] = special_position[i];
//printf("command_arg[%d]: %c\n", i-1, command_arg[i-1]);
}
len = (int)strlen(command_arg);
//printf("command_arg len: %d\n", len);
command_arg[len] = '\0';
my_command = command_arg;
printf("my_command: %s\n", my_command);
return my_command;
}
int main(int argc, const char * argv[]) {
char *test = "cd xxx";
char *outcome;
outcome = cut_command_head(test, " ");
printf("outcome: %s\n",outcome);
return 0;
}
Here
my_command = command_arg;
you assign the address of a local variable to the variable you are returning. This local variable lives on the stack of cut_command_head().
This address is invalid after the function returned. Accessing the memory returned by cut_command_head() provokes undefined behaviour.
You need to allocate memory somewhen, somewhere.
The easiest way is to use strdup() (if available):
my_command = strdup(command_arg);
A portable approach is to use malloc() followed by copying the data in question:
my_command = malloc(strlen(command_arg));
if (NULL != my_command)
{
strcpy(my_command, command_arg);
}
Also this looks strange:
len = (int)strlen(command_arg);
//printf("command_arg len: %d\n", len);
command_arg[len] = '\0';
Just remove it and initialise command_arg to all zeros right at the beginning to makes sure it is always 0-terminated:
char command_arg[256] = {0};