Compare strings - c

Thanks! Works perfectly now. Java has made me stupid :(
I am having some difficulty comparing strings in C. I get correct output when I don't use my isMorse function, but when I use it the output becomes inaccurate and displays random characters. As far as I can tell, the variable "morse" is actually changed when strcmp is called on it. I am thinking that it has to do with "morse" not being a constant, but I am unsure of how to remedy it.
Thanks!!
char *EnglishToMorse(char english)
{
static char *morse;
int i;
for (i = 0; i < LOOKUP_SIZE; i++)
{
if (lookup[i].character == english)
{
morse = lookup[i].morse;
return morse;
}
}
morse = &english; // Problem was here!!!
return morse;
}

I have a little guess. The function EnglishToMorse() might be returning a pointer to memory from the stack. If so, running another function after EnglishToMorse() will alter that memory. This would be due to a mistake in EnglishToMorse() -- declaring a local array of char and returning a pointer to it.
Without seeing the code for EnglishToMorse(), this is just a stab in the dark. You could provide us more code to look at, and win.

You have a static variable in EnglishToMorse, but it's the wrong one. There's no need for morse to be static -- you simply return it. But you do need english to be static -- rather than on the stack -- since you return its address. Also, it needs to be a NUL-terminated string. Do something like
char *EnglishToMorse(char english)
{
static char save_english[2]; /* initialized to 0's */
int i;
for (i = 0; i < LOOKUP_SIZE; i++)
if (lookup[i].character == english)
return lookup[i].morse;
save_english[0] = english;
return save_english;
}
Note, however, that the caller of EnglishToMorse must use the result or save it before EnglishToMorse is called again, since the second call may overwrite static_english.

The reason your morse variable appears to change is because it points to an area on the stack. The reason it points to an area on the stack is because you assigned it the address of your parameter english, which got pushed onto the stack when you called your function then popped off the stack once the function completed.
Now your morse variable will point to whatever memory takes that same location on the stack, which will constantly change throughout the lifetime of your program.
In my opinion, the best way to fix this problem would be to return a NULL pointer from EnglishToMorse if the character is not A-Z... then check for the NULL pointer in your isMorse function. After all, it's good practice to check for NULL pointers in code.
char* EnglishToMorse(char english)
{
int i;
english = toupper(english);
for (i = 0; i < LOOKUP_SIZE; i++)
{
if (lookup[i].character == english)
return lookup[i].morse;
}
return NULL;
}
int isMorse(char* morse)
{
int i;
/* Check for NULL, so strcmp doesn't fail. */
if (morse == NULL) return 0;
for (i = 0; i < LOOKUP_SIZE; i++)
{
if(strcmp(morse, lookup[i].morse) == 0)
return 1;
}
return 0;
}

It looks like the problem is likely in this function:
char *EnglishToMorse(char english) {
static char *morse;
// ...
morse = &english;
return morse;
}
You are returning the address of a parameter (english) that's passed into the function. This parameter ceases to exist after the function returns (and before the caller gets a chance to actually see the value). It appears as though you've attempted to fix this by declaring the morse variable static, but this only makes the morse variable itself static, not whatever it points to.
Also, strings in C must be terminated with a NUL character. By returning a pointer to a single character (as in english), there is no guarantee that the next byte in memory is or is not a NUL character. So the caller, who is expecting to see a NUL-terminated string, may get more than they bargained for.

char *EnglishToMorse(char english)
and
morse = &english;
are the problem.
You should never return a pointer to a local variable or a function parameter.

Related

Why does a function share the same struct instance?

The user specifies the number of lines in the output in the arguments (as the size of the page in pagination), by pressing the key he gets the next lines. How it works now:
Let's say the user chose to receive 1 row at a time:
first string
first string
second string
first string
second string
third string
struct result {
char part[32768];
int is_end_of_file;
};
struct result readLines(int count) {
int lines_readed = 0;
struct result r;
if (count == 0) {
count = -1;
}
while (count != lines_readed) {
while (1) {
char sym[1];
sym[0] = (char) fgetc(file);
if (feof(file)) {
r.is_end_of_file = 1;
return r;
}
strcat(r.part, sym);
if (*"\n" == sym[0]) {
break;
}
}
lines_readed++;
}
return r;
}
int main(int argc, char *argv[]) {
file = fopen(argv[1], "r");
while (1) {
struct result res = readLines(atoi(argv[2]));
printf("%s", res.part);
if (res.is_end_of_file) {
printf("\nEnd of file!\n");
break;
}
getc(stdin);
}
closeFile();
return 0;
}
I know that when I define a struct in the readLines function, it is already filled with previous data. Forgive me if this is a dumb question, I'm a complete newbie to C.
I'm not sure what is the question here, however I'll do my best to address what I understand. I assume the problem lies somewhere around the "previous data" you mentioned in the title and in the comments to the question.
Let's first set an example program:
#include <stdio.h>
struct result {
char part[10];
};
int main (int argc, char *argv[]) {
struct result r;
printf(r.part);
return 0;
}
The variable r has a block scope, so it has automatic storage duration. Since it has automatic storage duration, and no initializer is provided, it is initialized to an indeterminate value (as mentioned by UnholySheep and n. 1.8e9-where's-my-share m. in the comments to the question). I don't yet get all the C intricacies, but based on this, I guess you cannot rely on what the value of r will be.
Now, in the comments to the question you try to understand how is it possible that you can access some data that was not written by the current invokation of your program. I cannot tell you exactly how is that possible, but I suspect it is rather platform-specific than C-specific. Maybe the following will help you:
What is Indeterminate value?
What happens to memory after free()?
Why memory isn't zero out from malloc?
Going further, in the line
printf(r.part);
first we try to access a member part of r, and then we call printf with the value of this member. Accessing a variable of an indeterminate value results in undefined behavior, according to this. So, in general, you cannot rely also on anything that happens after invoking r.part (it doesn't mean there is no way of knowing what will happen).
There is also another problem with this code. printf's first parameter is interpreted as having the type const char *, according to man 3 printf, but there is provided a variable that has the type struct result. Indeed, there is produced the following warning when the code is compiled with gcc with the option -Wformat-security:
warning: format not a string literal and no format arguments [-Wformat-security]
Unfortunately, I don't know C well enough to tell you what precisely is happening when you do such type mismatch in a function call. But as we know that there already happened undefined behavior in the code, this seems less important.
As a side note, a correct invokation of printf could be in this case:
printf("%p", (void *)r.part);
r.part is a pointer, therefore I use the %p conversion specifier, and cast the value to (void *).

"address of stack memory associated with local variable" error when running user-defined function in C [duplicate]

This question already has answers here:
Returning string from C function
(8 answers)
Can a local variable's memory be accessed outside its scope?
(20 answers)
Closed 3 years ago.
Recentely I've been taking the CS50 2020 course from Harvard University as an introduction to C programming. I'm not very experienced with the language or with coding as a whole, so I'm struggling a bit to figure out what is wrong with my code.
I wrote this little function which is suposed to take in a string and, by calling another function, encrypt the text using a Caesar cypher, then return it as a string. Problem is, I can't figure out how to return the character array as a string. I tried adding a NUL char at the end of the array after reading a bit about the problem, and it compiled alright, but when I ran the program I got the following error message:
error: address of stack memory associated with local variable 'result' returned [-Werror,-Wreturn-stack-address]
return result;
^~~~~~
My code:
string encypher(string text)
{
int length = strlen(text);
char result[length];
for(int i = 0; i < length; i++)
{
int letter_c = test_char(text[i]);
result[i] = (char)letter_c;
}
result[length + 1] = '\0';
return result;
}
The problem here is that result, being an array, decays to a pointer to its first element when used in an expression, and that is what is being returned from the function. And because the lifetime of the array ends when the function returns, that pointer now points to an invalid memory location, and attempting to use it invokes undefined behavior.
Instead of creating a local array, use the malloc function to dynamically allocate memory. That memory is valid for the life of the program, or until the returned pointer is passed to free:
string result = malloc(length + 1);
Also, note that you need to set aside one extra byte for the null byte that is used to terminate a string.
In the line
return result;
the array decays to a pointer to the first element of the array, so it is effectively the following:
return &result[0];
This array is allocated on the stack in the function encypher, so it will no longer exist when the function returns. Therefore, the returned pointer is a dangling pointer, which means that it is pointing to memory which is no longer allocated and may be overwritten by something else. For this reason, such a pointer should not be used.
In order to allocate memory that will still exist after the function returns, you can either:
Use dynamic memory allocation, such as malloc.
Allocate the memory on the stack of the function calling encypher (instead of in the function encypher itself) and change the parameters of the function encypher to accept a pointer to that array.
In my opinion, the second solution is the cleaner solution, as it allows the caller to decide where and how the memory is allocated. Using that solution, your code would look like this:
void encypher( char *cyphertext, const char *plaintext )
{
int length = strlen(plaintext);
//removed: char result[length];
for(int i = 0; i < length; i++)
{
int letter_c = test_char(plaintext[i]);
cyphertext[i] = (char)letter_c;
}
cyphertext[length + 1] = '\0';
}
The function could now be called like this:
int main( void )
{
char plaintext[23] = "This is the plaintext.";
char cyphertext[23]; //make sure the buffer is large enough to store the cyphertext including the terminating null character
encypher( cyphertext, plaintext );
return 0;
}

printing strings produces garbage even when (I think) they are null terminated

When I run print_puzzle(create_puzzle(input)), I get a bunch of gobbledegook at the bottom of the output, only in the last row. I have no idea why this keeps happening. The output is supposed to be 9 rows of 9 numbers (the input is a sudoku puzzle with zeroes representing empty spaces).
This bunch of code should take that input, make a 2d array of strings and then, with print_puzzle, print those strings out in a grid. They are string because eventually I will implement a way to display all the values the square could possibly be. But for now, when I print it out, things are screwed up. I even tried putting the null value in every single element of all 81 strings but it still get's screwed up when it goes to print the strings. I'm lost!
typedef struct square {
char vals[10]; // string of possible values
} square_t;
typedef struct puzzle {
square_t squares[9][9];
} puzzle_t;
static puzzle_t *create_puzzle(unsigned char vals[9][9]) {
puzzle_t puz;
puzzle_t *p = &puz;
int i, j, k, valnum;
for (i = 0; i < 9; i++) {
for (j = 0; j < 9; j++) {
puz.squares[i][j].vals[0] = '\0';
puz.squares[i][j].vals[1] = '\0';
puz.squares[i][j].vals[2] = '\0';
puz.squares[i][j].vals[3] = '\0';
puz.squares[i][j].vals[4] = '\0';
puz.squares[i][j].vals[5] = '\0';
puz.squares[i][j].vals[6] = '\0';
puz.squares[i][j].vals[7] = '\0';
puz.squares[i][j].vals[8] = '\0';
puz.squares[i][j].vals[9] = '\0';
valnum = vals[i][j] -'0';
for (k = 0; k < 10; k++){
if ((char)(k + '0') == (char)(valnum + '0')){
char tmpStr[2] = {(char)(valnum +'0'),'\0'};
strcat(puz.squares[i][j].vals, tmpStr);
}
}
}
}
return p;
}
void print_puzzle(puzzle_t *p) {
int i, j;
for (i=0; i<9; i++) {
for (j=0; j<9; j++) {
printf(" %2s", p->squares[i][j].vals);
}
printf("\n");
}
}
In short:
In function create_puzzle(), you are returning a pointer to the local variable puz. Local variables are only known to function inside their own. So the content referenced by the pointer returned by create_puzzle is indeterminate.
More details:
In C++, local variables are usually generated as storage on a "stack" data structure. when create_puzzle() method is entered, its local variables come alive. A function's local variables will be dead when the method is over. An implementation of C++ is not required to leave the garbage you left on the stack untouched so that you can access it's original content. C++ is not a safe language, implementations let you make mistake and get away with it. Other memory-safe languages solve this problem by restricting your power. For example in C# you can take the address of a local, but the language is cleverly designed so that it is impossible to use it after the lifetime of the local ends.
This answer is very awesome:
Can a local variable's memory be accessed outside its scope?
In function create_puzzle(), you are returning a pointer of the type puzzle_t. But, the address of variable puz of the type puzzle_t is invalid once you return from the function.
Variables that are declared inside a function are local variables. They can be used only by statements that are inside that function. These Local variables are not known to functions outside their own, so returning an address of a local variable doesn't make sense as when the function returns, the local storage it was using on the stack is considered invalid by the program, though it may not get cleared right away. Logically, the value at puz is indeterminate, and accessing it results in undefined behavior.
You can make puz a global variable, and use it the way you are doing right now.
You are returning a local variable here:
return p;
Declare p and puz outside of the function, then it should work.
p point to local memory that is unavailable after the function ends. Returning that leads to problems. Instead allocate memory.
// puzzle_t puz;
// puzzle_t *p = &puz;
puzzle_t *p = malloc(sizeof *p);
assert(p);
Be sure to free() the memory after the calling code completes using it.

How to make a pointer in a function change the pointee in C

Hee guys,
I have been reading a couple of things about pointers and pointees and started getting curious. The only thing I dont understand is how pointers behave in functions, hence the following code:
#include <stdio.h>
int pointeeChanger(char* writeLocation) {
writeLocation = "something";
return 0;
}
int main(void)
{
char crypted[] = "nothing";
char* cryptedPointer = crypted;
pointeeChanger(cryptedPointer);
printf("The new value is: %s", cryptedPointer);
return 0;
}
What my intention to do is to adjust the pointee, "crypted" var, through a pointer given to a function. The only thing is that it is not working. Could you please explain me what is going wrong in my thought process. I am fairly new to C so my errors could be fairly basic.
Thanks in advance!
Greetings,
Kipt Scriddy
C strings are not the best material to learn pointers, because they are implemented as pointers to char. Let's use int instead:
#include <stdio.h>
void pointeeChanger(int* writeLocation) {
// Using dereference operator "*"
*writeLocation = 42; // something
}
int main(void) {
int crypted = 0; // Nothing
pointeeChanger(&cryptedPointer); // Taking an address with "&"
printf("The new value is: %d", crypted);
return 0;
}
This works as expected.
Modifying strings in place is a lot harder, because you are forced to deal with memory management issues. Specifically, the string into which you copy must have enough space allocated to fit the new string. This wouldn't work with "nothing" and "something", because the replacement is longer by two characters.
Short answer: writeLocation is a local variable and is a copy of cryptedPointer. When you modify writeLocation, cryptedPointer is not modified.
If you want to modify cryptedPointer, you have to pass a pointer to it, like so:
#include <stdio.h>
int pointeeChanger(char** writeLocation) { /* Note: char** */
*writeLocation = "something"; /* Note: *writeLocation */
return 0;
}
int main(void)
{
char crypted[] = "nothing";
char* cryptedPointer = crypted;
pointeeChanger(&cryptedPointer); /* Note: &cryptedPointer */
printf("The new value is: %s", cryptedPointer);
return 0;
}
There are other issues with this code though. After the call to pointeeChanger(), cryptedPointer no longer points to the crypted array. I suspect you actually wanted to change the contents of that array. This code fails to do that.
To change the value of crypted[] you will need to use strcpy() or (preferably) strncpy(). Also you will need to watch the size of the crypted[] array - "something" is longer than "nothing" and will cause a buffer overflow unless crypted[] is made larger.
This code will modify the original crypted[] array:
#include <stdio.h>
#include <string.h>
#define MAX_STR_LEN 64
/*
* Only char* required because we are not modifying the
* original pointer passed in - we are modifying what it
* points to.
*/
int pointeeChanger(char* writeLocation)
{
/*
* In C, you need to use a function like strcpy or strncpy to overwrite a
* string with another string. Prefer strncpy because it allows you to
* specify a maximum size to copy, which helps to prevent buffer overruns.
*/
strncpy(writeLocation, "something", MAX_STR_LEN);
/* strncpy doesn't automatically add a \0 */
writeLocation[MAX_STR_LEN] = '\0';
return 0;
}
int main(void)
{
/*
* The +1 is because an extra character is required for the
* null terminator ('\0')
*/
char crypted[MAX_STR_LEN + 1] = "nothing";
pointeeChanger(crypted);
printf("The new value is: %s", crypted);
return 0;
}
It depends slightly on what you actually want to do:
Do you want to change what cryptedPointer is pointing to, or change the content that cryptedPointer is pointing at?
The second can be done by:
strcpy(writeLocation, "something");
Beware that if something is longer than what the original string's size, you'll overflow the buffer, which is a bad thing. So to fix this, you'd have to have char crypted[10] = "nothing";, to make space for the string "something".
You can clearly also do something like:
writeLocation[2] = 'f';
writeLocation[3] = 'f';
and have the printf print "noffing"
but if you want to do the first variant, then you need to pass a pointer to the pointer:
int pointeeChanger(char** writeLocation) {
*writeLocation = "something";
return 0;
}
And then call:
pointeeChanger(&cryptedPointer);
Note that when this returns, cruptedPointer is pointing at a constant string that can't be modified, where your original crypted can be modified.
Consider that Tom is hired by Sally to break knuckles for the mafia.
Pass-by-value: If Sally tells Tom to count the number of knuckles he breaks at work today, then Sally has no way of knowing which number Tom has come up with until he returns from the road. They both have a copy of the number "zero" in their heads to begin with, but Tom's number might increase throughout the course of the day.
Note the word "copy". When you pass-by-value to a function, you're passing a copy of the object. When you modify the object within a function, you're modifying the copy instead of the original.
Pass-by-reference: If Sally tells Tom to tally the number of knuckles he breaks in the sky, then she (and anyone else who's interested) can refer to the sky. By changing the sky, Tom would also be changing Sally's number.
edit: C doesn't have pass-by-reference, though it does have pointers, which are reference types. Passing a pointer is still pass-by-value, and a copy with the same pointer value is still formed. Hence, your assignment is to the copy, not the original.

How do I return an array of strings from a recursive function?

How do I return an array of strings from a recursive function?
For example::
char ** jumble( char *jumbStr)//reccurring function
{
char *finalJumble[100];
...code goes here...call jumble again..code goes here
return finalJumble;
}
Thanks in advance.
In C, you cannot return a string from a function. You can only return a pointer to a string. Therefore, you have to pass the string you want returned as a parameter to the function (DO NOT use global variables, or function local static variables) as follows:
char *func(char *string, size_t stringSize) {
/* Fill the string as wanted */
return string;
}
If you want to return an array of strings, this is even more complex, above all if the size of the array varies. The best IMHO could be to return all the strings in the same string, concatenating the strings in the string buffer, and an empty string as marker for the last string.
char *string = "foo\0bar\0foobar\0";
Your current implementation is not correct as it returns a pointer to variables that are defined in the local function scope.
(If you really do C++, then return an std::vector<std::string>.)
Your implementation is not correct since you are passing a pointer to a local variable that will go out of scope rather quickly and then you are left with a null pointer and eventually a crash.
If you still want to continue this approach, then pass by reference (&) an array of characters to that function and stop recursing once you have reached the desired end point. Once you are finished, you should have the 'jumbled' characters you need.
You don't :-)
Seriously, your code will create a copy of the finalJumble array on every iteration and you don't want that I believe. And as noted elsewhere finalJumble will go out of scope ... it will sometimes work but other times that memory will be reclaimed and the application will crash.
So you'd generate the jumble array outside the jumble method:
void jumble_client( char *jumbStr)
char *finalJumble[100];
jumble(finalJuble, jumbStr);
... use finalJumble ...
}
void jumble( char **jumble, char *jumbStr)
{
...code goes here...call jumble again..code goes here
}
And of course you'd use the stl datatypes instead of char arrays and you might want to examine whether it might be sensible to write a jumble class that has the finalJumble data as a member. But all that is a little further down the road. Nevertheless once you got the original problem solved try to find out how to do that to learn more.
I would pass a vector of strings as a parameter, by reference. You can always use the return value for error checking.
typedef std::vector<std::string> TJumbleVector;
int jumble(char* jumbStr, TJumbleVector& finalJumble) //recurring function
{
int err = 0; // error checking
...code goes here...call jumble again..code goes here
// finalJumble.push_back(aGivenString);
return err;
}
If you want to do it in C, you can keep track of the number of strings, do a malloc at the last recursive call, and fill the array after each recursive call. You should keep in mind that the caller should free the allocated memory. Another option is that the caller does a first call to see how much space he needs for the array, then does the malloc, and the call to jumble:
char** jumble(char* jumbStr)
{
return recursiveJumble(jumbStr, 0);
}
char** recursiveJumble(char* jumbStr, unsigned int numberOfElements)
{
char** ret = NULL;
if (/*baseCase*/)
{
ret = (char**) malloc(numberOfElements * sizeof(char*));
}
else
{
ret = jumble(/*restOfJumbStr*/, numberOfElements+1);
ret[numberOfElements] = /*aGivenString*/;
}
return ret;
}

Resources