I'm writing an nginx module in C and am having some super bizarre results. I've extracted a function from my module to test its output as well as the relevant nginx type/macro definitions.
I'm building a struct in my build_key_hash_pair function, then doing a printf() on the contents in main. When I printf the data inside the inner function, main's output is valid. When I remove the printf inside the inner function, main prints an empty string. This is confusing because after the function call to build_key_hash_pair I am not operating on the data except to display it. Here is the code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct ngx_str_t {
size_t len;
char *data;
} ngx_str_t;
typedef uintptr_t ngx_uint_t;
typedef struct key_hash_pair {
ngx_uint_t hash;
ngx_str_t key;
} key_hash_pair;
#define ngx_string(str) { sizeof(str) - 1, (char *) str }
#define ngx_str_set(str, text) \
(str)->len = sizeof(text) - 1; (str)->data = (char *) text
#define ngx_hash(key, c) ((ngx_uint_t) key * 31 + c)
#define ngx_str_null(str) (str)->len = 0; (str)->data = NULL
void build_key_hash_pair(key_hash_pair *h, ngx_str_t api_key, ngx_str_t ip);
int main (int argc, char const *argv[])
{
ngx_str_t api_key = ngx_string("86f7e437faa5a7fce15d1ddcb9eaeaea377667b8");
ngx_str_t ip = ngx_string("123.123.123.123");
key_hash_pair *pair;
pair = malloc(sizeof(key_hash_pair));
build_key_hash_pair(pair, api_key, ip);
printf("api_key = %s\n", api_key.data);
printf("ip = %s\n", ip.data);
printf("pair->key = %s\n", pair->key.data);
printf("pair->hash = %u\n", (unsigned int)pair->hash);
return 0;
}
void build_key_hash_pair(key_hash_pair *h, ngx_str_t api_key, ngx_str_t ip)
{
ngx_str_null(&h->key);
char str[56];
memset(str, 0, sizeof(str));
strcat(str, api_key.data);
strcat(str, ip.data);
ngx_str_set(&h->key, str);
ngx_uint_t i;
for (i = 0; i < 56; i++) {
h->hash = ngx_hash(&h->hash, h->key.data[i]);
}
}
Here is the output when I do a printf("hello") inside the build_key_hash_pair function:
helloapi_key = 86f7e437faa5a7fce15d1ddcb9eaeaea377667b8
ip = 123.123.123.123
pair->key = 86f7e437faa5a7fce15d1ddcb9eaeaea377667b8123.123.123.123
pair->hash = 32509824
And here is the (bizarre) output when I do NOT printf inside build_key_hash_pair:
api_key = 86f7e437faa5a7fce15d1ddcb9eaeaea377667b8
ip = 123.123.123.123
pair->key =
pair->hash = 32509824
As you can see, pair->key has no data. In gdb, if I breakpoint right after the call in main to build_key_hash_pair, pair->key contains the appropriate data. But after the first call to printf, it is blanked out. The memory address stays the same, but the data is just gone. Can anyone tell me what in the world I'm doing wrong?
This line is a problem:
ngx_str_set(&h->key, str);
Here str is a local variable, and you are putting a pointer to it inside h->key, which will be returned to the caller. After build_key_hash_pair returns, the pointer will no longer be valid. When you didn't call any other function, the pointer happened to still point to the same value, but this is not something you can rely on. The call to printf overwrote that part of the stack.
What you need is either to dynamically allocate the string with malloc or strdup, or put an array inside the key_hash_pair struct to hold the key (possible if the key is always the same size).
build_key_hash_pair uses stack-based array str to populate the data field in the key of h. When you exit from the function, that pointer is no longer valid since str goes out of scope.
Your results could be anything from apparently correct operation to a program failure. printf in the function will work, but definitely not if called afterwards. ngx_str_set needs to allocate memory and copy the text string into it (to be freed later of course).
I would replace all those macros with functions or inline code, personally.
The problem is in the build_key_hash_pair function, specifically with the stack variable char str[56]; which is assigned to the key_hash_pair via the macro ngx_str_set.
Since the stack frame containing char str[56]; disappears when the function returns, all bets are off for the value of of the pair's data once the function ends.
Related
I have tried solving an exercise where we have to return a struct containing the first whitespace-separated word and its length of a given string. Example: "Test string" returns {"Test", 4}.
To solve this problem I have implemented the following function:
struct string whitespace(char* s){
char* t = s;
size_t len = 0;
while(*t != ' '){
len++;
t++;
}
char out[len+1];
strncpy(out, s, len);
if(len>0){
out[len] = '\0';
}
//printf("%d\n",len);
struct string x = {out, len};
return x;
}
with the struct defined as follows:
struct string{
char* str;
size_t len;
};
If I run the following main function:
int main(){
char* s = "Test string";
struct string x = whitespace(s);
printf("(%s, %d)\n", x.str, x.len);
return 0;
}
I get this output:
(, 4)
where when I remove the comment //printf("%d\n",len); I get:
4
(Test, 4)
In fact, the string (Test, 4) is output whenever I print out a given variable in the function whitespace(char* s). Also when using different gcc optimization flags such as -O3 or -Ofast the result is correct even without the printing of the variables in the function.
Did I bump into some kind of undefined behavior? Can somebody explain what is happening here?
The struct you're returning includes a char *, which you point to the local variable out. That variable goes out of scope when the function returns, so dereferencing that pointer invokes undefined behavior.
Rather than using a VLA, declare out as a pointer and allocate memory for it to point to. Then you can safely set the struct member to that address and the memory will be good for the duration of the program.
char *out = malloc(len+1);
Also, be sure to free this memory before exiting your program.
I am doing a bit of studying about C pointers and how to transfer them to functions, so I made this program
#include <stdio.h>
char* my_copy(pnt);
void main()
{
int i;
char a[30];
char* p,*pnt;
printf("Please enter a string\n");
gets(a);
pnt = a;
p = my_copy(pnt);
for (i = 0; i < 2; i++)
printf("%c", p[i]);
}
char* my_copy(char* pnt)
{
char b[3];
char* g;
g = pnt;
b[0] = *pnt;
for (; *pnt != 0; pnt++);
pnt--;
b[1] = *pnt;
b[2] = NULL;
return b;
}
It's supposed to take a string using only pointers and send a pointer of the string to the function my_copy and return a pointer to a new string which contains the first and the last letter of the new string. Now the problem is that the p value does receive the 2 letters but I can't seem to print them. Does anyone have an idea why?
I see five issues with your code:
char* my_copy(pnt); is wrong. A function prototype specifies the types of the parameters, not their names. It should be char *my_copy(char *).
void main() is wrong. main should return int (and a parameterless function is specified as (void) in C): int main(void).
gets(a); is wrong. Any use of gets is a bug (buffer overflow) and gets itself has been removed from the standard library. Use fgets instead.
b[2] = NULL; is a type error. NULL is a pointer, but b[2] is a char. You want b[2] = '\0'; instead.
my_copy returns the address of a local variable (b). By the time the function returns, the variable is gone and the pointer is invalid. To fix this, you can have the caller specify another pointer (which tells my_copy where to store the result, like strcpy or fgets). You can also make the function return dynamically allocated memory, which the caller then has to free after it is done using it (like fopen / fclose).
You're returning an array from my_copy that you declared within the function. This was allocated on the stack and so is invalid when the function returns.
You need to allocate the new string on the heap:
#include <stdlib.h>
b = malloc(3);
if (b) {
/* Do your funny copy here */
}
Don't forget to free() the returned string when you've finished with it.
I want to outsource a string operation to a function and then ask the results in main. This doesn't work, and I don't understand why.
#include <stdio.h>
#include <string.h>
void splitRequest(char request[], char method[], char ressource[], char proto[]) {
method = strtok(request, " ");
ressource = strtok(NULL, " ");
proto = strtok(NULL, " ");
printf("\nResult:\n\nmethod:\t\t%s\nressource:\t%s\nproto:\t\t%s\n",method,ressource,proto);
}
int main()
{
char method[50], ressource[50], proto[50], request[50];
memset(method, '\0', 50);
memset(ressource, '\0', 50);
memset(proto, '\0', 50);
memset(request, '\0', 50);
strcpy(request,"Get /index.htm HTTP/1.1");
//rehash query
splitRequest(request, method, ressource, proto);
//check Results
printf("\nResult:\n\nmethod:\t\t%s\nressource:\t%s\nproto:\t\t%s\n",method,ressource,proto);
return 0;
}
In the splitRequest function all the arguments are pointers. And more importantly they are local variables. So all changes to them (like making them point anywhere else) will only affect the local variable, nothing else.
The solution is to copy to the memory instead. Perhaps something like
char *temp;
if ((temp = strtok(request, " ")) != NULL)
{
strcpy(method, temp);
}
// And so on...
A little elaboration about what I mean...
In C all function arguments are passed by value. That means their value is copied and the function only have a local copy of the value. Chaning the copy will of course not change the original value.
It is the same with pointers. When you call your splitRequest function the pointers you pass are copied. Inside the function the variable method (for example) is pointing to the memory of the array you defined in the main function. When you assign to this variable, like you do with
method = strtok(...);
you only modify the local variable, the local copy of the pointer. When the function returns the local variable method goes out of scope, and all changes to it are lost.
There are two solutions to this problem. One is to emulate pass by reference (something which C doesn't have, which is why it must be emulated), but that will not work as long as you have an array. Therefore the second and easiest solution: To copy to the memory pointed to by the local variable method, which is what I show above.
An important note though: When you call the splitRequest function, passing the arrays, the arrays themselves are not passed. Instead the arrays decays to a pointer to their first element, and inside the function the variables defined by the arguments are pointers and not arrays.
The call
splitRequest(request, method, ressource, proto);
is equal to
splitRequest(&request[0], &method[0], &ressource[0], &proto[0]);
And the function declaration
void splitRequest(char request[], char method[], char ressource[], char proto[])
is equal to
void splitRequest(char *request, char *method, char *ressource, char *proto)
So no the strings are not copied, only the pointers. Otherwise you would have gotten an error when assigning to the pointer, since you can't assign to arrays only copy to them.
Found another workaround even more effectiv:
#include <stdio.h>
#include <string.h>
void splitRequest(char request[], char **method, char **ressource, char **proto) {
*method = strtok(request, " ");
*ressource = strtok(NULL, " ");
*proto = strtok(NULL, " ");
printf("\nResult:\n\nmethod:\t\t%s\nressource:\t%s\nproto:\t\t%s\n",*method,*ressource,*proto);
}
int main()
{
char *method, *ressource, *proto, request[50];
memset(request, '\0', 50);
strcpy(request,"Get /index.htm HTTP/1.1");
//rehash query
splitRequest(&request, &method, &ressource, &proto);
//check Results
printf("\nResult:\n\nmethod:\t\t%s\nressource:\t%s\nproto:\t\t%s\n",method,ressource,proto);
return 0;
}
Whenever the environment variable's value is larger than its key in this method, I get a buffer overflow. Target is part of a dynamically allocated two dimensional array for tokens. Whenever I replace the token that's an environment variable with a value longer than it, it flows into the next token. I've tried adding a realloc to try and fix it, but it doesn't work or leads to a segfault.
If anyone has any suggestions or can point me at something I"m overlooking, I'd greatly appreciate it, because I have a feeling I'll be kicking myself when I find it out anyway.
The method is:
void envReplace(ENV *evlist, char *Target)
{
if (Target[0] == '#')
{
memmove(Target, Target+1, strlen(Target));
for(q = 0; q<16; q++)
{
if(evlist[q].envVariable!=NULL)
{
if(strcmp(Target, evlist[q].envVariable)==0)
{
//this is where I'd add the realloc as realloc(Target, strlen(evlist[q].Value))
strcpy(Target, evlist[q].Value);
return;
}
}
}
printf("Variable not found\n");
}
else
{
printf("A value that didn't start with # was an argument\n");
return;
}
}
The data structure ENV is:
typedef struct envStorage
{
char *envVariable;
char *Value;
}ENV;
You can pass a pointer to a pointer, like this, and then you can call realloc() inside the function, modifying the original pointer.
void func (char **to_change) {
// Changes the char *target in main()
*to_change = (char *) realloc(*to_change, 20);
sprintf (*to_change, "Blablablablabla\n");
}
int main (int argc, char **argv) {
char *target = (char *) malloc (10);
func(&target);
printf(target);
free(target);
}
In this example, func() writes to the original pointer char *target in the main() function.
What happens in your code, is that you realloc() and assign to a copy of the pointer. The pointer is copied when you call envReplace(). When this function returns, the original pointer contains the old memory address, which is no longer valid allocated memory (it was freed by realloc()).
I have a function whichtakes a file, reads it line by line, puts every line in a *char[], puts this twodimensional array in a struct, and returns this struct:
wordlist.h:
#ifndef H_WORDLIST
#define H_WORDLIST
typedef struct {
char **chWordsList;
int listlen;
}Wordlist;
Wordlist getWordlistFromFile(char *chFilename);
char *getRandomWord();
#endif
The function (plus headers):
#include "wordlist.h"
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>
#define WORDSIZE 100
Wordlist getWordlistFromFile(char *chFilename){
FILE *file = fopen(chFilename,"r");
if (file == NULL){
printf("Unable to open file %s. Check if the file exists and can be read by this user.\n",chFilename);
exit(1);
}
char chWord[WORDSIZE];
int intFileSize = 0;
//First: coundt the amount of lines in the file
while((fgets(chWord,WORDSIZE,file) != NULL)){
++intFileSize;
}
rewind(file);
char *chWordList[intFileSize];
for (int count = 0; (fgets(chWord,WORDSIZE,file) != NULL); ++count){
chWordList[count] = malloc( strlen(chWord +1));
strcpy(chWordList[count],chWord);
chWordList[count][strlen(chWord) -1] = 0;
}
fclose(file);
Wordlist wordlist;
wordlist.chWordsList = chWordList;
wordlist.listlen = intFileSize;
for (int i = 0; i < wordlist.listlen; ++i){
printf("%s\n", wordlist.chWordsList[i]);
}
return wordlist;
}
So far this works great. The last for loop prints exactly every line of the given file, all fully expected behaviour, works perfect. Now, I actually want to use the function. So: in my main.c:
Wordlist list = getWordlistFromFile(strFilePath);
for (int i = 0; i < list.listlen; ++i){
printf("%s\n", list.chWordsList[i]);
}
This gives me the weirdest output:
abacus
wordlist
(null)
(null)
��Ⳏ
E����H�E
gasses
While the output should be:
abacus
amused
amours
arabic
cocain
cursor
gasses
It seems to me almost like some pointers get freed or something, while others stay intact. What is going on? Why is wordlist perfect before the return and broken after?
char *chWordList[intFileSize]
This array of strings is allocated on stack since it's declared as a local of getWordlistFromFile. Upon exiting the function the stack pointer is decreased and the array is no longer valid.
You should use the same approach used for the single string: allocate in on heap.
char **chWordList = malloc(intFileSize*sizeof(char*))
In this way the array will persist the scope of the function and you will be able to use it after the call to the function.
Because you are returning pointers to objects whose lifetime has expired. In particular, chWordsList inside the return value points to an object whose lifetime ends when the function returns. When you dereference that pointer you get undefined behavior (UB); therefore any result would not be surprising.
What you need to do is malloc memory for the chWordList instead of declaring it as a local array:
char **chWordList = malloc(intFileSize * sizeof(char*))
Change
char *chWordList[intFileSize];
to
char **chWordList = malloc(sizeof(char *) * intFileSize);
i.e allocated chwordList and set that in the WordList.
Your code is returning array variable chWordList allocated on stack, so it will not be valid once the function getWordlistFromFile() completes and returns to main().