I am student and I am writing HTTP proxy application in C. I have trouble with memory management. In all my previous applications I simply wrote a wrapper around malloc which aborted when malloc failed.
void *xmalloc(size_t size)
{
void *ptr;
assert(size);
ptr = malloc(size);
if (!ptr)
abort();
return ptr;
}
This I now find insufficient as I just want to refuse client and continue serving other clients when memory allocation fails due to temporary shortage of memory. If I don't want to clutter my code with checks after each malloc call (I have quite lot of them per function in parsing code), what are other options to handle memory management and which one is the best for my purposes and how what is a common way for server applications to handle memory management and shortage of memory?
Consider this function from my current code which parses one line from header portion of HTTP message (xstrndup calls xmalloc):
int http_header_parse(http_hdr_table *t, const char *s)
{
const char *p;
const char *b;
char *tmp_name;
char *tmp_value;
int ret = -1;
assert(t);
assert(s);
p = b = s;
/* field name */
for (; ; p++) {
if (*p == ':') {
if (p-b <= 0) goto out;
tmp_name = xstrndup(b, p-b);
b = ++p;
break;
}
if (is_ctl_char(*p) || is_sep_char(*p)) goto out;
}
while (*p == ' ' || *p == '\t') {
p++; b++;
}
/* field value */
for (; ; p++) {
if (is_crlf(p)) {
if (p-b <= 0) goto err_value;
tmp_value = xstrndup(b, p-b);
p += 2;
break;
}
if (!*p) goto err_value;
}
http_hdr_table_set(t, tmp_name, tmp_value);
ret = 0;
xfree(tmp_value);
err_value:
xfree(tmp_name);
out:
return ret;
}
I would like to keep things simple and handle memory allocation errors at one place and to not clutter code with malloc error handling code. What should I do? Thank you.
P.S: I am writing the application to run on POSIX/Unix-like systems. Also feel free to criticize my current coding style and practices.
If you want to use a relatively low level language like C, then you shouldn't be too worried about adding something like if(tmp_value == NULL) goto out; in 2 places.
If you can't stand the idea of 2 trivial lines of extra code, then maybe try a language that supports exceptions properly (e.g. C++) and add throw/try/catch instead. Note: I really don't like C++, but using C++ would have to make more sense than implementing your own "exception like" features and an entire layer of automated resource de-allocation in C.
Modern languages give you garbage collection and exceptions. C doesn't, so you have to work hard. There's no magical solution here.
Some tips:
Create a session structure, and keep all your allocated memory pointed from it. When the session is aborted, always call a cleanup function. This way, even if you have to check for failures in many places, at least all failures are handled the same way.
You can even create a session_allocate() function, which allocates memory and keeps it on a linked list pointed from the session structure. Everything you allocate using this function would be freed when the session is destroy.
Try to concentrate all allocations in the beginning of the session. After you've allocated all you need, the rest of your code won't need to worry about failures.
If you're on a system that supports fork(), which linux does, you can run each client connection in it's own process. When a client connection is first established, you fork your main process into a child process to handle the rest of the request. Then you can abort() like you always have and only the specific client connection is affected. This is a classic unix server model.
If you don't want to or can't use fork(), you need to abort the request by throwing an exception. In C, that would be done by using setjump() when the connection is first established and then calling longjump() when out of memory is detected. This will reset execution and the stack back to where setjump() was called.
The problem is, this will leak all the resources allocated up to that point (for example, other memory allocations that had succeeded up to the point of getting out of memory). So additionally, your memory allocator will have to track all the memory allocations for each request. When longjump() is called, the setjump() return location will then have to free all the memory that was associated with the aborted request.
This is what apache does using pools. Apache uses pools to track resource allocations so it can auto free them in the case of an abort or because the code just didn't free it: http://www.apachetutor.org/dev/pools.
You should also consider the pool model and not just simply wrap malloc() so one client can't use up all the memory in the system.
Another possibility would be to use Boehm's GC by using its GC_malloc instead of malloc (you won't need to call free or GC_free); its
GC_oom_fn function pointer (called internally from GC_malloc when no memory is available any more) can be set to your particular out of memory handler (which would deny the incoming HTTP request, perhaps with a longjmp)
The major advantage of using Boehm GC is that you don't care any more about free-ing your dynamically allocated data (provided it was allocated using GC_malloc or friends, e.g. GC_malloc_atomic for data without any pointers inside).
Notice that memory management is not a modular property. The liveness of some given data is a whole program property, see garbage collection wikipage, and RAII programming idiom.
You could of course use alloca, but that has issues that mean it must be used with care. Alternatively, you can write your code so that you minimise and localise the use of malloc. For example your function above could be rewritten to localise the allocations:
static size_t field_name_length(const char *s)
{
const char *p = s;
for ( ; *p != ':'; ++p) {
if (is_ctl_char(*p) || is_sep_char(*p))
return 0;
}
return (size_t) (p - s);
}
static size_t value_length(const char *s)
{
const char *p = s;
for (; *p && !is_crlf(p); p+=2) {
/* nothing */
}
return *p ? (size_t) (p - s) : 0;
}
int http_header_parse(http_hdr_table *t, const char *s)
{
const char *v;
int ret = -1;
size_t v_len = 0;
size_t f_len = field_name_length(s);
if (f_len) {
v = s + f_len + 1;
v = s + strspn(s, " \t");
v_len = value_length(s);
}
if (v_len > 0 && f_len > 0) {
/* Allocation is localised to this block */
const char *name = xstrndup(s, f_len);
const char *value = xstrndup(v, v_len);
if (name && value) {
http_hdr_table_set(t, name, value);
ret = 0;
}
xfree(value);
xfree(name);
}
return ret;
}
Or, even better, you could modify http_hdr_table_set to accept the pointers and lengths and avoid allocation completely.
Related
void main(int argc, char* argv[]) {
char* hostname = (char*)malloc(sizeof(char)*1024);
hostname = getClientHostName("122.205.26.34");
printf("%s\n", hostname);
free(hostname);
}
char* getClientHostName(char* client_ip) {
char hostnames[5][2];
hostnames[0][0] = "122.205.26.34";
hostnames[0][1] = "aaaaa";
hostnames[1][0] = "120.205.36.30";
hostnames[1][1] = "bbbbb";
hostnames[2][0] = "120.205.16.36";
hostnames[2][1] = "ccccc";
hostnames[3][0] = "149.205.36.46";
hostnames[3][1] = "dddddd";
hostnames[4][0] = "169.205.36.33";
hostnames[4][1] = "eeeeee";
for(int i = 0; i<5; i++) {
if(!strcmp(hostnames[i][0], client_ip))
return (char*)hostnames[i][1];
}
return NULL;
}
Beginner in C.
I am not sure if there would be a better way to implement what I am trying to implement. The code is self-explanatory. Is there any way that I can predefine the size of hostname, using some general size of IP addresses, to avoid seg fault? Is there a even better way where I don't have to hardcode the size?
After fixing the compiler errors and warnings you get:
const char* getClientHostName(const char* client_ip) {
const char * hostnames[5][2];
hostnames[0][0] = "122.205.26.34";
hostnames[0][1] = "aaaaa";
hostnames[1][0] = "120.205.36.30";
hostnames[1][1] = "bbbbb";
hostnames[2][0] = "120.205.16.36";
hostnames[2][1] = "ccccc";
hostnames[3][0] = "149.205.36.46";
hostnames[3][1] = "dddddd";
hostnames[4][0] = "169.205.36.33";
hostnames[4][1] = "eeeeee";
for(int i = 0; i<5; i++) {
if(!strcmp(hostnames[i][0], client_ip))
return hostnames[i][1];
}
return NULL;
}
int main(int argc, char* argv[]) {
const char * hostname = getClientHostName("128.205.36.34");
printf("%s\n", hostname);
}
Is there a even better way where I don't have to hardcode the size?
Take the habit to compile with all warnings and debug info: gcc -Wall -Wextra -g with GCC. Improve the code to get no warnings at all.
If you want to get genuine IP addresses, this is operating system specific (since standard C11 don't know about IP addresses; check by reading n1570). On Linux you would use name service routines such as getaddrinfo(3) & getnameinfo(3) or the obsolete gethostbyname(3).
If this is just an exercise without actual relationship to TCP/IP sockets (see tcp(7), ip(7), socket(7)) you could store the table in some global array:
struct myipentry_st {
const char* myip_hostname;
const char* myip_address;
};
then define a global array containing them, with the convention of terminating it by some {NULL, NULL} entry:
const struct myipentry_st mytable[] = {
{"aaaaa", "122.205.26.34"},
{"bbbb", "120.205.36.30"},
/// etc
{NULL, NULL} // end marker
};
You'll better have a global or static variable (not an automatic one sitting on the call stack) because you don't want to fill it on every call to your getClientHostName.
Then your lookup routine (inefficient, since in linear time) would be:
const char* getClientHostName(char* client_ip) {
for (const struct myipentry_st* ent = mytable;
ent->myip_hostname != NULL;
ent++)
// the if below is the only statement of the body of `for` loop
if (!strcmp(ent->myip_address, client_ip))
return ent->myip_hostname;
// this happens after the `for` when nothing was found
return NULL;
}
You could even declare that table as a heap allocated pointer:
const struct myipentry_st**mytable;
then use calloc to allocate it and read its data from some text file.
Read the documentation of every standard or external function that you are using. Don't forget to check against failure (e.g. of calloc, like here). Avoid memory leaks by appropriate calls to free. Use the debugger gdb and valgrind. Beware of undefined behavior.
In the real world, you would have perhaps thousands of entries and you would perform the lookup many times (perhaps millions of times, e.g. once per every HTTP request in a web server or client). Then choose a better data structure (hash table or red-black tree perhaps). Read some Introduction to Algorithms.
Add * to type definition char * hostnames[5][2]. This must be array of pointers, not simple chars. Another necessary change is strcpy instead of = in strcpy( hostname, getClientHostName("122.205.26.34") );.
PS: Always try to compile with 0 compiler warnings, not only 0 errors!
void* password_cracker_thread(void* args) {
cracker_args* arg_struct = (cracker_args*) args;
md5hash* new_hash = malloc (sizeof(md5hash));
while(1)
{
char* password = fetch(arg_struct->in);
if(password == NULL )
{
deposit(arg_struct->out,NULL);
free(new_hash);
pthread_exit(NULL);
}
compute_hash(password,new_hash);
if(compare_hashes(new_hash,(md5hash**)arg_struct->hashes,arg_struct->num_hashes) != -1)
{
printf("VALID_PASS:%s \n",password);
deposit(arg_struct->out,password);
}else{
free(password);
}
}
}
This is a part of a program, where you get char* passwords from a ringbuffer, calculate md5 and compare them and push them into the next buffer if valid.
My problem is now, why can't I free those I don't need?
The whole program will stop if I try to and if I don't, I get memory leaks.
"You", and by this I mean your program, can only free() storage that was got from malloc()-and-friends, and only in the granularity it was got, and only once per chunk of storage.
In the code you show here, you're attempting to free() something got from fetch(). Since we can't see the definition of that function and you have not provided any documentation of it, our best guess is that
fetch() gives you a pointer to something other than a whole chunk
got from malloc()-et-al; and/or
some other part of the program not
shown here free()s the relevant chunk itself.
While working through Zed Shaw's learn C the Hard Way, I encountered the function apr_dir_make_recursive() which according to the documentation here has the type signature
apr_status_t apr_dir_make_recursive(const char *path, apr_fileperms_t perm, apr_pool_t *pool)
Which makes the directory, identical to the Unix command mkdir -p.
Why would the IO function need a memory pool in order to operate?
My first thought was that it was perhaps an optional argument to populate the newly made directory, however the code below uses an initialized but presumptively empty memory pool. Does this mean that the IO function itself needs a memory pool, that we are passing in for it to use? But that doesn't seem likely either; couldn't the function simply create a local memory pool for it to use which is then destroyed upon return or error?
So, what use is the memory pool? The documentation linked is unhelpful on this point.
Code shortened and shown below, for the curious.
int DB_init()
{
apr_pool_t *p = NULL;
apr_pool_initialize();
apr_pool_create(&p, NULL);
if(access(DB_DIR, W_OK | X_OK) == -1) {
apr_status_t rc = apr_dir_make_recursive(DB_DIR,
APR_UREAD | APR_UWRITE | APR_UEXECUTE |
APR_GREAD | APR_GWRITE | APR_GEXECUTE, p);
}
if(access(DB_FILE, W_OK) == -1) {
FILE *db = DB_open(DB_FILE, "w");
check(db, "Cannot open database: %s", DB_FILE);
DB_close(db);
}
apr_pool_destroy(p);
return 0;
}
If you pull up the source, you'll see: apr_dir_make_recursive() calls path_remove_last_component():
static char *path_remove_last_component (const char *path, apr_pool_t *pool)
{
const char *newpath = path_canonicalize (path, pool);
int i;
for (i = (strlen(newpath) - 1); i >= 0; i--) {
if (path[i] == PATH_SEPARATOR)
break;
}
return apr_pstrndup (pool, path, (i < 0) ? 0 : i);
}
This function is creating copies of the path in apr_pstrndup(), each representing a smaller component of it.
To answer your question - because of how it was implemented. Would it be possible to do the same without allocating memory, yes. I think in this case everything came out cleaner and more readable by copying the necessary path components.
The implementation of the function (found here) shows that the pool is used to allocate strings representing the individual components of the path.
The reason the function does not create its own local pool is because the pool may be reused across multiple calls to the apr_*() functions. It just so happens that DB_init() does not have a need to reuse an apr_pool_t.
I'm trying to write a function in C to solve a math problem. In that function, there are several steps, and each step needs to allocate some memory with the size depending on the calculation results in previous steps (so I can't allocate them all at the beginning of the function). The pseudo code looks like:
int func(){
int *p1, *p2, *p3, *p4;
...
p1 = malloc(...);
if(!p1){
return -1; //fail in step 1
}
...
p2 = malloc(...);
if(!p2){
free(p1);
return -2; //fail in step 2
}
...
p3 = malloc(...);
if(!p3){
free(p1);
free(p2);
return -3; //fail in step 3
}
...
p4 = malloc(...);
if(!p4){
free(p1);
free(p2);
free(p3); /* I have to write too many "free"s here! */
return -4; //fail in step 4
}
...
free(p1);
free(p2);
free(p3);
free(p4);
return 0; //normal exit
}
The above way to handle malloc failures is so ugly. Thus, I do it in the following way:
int func(){
int *p1=NULL, *p2=NULL, *p3=NULL, *p4=NULL;
int retCode=0;
...
/* other "malloc"s and "if" blocks here */
...
p3 = malloc(...);
if(!p3){
retCode = -3; //fail in step 3
goto FREE_ALL_EXIT;
}
...
p4 = malloc(...);
if(!p4){
retCode = -4; //fail in step 4
goto FREE_ALL_EXIT;
}
...
FREE_ALL_EXIT:
free(p1);
free(p2);
free(p3);
free(p4);
return retCode; //normal exit
}
Although I believe it's more brief, clear, and beautiful now, my team mate is still strongly against the use of 'goto'. And he suggested the following method:
int func(){
int *p1=NULL, *p2=NULL, *p3=NULL, *p4=NULL;
int retCode=0;
...
do{
/* other "malloc"s and "if" blocks here */
p4 = malloc(...);
if(!p4){
retCode = -4; //fail in step 4
break;
}
...
}while(0);
free(p1);
free(p2);
free(p3);
free(p4);
return retCode; //normal exit
}
Hmmm, it seems a way to avoid the use of 'goto', but this way increases indents, which makes the code ugly.
So my question is, is there any other method to handle many 'malloc' failures in a good code style? Thank you all.
goto in this case is legitimate. I see no particular advantage to the do{}while(0) block as its less obvious what pattern it is following.
First of all, there's nothing wrong with goto—this is a perfectly legitimate use of goto. The do { ... } while(0) with break statements are just gotos in disguise, and it only serves to obfuscate the code. Gotos are really the best solution in this case.
Another option is to put a wrapper around malloc (e.g. call it xmalloc) which kills the program if malloc fails. For example:
void *xmalloc(size_t size)
{
void *mem = malloc(size);
if(mem == NULL)
{
fprintf(stderr, "Out of memory trying to malloc %zu bytes!\n", size);
abort();
}
return mem;
}
Then use xmalloc everywhere in place of malloc, and you no longer need to check the return value, since it will return a valid pointer if it returns at all. But of course, this is only usable if you want allocation failures to be an unrecoverable failure. If you want to be able to recover, then you really do need to check the result of every allocation (though honestly, you'll probably have another failure very soon after).
Ask your teammate how he would re-write this sort of code:
if (!grabResource1()) goto res1failed;
if (!grabResource2()) goto res2failed;
if (!grabResource3()) goto res3failed;
(do stuff)
res3failed:
releaseResource2();
res2failed:
releaseResource1();
res1failed:
return;
And ask how he would generalize it to n resources. (Here, "grabbing a resource" could mean locking a mutex, opening a file, allocating memory, etc. The "free on NULL is OK" hack does not solve everything...)
Here, the alternative to goto is to create a chain of nested functions: Grab a resource, call a function that grabs another resource and calls another function that grabs a resource and calls another function... When a function fails, its caller can free its resource and return failure, so the releasing happens as the stack unwinds. But do you really think this is easier to read than the gotos?
(Aside: C++ has constructors, destructors, and the RAII idiom to handle this sort of thing. But in C, this is the one case where goto is clearly the right answer, IMO.)
There's nothing wrong with goto in error handling and there's actually no code difference between using a do { ... } while(0); with breaks; instead of goto (since they're both jmp instructions). I would say that seems normal. One thing you could do that is shorter is create an array of int * types and iterate through while calling malloc. If one fails free the ones that are non-null and return an error code. This is the cleanest way I can think of so something like
int *arr[4];
unsigned int i;
for (i = 0; i < 4; ++i)
if (!(arr[i] = malloc(sizeof(int))) {
retCode = -(i + 1); //or w/e error
break;
}
if (errorCode)
for (i = 0; i < 4; i++)
if (arr[i])
free(arr[i]);
else
break;
or something along those lines (used brain compiler for this so I might be wrong)
Not only does this shorten your code but also avoids goto's (which I don't see anything wrong with) so you and your teammate can both be happy :D
David Hanson wrote the book C Interfaces and Implementations: Techniques for Creating Reusable Software. His Mem interface provides functions that are "similar to those in the standard C library, but they don't accept zero sizes and never return null pointers." The source code includes a production implementation and a checking implementation.
He also implements an Arena interface. The Arena interface releases you from the obligation to call free() for every malloc(). Instead, there's just a single call to free the entire arena.
CII source code
If an allocation fails, simply assign the error code as normal. conditionalize each malloc like so:
if (retCode < 0) malloc...
and then at the end of your code, add this:
int * p_array[] = { p1, p2, p3, p4};
for (int x = -retCode + 1; x >= 0; x-- )
{
free(p_array[x]);
}
I'm attempting to write a solver for a particular puzzle. It tries to find a solution by trying every possible move one at a time until it finds a solution. The first version tried to solve it depth-first by continually trying moves until it failed, then backtracking, but this turned out to be too slow. I have rewritten it to be breadth-first using a queue structure, but I'm having problems with memory management.
Here are the relevant parts:
int main(int argc, char *argv[])
{
...
int solved = 0;
do {
solved = solver(queue);
} while (!solved && !pblListIsEmpty(queue));
...
}
int solver(PblList *queue) {
state_t *state = (state_t *) pblListPoll(queue);
if (is_solution(state->pucks)) {
print_solution(state);
return 1;
}
state_t *state_cp;
puck new_location;
for (int p = 0; p < puck_count; p++) {
for (dir i = NORTH; i <= WEST; i++) {
if (!rules(state->pucks, p, i)) continue;
new_location = in_dir(state->pucks, p, i);
if (new_location.x != -1) {
state_cp = (state_t *) malloc(sizeof(state_t));
state_cp->move.from = state->pucks[p];
state_cp->move.direction = i;
state_cp->prev = state;
state_cp->pucks = (puck *) malloc (puck_count * sizeof(puck));
memcpy(state_cp->pucks, state->pucks, puck_count * sizeof(puck)); /*CRASH*/
state_cp->pucks[p] = new_location;
pblListPush(queue, state_cp);
}
}
}
free(state->pucks);
return 0;
}
When I run it I get the error:
ice(90175) malloc: *** mmap(size=2097152) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
Bus error
The error happens around iteration 93,000.
From what I can tell, the error message is from malloc failing, and the bus error is from the memcpy after it.
I have a hard time believing that I'm running out of memory, since each game state is only ~400 bytes. Yet that does seem to be what's happening, seeing as the activity monitor reports that it is using 3.99GB before it crashes. I'm using http://www.mission-base.com/peter/source/ for the queue structure (it's a linked list).
Clearly I'm doing something dumb. Any suggestions?
Check the result of malloc. If it's NULL, you might want to print out the length of that queue.
Also, the code snippet you posted didn't include any frees...
You need to free() the memory you've allocated manually after you're done with it; dynamic memory doesn't just "free itself"