I have a question concerning scope and how memory is handled in C. Example:
int main(){
int some,
instance,
variables;
{ // scoped block to do some work
int more,
data,
just,
forThisBlock;
}
// ...
}
Essentially my question is this.... Does it make sense to use a block as above to section off areas to instantiate temporarily used data? Memory-wise, what happens to the variables allocated in the block? Are they de-allocated after the block exits, ostensibly saving memory further on in the function? I realize a proper function could also serve the purpose, but in some cases adding another function is not as neat if the work is only a couple lines long.
Edit: my particular use case:
Many functions return an int to show if an error occurred or not. I have noticed from looking around that a style convention is to 'pile' variable declarations at the beginning of their scope block. If you have a few of these function uses, then you could potentially cloud up the declarations with the variables to store these return values (or really any variable that is only used temporarily, and the processing of which requires many of the other more 'permanent' variables that would make a for a long function declaration). Essentially the purpose of the question is determining the ramifications of scoping off these usages to attempt a cleaner look and declaration section. I realize for some cases one could reuse this return value, but the purpose of this question is to better understand my options.
For example:
int infoRtn;
char *serverHost;
...
infoRtn = getaddrinfo(
serverHost,
port,
&addrCriteria,
&addrList
);
if(infoRtn != 0){
fprintf(stderr, "%s # %d: Failed to get address info, error returned: %d\n", __FILE__, __LINE__, infoRtn);
exit(1);
}
...
vs
char *serverHost;
...
{
int infoRtn = getaddrinfo(
serverHost,
port,
&addrCriteria,
&addrList
);
if(infoRtn != 0){
fprintf(stderr, "%s # %d: Failed to get address info, error returned: %d\n", __FILE__, __LINE__, infoRtn);
exit(1);
}
}
...
Does it make sense to use a block as above to section off areas to instantiate temporarily used data?
Personally I haven't seen many use cases of having anonymous blocks to do something.
Functions are designed for that and you should use it, Using functions also gives visibility that you are trying to do something (but hides its implementation) and is better for readability eg
int averagreMarks = getAvgMarks(students);
Here when I read I see average marks of students are requested. If I want to see its implementation I go and see else I just skip over to main logic (Better readability)
Memory-wise, what happens to the variables allocated in the block
Yes they are destroyed (at least in most C compilers) so memory wise it can be analogous to a function but I would strongly urge to use functions for better readability and maintainability
EDIT (After details got added to question)
I propose a 3rd solution to this
{
char *serverHost;
validateConnection() //Add params that you want to send
...
}
validateConnection() {
int infoRtn;
infoRtn = getaddrinfo(
serverHost,
port,
&addrCriteria,
&addrList
);
if(infoRtn != 0){
fprintf(stderr, "%s # %d: Failed to get address info, error returned: %d\n", __FILE__, __LINE__, infoRtn);
exit(1);
}
The above solution has advantage that user knows ok now we do validate and continue. Plus in future if your validate logic is modified (example you add retries) you dont clutter the original function.
Related
In C, I know it is good practice to always check if a newly malloced variable is null right after allocating. If so, I can output an error e.g. perror and exit the program.
But what about in more complicated programs? E.g. I have main call a function f1(returns an int), which calls a function f2(returns a char*), which calls a function f3(returns a double), and I fail to malloc inside f3.
In this case, I can't seem to just output an error and exit(and may even have memory leaks if possible) since f3 will still force me to first return a double. Then f2 will force me to return a char*, etc. In this case, it seems very painful to keep track of the errors and exit appropriately. What is the proper way to efficiently cover these sort of errors accross functions?
The obvious solution is to design your program with care, so that every function that does dynamic allocation has some means to report errors. Most often the return value of the function is used for this purpose.
In well-designed programs, errors bounce back all the way up the call stack, so that they are dealt with at the application level.
In the specific case of dynamic memory allocation, it is always best to leave the allocation to the caller whenever possible.
It's always a problem. You need a disiplined approach.
Firstly, every dynamic pointer must be "owned" by someone. C won't help you here, you just have to specify. Generally the three patterns are
a) Function calls malloc(), then calls free():
b) We have two matching functions, one which returns a buffer or dynamic
structure, one which destroys it. The function that calls create also calls the destroy.
c) We have a set of nodes we are inserting into a graph, at random throughout the program. It needs to be managed like b, one function creates the root, then calls the delete which destroys the entire graph.
The rule is owner holds and frees.
If you return a pointer, return 0 on out of memory. If you return an integer, return -1. Errors get propagated up until some high level code knows what user-level operation has failed and aborts it.
The other answers are correct that the correct way to handle this is to make sure that every function that can allocate memory can report failure to its caller, and every caller handles the possibility. And, of course, you have a test malloc shim that arranges to test every possible allocation failure.
But in large C programs, this becomes intractable — the number of cases that need testing increases exponentially with the number of malloc callsites, for starters — so it is very common to see a function like this in a utils.c file:
void *
xmalloc(size_t n)
{
void *rv = malloc(n);
if (!rv) {
fprintf(stderr, "%s: memory exhausted\n", program_name);
exit(1);
}
return rv;
}
All other code in the program always calls xmalloc, never malloc, and can assume it always succeeds. (And you also have xcalloc, xrealloc, xstrdup, etc.)
Libraries cannot get away with this, but applications can.
The one way to switch across the functions is exception handling.
When an exception is thrown, it return the scope the catch part.
But make sure of the memory allocation across the functions, Since it directly moves to the catch blok.
The sample code for reference,
// Example program
#include <iostream>
#include <string>
using namespace std ;
int f1()
{
int *p = (int*) malloc(sizeof(int)) ;
if(p == NULL)
{
throw(1) ;
}
//Code flow continues.
return 0 ;
}
char *g()
{
char *p ;
f1() ;
cout << "Inside fun g*" << endl ;
return p;
}
int f2()
{
g() ;
cout << "Inside fun f2" << endl ;
return 0 ;
}
int main()
{
try
{
f2() ;
}
catch(int a)
{
cout << "Caught memory exception" << endl ;
}
return 0 ;
}
When using malloc in C, I'm always told to check to see if any errors occurred by checking if it returned a NULL value. While I definitely understand why this is important, it is a bit of a bother constantly typing out 'if' statements and whatever I want inside them to check whether the memory was successfully allocated for each individual instance where I use malloc. To make things quicker, I made a function as follows to check whether it was successful.
#define MAX 25
char MallocCheck(char* Check);
char *Option1, *Option2;
int main(){
Option1 = (char *)malloc(sizeof(char) * MAX);
MallocCheck(Option1);
Option2 = (char *)malloc(sizeof(char) * MAX);
MallocCheck(Option2);
return 0;
}
char MallocCheck(char* Check){
if(Check == NULL){
puts("Memory Allocation Error");
exit(1);
}
}
However, I have never seen someone else doing something like this no matter how much I search so I assume it is wrong or otherwise something that shouldn't be done.
Is using a user-defined function for this purpose wrong and if so, why is that the case?
Error checking is a good thing.
Making a helper function to code quicker, better is a good thing.
The details depend on coding goals and your group's coding standards.
OP's approach is not bad. I prefer to handle the error with the allocation. The following outputs on stderr #EOF and does not complain of a NULL return when 0 bytes allocated (which is not a out-of-memory failure).
void *malloc_no_return_on_OOM(size_t size) {
void *p = mallc(size);
if (p == NULL && size > 0) {
// Make messages informative
fprintf(stderr, "malloc(%zu) failure\n", size);
// or
perror("malloc() failure");
exit(1);
}
return p;
}
Advanced: Could code a DEBUG version that contains the callers function and line by using a macro.
This is an addendum to #chux's answer and the comments.
As stated, DRY code is generally a good thing and malloc errors are often handled the same way within a specific implementation.
It's true that some systems (notably, Linux) offer optimistic malloc implementations, meaning malloc always returns a valid pointer (never NULL) and the error is reported using a signal the first time data is written to the returned pointer... which makes error handling slightly more complex then the code in the question.
However, moving the error check to a different function might incur a performance penalty, unless the compiler / linker catches the issue and optimizes the function call away.
This is a classic use case for inline functions (on newer compilers) or macros.
i.e.
#include <signal.h>
void handle_no_memory(int sig) {
if (sig == SIGSEGV) {
perror("Couldn't allocate or access memory");
/* maybe use longjmp to stay in the game...? Or not... */
exit(SIGSEGV);
}
}
/* Using a macro: */
#define IS_MEM_VALID(ptr) \
if ((ptr) == NULL) { \
handle_no_memory(SIGSEGV); \
}
/* OR an inline function: */
static inline void *is_mem_valid(void *ptr) {
if (ptr == NULL)
handle_no_memory(SIGSEGV);
return ptr;
}
int main(int argc, char const *argv[]) {
/* consider setting a signal handler - `sigaction` is better, but I'm lazy. */
signal(SIGSEGV, handle_no_memory);
/* using the macro */
void *data_macro = malloc(1024);
IS_MEM_VALID(data_macro);
/* using the inline function */
void *data_inline = is_mem_valid(malloc(1024));
}
Both macros and inline functions prevent code jumps and function calls, since the if statement is now part of the function instead of an external function.
When using inline, the compiler will take the assembly code and place it within the function (instead of performing a function call). For this, we must trust the compiler to so it's job properly (it usually does it's job better than us).
When using macros, the preprocessor takes care of things and we don't need to trust the compiler.
In both cases the function / macro is local to the file (notice the static key word), allowing any optimizations to be performed by the compiler (not the linker).
Good luck.
Should one check after each malloc() if it was successful? Is it at all possible that a malloc() fails? What happens then?
At school we were told that we should check, i.e.:
arr = (int) malloc(sizeof(int)*x*y);
if(arr==NULL){
printf("Error. Allocation was unsuccessful. \n");
return 1;
}
What is the practice regarding this? Can I do it this way:
if(!(arr = (int) malloc(sizeof(int)*x*y))
<error>
This mainly only adds to the existing answer but I understand where you are coming from, if you do a lot of memory allocation your code ends up looking very ugly with all the error checks for malloc.
Personally I often get around this using a small malloc wrapper which will never fail. Unless your software is a resilient, safety critical system you cannot meaningfully work around malloc failing anyway so I would suggest something like this:
static inline void *MallocOrDie(size_t MemSize)
{
void *AllocMem = malloc(MemSize);
/* Some implementations return null on a 0 length alloc,
* we may as well allow this as it increases compatibility
* with very few side effects */
if(!AllocMem && MemSize)
{
printf("Could not allocate memory!");
exit(-1);
}
return AllocMem;
}
Which will at least ensure you get an error message and clean crash, and avoids all the bulk of the error checking code.
For a more generic solution for functions that can fail I also tend to implement a simple macrosuch as this:
#define PrintDie(...) \
do \
{ \
fprintf(stderr, __VA_ARGS__); \
abort(); \
} while(0)
Which then allows you to run a function as:
if(-1 == foo()) PrintDie("Oh no");
Which gives you a one liner, again avoiding the bulk while enabling proper checks.
No need to cast malloc(). Yes, however, it is required to check whether the malloc() was successful or not.
Let's say malloc() failed and you are trying to access the pointer thinking memory is allocated will lead to crash, so it it better to catch the memory allocating failure before accessing the pointer.
int *arr = malloc(sizeof(*arr));
if(arr == NULL)
{
printf("Memory allocation failed");
return;
}
I observed something in a log file that I cannot explain:
All code in project is ANSI C, 32bit exe running on Windows 7 64bit
I have a worker function similar to this one, running in a single threaded program, using no recursion. During debugging logging was included as shown:
//This function is called from an event handler
//triggered by a UI timer similar in concept to
//C# `Timer.OnTick` or C++ Timer::OnTick
//with tick period set to a shorter duration
//than this worker function sometimes requires
int LoadState(int state)
{
WriteToLog("Entering ->"); //first call in
//...
//Some additional code - varies in execution time, but typically ~100ms.
//...
WriteToLog("Leaving <-");//second to last call out
return 0;
}
The function above is simplified from our actual code but is sufficient for illustrating the issue.
Occasionally we have seen log entries such as this:
Where the time/date stamp is on the left, then message, the last field is duration in clock() ticks between calls to logging function. This logging indicates that the function was entered twice in a row before exiting.
Without recursion, and in a single threaded program, how is it (or is it) possible that execution flow can enter a function twice before the first call was completed?
EDIT: (to show top call of logging function)
int WriteToLog(char* str)
{
FILE* log;
char *tmStr;
ssize_t size;
char pn[MAX_PATHNAME_LEN];
char path[MAX_PATHNAME_LEN], base[50], ext[5];
char LocationKeep[MAX_PATHNAME_LEN];
static unsigned long long index = 0;
if(str)
{
if(FileExists(LOGFILE, &size))
{
strcpy(pn,LOGFILE);
ManageLogs(pn, LOGSIZE);
tmStr = calloc(25, sizeof(char));
log = fopen(LOGFILE, "a+");
if (log == NULL)
{
free(tmStr);
return -1;
}
//fprintf(log, "%10llu %s: %s - %d\n", index++, GetTimeString(tmStr), str, GetClockCycles());
fprintf(log, "%s: %s - %d\n", GetTimeString(tmStr), str, GetClockCycles());
//fprintf(log, "%s: %s\n", GetTimeString(tmStr), str);
fclose(log);
free(tmStr);
}
else
{
strcpy(LocationKeep, LOGFILE);
GetFileParts(LocationKeep, path, base, ext);
CheckAndOrCreateDirectories(path);
tmStr = calloc(25, sizeof(char));
log = fopen(LOGFILE, "a+");
if (log == NULL)
{
free(tmStr);
return -1;
}
fprintf(log, "%s: %s - %d\n", GetTimeString(tmStr), str, GetClockCycles());
//fprintf(log, "%s: %s\n", GetTimeString(tmStr), str);
fclose(log);
free(tmStr);
}
}
return 0;
}
I asked the question wondering at the time if there was some obscure part of the C standard that allowed execution flow to enter a function more than once without first exiting (given that multi-threading or recursion was not present)
Your comments, I believe, have clearly answered the question. Borrowing from what #Oli Charlesworth said in one comment, he summarizes it up pretty well:
If the code is truly single-threaded, and the log-function is truly sane, and there's truly no other piece of code that can be outputting to the log, then obviously this can't happen (UB notwithstanding).
But since the actual log files (which I could not post for proprietary reasons) on several occasions have demonstrated this pattern, one of the conditions #Oli Charlesworth listed is not actually true for our software. My best guess at this point, given that the logging function is sane and is the only input to the file, is to consider the alternate context/Fiber possibility suggested by #jxh:
"Primary thread only" can mean multiple things. The library could still possibly use <ucontext.h> on POSIX, or Fibers on Windows.
So, I will post this same question to the supplier of my environment, specifically if their UI Timers are run in such a way as to allow parallel calls due to a fiber or thread.
If any are interested, I will also update this answer with their response.
Edit to show conclusion:
As it turns out, the cause of the double entry of execution flow into a function was implicit recursion. That is, while the worker function did not reference itself explicitly, it was designated as the event handler for two separate event generators. That, coupled with a call to Process System Events (a function available in our environment forcing events in the queue to be processed now) can (and did) result in recursive execution flow into the event handler function. Here is a quote from a person who has expertise in the relationship between UI timers and system events in our environment:
"Timer events being nested" does equate to execution flow entering a function twice before leaving. Basically, it's the same thing as basic recursion: while you're inside one function, you call that same function. The only difference between this case and basic recursion is that the recursion call is implicit (via ProcessSystemEvents) and not explicit. But the end result is the same."
What is the best way for unit testing code paths involving a failed malloc()? In most instances, it probably doesn't matter because you're doing something like
thingy *my_thingy = malloc(sizeof(thingy));
if (my_thingy == NULL) {
fprintf(stderr, "We're so screwed!\n");
exit(EXIT_FAILURE);
}
but in some instances you have choices other than dying, because you've allocated some extra stuff for caching or whatever, and you can reclaim that memory.
However, in those instances where you can try to recover from a failed malloc() that you're doing something tricky and error prone in a code path that's pretty unusual, making testing especially important. How do you actually go about doing this?
I saw a cool solution to this problem which was presented to me by S. Paavolainen. The idea is to override the standard malloc(), which you can do just in the linker, by a custom allocator which
reads the current execution stack of the thread calling malloc()
checks if the stack exists in a database that is stored on hard disk
if the stack does not exist, adds the stack to the database and returns NULL
if the stack did exist already, allocates memory normally and returns
Then you just run your unit test many times: this system automatically enumerates through different control paths to malloc() failure and is much more efficient and reliable than e.g. random testing.
I suggest creating a specific function for your special malloc code that you expect could fail and you could handle gracefully. For example:
void* special_malloc(size_t bytes) {
void* ptr = malloc(bytes);
if(ptr == NULL) {
/* Do something crafty */
} else {
return ptr;
}
}
Then you could unit-test this crafty business in here by passing in some bad values for bytes. You could put this in a separate library and make a mock-library that does behaves special for your testing of the functions which call this one.
This is a kinda gross, but if you really want unit testing, you could do it with #ifdefs:
thingy *my_thingy = malloc(sizeof(thingy));
#ifdef MALLOC_UNIT_TEST_1
my_thingy = NULL;
#endif
if (my_thingy == NULL) {
fprintf(stderr, "We're so screwed!\n");
exit(EXIT_FAILURE);
}
Unfortunately, you'd have to recompile a lot with this solution.
If you're using linux, you could also consider running your code under memory pressure by using ulimit, but be careful.
write your own library that implements malloc by randomly failing or calling the real malloc (either staticly linked or explicitly dlopened)
then LD_PRELOAD it
In FreeBSD I once simply overloaded C library malloc.o module (symbols there were weak) and replaced malloc() implementation with one which had controlled probability to fail.
So I linked statically and started to perform testing. srandom() finished the picture with controlled pseudo-random sequence.
Also look here for a set of good tools that you seems to need by my opinion. At least they overload malloc() / free() to track leaks so it seems as usable point to add anything you want.
You could hijack malloc by using some defines and global parameter to control it... It's a bit hackish but seems to work.
#include <stdio.h>
#include <stdlib.h>
#define malloc(x) fake_malloc(x)
struct {
size_t last_request;
int should_fail;
void *(*real_malloc)(size_t);
} fake_malloc_params;
void *fake_malloc(size_t size) {
fake_malloc_params.last_request = size;
if (fake_malloc_params.should_fail) {
return NULL;
}
return (fake_malloc_params.real_malloc)(size);;
}
int main(void) {
fake_malloc_params.real_malloc = malloc;
void *ptr = NULL;
ptr = malloc(1);
printf("last: %d\n", (int) fake_malloc_params.last_request);
printf("ptr: 0x%p\n", ptr);
return 0;
}