In one of Apple's header files for libdispatch, queue.h, the following warning appears:
// The declaration of a block allocates storage on the stack.
// Therefore, this is an invalid construct:
dispatch_block_t block;
if (x) {
block = ^{ printf("true\n"); };
} else {
block = ^{ printf("false\n"); };
}
block(); // unsafe!!!
// What is happening behind the scenes:
if (x) {
struct Block __tmp_1 = ...; // setup details
block = &__tmp_1;
} else {
struct Block __tmp_2 = ...; // setup details
block = &__tmp_2;
}
// As the example demonstrates, the address of a stack variable is
// escaping the scope in which it is allocated. That is a classic C bug.
Try as I may, I cannot come up with a test case that exemplifies this bug. I can create blocks that are instantiated on the stack, but they (seem to) always appear at unique addresses on the stack, even when out of scope with respect to each other.
I imagine that the answer to this is simple, but it escapes me. Can anyone fill the gaps in my (limited) understanding?
EDIT: I've seen this response, but I don't quite understand how that instance can translate to my example posted above. Can someone show me an example using if constructs?
In order to crash a stack closure inside a function:
You need to make sure that the closure is indeed a stack closure. As of Apple Clang 2.1, a closure that doesn’t reference variables in its current context (like the one in queue.h) is realised as a global closure. This is an implementation detail that can vary amongst different compilers/compiler versions;
The compiler must emit code that effectively reuses/rewrites the stack area where the closure once lived. Otherwise, every object inside that function lives in a different address in the function stack frame, which means you won’t get a crash inside that function. It seems that Apple Clang 2.1 doesn’t reuse stack memory addresses. GCC 4.6 can reuse them, but it doesn’t support closures.
Since Apple Clang 2.1 doesn’t reuse addresses in a function stack frame and GCC 4.6 doesn’t support closures, from what I can tell it’s not possible to make this particular example — inside a function, invoke an out of scope stack closure — crash.
I wrote a more detailed text about this on my blog.
Related
I was using a sample C ALSA program as reference and ran along the following piece of code:
...
snd_ctl_event_t *event;
snd_ctl_event_alloca(&event);
...
Based on the ALSA source code, snd_ctl_event_alloca is a macro that calls __snd_alloca which is a macro that finally expands to the following equivalent line for snd_ctl_event_alloca(&event); (with some straightforward simplification):
event = (snd_ctl_event_t *) alloca(snd_ctl_event_sizeof());
memset(event, 0, snd_ctl_event_sizeof());
where snd_ctl_event_sizeof() is only implemented once in the whole library as:
size_t snd_ctl_event_sizeof()
{
return sizeof(snd_ctl_event_t);
}
So my question is, isn't this whole process equivalent to simply doing:
snd_ctl_event_t event = {0};
For reference, these are the macros:
#define snd_ctl_event_alloca(ptr) __snd_alloca(ptr, snd_ctl_event)
#define __snd_alloca(ptr,type) do { *ptr = (type##_t *) alloca(type##_sizeof()); memset(*ptr, 0, type##_sizeof()); } while (0)
Clarifications:
The first block of code above is at the start of the body of a function and not in a nested block
EDIT
As it turns out (from what I understand), doing:
snd_ctl_event_t event;
gives a storage size of 'event' isn't known error because snd_ctl_event_t is apparently an opaque struct that's defined privately. Therefore the only option is dynamic allocation.
Since it is an opaque structure, the purpose of all these actions is apparently to implement an opaque data type while saving all the "pros" and defeating at least some of their "cons".
One prominent problem with opaque data types is that in standard C you are essentially forced to allocate them dynamically in an opaque library function. It is not possible to implicitly declare an opaque object locally. This negatively impacts efficiency and often forces the client to implement additional resource management (i.e. remember to release the object when it is no longer needed). Exposing the exact size of the opaque object (through a function in this case) and relying on alloca to allocate storage is as close as you can get to a more efficient and fairly care-free local declaration.
If function-wide lifetime is not required, alloca can be replaced with VLAs, but the authors probably didn't want/couldn't use VLAs. (I'd say that using VLA would take one even closer to emulating a true local declaration.)
Often in order to implement the same technique the opaque object size might be exposed as a compile-time constant in a header file. However, using a function has an added benefit of not having to recompile the entire project if the object size in this isolated library changes (as #R. noted in the comments).
Previous version of the answer (the points below still apply, but apparently are secondary):
It is not exactly equivalent, since alloca defies scope-based lifetime rules. Lifetime of alloca-ed memory extends to the end of the function, while lifetime of local object extends only to the end of the block. It could be a bad thing, it could be a good thing depending on how you use it.
In situations like
some_type *ptr;
if (some condition)
{
...
ptr = /* alloca one object */;
...
}
else
{
...
ptr = /* alloca another object */;
...
}
the difference in semantics can be crucial. Whether it is your case or not - I can't say from what you posted so far.
Another unrelated difference in semantics is that memset will zero-out all bytes of the object, while = { 0 } is not guaranteed to zero-out padding bytes (if any). It could be important if the object is then used with some binary-based API's (like sent to a compressed I/O stream).
Is there any advantage or disadvantage of using a C block scope in a function or specifically inside an Interrupt Handler?
From the link - http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0472i/CJAIIDCG.html - Refer stack usage.
In general, you can lower the stack requirements of your program by:
Writing small functions that only require a small number of variables.
Avoiding the use of large local structures or arrays.
Avoiding recursion, for example, by using an alternative algorithm.
Minimizing the number of variables that are in use at any given time at each p oint in a function.
Using C block scope and declaring variables only where they are required, so overlapping the memory used by distinct scopes.
Advantage or disadvantage in using the C block Scope is not very clear.
If you write:
{
int foo[1000];
int bar[1000];
… code that uses foo …
… code that uses bar …
}
then foo and bar exist for the entire block and must use different memory. (The compiler/optimizer may recogize they are not used simultaneously and arrange to use the same memory, but various things can interfere with this, so you cannot rely on it.)
If you write:
{
{
int foo[1000];
… code that uses foo …
}
{
int bar[1000];
… code that uses bar …
}
}
Then foo and bar only exist at different times, so the compiler can use the same memory for them.
This just means that the compiler can optimize the space required by the execution of a function if you limit the life-time of variables using more scopes. Considering something like
int x;
// [some code]
int y;
// [some code not using x]
The compiler must still keep x until the end of the function. If you instead structure it like this:
{
int x;
// [some code]
}
{
int y;
// [some code not using x]
}
x doesn't exist in the code using y, so the compiler may reuse the same space that was earlier occupied by x (and it is quite likely that a compiler will indeed do that).
General note: If you have to think about this in an ISR, your ISR is probably already too big. Keep ISRs as minimal and fast as possible -- e.g. enqueuing an event might be a good action for an ISR.
Use blocks in your C code to make the code clear and logical. This applies to all C programming - not just in interrupts. It is usually a good idea to define your local variables only within the scope where they will be used. (Don't go overboard on this and define new scopes just for your variables - but if a variable is used only within a loop or a branch of a conditional, define it within that block.)
As long as you are using an optimising compiler (and the optimisation is enabled!), there is rarely a difference in stack space and usage when you have your local variables inside a block or outside. The compiler will be smart enough to see the useful lifetime of the variable, and limit its allocation appropriately.
And if you are not using optimisation with your compiler - you should be!
I am building one of the projects and I am looking at the generated list file.(target: x86-64) My code looks like:
int func_1(var1,var2){
asm_inline_(
)
func_2(var1,var2);
return_1;
}
void func_2(var_1,var_2){
asm __inline__(
)
func_3();
}
/**** Jump to kernel ---> System call stub in assembly. This func in .S file***/
void func_3(){
}
When I see the assembly code, I find "jmp" instruction is used instead of "call-return" pair when calling func_2 and func_3. I am sure it is one of the compiler optimization and I have not explored how to disable it. (GCC)
The moment I add some volatile variables to func_2 and func_3 and increment them then "jmp" gets replaced by "call-ret" pair.
I am bemused to see the behavior because those variables are useless and they don't serve any purpose.
Can someone please explain the behavior?
Thanks
If code jumps to the start of another function rather than calling it, when the jumped-to function returns, it will return back to the point where the outer function was called from, ignoring any more of the first function after that point. Assuming the behaviour is correct (the first function contributed nothing else to the execution after that point anyway), this is an optimisation because it reduces the number of instructions and stack manipulations by one level.
In the given example, the behaviour is correct; there's no local stack to pop and no value to return, so there is no code that needs to run after the call. (return_1, assuming it's not a macro for something, is a pure expression and therefore does nothing no matter its value.) So there's no reason to keep the stack frame around for the future when it has nothing more to contribute to events.
If you add volatile variables to the function bodies, you aren't just adding variables whose flow the compiler can analyse - you're adding slots that you've explicitly told the compiler could be accessed outside the normal control flow it can predict. The volatile qualifier warns the compiler that even though there's no obvious way for the variables to escape, something outside has a way to get their address and write to it at any time. So it can't reduce their lifetime, because it's been told that code outside the function might still try to write to that stack space; and obviously that means the stack frame needs to continue to exist for its entire declared lifespan.
C89
gcc (GCC) 4.7.2
Hello,
I am maintaining someones software and I found this function that returns the address of a static structure. This should be ok as the static would indicate that it is a global so the address of the structure will be available until the program terminates.
DRIVER_API(driver_t*) driver_instance_get(void)
{
static struct tag_driver driver = {
/* Elements initialized here */
};
return &driver;
}
Used like this:
driver_t *driver = NULL;
driver = driver_instance_get();
The driver variable is used throughout the program until it terminates.
some questions:
Is it good practice to do like this?
Is there any difference to declaring it static outside the function at file level?
Why not pass it a memory pool into the function and allocate memory to the structure so that the structure is declared on the heap?
Many thanks for any suggestions,
Generally, no. It makes the function non-reentrable. It can be used with restraint in situations when the code author really knows what they are doing.
Declaring it outside would pollute the file-level namespace with the struct object's name. Since direct access to the the object is not needed anywhere else, it makes more sense to declare it inside the function. There's no other difference.
Allocate on the heap? Performance would suffer. Memory fragmentation would occur. And the caller will be burdened with the task of explicitly freeing the memory. Forcing the user to use dynamic memory when it can be avoided is generally not a good practice.
A better idea for a reentrable implementation would be to pass a pointer to the destination struct from the outside. That way the caller has the full freedom of allocating the recipient memory in any way they see fit.
Of course, what you see here can simply be a C implementation of a singleton-like idiom (and most likely it is, judging by the function's name). This means that the function is supposed to return the same pointer every time, i.e. all callers are supposed to see and share the same struct object through the returned pointer. And, possibly, thy might even expect to modify the same object (assuming no concurrency). In that case what you see here is a function-wrapped implementation of a global variable. So, changing anything here in that case would actually defeat the purpose.
As long as you realize that any code that modifies the pointer returned by the function is modifying the same variable as any other code that got the same pointer is referring to, it isn't a huge problem. That 'as long as' can be a fairly important issue, but it works. It usually isn't the best practice — for example, the C functions such as asctime() that return a pointer to a single static variable are not as easy to use as those that put their result into a user-provided variable — especially in threaded code (the function is not reentrant). However, in this context, it looks like you're achieving a Singleton Pattern; you probably only want one copy of 'the driver', so it looks reasonable to me — but we'd need a lot more information about the use cases before pontificating 'this is diabolically wrong'.
There's not really much difference between a function static and a file static variable here. The difference is in the implementation code (a file static variable can be accessed by any code in the file; the function static variable can only be accessed in the one function) and not in the consumer code.
'Memory pool' is not a standard C concept. It would probably be better, in general, to pass in the structure to be initialized by the called function, but it depends on context. As it stands, for the purpose for which it appears to be designed, it is OK.
NB: The code would be better written as:
driver_t *driver = driver_instance_get();
The optimizer will probably optimize the code to that anyway, but there's no point in assigning NULL and then reassigning immediately.
I have the following structure:
struct sys_config_s
{
char server_addr[256];
char listen_port[100];
char server_port[100];
char logfile[PATH_MAX];
char pidfile[PATH_MAX];
char libfile[PATH_MAX];
int debug_flag;
unsigned long connect_delay;
};
typedef struct sys_config_s sys_config_t;
I also have a function defined in a static library (let's call it A.lib):
sys_config_t* sys_get_config(void)
{
static sys_config_t config;
return &config;
}
I then have a program (let's call it B) and a dynamic library (let's call it C). Both B and C link with A.lib. At runtime B opens C via dlopen() and then gets an address to C's function func() via a call to dlsym().
void func(void)
{
sys_get_config()->connect_delay = 1000;
}
The above code is the body of C's func() function and it produces a segmentation fault when reached. The segfault only occurs while running outside of gdb.
Why does that happen?
EDIT: Making sys_config_t config a global variable doesn't help.
The solution is trivial. Somehow, by a header mismatch, the PATH_MAX constant was defined differently in B's and C's compilation units. I need to be more careful in the future. (facepalms)
There is no difference between the variable being a static-local, or a static-global variable. A static variable is STATIC, that means, it is not, on function-call demand, allocated on the stack within the current function frame, but rather it is allocated in one of the preexisting segments of the memory defined in the executable's binary headers.
That's what I'm 100% sure. The question, where in what segment they exactly placed, and whether they are properly shared - is an another problem. I've seen similar problems with sharing global/static variables between modules, but usually, the core of the problem was very specific to the exact setup..
Please take into consideration, that the code sample is small, and I worked on that platforms long time ago. What I've written above might got mis-worded or even be plainly wrong at some points!
I think, that the important thing is that you are getting that segfault in C when touching that line. Setting an integer field to a constant could not have failed, never, provided that target address is valid and not write-protected. That leaves two options:
- either your function sys_get_config() has crashed
- or it has returned an invalid pointer.
Since you say that the segfault is raised here, not in sys_get_config, the only thing left is the latter point: broken pointer.
Add to the sys_get_config some trivial printf that will dump the address-to-be-returned, then do the same in the calling function "func". Check whether it not-null, and also check if within sys_get_config it is the same as after being returned, just to be sure that calling conventions are proper, etc. A good idea for making a double/triple check is to also add inside the module "A" a copy of the function sys_get_config (with different name of course), and to check whether the addresses returned from sys_get_config and it's copy are the same. If they are not - something went very wrong during the linking
There is also a very very small chance that the module loading has been deferred, and you are trying to reference a memory of a module that was not fully initialized yet.. I worked on linux very long time ago, but I remember that dlopen has various loading options. But you wrote that you got the address by dlsym, so I suppose the module has loaded since you've got the symbol's final address..