And is there a way to easily monitor your stack depth in a linux environment?
Consider the case of a basic app in C, compiled with gcc, in Ubuntu.
How about if you do NOT allow dynamic memory allocation (no malloc/free-ing)?
Can you know your max stack depth after you compile?
No. Consider a recursive function that might call itself any number of times, depending on input. You can't know how many times the function might be invoked, one inside the last, without knowing what the input to the program is.
I expect that it might be possible to determine the max stack depth for some programs, but you can't determine that for all programs.
And is there a way to easily monitor your stack depth in a linux environment?
I don't know about an easy way to monitor the stack depth continuously, but you can determine the stack depth at any point using gdb's -stack-info-depth command.
How about if you do NOT allow dynamic memory allocation (no malloc/free-ing)?
That doesn't make a difference. Consider a recursive Fibonacci function -- that wouldn't need to allocate any memory dynamically, but the number of stack frames will still vary depending on input.
It is possible to do a call-graph analysis. The longest path in the graph is the maximum stack depth. However, there are limitations.
With recursive functions, all bets are off, the depth of recursion depends on the run time input, it is not possible to deduce that in compile time analysis. [It is possible to detect the presence of recursive function by analyzing the call graph and looking for self edges, i.e. edges with same source and destination.]
Furthermore, the same issue is present if there are loops/cycles in the call graph. [As mentioned by #Caleb: a()->b()->c()->d()->e()->f()->g()->c()] [Using algorithms from graph theory, it is possible to detect the presence of cycles as well.]
References for call graph:
http://en.wikipedia.org/wiki/Call_graph
Tools to get a pictorial function call graph of code
Related
I am wondering what the best way to Determine how much stack space a program is using, are there any techniques or tools to generate the statistics, rather than counting by hand?
The program is hoping to analyze is a C program in code composer, if that makes a difference.
Thank you
You can fill the stack ram with some pattern (0xDEADBEEF for example) and then run for a while then examine the stack to see how much was used. You would still have to do the analysis to find the deepest paths, and then generate the deepest nested interrupts on top of that if it is ever possible in the application.
There is some info about running the static analysis tool on TI's website here. Generally, static analysis will tell you how much stack is used by the deepest call tree from main(), but it won't include the ISRs. You need to manually look at the call tree and add in the ISR call depth. If you have several priority levels of ISRS, don't forget that a higher priority ISR can interrupt a lower priority one.
I have seen a lot of search algorithms to search in a binary sorted tree, but all of them are using the same way: recursion. I know recursion is expensive in comparison to loops, because every time we call the search function, a new stack frame is created for that method, which will eventually use a lot of memory if the binary search tree is too big.
Why can't we search the binary search tree like this:
while (root!=NULL)
{
if (root->data==data)
return 0;
else if (data > root->data)
root=root->right;
else
root=root->left;
}
I think that this way is faster and more efficient than the recursive way, correct me if I am wrong!
Probably your way -which is the common way to code that in C- might be faster, but you should benchmark, because some C compilers (e.g. recent GCC when invoked with gcc -O2 ...) are able to optimize most tail calls as a jump (and passing values in registers). tail call optimization means that a call stack frame is reused by the callee (so the call stack stays bounded). See this question.
FWIW, in OCaml (or Scheme, or Haskell, or most Common Lisp implementations) you would code a tail-recursive call and you know that the compiler is optimizing it as a jump.
Recursion is not always slower than loops (in particular for tail-calls). It is a matter of optimization by the compiler.
Read about continuations and continuation passing style. If you know only C, learn some functional language (Ocaml, Haskell, or Scheme with SICP ...) where tail-calls are very often used. Read about call/cc in Scheme.
Yes, that's the normal way of doing it.
Theoretically both your solution and the recursive solution have the same Big Oh complexity. In theory they are both O(log n). If you want performance measured in seconds you need to go practical, write the code of both methods (iterative, recursive), run them and measure the run time.
I've got an auxiliary function that does some operations that are pretty costly.
I'm trying to profile the main section of the algorithm, but this auxiliary function gets called a lot within. Consequently, the measured time takes into account the auxillary function's time.
To solve this, I decided to set and restore the time so that the auxillary function appears to be instantaneous. I defined the following macros:
#define TIME_SAVE struct timeval _time_tv; gettimeofday(&_time_tv,NULL);
#define TIME_RESTORE settimeofday(&_time_tv,NULL);
. . . and used them as the first and last lines of the auxiliary function. For some reason, though, the auxiliary function's overhead is still included!
So, I know this is kind of a messy solution, and so I have since moved on, but I'm still curious as to why this idea didn't work.
Can someone please explain why?
If you insist on profiling this way, do not set the system clock. This will break all sorts of things, if you have permission to do it. Basically you should forget you ever heard of settimeofday. What you want to do is call gettimeofday both before and after the function you want to exclude from measurement, and compute the difference. You can then exclude the time spent in this function from the overall time.
With that said, this whole method of "profiling" is highly flawed, because gettimeofday probably (1) takes a significant amount of time compared to what you're trying to measure, and (2) probably involves a transition into kernelspace, which will do some serious damage to your program's cache coherency. This second problem, whereby in attempting to observe your program's performance characteristics you actually change them, is the most problematic.
What you really should do is forget about this kind of profiling (gettimeofday or even gcc's -pg/gmon profiling) and instead use oprofile or perf or something similar. These modern profiling techniques work based on statistically sampling the instruction pointer and stack information periodically; your program's own code is not modified at all, so it behaves as closely as possible to how it would behave with no profiler running.
There are a couple possibilities that may be occurring. One is that Linux tries to keep the clock accurate and adjustments to the clock may be 'smoothed' or otherwise 'fixed up' to try to keep a smooth sense of time within the system. If you are running NTP, it will also try to maintain a reasonable sense of time.
My approach would have been to not modify the clock but instead track time consumed by each portion of the process. The calls to the expensive part would be accumulated (by getting the difference between gettimeofday on entry and exit, and accumulating) and subtracting that from overall time. There are other possibilities for fancier approaches, I'm sure.
I have two questions related to memory. First some background. I am a novice-intermediate c programmer.
I have written several different tree like data-structures with variable number of nodes at each level. One such structure, can have as its data a number of integer variables, which themselves are primary data for integer trees. I have written recursive functions for genrating trees with random numbers of nodes at different levels. I pass pointers to randomly generated integer trees as parameters for generating the main data-structure.
I have also written recursive code for operating on these tree structures, such as printing the tree. Just for my learning, I created queue and stack for my nodes and wrote iterative functions for in-order, pre-order and posr-order printing of the tree. I think, I am beginning to get the hang of it.
Now the question.
(a) I need to write other functions, which are obviously easy and clean if written using pure recursion. I can see how it could be written iteratively. It is not difficult, just tedious. The maximum depth of my trees will be 3-5, however, the number of nodes at each level is large. It is my understanding, that every recursive call will store addresses on a stack. If the depth is large, it can run out of memory. But if the depth is shallow, the penalty (memory/speed) of using a recursive function may not be terrible.
Do people have recommendations on criteria for deciding if an iterative/recursive solution is preferable?? I have read various threads on the site about iterative soution, but could not find any thing that directly speaks to this issue.
(b) Second, question relates to requesting memory from the system. I know that some applications can request certain amount of memory. I am using mingw-gcc4.x with Netbeans IDE. How can I specify the maximum amount of memory that the program can use in debug / release mode? Or, does it depend solely on the available RAM and no explicit specification is necessary?!
Thanks in advance,
paras
~RT
"The maximum depth of my trees will be 3-5"
This depth of recursion will not challenge the default stack size of any version of Windows, or any other system you'll ever see that doesn't have "Watch out! The stack is tiny!" plastered all over it. Most programs go a lot more than 3-5 calls deep, without involving any recursion at all.
So as long as your recursion only goes "down" your tree, not "across" its breadth, you're fine. Unless of course you're doing something really unusual like sticking an enormous array on the stack as a local variable.
Am I right that your (post-order) recursion looks something like this?
void doNode(struct node *n) {
for (int i = 0; i < n->num_nodes; ++i) {
doNode(n->nodes[i]);
}
// do some work on this node
}
If so, then for 3-5 levels the only way it'll use a lot of stack is if it looks like this:
void doNode(struct node *n) {
int myRidiculousArray[100*1000] = { 0 };
for (int i = 0; i < n->num_nodes; ++i) {
doNode(n->nodes[i]);
}
// do some work on this node, using myRidiculousArray
}
If you have millions of nodes, then there may be some performance gain to be had from avoiding the function call overhead per node. But write it the easy way first, and then come back and look at it later if you're ever desperate for a bit more performance. It's fairly rare for the overhead of a function call per node to be reason your code is slow - it does happen, but usually only after you've fixed a bunch of other, even worse, slowdowns.
If you write your function using tail recursion (provided you're compiling with optimization enabled) you won't run into problems with stack or memory space.
In the end you need to program your functions so you can understand them so do whatever is easier for you.
Even an iterative implementation is a recursive algorithm if you're using a stack to store nodes; both use O(f) space, where "f" is a function that's "more" than a constant (c is O(f) but f is not O(1)). You might still wind up using less memory with the iterative version if the elements of your stack are smaller than a call-stack frame. If so, you can look into reducing the size of a call stack by using closures, assuming the language supports them.
Iterative algorithms will have O(1) space requirements. Even a recursive implementation can achieve this using tail calls, as Dashogun mentions.
Spend a little time trying to find an iterative algorithm. If you can't find one, I recommend going with the recursive implementation unless you know for certain that you need to handle a recursive structure that (these days) has a depth of at least 213. For a binary tree, that's 2213 nodes, which I very much doubt you'll see.
(a) Recursion is not bad in itself. However if writing the iterative algo is close in complexity you should use the iterative one. Before commiting to a recursive algorithm some prerequisites apply:
-You should make sure that the recursion depth (and the local variables in the re-entrant functions) will not make you exceed the stack size. For the depth you mentioned used on Windows this would be a problem in very few cases. Additionally you can add a safety check on the height of the tree.
(b) If you are asking about the stack size: I see you use mingw, thus you probably build for Windows. The stack size in Windows is per thread. Have a look here how to setup your reserved and initially commited stack size.
If you are asking about heap memory allocation have a look here. But the short story is that you can use all the memory the system can provide for heap allocations.
I want to know what is the maximum number of (recursive)function calls allowed in gcc C. I have a program which can take stack depth of 400000 function calls each with size of around 200 bytes (so around 80 MB). How can I increase the maximum depth?
The stack limit is not imposed by the compiler, but by the operating system. On Unix, you can try using ulimit(1) to increase it.
I would recommend rewriting the routine into an iterative algorithm. Though nontrivial it should be straightforward to convert the algorithm, and will free you from having to deal with such resource limitations (which, I would guess, vary wildly by architecture, platform, computer details, etc.)
Also, please note: all recursive algorithms can be written iteratively.