Variable inside if block is shown in call stack even though the if statement didn't get evaluated to true - c

I have a piece of code in C as shown below-
In a .c file-
1 custom_data_type2 myFunction1(custom_data_type1 a, custom_data_type2 b)
2 {
3 int c=foo();
4 custom_data_type3 t;
5 check_for_ir_path();
6 ...
7 ...
8 }
9
10 custom_data_type4 myFunction2(custom_data_type3 c, const void* d)
11 {
12 custom_data_type4 e;
13 struct custom_data_type5 f;
14 check_for_ir_path();
15 ...
16 temp = myFunction1(...);
17 return temp;
18 }
In a header file-
1 void CRASH_DUMP(int *i)
2 __attribute__((noinline));
3
4 #define INTRPT_FORCE_DUMMY_STACK 3
5
6 #define check_for_ir_path() { \
7 if (checkfunc1() && !checkfunc2()) { \
8 int temp = INTRPT_FORCE_DUMMY_STACK; \
9 ...
10 CRASH_DUMP(&sv);\
11 }\
12 }\
In an unknown scenario, there is a crash.
After processing the core dump using GDB, we get the call stack like -
#0 0x00007ffa589d9619 in myFunction1 [...]
(custom_data_type1=0x8080808080808080, custom_data_type2=0x7ff9d77f76b8) at ../xxx/yyy/zzz.c:5
temp = 32761
t = <optimized out>
#1 0x00007ffa589d8f91 in myFunction2 [...]
(custom_data_type3=<optimized out>, d=0x7ff9d77f7748) at ../xxx/yyy/zzz.c:16
temp = 167937677
f = {
...
}
If you see the code, check_for_ir_path is invoked from both myFunction1() and myFunction2().
And inside check_for_ir_path, there is a check inside if block like - checkfunc1() && !checkfunc2(). If that check evaluates to TRUE then a SIGSEGV is fired and the process is crashed intentionally. And the variable temp is declared only if that condition passes.
Now if you look at the call stack, you can see the local variable temp shown even in the StackFrame_1. However it didn't crash inside the function myFunction2. How could this be possible?
If i declare another variable, say 'int temp' just after the statement int temp = INTRPT_FORCE_DUMMY_STACK;, that is not shown as part of bt full
How could this be even possible?

Compilers are allowed to reorganise your code in any way that doesn't change the outcome of the program. So if you write:
void foo()
{
if (something)
{
int sv;
...
}
}
the compiler is allowed to change it into something equivalent to:
void foo()
{
int sv;
if (something)
{
...
}
}
regardless of something being true or false.
But the compiler must make sure that this will fail to compile:
void foo()
{
if (something)
{
int sv;
...
}
sv = whatever; // Compiler error....
}

Related

How does the for loop run in such a case?

#include <stdio.h>
int f1(){
static int val=11;
return --val;
}
int main()
{
for( f1(); f1(); f1()){
printf("%d",f1());
}
}
The output of the program is :
8 5 2
Could someone explain me what does for(f1();f1();f1()) do?
#include <stdio.h>
int foo(void) {
static int val = 11;
return --val;
}
int main(void) {
int (*f1)(void) = foo; //4 function pointers to the same function
int (*f2)(void) = foo;
int (*f3)(void) = foo;
int (*f4)(void) = foo;
for (f1(); f2(); f3()) {
printf("%d", f4());
}
}
The execution goes like this
val = 11
f1() 10
f2() 9 "true"
f4() print 8
f3() 7
f2() 6 "true"
f4() print 5
f3() 4
f2() 3 "true"
f4() print 2
f3() 1
f2() 0 "false"
For easier understanding I'll talk about functions as such:
for (A(); B(); C()) {
printf("%d", D());
}
So understand that all A B C and D are calls to f1() but I want to be able to differentiate them.
1st thing to understand is how are the "parameters" of for working.
1st one is the init part, it will be executed once only, before the evaluation of the first condition.
2nd one is the condition part, it will be executed before every loop, to check if it needs to keep looping or stop.
3rd one is the conclusion part, it will be executed after every loop, before the next evaluation of the condition.
We can then understand that your program will execute in this order:
call A() to initialize the for loop => val = 10;
call B() to check condition => val = 9
call D() in your printf => val = 8 gets printed
call C() to end an iteration => val = 7
call B() to check condition => val = 6
call D() in your printf => val = 5 gets printed
call C() to end an iteration => val = 4
call B() to check condition => val = 3
call D() in your printf => val = 2 gets printed
call C() to end an iteration => val = 1
call B() to check condition => val = 0 condition ends
The C for loop contains 3 items:
the initialisation
the condition
the incrementation
A for loop run until the condition is equal to either 0 or null.
You enter the loop, it call f1() a first time in the initialisation so val equal 10
At each iteration, it will call f1() three time, to check the condition, to do the increment part and in the printf call. Thus at each iteration we have val = val - 3
Once val == 0 you exit the loop.
Note that it only work because val is static thus not being reinitialised at each call of f1()

Pointer to linkedlist randomly changing

For one of my school assignments I have to make my own stack library and a POSTFIX calculator.
The calculator has to make use of the stack library and do some calculations.
I am pushing two different numbers to my stack. Number 6 and 3. The header should point to the most recently added node (LIFO). So when 6 is added:
HEADER -> 6 -> NULL
When 3 is being added:
HEADER -> 3 -> 6 -> NULL
When I print the value of my header after adding '6' it's good. It's printing 6.
However, when I print the value of my header BEFORE adding '3' it's printing '3'. When it still should print 6.
So a summary of my problem:
When adding another node to my linkedlist, the header suddenly points to the newest node before even changing it.
You may understand me better with some code and debugging results.
Btw: Don't mind the typedefs, I don't like them. My teacher wants us to use it.
typedef struct stackObject* pStackObject_t;
typedef struct stackObject
{
void* obj;
pStackObject_t next;
} StackObject_t;
typedef struct stackMEta* pStackMeta_t;
typedef struct stackMEta
{
pStackObject_t stack;
size_t objsize;
int numelem; //number of elements
int handle; //welke stack
pStackMeta_t next;
} StackMeta_t;
int mystack_push(int handle, void* obj)
{
**DELETED NON RELATED CODE BASED ON FEEDBACK**
if (currentMeta->handle == handle)
{
pStackObject_t newObject = malloc(sizeof(StackObject_t));
newObject->obj = obj;
printf("%s%d\n", "Wanting to push int to stack: ", *(int*)obj);
//First node
if (currentMeta->stack == NULL)
{
currentMeta->stack = newObject;
currentMeta->stack->next = NULL;
printf("%s%d\n", " FIRST Curentmeta->stack pointing to ", *(int*)currentMeta->stack->obj);
return 0;
}
else
{
printf("%s%d\n", "NOT FIRST Currentmeta->stack pointing to ", *(int*)currentMeta->stack->obj);
newObject->next = currentMeta->stack;
currentMeta->stack = newObject;
printf("%s%d\n", "Currentmeta->stack ", *(int*)currentMeta->stack->obj);
printf("%s%d\n", "Currentmeta->stack->next ", *(int*)currentMeta->stack->next->obj);
printf("%s%d\n", "Succesful pushed int to stack: ", *(int*)currentMeta->stack->obj);
return 0;
}
}
return -1;
}
Terminal:
Created stack with handle: 1 and objsize 4 bytes
Wanting to push int to stack: 6
FIRST Curentmeta->stack pointing to 6
Wanting to push int to stack: 3
NOT FIRST Currentmeta->stack pointing to 3
Currentmeta->stack 3
Currentmeta->stack->next 3
Succesful pushed int to stack: 3
My unit tests are performing good with this code. My calculator does not, while it's the same function call.
I found out that it was working 50/50. Using the same input values in another program resulted in good things.
I changed the code to this:
pStackObject_t newObject = malloc(sizeof(StackObject_t));
newObject->obj = malloc(sizeof(currentMeta->objsize));
memcpy(newObject->obj, obj, currentMeta->objsize);
Now it's working fine. The previous code was somehow using the old values, while it was already out of scope. Thanks everyone for the help.

Writing an LLVM pass to detect malloc function calls, number of bytes assigned and the variable name pointing to that memory

I have recently begun working with LLVM. I am trying to write a pass in LLVM that given the following code
string = (char *)malloc(100);
string = NULL;
and the corresponding LLVM IR
%call = call noalias i8* #malloc(i64 100) #3
store i8* %call, i8** %string, align 8
store i8* null, i8** %string, align 8
detects instructions calling malloc, extracts number of bytes assigned (in this case 100), the address returned and the variable name that the address is assigned to.
std::map<std::string, std::tuple<size_t, int> > mem_addrs; // stores pointer name, address and no. of bytes allocated
Count() : ModulePass(ID) {}
virtual bool runOnModule(Module &M) {
for (Function &F: M) {
for (BasicBlock &B: F) {
for (Instruction &I: B) {
if(CallInst* call_inst = dyn_cast<CallInst>(&I)) {
Function* fn = call_inst->getCalledFunction();
StringRef fn_name = fn->getName();
errs() << fn_name << " : " << "\n";
for(auto args = fn->arg_begin(); args != fn->arg_end(); ++args) {
ConstantInt* arg = dyn_cast<ConstantInt>(&(*args));
if (arg != NULL)
errs() << arg->getValue() << "\n";
}
}
}
}
}
The output is
-VirtualBox:~/program_analysis$ opt -load $LLVMLIB/CSE231.so -analyze -count < $BENCHMARKS/leaktest/leaktest.bc > $OUTPUTLOGS/welcome.static.log
ok
allocaimw
allocaleak
allocamalloc : 0x2f5d9e0
0 opt 0x0000000001315cf2 llvm::sys::PrintStackTrace(_IO_FILE*) + 34
1 opt 0x0000000001315914
2 libpthread.so.0 0x00007f0b53f12330
3 opt 0x00000000012ec78f llvm::APInt::toString(llvm::SmallVectorImpl<char>&, unsigned int, bool, bool) const + 79
4 opt 0x00000000012ed309 llvm::APInt::print(llvm::raw_ostream&, bool) const + 57
5 CSE231.so 0x00007f0b52f16661
6 opt 0x00000000012ad6cd llvm::legacy::PassManagerImpl::run(llvm::Module&) + 797
7 opt 0x000000000058e190 main + 2752
8 libc.so.6 0x00007f0b5313af45 __libc_start_main + 245
9 opt 0x00000000005ab2ca
Stack dump:
0. Program arguments: opt -load /home/hifza/program_analysis/llvm/build/Release+Asserts/lib/CSE231.so -analyze -count
1. Running pass 'Instruction Counts Pass' on module '<stdin>'.
Segmentation fault (core dumped)
I am able to detect malloc instructions, but I am not able to find out the corresponding memory address and the number of bytes assigned. Can anyone guide me on how can I go about doing this? Thanks.
You don't check the result of dyn_cast<ConstantInt>(&(*args)). If casted type is not a ConstantInt, it returns nullptr. And in the next line (arg->getValue()) you dereference it.
I prefer detecting malloc calls,
by first detecting store insts
then checking whether LHS is a pointer
then find out what is RHS (by using a stack approach to find actual value, since LLVM IR is a load-store architecture and hence we don't find the actual value in RHS, always)
if I end up getting a call inst then
check whether its malloc or not
Once you have detected the malloc, you can simply fetch the bytes accessed by ip->getOperand(0)
And the variable name pointing to the memory is nothing but the value returned by Store inst that you just started with - lhs in the code
Am sharing the code snippet,which will also work for inter-procedural cases as well and also supports new operator .
void findOperand(Value *itVal) {
std::stack<Value *> st;
st.push(itVal);
while(!st.empty()) {
auto ele = st.top();
st.pop();
if(isa<Instruction>(ele)) {
Instruction *tip = (Instruction *)ele;
if(isa<AllocaInst>(tip)) {
errs()<<"others\n";
//opdSet.insert(ele);
}else if(isa<LoadInst>(tip)) {
Value *ti = tip->getOperand(0);
if(!isa<ConstantData>(ti))
st.push(ti);
} else if(isa<CallInst>(tip)) {
Function *calledFp = cast<CallInst>(tip)->getCalledFunction();
errs()<<calledFp->getName()<<"\n";
if(calledFp->getName() == "malloc" || calledFp->getName() == "_Znwm") {
errs()<<"Dynamic memory allocation!\n";
errs()<<tip->getNumOperands()<<"\n";
errs()<<tip->getOperand(0)<<"\n";
} else {
//fetch the last bb of the function
auto bb = calledFp->end();
if(bb != calledFp->begin()) {
bb --;
BasicBlock *bp = &(*bb);
//fetch the terminator
Instruction *term = bp->getTerminator();
if(isa<ReturnInst>(term)) {
//find Operand
findOperand(term->getOperand(0));
errs()<<"done\n";
}
}
}
} else {
for(int i=0, numOp = tip->getNumOperands(); i < numOp; i++) {
Value *ti = tip->getOperand(i);
if(!isa<ConstantData>(ti)) {
st.push(ti);
}
}
}
} else if (isa<GlobalVariable>(ele)) {
errs()<<"others\n";
}
}
}//findOperand
void visitStoreInst(StoreInst &ip) {
Value *lhs = ip.getOperand(1);
Value *rhs = ip.getOperand(0);
if(lhs->getType()->getContainedType(0)->isPointerTy()) {
//figure out rhs
errs()<<"pointer assignment!"<<lhs->getName()<<"\n";
findOperand(rhs);
}
}

What is this madness?

I've never seen anything like this; I can't seem to wrap my head around it. What does this code even do? It looks super fancy, and I'm pretty sure this stuff is not described anywhere in my C book. :(
union u;
typedef union u (*funcptr)();
union u {
funcptr f;
int i;
};
typedef union u $;
int main() {
int printf(const char *, ...);
$ fact =
($){.f = ({
$ lambda($ n) {
return ($){.i = n.i == 0 ? 1 : n.i * fact.f(($){.i = n.i - 1}).i};
}
lambda;
})};
$ make_adder = ($){.f = ({
$ lambda($ n) {
return ($){.f = ({
$ lambda($ x) {
return ($){.i = n.i + x.i};
}
lambda;
})};
}
lambda;
})};
$ add1 = make_adder.f(($){.i = 1});
$ mul3 = ($){.f = ({
$ lambda($ n) { return ($){.i = n.i * 3}; }
lambda;
})};
$ compose = ($){
.f = ({
$ lambda($ f, $ g) {
return ($){.f = ({
$ lambda($ n) {
return ($){.i = f.f(($){.i = g.f(($){.i = n.i}).i}).i};
}
lambda;
})};
}
lambda;
})};
$ mul3add1 = compose.f(mul3, add1);
printf("%d\n", fact.f(($){.i = 5}).i);
printf("%d\n", mul3.f(($){.i = add1.f(($){.i = 10}).i}).i);
printf("%d\n", mul3add1.f(($){.i = 10}).i);
return 0;
}
This example primarily builds on two GCC extensions: nested functions, and statement expressions.
The nested function extension allows you to define a function within the body of another function. Regular block scoping rules apply, so the nested function has access to the local variables of the outer function when it is called:
void outer(int x) {
int inner(int y) {
return x + y;
}
return inner(6);
}
...
int z = outer(4)' // z == 10
The statement expression extension allows you to wrap up a C block statement (any code you would normally be able to place within braces: variable declarations, for loops, etc.) for use in a value-producing context. It looks like a block statement in parentheses:
int foo(x) {
return 5 + ({
int y = 0;
while (y < 10) ++y;
x + y;
});
}
...
int z = foo(6); // z == 20
The last statement in the wrapped block provides the value. So it works pretty much like you might imagine an inlined function body.
These two extensions used in combination let you define a function body with access to the variables of the surrounding scope, and use it immediately in an expression, creating a kind of basic lambda expression. Since a statement expression can contain any statement, and a nested function definition is a statement, and a function's name is a value, a statement expression can define a function and immediately return a pointer to that function to the surrounding expression:
int foo(int x) {
int (*f)(int) = ({ // statement expression
int nested(int y) { // statement 1: function definition
return x + y;
}
nested; // statement 2 (value-producing): function name
}); // f == nested
return f(6); // return nested(6) == return x + 6
}
The code in the example is dressing this up further by using the dollar sign as a shortened identifier for a return type (another GCC extension, much less important to the functionality of the example). lambda in the example isn't a keyword or macro (but the dollar is supposed to make it look like one), it's just the name of the function (reused several times) being defined within the statement expression's scope. C's rules of scope nesting mean it's perfectly OK to reuse the same name within a deeper scope (nested "lambdas"), especially when there's no expectation of the body code using the name for any other purpose (lambdas are normally anonymous, so the functions aren't expected to "know" that they're actually called lambda).
If you read the GCC documentation for nested functions, you'll see that this technique is quite limited, though. Nested functions expire when the lifetime of their containing frame ends. That means they can't be returned, and they can't really be stored usefully. They can be passed up by pointer into other functions called from the containing frame that expect a normal function pointer, so they are fairly useful still. But they don't have anywhere near the flexibility of true lambdas, which take ownership (shared or total depends on the language) of the variables they close over, and can be passed in all directions as true values or stored for later use by a completely unrelated part of the program. The syntax is also fairly ungainly, even if you wrap it up in a lot of helper macros.
C will most likely be getting true lambdas in the next version of the language, currently called C2x. You can read more about the proposed form here - it doesn't really look much like this (it copies the anonymous function syntax and semantics found in Objective-C). The functions created this way have lifetimes that can exceed their creating scope; the function bodies are true expressions, without the need for a statement-containing hack; and the functions themselves are truly anonymous, no intermediate names like lambda required.
A C2x version of the above example will most likely look something like this:
#include <stdio.h>
int main(void) {
typedef int (^ F)(int);
__block F fact; // needs to be mutable - block can't copy-capture
// its own variable before initializing it
fact = ^(int n) {
return n == 0 ? 1 : n * fact(n - 1);
};
F (^ make_adder)(int) = ^(int n) {
return _Closure_copy(^(int x) { return n + x; });
};
F add1 = make_adder(1);
F mul3 = ^(int n) { return n * 3; };
F (^ compose)(F, F) = ^(F f, F g) {
return _Closure_copy(^(int n) { return f(g(n)); });
};
F mul3add1 = compose(mul3, add1);
printf("%d\n", fact(5));
printf("%d\n", mul3(add1(10)));
printf("%d\n", mul3add1(10));
_Closure_free(add1);
_Closure_free(mul3add1);
return 0;
}
Much simpler without all that union stuff.
(You can compile and run this modified example in Clang right now - use the -fblocks flag to enable the lambda extension, add #include <Block.h> to the top of the file, and replace _Closure_copy and _Closure_free with Block_copy and Block_release respectively.)

Segmentation Fault with Recursive Function

I am very new to programming in C, and can't seem to locate the cause of the segmentation error that I have been getting. The program I wrote is as follows:
# include <stdio.h>
# include <stdlib.h>
int recursive(int x){
if(x=0)
{
return 2;
}
else
{
return 3*(x-1)+recursive(x-1)+1;
}
}
int main(int argc, char *argv[])
{
int N = atoi(argv[1]);
return recursive(N);
}
I would appreciate any help.
Thanks a lot
if(x=0){...}
it's wrong
It should be
if(x==0){...}
Note:
if (x = 0)
is the same as:
x = 0; if (x)
This:
if(x=0){
is not a (pure) test, it's an assignment. It works in the if since it also has a value (zero), but it's always false so that branch is never taken, i.e. the recursion never stops.
You should enable all compiler warnings, this is very commonly caught by compilers.
Change if(x = 0) to if(0 == x)
It is a good rule of hand to write 0 == x instead of x == 0 because in case of a typo like = instead of == the compiler will give an error.
The segfault error is from the use of argv[1]. Make sure you call your function with an argument, as follow:
$ ./a.out 6
with a.out the name of your program, and 6 the number you want to apply the function on.
The following line will create a segfault :
$ ./a.out
because the first argument isn't set.
Plus, watch out on the second line : use == instead of =

Resources