General protection / core dump "boundary" - c

I spoke with a friend about an algorithm, where I need to read data past end of the "variable".
He said I always can do this in safe way, but I disagree.
I know this is undefined behavior, however the person who said it, is really experienced in C. As you can see below, it really works for small number of bytes.
Here is example of the idea, but bit it is "over the top".
#include <stdint.h>
#include <stdio.h>
const uint64_t a = 100;
int main(){
const char *s = (const char *) &a;
printf("%d\n", s[1530]); // this works
// printf("%d\n", s[15300]); // this doesn't
}
Access will be read only and I will mask it so I don't care about reading the junk.
In reality I need just 16 bytes after the variable.
Also variable does not need to be const, but it very well might be const.
Is this safe, at least for small number of bytes?

You and your friend are speaking at cross-purposes.
Undefined behaviour means "not defined in the standard," often to avoid constraining implementations.
Behaviour left undefined may be defined by a particular implementation, but:
not be portable to other systems or implementations
not be portable to future versions of the same implementation
depending on what guarantees the implementation makes, it may anyway be badly supported
You haven't told us your platform or implementation, so all we can say is "this is undefined behaviour," because we only have the standard to go on.
If your implementation does make some guarantee that's relevant, and you have some platform-specific reason to believe it will keep working ... then at least bear in mind that the same code will not work elsewhere.
If, as seems more likely, the code just happens to do the right thing, at the moment, with the current code on the current version of one specific implementation ... then it could break at any time.

(Maybe I misunderstand what you are trying to do, if so, I just delete this answer again:-) ).
If you are trying to read from a memory segment you don't have access to (it could belong to another process), most OS/CPU will sent your process a 'Segment violation' signal. This signal will usual terminates your process.
You can (only experimental, I would say) intercept this signal, eg:
#include <stdio.h>
#include <signal.h>
#include <setjmp.h>
#include <stdint.h>
static /*sig*/jmp_buf segv_env;
static void * old_segv_sig;
static void segv_handler(int s) { /*sig*/longjmp(segv_env,s); }
int safe_read_check(void * mem,size_t sz)
{
int safe;
size_t i;
int r;
safe = 0;
old_segv_sig = signal( SIGSEGV, segv_handler );
if( /*sig*/setjmp( segv_env /*,1*/ ) == 0 )
{
for( i = 0 ; i < sz ; i++)
r = ((char*)mem)[i];
// no signals
safe=1;
}
signal( SIGSEGV, old_segv_sig ); // restore signal handler
return safe;
}
const uint64_t a = 100;
int main()
{
const char *s = (const char *) &a;
if( safe_read_check( (void*)&s[153],16 ) )
printf("safe\n");
else
printf("unsafe\n");
if( safe_read_check( (void*)&s[15300000L], 16 ))
printf("safe\n");
else
printf("unsafe\n");
return 0;
}

Related

How to understand the type of storage of a pointer

I have as an homework this task:
Given a void** ptr_addr write a function that return 0 if the type of storage of *ptr_addr is static or automatic and return 1 if the type of storage of *ptr_addr is dynamic.
The language of the code must be C.
The problem is that theoretically I know what the task is about but I don't know how to check the
previous condition with a code.
Thanks for the help!
Normally I don't do homework, but in cases like this I may make an exception.
Bear in mind that what I'm about to present is horrible code. Also it doesn't meet your requirements as stated — you'll have to adapt it for that. Also it may not meet your instructor's expectations: for an instructor demented enough to be assigning this task, I can't begin to guess his (her? its?) expectations. You may get dinged for using the technique I've presented, or for presenting someone else's work. Also I'm going to get dinged for presenting this code here on Stack Overflow, because no, it's nothing like portable or guaranteed to do anything, let alone to work. I have no idea whether it'll work on your system.
Nevertheless, and may God help me, I tested it, and it does "work" on a modern Debian Linux system.
#include <unistd.h>
extern etext, edata, end;
char *
mcat(void *p)
{
int dummy;
if(p < &etext)
return "text";
else if(p < &edata)
return "data";
else if(p < &end)
return "bss";
else if(p < sbrk(0))
return "heap";
else if(p > &dummy)
return "stack";
else return "?";
}
You'll get a good number of warnings if you compile this, which could theoretically be silenced using some explicit casts, but I think the warnings are actually pretty appropriate, given the nefariousness of this code.
How it works: on at least some Unix-like systems, etext, edata, and end are magic symbols corresponding to the ends of the program's text, initialized data, and uninitialized data segments, respectively. sbrk(0) gives you a pointer to the top of the heap that a traditional implementation of malloc is using. And &dummy is a good approximation of the bottom of the stack.
Test program:
#include <stdio.h>
#include <stdlib.h>
int g = 2;
int g2;
int main()
{
int l;
static int s = 3;
static int s2;
int *p = malloc(sizeof(int));
printf("g: %s\n", mcat(&g));
printf("g2: %s\n", mcat(&g2));
printf("main: %s\n", mcat(main));
printf("l: %s\n", mcat(&l));
printf("s: %s\n", mcat(&s));
printf("s2: %s\n", mcat(&s2));
printf("p: %s\n", mcat(p));
}
On my test system this prints
g: data
g2: bss
main: text
l: stack
s: data
s2: bss
p: heap
I'd like to post a different approach to solve the problem:
// this function returns 1 if ptr has been allocated by malloc/calloc/realloc, otherwise 0
int is_pointer_heap(void* ptr) {
pid_t p = fork();
if (p == 0) {
(void) realloc(ptr, 1);
exit(0);
}
int status;
(void) waitpid(p, &status, 0);
return (status == 0) ? 1 : 0;
}
I wrote this (bad) code very quickly (and there's lot of room for improvements), but I tested it and it seems to work.
EXPLANATION: realloc() will crash your process if the argument passed to it is not a malloc/calloc/realloc-allocated pointer. Here we create a new child process, we let the child process call realloc(); if the child process crashes, we return 0, otherwise we return 1.

Executing code in mmap to produce executable code segfaults

I'm trying to write a function that copies a function (and ends up modify its assembly) and returns it. This works fine for one level of indirection, but at two I get a segfault.
Here is a minimum (not)working example:
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#define BODY_SIZE 100
int f(void) { return 42; }
int (*G(void))(void) { return f; }
int (*(*H(void))(void))(void) { return G; }
int (*g(void))(void) {
void *r = mmap(0, BODY_SIZE, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
memcpy(r, f, BODY_SIZE);
return r;
}
int (*(*h(void))(void))(void) {
void *r = mmap(0, BODY_SIZE, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
memcpy(r, g, BODY_SIZE);
return r;
}
int main() {
printf("%d\n", f());
printf("%d\n", G()());
printf("%d\n", g()());
printf("%d\n", H()()());
printf("%d\n", h()()()); // This one fails - why?
return 0;
}
I can memcpy into an mmap'ed area once to create a valid function that can be called (g()()). But if I try to apply it again (h()()()) it segfaults. I have confirmed that it correctly creates the copied version of g, but when I execute that version I get a segfault.
Is there some reason why I can't execute code in one mmap'ed area from another mmap'ed area? From exploratory gdb-ing with x/i checks it seems like I can call down successfully, but when I return the function I came from has been erased and replaced with 0s.
How can I get this behaviour to work? Is it even possible?
BIG EDIT:
Many have asked for my rationale as I am obviously doing an XY problem here. That is true and intentional. You see, a little under a month ago this question was posted on the code golf stack exchange. It also got itself a nice bounty for a C/Assembly solution. I gave some idle thought to the problem and realized that by copying a functions body while stubbing out an address with some unique value I could search its memory for that value and replace it with a valid address, thus allowing me to effectively create lambda functions that take a single pointer as an argument. Using this I could get single currying working, but I need the more general currying. Thus my current partial solution is linked here. This is the full code that exhibits the segfault I am trying to avoid. While this is pretty much the definition of a bad idea, I find it entertaining and would like to know if my approach is viable or not. The only thing I'm missing is ability to run a function created from a function, but I can't get that to work.
The code is using relative calls to invoke mmap and memcpy so the copied code ends up calling an invalid location.
You can invoke them through a pointer, e.g.:
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#define BODY_SIZE 100
void* (*mmap_ptr)(void *addr, size_t length, int prot, int flags,
int fd, off_t offset) = mmap;
void* (*memcpy_ptr)(void *dest, const void *src, size_t n) = memcpy;
int f(void) { return 42; }
int (*G(void))(void) { return f; }
int (*(*H(void))(void))(void) { return G; }
int (*g(void))(void) {
void *r = mmap_ptr(0, BODY_SIZE, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
memcpy_ptr(r, f, BODY_SIZE);
return r;
}
int (*(*h(void))(void))(void) {
void *r = mmap_ptr(0, BODY_SIZE, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
memcpy_ptr(r, g, BODY_SIZE);
return r;
}
int main() {
printf("%d\n", f());
printf("%d\n", G()());
printf("%d\n", g()());
printf("%d\n", H()()());
printf("%d\n", h()()()); // This one fails - why?
return 0;
}
I'm trying to write a function that copies a function
I think that is pragmatically not the right approach, unless you know very well machine code for your platform (and then you would not ask the question). Be aware of position independent code (useful because in general mmap(2) would use ASLR and give some "randomness" in the addresses). BTW, genuine self-modifying machine code (i.e. changing some bytes of some existing valid machine code) is today cache and branch-predictor unfriendly and should be avoided in practice.
I suggest two related approaches (choose one of them).
Generate some temporary C file (see also this), e.g. in /tmp/generated.c, then fork a compilation using gcc -Wall -g -O -fPIC /tmp/generated.c -shared -o /tmp/generated.so of it into a plugin, then dlopen(3) (for dynamic loading) that /tmp/generated.so shared object plugin (and probably use dlsym(3) to find function pointers in it...). For more about shared objects, read Drepper's How To Write Shared Libraries paper. Today, you can dlopen many hundreds of thousands of such shared libraries (see my manydl.c example) and C compilers (like recent GCC) are fast enough to compile a few thousand lines of code in a time compatible with interaction (e.g. less than a tenth of second). Generating C code is a widely used practice. In practice you would represent some AST in memory of the generated C code before emitting it.
Use some JIT compilation library, such as GCCJIT, or LLVM, or libjit, or asmjit, etc.... which would generate a function in memory, do the required relocations, and give you some pointer to it.
BTW, instead of coding in C, you might consider using some homoiconic language implementation (such as SBCL for Common Lisp, which compiles to machine code at every REPL interaction, or any dynamically contructed S-expr program representation).
The notions of closures and of callbacks are worthwhile to know. Read SICP and perhaps Lisp In Small Pieces (and of course the Dragon Book, for general compiler culture).
this question was posted on code golf.SE
I updated the 8086 16-bit code-golf answer on the sum-of-args currying question to include commented disassembly.
You might be able to use the same idea in 32-bit code with a stack-args calling convention to make a modified copy of a machine code function that tacks on a push imm32. It wouldn't be fixed-size anymore, though, so you'd need to update the function size in the copied machine code.
In normal calling conventions, the first arg is pushed last, so you can't just append another push imm32 before a fixed-size call target / leave / ret trailer. If writing a pure asm answer, you could use an alternate calling convention where args are pushed in the other order. Or you could have a fixed-size intro, then an ever-growing sequence of push imm32 + call / leave / ret.
The currying function itself could use a register-arg calling convention, even if you want the target function to use i386 System V for example (stack args).
You'd definitely want to simplify by not supporting args wider than 32 bit, so no structs by value, and no double. (Of course you could chain multiple calls to the currying function to build up a larger arg.)
Given the way the new code-golf challenge is written, I guess you'd compare the total number of curried args against the number of args the target "input" function takes.
I don't think there's any chance you can make this work in pure C with just memcpy; you have to modify the machine code.

Crazy macro hack for handling issues with thread cancellations and cleanup handlers

This is a really long question due to the code snippets and the detailed explanations. TL;DR, are there issues with the macros shown below, is this a reasonable solution, and if not then what is the most reasonable way to solve the issues presented below?
I am currently writing a C library which deals with POSIX threads, and must be able to handle thread cancellation cleanly. In particular, the library functions may be called from threads that were set to be cancellable (either PTHREAD_CANCEL_DEFFERED or PTHREAD_CANCEL_ASYNCHRONOUS canceltype) by the user.
Currently the library functions that interface with the user all begin with a call to pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &oldstate), and at each return point, I make sure that a call to pthread_setcancelstate(oldstate, &dummy) is made to restore whatever cancellation settings the thread had previously.
This basically prevents the thread from being cancelled while in the library code, thus ensuring that the global state remains consistent and resources were properly managed before returning.
This method unfortunately has a few drawbacks:
One must be sure to restore the cancelstate at every return point. This makes it somewhat hard to manage if the function has nontrivial control flow with multiple return points. Forgetting to do so may lead to threads that don't get cancelled even after return from the library.
We only really need to prevent cancellations at points where resources are being allocated or global state is inconsistent. A library function may in turn call other internal library functions that are cancel-safe, and ideally cancellations could occur at such points.
Here is a sample illustration of the issues:
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <pthread.h>
static void do_some_long_computation(char *buffer, size_t len)
{
(void)buffer; (void)len;
/* This is really, really long! */
}
int mylib_function(size_t len)
{
char *buffer;
int oldstate, oldstate2;
pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &oldstate);
buffer = malloc(len);
if (buffer == NULL) {
pthread_setcancelstate(oldstate, &oldstate2);
return -1;
}
do_some_long_computation(buffer, len);
fd = open("results.txt", O_WRONLY);
if (fd < 0) {
free(buffer);
pthread_setcancelstate(oldstate, &oldstate2);
return -1;
}
write(fd, buffer, len); /* Normally also do error-check */
close(fd);
free(buffer);
pthread_setcancelstate(oldstate, &oldstate2);
return 0;
}
Here it is not so bad because there are only 3 return points. One could possibly even restructure the control flow in such a way as to force all paths to reach a single return point, perhaps with the goto cleanup pattern. But the second issue is still left unresolved. And imagine having to do that for many library functions.
The second issue may be resolved by wrapping each resource allocation with calls to pthread_setcancelstate that will only disable cancellations during resource allocation. While cancellations are disabled, we also push a cleanup handler (with pthread_cleanup_push). One could also move all resource allocations together (opening the file before doing the long computation).
While solving the second issue, it is still somewhat hard to maintain because each resource allocation needs to be wrapped under these pthread_setcancelstate and pthread_cleanup_[push|pop] calls. Also it might not always be possible to put all resource allocations together, for instance if they depend on the results of the computation. Moreover, the control flow needs to be changed because one cannot return between a pthread_cleanup_push and pthread_cleanup_pop pair (which would be the case if malloc returns NULL for example).
In order to solve both issues, I came up with another possible method that involves dirty hacks with macros. The idea is to simulate something like a critical section block in other languages, to insert a block of code in a "cancel-safe" scope.
This is what the library code would look like (compile with -c -Wall -Wextra -pedantic):
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <pthread.h>
#include "cancelsafe.h"
static void do_some_long_computation(char *buffer, size_t len)
{
(void)buffer; (void)len;
/* This is really, really long! */
}
static void free_wrapper(void *arg)
{
free(*(void **)arg);
}
static void close_wrapper(void *arg)
{
close(*(int *)arg);
}
int mylib_function(size_t len)
{
char *buffer;
int fd;
int rc;
rc = 0;
CANCELSAFE_INIT();
CANCELSAFE_PUSH(free_wrapper, buffer) {
buffer = malloc(len);
if (buffer == NULL) {
rc = -1;
CANCELSAFE_BREAK(buffer);
}
}
do_some_long_computation(buffer, len);
CANCELSAFE_PUSH(close_wrapper, fd) {
fd = open("results.txt", O_WRONLY);
if (fd < 0) {
rc = -1;
CANCELSAFE_BREAK(fd);
}
}
write(fd, buffer, len);
CANCELSAFE_POP(fd, 1); /* close fd */
CANCELSAFE_POP(buffer, 1); /* free buffer */
CANCELSAFE_END();
return rc;
}
This resolves both issues to some extent. The cancelstate settings and cleanup push/pop calls are implicit in the macros, so the programmer only has to specify the sections of code that need to be cancel-safe and what cleanup handlers to push. The rest is done behind the scenes, and the compiler will make sure each CANCELSAFE_PUSH is paired with a CANCELSAFE_POP.
The implementation of the macros is as follows:
#define CANCELSAFE_INIT() \
do {\
int CANCELSAFE_global_stop = 0
#define CANCELSAFE_PUSH(cleanup, ident) \
do {\
int CANCELSAFE_oldstate_##ident, CANCELSAFE_oldstate2_##ident;\
int CANCELSAFE_stop_##ident;\
\
if (CANCELSAFE_global_stop)\
break;\
\
pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &CANCELSAFE_oldstate_##ident);\
pthread_cleanup_push(cleanup, &ident);\
for (CANCELSAFE_stop_##ident = 0; CANCELSAFE_stop_##ident == 0 && CANCELSAFE_global_stop == 0; CANCELSAFE_stop_##ident = 1, pthread_setcancelstate(CANCELSAFE_oldstate_##ident, &CANCELSAFE_oldstate2_##ident))
#define CANCELSAFE_BREAK(ident) \
do {\
CANCELSAFE_global_stop = 1;\
pthread_setcancelstate(CANCELSAFE_oldstate_##ident, &CANCELSAFE_oldstate2_##ident);\
goto CANCELSAFE_POP_LABEL_##ident;\
} while (0)
#define CANCELSAFE_POP(ident, execute) \
CANCELSAFE_POP_LABEL_##ident:\
pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &CANCELSAFE_oldstate_##ident);\
pthread_cleanup_pop(execute);\
pthread_setcancelstate(CANCELSAFE_oldstate_##ident, &CANCELSAFE_oldstate2_##ident);\
} while (0)
#define CANCELSAFE_END() \
} while (0)
This combines several macro tricks that I have encountered before.
The do { } while (0) pattern is used to have a multiline function-like macro (with semicolon required).
The CANCELSAFE_PUSH and CANCELSAFE_POP macros are forced to come in pairs by the use of the same trick as the pthread_cleanup_push and pthread_cleanup_pop using unmatched { and } braces respectively (here it is unmatched do { and } while (0) instead).
The usage of the for loops is somewhat inspired by this question. The idea is that we want to call the pthread_setcancelstate function after the macro body to restore cancellations after the CANCELSAFE_PUSH block. I use a stop flag that is set to 1 at the second loop iteration.
The ident is the name of the variable that will be released (this needs to be a valid identifier). The cleanup_wrappers will be given its address, which will always be valid in a cleanup handler scope according to this answer. This is done because the value of the variable is not yet initialized at the point of cleanup push (and also doesn't work if the variable is not of pointer type).
The ident is also used to avoid name collisions in the temporary variables and labels by appending it as a suffix with the ## concatenation macro, giving them unique names.
The CANCELSAFE_BREAK macro is used to jump out of the cancelsafe block and right into the corresponding CANCELSAFE_POP_LABEL. This is inspired by the goto cleanup pattern, as mentioned here. It also sets the global stop flag.
The global stop is used to avoid cases were there might be two PUSH/POP pairs in the same scope level. This seems like an unlikely situation, but if this happens then the content of the macros is basically skipped when the global stop flag is set to 1. The CANCELSAFE_INIT and CANCELSAFE_END macros aren't crucial, they just avoid the need to declare the global stop flag ourselves. These could be skipped if the programmer always does all the pushes and then all the pops consecutively.
After expanding the macros, we obtain the following code for the mylib_function:
int mylib_function(size_t len)
{
char *buffer;
int fd;
int rc;
rc = 0;
do {
int CANCELSAFE_global_stop = 0;
do {
int CANCELSAFE_oldstate_buffer, CANCELSAFE_oldstate2_buffer;
int CANCELSAFE_stop_buffer;
if (CANCELSAFE_global_stop)
break;
pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &CANCELSAFE_oldstate_buffer);
pthread_cleanup_push(free_wrapper, &buffer);
for (CANCELSAFE_stop_buffer = 0; CANCELSAFE_stop_buffer == 0 && CANCELSAFE_global_stop == 0; CANCELSAFE_stop_buffer = 1, pthread_setcancelstate(CANCELSAFE_oldstate_buffer, &CANCELSAFE_oldstate2_buffer)) {
buffer = malloc(len);
if (buffer == NULL) {
rc = -1;
do {
CANCELSAFE_global_stop = 1;
pthread_setcancelstate(CANCELSAFE_oldstate_buffer, &CANCELSAFE_oldstate2_buffer);
goto CANCELSAFE_POP_LABEL_buffer;
} while (0);
}
}
do_some_long_computation(buffer, len);
do {
int CANCELSAFE_oldstate_fd, CANCELSAFE_oldstate2_fd;
int CANCELSAFE_stop_fd;
if (CANCELSAFE_global_stop)
break;
pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &CANCELSAFE_oldstate_fd);
pthread_cleanup_push(close_wrapper, &fd);
for (CANCELSAFE_stop_fd = 0; CANCELSAFE_stop_fd == 0 && CANCELSAFE_global_stop == 0; CANCELSAFE_stop_fd = 1, pthread_setcancelstate(CANCELSAFE_oldstate_fd, &CANCELSTATE_oldstate2_fd)) {
fd = open("results.txt", O_WRONLY);
if (fd < 0) {
rc = -1;
do {
CANCELSAFE_global_stop = 1;
pthread_setcancelstate(CANCELSAFE_oldstate_fd, &CANCELSAFE_oldstate2_fd);
goto CANCELSAFE_POP_LABEL_fd;
} while (0);
}
}
write(fd, buffer, len);
CANCELSAFE_POP_LABEL_fd:
pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &CANCELSAFE_oldstate_fd);
pthread_cleanup_pop(1);
pthread_setcancelstate(CANCELSAFE_oldstate_fd, &CANCELSAFE_oldstate2_fd);
} while (0);
CANCELSAFE_POP_LABEL_buffer:
pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &CANCELSAFE_oldstate_buffer);
pthread_cleanup_pop(1);
pthread_setcancelstate(CANCELSAFE_oldstate_buffer, &CANCELSAFE_oldstate2_buffer);
} while (0);
} while (0);
return rc;
}
Now, this set of macros is horrendous to look at and it is somewhat tricky to understand how they work exactly. On the other hand, this is a one-time task, and once written, they can be left and the rest of the project can benefit from their nice benefits.
I would like to know if there are any issues with the macros that I may have overlooked, and whether there could be a better way to implement similar functionality. Also, which of the solutions proposed do you think would be the most reasonable? Are there other ideas that could work better to resolve these issues (or perhaps, are they really non-issues)?
Unless you use asynchronous cancellation (which is always very problematic), you do not have to disable cancellation around malloc and free (and many other POSIX functions). Synchronous cancellation only happens at cancellation points, and these functions are not.
You are abusing the POSIX cancellation handling facilities to implement a scope exit hook. In general, if you find yourself doing things like this in C, you should seriously consider using C++ instead. This will give you a much more polished version of the feature, with ample documentation, and programmers will already have experience with it.

Accessing the variable inside another code

Is there a way to access a variable initialized in one code from another code. For eg. my code1.c is as follows,
# include <stdio.h>
int main()
{
int a=4;
sleep(99);
printf("%d\n", a);
return 0;
}
Now, is there any way that I can access the value of a from inside another C code (code2.c)? I am assuming, I have all the knowledge of the variable which I want to access, but I don't have any information about its address in the RAM. So, is there any way?
I know about the extern, what I am asking for here is a sort of backdoor. Like, kind of searching for the variable in the RAM based on some properties.
Your example has one caveat, set aside possible optimizations that would make the variable to dissapear: variable a only exists while the function is being executed and has not yet finished.
Well, given that the function is main() it shouldn't be a problem, at least, for standard C programs, so if you have a program like this:
# include <stdio.h>
int main()
{
int a=4;
printf("%d\n", a);
return 0;
}
Chances are that this code will call some functions. If one of them needs to access a to read and write to it, just pass a pointer to a as an argument to the function.
# include <stdio.h>
int main()
{
int a=4;
somefunction(&a);
printf("%d\n", a);
return 0;
}
void somefunction (int *n)
{
/* Whatever you do with *n you are actually
doing it with a */
*n++; /* actually increments a */
}
But if the function that needs to access a is deep in the function call stack, all the parent functions need to pass the pointer to a even if they don't use it, adding clutter and lowering the readability of code.
The usual solution is to declare a as global, making it accessible to every function in your code. If that scenario is to be avoided, you can make a visible only for the functions that need to access it. To do that, you need to have a single source code file with all the functions that need to use a. Then, declare a as static global variable. So, only the functions that are written in the same source file will know about a, and no pointer will be needed. It doesn't matter if the functions are very nested in the function call stack. Intermediate functions won't need to pass any additional information to make a nested function to know about a
So, you would have code1.c with main() and all the functions that need to access a
/* code1.c */
# include <stdio.h>
static int a;
void somefunction (void);
int main()
{
a=4;
somefunction();
printf("%d\n", a);
return 0;
}
void somefunction (void)
{
a++;
}
/* end of code1.c */
About trying to figure out where in RAM is a specific variable stored:
Kind of. You can travel across function stack frames from yours to the main() stack frame, and inside those stack frames lie the local variables of each function, but there is no sumplementary information in RAM about what variable is located at what position, and the compiler may choose to put it wherever it likes within the stack frame (or even in a register, so there would be no trace of it in RAM, except for push and pops from/to general registers, which would be even harder to follow).
So unless that variable has a non trivial value, it's the only local variable in its stack frame, compiler optimizations have been disabled, your code is aware of the architecture and calling conventions being used, and the variable is declared as volatile to stop being stored in a CPU register, I think there is no safe and/or portable way to find it out.
OTOH, if your program has been compiled with -g flag, you might be able to read debugging information from within your program and find out where in the stack frame the variable is, and crawl through it to find it.
code1.c:
#include <stdio.h>
void doSomething(); // so that we can use the function from code2.c
int a = 4; // global variable accessible in all functions defined after this point
int main()
{
printf("main says %d\n", a);
doSomething();
printf("main says %d\n", a);
return 0;
}
code2.c
#include <stdio.h>
extern int a; // gain access to variable from code1.c
void doSomething()
{
a = 3;
printf("doSomething says %d\n", a);
}
output:
main says 4
doSomething says 3
main says 3
You can use extern int a; in every file in which you must use a (code2.c in this case), except for the file in which it is declared without extern (code1.c in this case). For this approach to work you must declare your a variable globally (not inside a function).
One approach is to have the separate executable have the same stack layout as the program in question (since the variable is placed on the stack, and we need the relative address of the variable), therefore compile it with the same or similar compiler version and options, as much as possible.
On Linux, we can read the running code's data with ptrace(PTRACE_PEEKDATA, pid, …). Since on current Linux systems the start address of the stack varies, we have to account for that; fortunately, this address can be obtained from the 28th field of /proc/…/stat.
The following program (compiled with cc Debian 4.4.5-8 and no code generator option on Linux 2.6.32) works; the pid of the running program has to be specified as the program argument.
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ptrace.h>
void *startstack(char *pid)
{ // The address of the start (i. e. bottom) of the stack.
char str[FILENAME_MAX];
FILE *fp = fopen(strcat(strcat(strcpy(str, "/proc/"), pid), "/stat"), "r");
if (!fp) perror(str), exit(1);
if (!fgets(str, sizeof str, fp)) exit(1);
fclose(fp);
unsigned long address;
int i = 28; char *s = str; while (--i) s += strcspn(s, " ") + 1;
sscanf(s, "%lu", &address);
return (void *)address;
}
static int access(void *a, char *pidstr)
{
if (!pidstr) return 1;
int pid = atoi(pidstr);
if (ptrace(PTRACE_ATTACH, pid, 0, 0) < 0) return perror("PTRACE_ATTACH"), 1;
int status;
// wait for program being signaled as stopped
if (wait(&status) < 0) return perror("wait"), 1;
// relocate variable address to stack of program in question
a = a-startstack("self")+startstack(pidstr);
int val;
if (errno = 0, val = ptrace(PTRACE_PEEKDATA, pid, a, 0), errno)
return perror("PTRACE_PEEKDATA"), 1;
printf("%d\n", val);
return 0;
}
int main(int argc, char *argv[])
{
int a;
return access(&a, argv[1]);
}
Another, more demanding approach would be as mcleod_ideafix indicated at the end of his answer to implement the bulk of a debugger and use the debug information (provided its presence) to locate the variable.

How can I throw an exception in C?

I typed this into Google, but I only found how-tos in C++.
How can I do it in C?
There are no exceptions in C. In C the errors are notified by the returned value of the function, the exit value of the process, signals to the process (Program Error Signals (GNU libc)) or the CPU hardware interruption (or other notification error form the CPU if there is)(How processor handles the case of division by zero).
Exceptions are defined in C++ and other languages though. Exception handling in C++ is specified in the C++ standard "S.15 Exception handling", there is no equivalent section in the C standard.
In C you could use the combination of the setjmp() and longjmp() functions, defined in setjmp.h. Example from Wikipedia
#include <stdio.h>
#include <setjmp.h>
static jmp_buf buf;
void second(void) {
printf("second\n"); // prints
longjmp(buf,1); // jumps back to where setjmp
// was called - making setjmp now return 1
}
void first(void) {
second();
printf("first\n"); // does not print
}
int main() {
if ( ! setjmp(buf) ) {
first(); // when executed, setjmp returns 0
} else { // when longjmp jumps back, setjmp returns 1
printf("main"); // prints
}
return 0;
}
Note: I would actually advise you not to use them as they work awful with C++ (destructors of local objects wouldn't get called) and it is really hard to understand what is going on. Return some kind of error instead.
There's no built-in exception mechanism in C; you need to simulate exceptions and their semantics. This is usually achieved by relying on setjmp and longjmp.
There are quite a few libraries around, and I'm implementing yet another one. It's called exceptions4c; it's portable and free. You may take a look at it, and compare it against other alternatives to see which fits you most.
Plain old C doesn't actually support exceptions natively.
You can use alternative error handling strategies, such as:
returning an error code
returning FALSE and using a last_error variable or function.
See http://en.wikibooks.org/wiki/C_Programming/Error_handling.
C is able to throw C++ exceptions. It is machine code anyway.
For example, in file bar.c:
#include <stdlib.h>
#include <stdint.h>
extern void *__cxa_allocate_exception(size_t thrown_size);
extern void __cxa_throw (void *thrown_exception, void* *tinfo, void (*dest) (void *) );
extern void * _ZTIl; // typeinfo of long
int bar1()
{
int64_t * p = (int64_t*)__cxa_allocate_exception(8);
*p = 1976;
__cxa_throw(p, &_ZTIl, 0);
return 10;
}
In file a.cc,
#include <stdint.h>
#include <cstdio>
extern "C" int bar1();
void foo()
{
try
{
bar1();
}
catch(int64_t x)
{
printf("good %ld", x);
}
}
int main(int argc, char *argv[])
{
foo();
return 0;
}
To compile it:
gcc -o bar.o -c bar.c && g++ a.cc bar.o && ./a.out
Output
good 1976
https://itanium-cxx-abi.github.io/cxx-abi/abi-eh.html has more detail info about __cxa_throw.
I am not sure whether it is portable or not, and I test it with 'gcc-4.8.2' on Linux.
This question is super old, but I just stumbled across it and thought I'd share a technique: divide by zero, or dereference a null pointer.
The question is simply "how to throw", not how to catch, or even how to throw a specific type of exception. I had a situation ages ago where we needed to trigger an exception from C to be caught in C++. Specifically, we had occasional reports of "pure virtual function call" errors, and needed to convince the C runtime's _purecall function to throw something. So we added our own _purecall function that divided by zero, and boom, we got an exception that we could catch on C++, and even use some stack fun to see where things went wrong.
On Windows with Microsoft Visual C++ (MSVC) there's __try ... __except ..., but it's really horrible and you don't want to use it if you can possibly avoid it. Better to say that there are no exceptions.
C doesn't have exceptions.
There are various hacky implementations that try to do it (one example at: http://adomas.org/excc/).
As mentioned in numerous threads, the "standard" way of doing this is using setjmp/longjmp. I posted yet another such solution to https://github.com/psevon/exceptions-and-raii-in-c
This is to my knowledge the only solution that relies on automatic cleanup of allocated resources. It implements unique and shared smartpointers, and allows intermediate functions to let exceptions pass through without catching and still have their locally allocated resources cleaned up properly.
C doesn't support exceptions. You can try compiling your C code as C++ with Visual Studio or G++ and see if it'll compile as-is. Most C applications will compile as C++ without major changes, and you can then use the try... catch syntax.
If you write code with the happy path design pattern (for example, for an embedded device) you may simulate exception error processing (AKA deffering or finally emulation) with operator "goto".
int process(int port)
{
int rc;
int fd1;
int fd2;
fd1 = open("/dev/...", ...);
if (fd1 == -1) {
rc = -1;
goto out;
}
fd2 = open("/dev/...", ...);
if (fd2 == -1) {
rc = -1;
goto out;
}
// Do some with fd1 and fd2 for example write(f2, read(fd1))
rc = 0;
out:
//if (rc != 0) {
(void)close(fd1);
(void)close(fd2);
//}
return rc;
}
It is not actually an exception handler, but it takes you a way to handle error at function exit.
P.S.: You should be careful to use goto only from the same or more deep scopes and never jump a variable declaration.
Implementing exceptions in C by Eric Roberts.
Chapter 4 of C Interfaces and Implementations by Hanson.
A Discipline of Error Handling by Doug Moen
Implementing Exceptions in C (details the article of E. Roberts)
In C we can't use try case to handle the error.
but if you can use Windows.h so you can:
#include <stdio.h>
#include <Windows.h>
#include <setjmp.h>
jmp_buf Buf;
NTAPI Error_Handler(struct _EXCEPTION_POINTERS *ExceptionInfo)
{
printf("co loi roi ban oi.!!!\r\n");
longjmp(Buf, 1);
}
void main()
{
AddVectoredExceptionHandler(1, Error_Handler);
int x = 0;
printf("start main\r\n");
if (setjmp(Buf) == 0)
{
int y = 1 / x;
}
printf("end main\r\n");
}

Resources