I am trying to learn multithreading programming in C and trying to understand basic program. I could not understand the runner function and why is it returning a pointer to a type void and passing a parameter which is also a pointer to void. Also , I could not understand the parameters of main.
int sum; / this data is shared by the thread(s)
void *runner(void *param); / the thread
int main(int argc, char *argv[])
{
pthread_t tid; / the thread identifier /
pthread.attr_t attr; / set of thread attributes /
if (argc != 2) {
fprintf(stderr,"usage: a.out <integer value>\n");
return -1;
}
if (atoi(argv[1]) < 0) {
fprintf(stderr,"%d must be >= 0\n",atoi(argv[1]));
return -1;
/ get the default attributes /
pthread.attr.init (&attr) ;
/ create the thread /
pthread^create(&tid,&attr,runner,argv[1]);
/ wait for the thread to exit /
pthread_join (tid, NULL) ;
printf("sum = %d\n",sum);
/ The thread will begin control in this function /
void *runner(void *param)
{<br />
int i, upper = atoi(param);
sum = 0;<br />
for (i = 1; i <= upper; i
sum += i;
pthread_exit (0) ;
The parameters to main first. argc is the number of command-line parameters, including the program name. argv is an array of pointers to zero-delimited strings, which are the parameters themselves. So, if you run your program from the command-line like this:
myprog x y z
Then argc will be 4, argv will look like this:
argv[0]: "myprog"
argv[1]: "x"
argv[2]: "y"
argv[3]: "z"
argv[4]: ""
The final element should be an empty string. The exact format of the first element (program name) varies depending on the operating system and the exact way the program is called.
Your runner function is a type of function sometimes generally known as a callback. It is called by someone else (the pthread library). In order for someone else to call your function, it has to know it's return type and parameters, so these are fixed, even when they are not used.
So runner has to return a void * (untyped pointer) and take a void * parameter, even when it does not actually use either (it can return NULL). It is that way because that is what the pthread library expects.
Related
Here is my code for user/sleep.c:
#include "kernel/types.h"
#include "user/user.h"
int
main(int argc, char *argv[])
{
if (argc < 2)
{
printf("You have forgotten to pass an argument.");
exit(1);
}
int arg = atoi(argv[1]);
sleep(arg);
exit(0);
// return 0;
}
All is fine but with the return 0 instead of exit(0), I get the following error:
usertrap(): unexpected scause 0x000000000000000d pid=4
sepc=0x00000000000000f6 stval=0x0000000000003008
Why is that?
It's because xv6 is not standard on this point:
see https://tnallen.people.clemson.edu/2019/03/04/intro-to-xv6.html
Returning after main is a special case that is not supported in XV6, so just remember to always exit() at the end of main instead of return. Otherwise, you will get a “trap 14,” which in this case is XV6-speak for “Segmentation Fault.” Finally, notice that our headers are different. There is no #include <stdio.h> as you might be used to. Again, this is because the standard library is different. Check out these headers to see what kind of user utilities you have available.
So why did we need exit() in xv6?
Because, when building a program with gcc for linux for instance, the tool chain add the required exit call: see https://en.wikipedia.org/wiki/Crt0
In short, on linux, your program don't start at main but at start:
void start(void) {
/* get argc, argv, env) */
int r = main(argc, argv, envp); /* << start calls the actual main */
exit(r);
}
On xv6, on contrary, the real starting point is main when OS try to return from main, it has no address to pop what cause a segfault
Is there a way to access a variable initialized in one code from another code. For eg. my code1.c is as follows,
# include <stdio.h>
int main()
{
int a=4;
sleep(99);
printf("%d\n", a);
return 0;
}
Now, is there any way that I can access the value of a from inside another C code (code2.c)? I am assuming, I have all the knowledge of the variable which I want to access, but I don't have any information about its address in the RAM. So, is there any way?
I know about the extern, what I am asking for here is a sort of backdoor. Like, kind of searching for the variable in the RAM based on some properties.
Your example has one caveat, set aside possible optimizations that would make the variable to dissapear: variable a only exists while the function is being executed and has not yet finished.
Well, given that the function is main() it shouldn't be a problem, at least, for standard C programs, so if you have a program like this:
# include <stdio.h>
int main()
{
int a=4;
printf("%d\n", a);
return 0;
}
Chances are that this code will call some functions. If one of them needs to access a to read and write to it, just pass a pointer to a as an argument to the function.
# include <stdio.h>
int main()
{
int a=4;
somefunction(&a);
printf("%d\n", a);
return 0;
}
void somefunction (int *n)
{
/* Whatever you do with *n you are actually
doing it with a */
*n++; /* actually increments a */
}
But if the function that needs to access a is deep in the function call stack, all the parent functions need to pass the pointer to a even if they don't use it, adding clutter and lowering the readability of code.
The usual solution is to declare a as global, making it accessible to every function in your code. If that scenario is to be avoided, you can make a visible only for the functions that need to access it. To do that, you need to have a single source code file with all the functions that need to use a. Then, declare a as static global variable. So, only the functions that are written in the same source file will know about a, and no pointer will be needed. It doesn't matter if the functions are very nested in the function call stack. Intermediate functions won't need to pass any additional information to make a nested function to know about a
So, you would have code1.c with main() and all the functions that need to access a
/* code1.c */
# include <stdio.h>
static int a;
void somefunction (void);
int main()
{
a=4;
somefunction();
printf("%d\n", a);
return 0;
}
void somefunction (void)
{
a++;
}
/* end of code1.c */
About trying to figure out where in RAM is a specific variable stored:
Kind of. You can travel across function stack frames from yours to the main() stack frame, and inside those stack frames lie the local variables of each function, but there is no sumplementary information in RAM about what variable is located at what position, and the compiler may choose to put it wherever it likes within the stack frame (or even in a register, so there would be no trace of it in RAM, except for push and pops from/to general registers, which would be even harder to follow).
So unless that variable has a non trivial value, it's the only local variable in its stack frame, compiler optimizations have been disabled, your code is aware of the architecture and calling conventions being used, and the variable is declared as volatile to stop being stored in a CPU register, I think there is no safe and/or portable way to find it out.
OTOH, if your program has been compiled with -g flag, you might be able to read debugging information from within your program and find out where in the stack frame the variable is, and crawl through it to find it.
code1.c:
#include <stdio.h>
void doSomething(); // so that we can use the function from code2.c
int a = 4; // global variable accessible in all functions defined after this point
int main()
{
printf("main says %d\n", a);
doSomething();
printf("main says %d\n", a);
return 0;
}
code2.c
#include <stdio.h>
extern int a; // gain access to variable from code1.c
void doSomething()
{
a = 3;
printf("doSomething says %d\n", a);
}
output:
main says 4
doSomething says 3
main says 3
You can use extern int a; in every file in which you must use a (code2.c in this case), except for the file in which it is declared without extern (code1.c in this case). For this approach to work you must declare your a variable globally (not inside a function).
One approach is to have the separate executable have the same stack layout as the program in question (since the variable is placed on the stack, and we need the relative address of the variable), therefore compile it with the same or similar compiler version and options, as much as possible.
On Linux, we can read the running code's data with ptrace(PTRACE_PEEKDATA, pid, …). Since on current Linux systems the start address of the stack varies, we have to account for that; fortunately, this address can be obtained from the 28th field of /proc/…/stat.
The following program (compiled with cc Debian 4.4.5-8 and no code generator option on Linux 2.6.32) works; the pid of the running program has to be specified as the program argument.
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ptrace.h>
void *startstack(char *pid)
{ // The address of the start (i. e. bottom) of the stack.
char str[FILENAME_MAX];
FILE *fp = fopen(strcat(strcat(strcpy(str, "/proc/"), pid), "/stat"), "r");
if (!fp) perror(str), exit(1);
if (!fgets(str, sizeof str, fp)) exit(1);
fclose(fp);
unsigned long address;
int i = 28; char *s = str; while (--i) s += strcspn(s, " ") + 1;
sscanf(s, "%lu", &address);
return (void *)address;
}
static int access(void *a, char *pidstr)
{
if (!pidstr) return 1;
int pid = atoi(pidstr);
if (ptrace(PTRACE_ATTACH, pid, 0, 0) < 0) return perror("PTRACE_ATTACH"), 1;
int status;
// wait for program being signaled as stopped
if (wait(&status) < 0) return perror("wait"), 1;
// relocate variable address to stack of program in question
a = a-startstack("self")+startstack(pidstr);
int val;
if (errno = 0, val = ptrace(PTRACE_PEEKDATA, pid, a, 0), errno)
return perror("PTRACE_PEEKDATA"), 1;
printf("%d\n", val);
return 0;
}
int main(int argc, char *argv[])
{
int a;
return access(&a, argv[1]);
}
Another, more demanding approach would be as mcleod_ideafix indicated at the end of his answer to implement the bulk of a debugger and use the debug information (provided its presence) to locate the variable.
I wanna use setjmp()/longjmp() to implement a coroutine system.
Then I decide to code a little .c file to test it. In MinGW, it's OK; I got the result I want.
But when I compile it in MSVC++, the program crashes: "access violation"
#include <stdio.h>
#include <stdlib.h>
#include <setjmp.h>
jmp_buf a;
int is_invoke=0;
void
action_1()
{
for ( ;; ) {
printf("hello~~~A\n");
if(!setjmp(a)) {
is_invoke=1;
return;
}
}
}
void
func()
{
if (is_invoke) {
longjmp(a,1);
}
action_1();
printf("end\n");
}
void
dummy()
{
;
}
int
main(int argc, char *argv[])
{
for ( ;; ) {
func();
dummy();
}
return 0;
}
The man page for setjmp says:
setjmp() saves the stack context/environment in env for later use by
longjmp(). The stack context will be invalidated if the function
which called setjmp() returns.
In a simple implementation you might suppose that a jmp_buf contains an address to reset the stack pointer to and an address to jump to. As soon as you return from the function which saved the jmp_buf, the stack frame pointed to by the jmp_buf is no longer valid and may immediately become corrupted.
Or in other words, you can only rely on longjmp to act as a sort-of super-return statement - never to go deeper.
I think the reason this works for you in mingw (and for me on Linux) is implementation-specific and possibly down to luck. There is another way - have you read Simon Tatham's evil coroutine macros essay?
Since you're invoking undefined behaviour, it's OK for one compiler to crash and another to appear to work. Both are correct - that's the beauty of undefined behaviour.
The trouble is that a saved context - the jmp_buf - only remains valid as long as the function that called setjmp() to set it has not returned.
The C99 standard (no longer the current standard, but this wording is unlikely to have changed significantly) says:
§7.13.2.1 The longjmp function
The longjmp function restores the environment saved by the most recent invocation of
the setjmp macro in the same invocation of the program with the corresponding
jmp_buf argument. If there has been no such invocation, or if the function containing
the invocation of the setjmp macro has terminated execution208) in the interim, or if the
invocation of the setjmp macro was within the scope of an identifier with variably
modified type and execution has left that scope in the interim, the behavior is undefined.
208) For example, by executing a return statement or because another longjmp call has caused a
transfer to a setjmp invocation in a function earlier in the set of nested calls.
Your code is exiting from action_1() almost immediately, rendering the jmp_buf saved by setjmp() worthless.
I created this little demonstration of setjmp() and longjmp() a couple of years ago. It may help you.
/*
#(#)File: $RCSfile: setjmp.c,v $
#(#)Version: $Revision: 1.1 $
#(#)Last changed: $Date: 2009/10/01 16:41:04 $
#(#)Purpose: Demonstrate setjmp() and longjmp()
#(#)Author: J Leffler
#(#)Copyright: (C) JLSS 2009
*/
#include <stdio.h>
#include <setjmp.h>
#include <stdlib.h>
static jmp_buf target_location;
static void do_something(void)
{
static int counter = 0;
if (++counter % 10 == 0)
printf("---- doing something: %3d\n", counter);
if (counter % 1000 == 0)
{
printf("||-- doing_something: calling longjmp() with value -1\n");
longjmp(target_location, -1);
}
}
static void do_something_else(int i, int j)
{
printf("-->> do_something_else: (%d,%d)\n", i, j);
do_something();
if (i > 2 && j > 2 && j % i == 2)
{
printf("||-- do_something_else: calling longjmp() with value %d\n", (i + j) % 100);
longjmp(target_location, (i + j) % 100);
}
printf("<<-- do_something_else: (%d,%d)\n", i, j);
}
static void doing_stuff(void)
{
int i;
printf("-->> doing_stuff()\n");
for (i = rand() % 15; i < 30; i++)
{
int j;
do_something();
for (j = rand() % 10; j < 20; j++)
{
do_something_else(i, j);
}
}
printf("<<-- doing_stuff()\n");
}
static void manage_setjmp(void)
{
printf("-->> manage_setjmp()\n");
switch (setjmp(target_location))
{
case 0:
/* Initial return - get on with doing stuff */
doing_stuff();
break;
case -1:
/* Error return - terminate */
printf("<<-- manage_setjmp() - error return from setjmp()\n");
return;
default:
/* NB: not officially possible to assign the return from setjmp() */
printf("---- manage_setjmp() - non-error return from setjmp()\n");
doing_stuff();
break;
}
printf("<<-- manage_setjmp()\n");
}
int main(void)
{
printf("-->> main()\n");
manage_setjmp();
printf("<<-- main()\n");
return(0);
}
You cannot use setjmp/longjmp for coroutines. Use makecontext/swapcontext on POSIX, or fibers (CreateFiber, etc.) on windows.
I am writting an application that has multiple threads in Linux environment using C or python. I am using pthread for that. But How number of threads should be accepted via command line.
In C you handle command line arguments by having main take two arguments,
int main(int argc, char** argv)
in which argc is the number of command line arguments (including the program itself) and argv is a pointer to a memory location where argc-1 pointers to strings with the actual arguments are located. Example:
int main(int argc, char** argv)
{
printf("The program was executed as %s.\n", argv[0]);
printf("The arguments were:\n");
for (int i = 1; i < argc; i++)
printf("%s\n", argv[i]);
return 0;
}
Let's now assume that your program takes a single command line argument, an integer telling you how many threads to spawn. The integer is given as a string, so we have to convert it using atoi:
if (argc != 2)
{
printf("Need exactly one argument!\n");
return 1;
}
int num_threads = atoi(argv[1]); // Convert first argument to integer.
if (num_threads < 1)
{
printf("I'll spawn no less than 1 thread!\n");
return 2;
}
Now what you do is simply create an array of thread handles,
pthread_t* threads = malloc(num_threads*sizeof(pthread_t));
and use it to store the thread handles as you start num_threads number of threads using pthread_create.
If you are not familiar with pthreads at all, I recommend this short tutorial.
If you were using a threading framework like OpenMP, then this is all handled automatically simply by setting the environment variable OMP_NUM_THREADS.
But if you're implementing threads "manually", you'll need to do it the way most runtime configuration is done: either by parsing argv[], or by setting an environment variable and using getenv().
Usually, you would just pass it like any other argument. I've used code similar to the following in projects before to specify fixed thread counts. This is fairly simple but suitable for situations where you don't need the full power of thread pooling (though you could just as easily set a minimum and maximum thread count the same way).
#include <stdio.h>
#define THRD_DFLT 5
#define THRD_MIN 2
#define THRD_MAX 20
static int numThreads = 0;
int main (int argCount, char *argVal[]) {
if (argCount > 1)
numThreads = atoi (argVal[1]);
if ((numThreads < 5) || (numThreads > THRD_MAX)) {
printf ("Number of threads outside range %d-%d, using %d\n",
THRD_MIN, THRD_MAX, THRD_DFLT);
numThreads = THRD_DFLT;
}
:
:
Using a function like this:
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>
void print_trace() {
char pid_buf[30];
sprintf(pid_buf, "--pid=%d", getpid());
char name_buf[512];
name_buf[readlink("/proc/self/exe", name_buf, 511)]=0;
int child_pid = fork();
if (!child_pid) {
dup2(2,1); // redirect output to stderr
fprintf(stdout,"stack trace for %s pid=%s\n",name_buf,pid_buf);
execlp("gdb", "gdb", "--batch", "-n", "-ex", "thread", "-ex", "bt", name_buf, pid_buf, NULL);
abort(); /* If gdb failed to start */
} else {
waitpid(child_pid,NULL,0);
}
}
I see the details of print_trace in the output.
What are other ways to do it?
You mentioned on my other answer (now deleted) that you also want to see line numbers. I'm not sure how to do that when invoking gdb from inside your application.
But I'm going to share with you a couple of ways to print a simple stacktrace with function names and their respective line numbers without using gdb. Most of them came from a very nice article from Linux Journal:
Method #1:
The first method is to disseminate it
with print and log messages in order
to pinpoint the execution path. In a
complex program, this option can
become cumbersome and tedious even if,
with the help of some GCC-specific
macros, it can be simplified a bit.
Consider, for example, a debug macro
such as:
#define TRACE_MSG fprintf(stderr, __FUNCTION__ \
"() [%s:%d] here I am\n", \
__FILE__, __LINE__)
You can propagate this macro quickly
throughout your program by cutting and
pasting it. When you do not need it
anymore, switch it off simply by
defining it to no-op.
Method #2: (It doesn't say anything about line numbers, but I do on method 4)
A nicer way to get a stack backtrace,
however, is to use some of the
specific support functions provided by
glibc. The key one is backtrace(),
which navigates the stack frames from
the calling point to the beginning of
the program and provides an array of
return addresses. You then can map
each address to the body of a
particular function in your code by
having a look at the object file with
the nm command. Or, you can do it a
simpler way--use backtrace_symbols().
This function transforms a list of
return addresses, as returned by
backtrace(), into a list of strings,
each containing the function name
offset within the function and the
return address. The list of strings is
allocated from your heap space (as if
you called malloc()), so you should
free() it as soon as you are done with
it.
I encourage you to read it since the page has source code examples. In order to convert an address to a function name you must compile your application with the -rdynamic option.
Method #3: (A better way of doing method 2)
An even more useful application for
this technique is putting a stack
backtrace inside a signal handler and
having the latter catch all the "bad"
signals your program can receive
(SIGSEGV, SIGBUS, SIGILL, SIGFPE and
the like). This way, if your program
unfortunately crashes and you were not
running it with a debugger, you can
get a stack trace and know where the
fault happened. This technique also
can be used to understand where your
program is looping in case it stops
responding
An implementation of this technique is available here.
Method #4:
A small improvement I've done on method #3 to print line numbers. This could be copied to work on method #2 also.
Basically, I followed a tip that uses addr2line to
convert addresses into file names and
line numbers.
The source code below prints line numbers for all local functions. If a function from another library is called, you might see a couple of ??:0 instead of file names.
#include <stdio.h>
#include <signal.h>
#include <stdio.h>
#include <signal.h>
#include <execinfo.h>
void bt_sighandler(int sig, struct sigcontext ctx) {
void *trace[16];
char **messages = (char **)NULL;
int i, trace_size = 0;
if (sig == SIGSEGV)
printf("Got signal %d, faulty address is %p, "
"from %p\n", sig, ctx.cr2, ctx.eip);
else
printf("Got signal %d\n", sig);
trace_size = backtrace(trace, 16);
/* overwrite sigaction with caller's address */
trace[1] = (void *)ctx.eip;
messages = backtrace_symbols(trace, trace_size);
/* skip first stack frame (points here) */
printf("[bt] Execution path:\n");
for (i=1; i<trace_size; ++i)
{
printf("[bt] #%d %s\n", i, messages[i]);
/* find first occurence of '(' or ' ' in message[i] and assume
* everything before that is the file name. (Don't go beyond 0 though
* (string terminator)*/
size_t p = 0;
while(messages[i][p] != '(' && messages[i][p] != ' '
&& messages[i][p] != 0)
++p;
char syscom[256];
sprintf(syscom,"addr2line %p -e %.*s", trace[i], p, messages[i]);
//last parameter is the file name of the symbol
system(syscom);
}
exit(0);
}
int func_a(int a, char b) {
char *p = (char *)0xdeadbeef;
a = a + b;
*p = 10; /* CRASH here!! */
return 2*a;
}
int func_b() {
int res, a = 5;
res = 5 + func_a(a, 't');
return res;
}
int main() {
/* Install our signal handler */
struct sigaction sa;
sa.sa_handler = (void *)bt_sighandler;
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_RESTART;
sigaction(SIGSEGV, &sa, NULL);
sigaction(SIGUSR1, &sa, NULL);
/* ... add any other signal here */
/* Do something */
printf("%d\n", func_b());
}
This code should be compiled as: gcc sighandler.c -o sighandler -rdynamic
The program outputs:
Got signal 11, faulty address is 0xdeadbeef, from 0x8048975
[bt] Execution path:
[bt] #1 ./sighandler(func_a+0x1d) [0x8048975]
/home/karl/workspace/stacktrace/sighandler.c:44
[bt] #2 ./sighandler(func_b+0x20) [0x804899f]
/home/karl/workspace/stacktrace/sighandler.c:54
[bt] #3 ./sighandler(main+0x6c) [0x8048a16]
/home/karl/workspace/stacktrace/sighandler.c:74
[bt] #4 /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0x3fdbd6]
??:0
[bt] #5 ./sighandler() [0x8048781]
??:0
Update 2012/04/28 for recent linux kernel versions, the above sigaction signature is obsolete. Also I improved it a bit by grabbing the executable name from this answer. Here is an up to date version:
char* exe = 0;
int initialiseExecutableName()
{
char link[1024];
exe = new char[1024];
snprintf(link,sizeof link,"/proc/%d/exe",getpid());
if(readlink(link,exe,sizeof link)==-1) {
fprintf(stderr,"ERRORRRRR\n");
exit(1);
}
printf("Executable name initialised: %s\n",exe);
}
const char* getExecutableName()
{
if (exe == 0)
initialiseExecutableName();
return exe;
}
/* get REG_EIP from ucontext.h */
#define __USE_GNU
#include <ucontext.h>
void bt_sighandler(int sig, siginfo_t *info,
void *secret) {
void *trace[16];
char **messages = (char **)NULL;
int i, trace_size = 0;
ucontext_t *uc = (ucontext_t *)secret;
/* Do something useful with siginfo_t */
if (sig == SIGSEGV)
printf("Got signal %d, faulty address is %p, "
"from %p\n", sig, info->si_addr,
uc->uc_mcontext.gregs[REG_EIP]);
else
printf("Got signal %d\n", sig);
trace_size = backtrace(trace, 16);
/* overwrite sigaction with caller's address */
trace[1] = (void *) uc->uc_mcontext.gregs[REG_EIP];
messages = backtrace_symbols(trace, trace_size);
/* skip first stack frame (points here) */
printf("[bt] Execution path:\n");
for (i=1; i<trace_size; ++i)
{
printf("[bt] %s\n", messages[i]);
/* find first occurence of '(' or ' ' in message[i] and assume
* everything before that is the file name. (Don't go beyond 0 though
* (string terminator)*/
size_t p = 0;
while(messages[i][p] != '(' && messages[i][p] != ' '
&& messages[i][p] != 0)
++p;
char syscom[256];
sprintf(syscom,"addr2line %p -e %.*s", trace[i] , p, messages[i] );
//last parameter is the filename of the symbol
system(syscom);
}
exit(0);
}
and initialise like this:
int main() {
/* Install our signal handler */
struct sigaction sa;
sa.sa_sigaction = (void *)bt_sighandler;
sigemptyset (&sa.sa_mask);
sa.sa_flags = SA_RESTART | SA_SIGINFO;
sigaction(SIGSEGV, &sa, NULL);
sigaction(SIGUSR1, &sa, NULL);
/* ... add any other signal here */
/* Do something */
printf("%d\n", func_b());
}
If you're using Linux, the standard C library includes a function called backtrace, which populates an array with frames' return addresses, and another function called backtrace_symbols, which will take the addresses from backtrace and look up the corresponding function names. These are documented in the GNU C Library manual.
Those won't show argument values, source lines, and the like, and they only apply to the calling thread. However, they should be a lot faster (and perhaps less flaky) than running GDB that way, so they have their place.
nobar posted a fantastic answer. In short;
So you want a stand-alone function that prints a stack trace with all of the features that gdb stack traces have and that doesn't terminate your application. The answer is to automate the launch of gdb in a non-interactive mode to perform just the tasks that you want.
This is done by executing gdb in a child process, using fork(), and scripting it to display a stack-trace while your application waits for it to complete. This can be performed without the use of a core-dump and without aborting the application.
I believe that this is what you are looking for, #Vi
Isn't abort() simpler?
That way if it happens in the field the customer can send you the core file (I don't know many users who are involved enough in my application to want me to force them to debug it).