Unknown Segmentation Fault (Core Dumped) in multithread initialization

Unknown Segmentation Fault (Core Dumped) in multithread initialization - c

I've had three different lab TAs look at my code and none of them have been able to help me, so I've decided to try here. Unless I delete all code relating to both gettimeofday and any semaphores, I get a "Segmentation fault (core dumped)" error. I've boiled down my code to only the main thread with simple declarations to attempt to get to the root of the problem.
My code:
#include <pthread.h>
#include <semaphore.h>
#include <sys/types.h>
#include <stdio.h>
#include <string.h>
#include <sys/shm.h>
#include <sys/time.h>
void *threadN (void *); /* thread routines */
pthread_t tid[1]; /* array of thread IDs */
int main()
{
int i,j;
/* here create and initialize all semaphores */
int mutex = sem_create(777777, 1);
int MatrixA[6000][3000];
for (i=0; i < 6000; i++) {
for (j=0; j < 3000; j++) {
MatrixA[i][j]=i*j;
}
}
int MatrixB[3000][1000];
for (i=0; i < 3000; i++) {
for (j=0; j < 1000; j++) {
MatrixB[i][j]=i*j;
}
}
int MatrixC[6000][1000];
struct timeval tim;
gettimeofday(&tim, NULL);
float t1=tim.tv_sec+(tim.tv_usec/1000000.0);
gettimeofday(&tim, NULL);
float t2=tim.tv_sec+(tim.tv_usec/1000000.0);
printf("%.2lf seconds elapsed\n", t2-t1);
sem_rm(sem_open(777777, 1));
return 0;
}
I'm completely stumped here.

You eat your stack. See Radix sort for 10^6 array in C, comment from #Joachim Pileborg:
Local variables are usually stored on the stack, and the stack is usually limited to single-digit megabytes. On Windows for example, the default is 1MB per process, on Linux the default is 8MB...
I tried your code on Windows and it was dead with just MatrixA defined: 6000*3000*4 (for int)...
So you will have to move matrix data out of stack: define matrix as static or allocate on heap.

A useful thing I have found for tracking seg faults is with using gdb.
gcc -g -o a.out -c program.c
-g generates source level debug information
gdb a.out core
this starts up gdb with a.out
(gdb) run
this should run the program and show the line where the seg fault is happening.

Related

CTRL + C doesn't kill C script

I'm reading Operating Systems: Three Easy Pieces and I'm finding a problem is not mentioned in the book.
This is the C script:
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
/*#include "common.h"*/
#include <unistd.h>
#include <assert.h>
int
main(int argc, char *argv[])
{
int *p = malloc(sizeof(int));
assert(p != NULL);
printf("(%d address pointed to by p: %p\n",
getpid(), p);
*p = 0;
while (1) {
sleep(1);
*p = *p +1;
printf("(%d) p: %d\n", getpid(), *p);
}
return 0;
}
It allocates some memory, prints out the address memory, puts the number 0 into it and finally loops to increment the value.
I compile it through gcc -o mem mem.c -Wall and I have no problem running it with ./mem, if I press CRTL+C it will stop:
But then problems come when I run the script twice in parallel with the command ./mem & ./mem, look at the GIF:
No matter how many times I try to kill the process the scripts keeps hammering.
How to kill my C which project?

Use fg to bring the backgrounded process to the foreground, then it will respond to Ctrl-C.
You can also use jobs to see a numbered list of backgrounded jobs, and kill %<number> to kill a specific job, e.g. kill %1.

Program break doesnt change after calling malloc in a loop?

Running this piece of code is supposed to cause program break to increase by about malloc_counts * _SC_PAGESIZE instead I get fixed program break each time, so why is this. malloc is supposed to call brk or sbrk which itself round up size passed to next page (with some extra work). So what's happening ?
#include <stdio.h>
#include <malloc.h>
#include <unistd.h>
int main(){
const long malloc_counts = 10;
printf("PAGE SIZE: %ld\n", sysconf(_SC_PAGESIZE));
void* allocated_pool[malloc_counts];
for(int counter=0; counter < malloc_counts; counter++)
{
printf("program brk: %p\n",sbrk(0));
allocated_pool[counter] = malloc(127*4096);
}
}

which i guess of course using optimizations
Your compiler optimizes the calls to malloc out, because they are unused. Because malloc calls are removed, nothing changes and the heap is not moved.
And glibc overallocates a lot, so the value has to be large enough for it to see it. And the default M_MMAP_THRESHOLD seem to be 128 * 1024. So you have to pick a value large enough, but below mmap threshold to see a difference in glibc.
Disable your compiler optimizations and allocate a lot and heap will be moved. Try the following:
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
int main() {
printf("PAGE SIZE: %ld\n", sysconf(_SC_PAGESIZE));
#define malloc_counts 20
void *allocated_pool[malloc_counts];
for(int counter = 0; counter < malloc_counts; counter++) {
printf("program brk: %p\n", sbrk(0));
allocated_pool[counter] = malloc((size_t)127 * 1024);
*(void *volatile *)&allocated_pool[counter];
}
}

How to read the stack segment of a C program?

I am developing a Hobby operating system, for that I want to know the mechanism of memory allocation in Linux, to understand that, I created a simple C program that defines a unsigned char of some hex numbers and then runs in a empty infinite loop, I did this to keep the process alive. Then I used pmap to get page-mapping information. Now I know the location of stack segment, also I have created a program that uses process_vm_readv syscall to read the contents of that address, all I see a stream of 00 when I read the contents of stack segment and some random numbers at last, How can I be able to figure out how the array is stored in the stack segment?
If that is possible, how can I analyze the hex stream to extract meaningful information ?

Here I am adding a demonstration for accessing address space of a remote process, There are two programs local.c which will read and write a variable in another program named remote.c (These program assumes sizeof(int)==4 )
local.c
#define _GNU_SOURCE
#include <sys/uio.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/syscall.h>
int main()
{
char buf[4];
struct iovec local[1];
struct iovec remote[1];
int pid;
void *addr;
printf("Enter remote pid\n");
scanf("%d",&pid);
printf("Enter remote address\n");
scanf("%p", &addr);
local[0].iov_base = buf;
local[0].iov_len = 4;
remote[0].iov_base = addr;
remote[0].iov_len = 4;
if(syscall(SYS_process_vm_readv,pid,local,1,remote,1,0) == -1) {
perror("");
return -1;
}
printf("read : %d\n",*(int*)buf);
*(int*)buf = 4321;
if(syscall(SYS_process_vm_writev,pid,local,1,remote,1,0) == -1) {
perror("");
return -1;
}
return 0;
}
remote.c
#define _GNU_SOURCE
#include <sys/uio.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/syscall.h>
int main()
{
int a = 1234;
printf("%d %p\n",getpid(),&a);
while(a == 1234);
printf ("'a' changed to %d\n",a);
return 0;
}
And if you run this on a Linux machine,
[ajith#localhost Desktop]$ gcc remote.c -o remote -Wall
[ajith#localhost Desktop]$ ./remote
4574 0x7fffc4f4eb6c
'a' changed to 4321
[ajith#localhost Desktop]$
[ajith#localhost Desktop]$ gcc local.c -o local -Wall
[ajith#localhost Desktop]$ ./local
Enter remote pid
4574
Enter remote address
0x7fffc4f4eb6c
read : 1234
[ajith#localhost Desktop]$
Using the similar way you can read stack frame to the io-vectors, But you need to know the stack frame structure format to parse the values of local variables from stack frame. stack frame contains function parameters, return address, local variables, etc

GSL Segmentation fault: 11 on mac

trying to learn GSL (GNU Scientific Library)on mac OS X (El Capitan) following this tutorial (which is BTW the third result on my google search!).
I installed GSL using homebrew
following the comments on this post I changed flags from -lgslblasnative to -lgslcblas.
now it compiles but I get a Segmentation fault: 11 error while running the program. There are a number of possibilties. first is the stackoverflow, which is very unlikly with such a samll program. second there is an array somewhere and the program is trying to write on the memory which has not allocated. or there is something wrong with GSL. I would appreciate if you could help me.
edit 1: this is the code I'm trying to run
#include <stdio.h>
#include <gsl_rng.h>
#include <gsl_randist.h>
int main (int argc, char *argv[])
{
/* set up GSL RNG */
gsl_rng *r = gsl_rng_alloc(gsl_rng_mt19937);
/* end of GSL setup */
int i,n;
double gauss,gamma;
n=atoi(argv[1]);
for (i=0;i<n;i++)
{
gauss=gsl_ran_gaussian(r,2.0);
gamma=gsl_ran_gamma(r,2.0,3.0);
printf("%2.4f %2.4f\n", gauss,gamma);
}
return(0);
}

Find program's code address at runtime?

When I use gdb to debug a program written in C, the command disassemble shows the codes and their addresses in the code memory segmentation. Is it possible to know those memory addresses at runtime? I am using Ubuntu OS. Thank you.
[edit] To be more specific, I will demonstrate it with following example.
#include <stdio.h>
int main(int argc,char *argv[]){
myfunction();
exit(0);
}
Now I would like to have the address of myfunction() in the code memory segmentation when I run my program.

Above answer is vastly overcomplicated. If the function reference is static, as it is above, the address is simply the value of the symbol name in pointer context:
void* myfunction_address = myfunction;
If you are grabbing the function dynamically out of a shared library, then the value returned from dlsym() (POSIX) or GetProcAddress() (windows) is likewise the address of the function.
Note that the above code is likely to generate a warning with some compilers, as ISO C technically forbids assignment between code and data pointers (some architectures put them in physically distinct address spaces).
And some pedants will point out that the address returned isn't really guaranteed to be the memory address of the function, it's just a unique value that can be compared for equality with other function pointers and acts, when called, to transfer control to the function whose pointer it holds. Obviously all known compilers implement this with a branch target address.
And finally, note that the "address" of a function is a little ambiguous. If the function was loaded dynamically or is an extern reference to an exported symbol, what you really get is generally a pointer to some fixup code in the "PLT" (a Unix/ELF term, though the PE/COFF mechanism on windows is similar) that then jumps to the function.

If you know the function name before program runs, simply use
void * addr = myfunction;
If the function name is given at run-time, I once wrote a function to find out the symbol address dynamically using bfd library. Here is the x86_64 code, you can get the address via find_symbol("a.out", "myfunction") in the example.
#include <bfd.h>
#include <stdio.h>
#include <stdlib.h>
#include <type.h>
#include <string.h>
long find_symbol(char *filename, char *symname)
{
bfd *ibfd;
asymbol **symtab;
long nsize, nsyms, i;
symbol_info syminfo;
char **matching;
bfd_init();
ibfd = bfd_openr(filename, NULL);
if (ibfd == NULL) {
printf("bfd_openr error\n");
}
if (!bfd_check_format_matches(ibfd, bfd_object, &matching)) {
printf("format_matches\n");
}
nsize = bfd_get_symtab_upper_bound (ibfd);
symtab = malloc(nsize);
nsyms = bfd_canonicalize_symtab(ibfd, symtab);
for (i = 0; i < nsyms; i++) {
if (strcmp(symtab[i]->name, symname) == 0) {
bfd_symbol_info(symtab[i], &syminfo);
return (long) syminfo.value;
}
}
bfd_close(ibfd);
printf("cannot find symbol\n");
}

To get a backtrace, use execinfo.h as documented in the GNU libc manual.
For example:
#include <execinfo.h>
#include <stdio.h>
#include <unistd.h>
void trace_pom()
{
const int sz = 15;
void *buf[sz];
// get at most sz entries
int n = backtrace(buf, sz);
// output them right to stderr
backtrace_symbols_fd(buf, n, fileno(stderr));
// but if you want to output the strings yourself
// you may use char ** backtrace_symbols (void *const *buffer, int size)
write(fileno(stderr), "\n", 1);
}
void TransferFunds(int n);
void DepositMoney(int n)
{
if (n <= 0)
trace_pom();
else TransferFunds(n-1);
}
void TransferFunds(int n)
{
DepositMoney(n);
}
int main()
{
DepositMoney(3);
return 0;
}
compiled
gcc a.c -o a -g -Wall -Werror -rdynamic
According to the mentioned website:
Currently, the function name and offset only be obtained on systems that use the ELF
binary format for programs and libraries. On other systems, only the hexadecimal return
address will be present. Also, you may need to pass additional flags to the linker to
make the function names available to the program. (For example, on systems using GNU
ld, you must pass (-rdynamic.)
Output
./a(trace_pom+0xc9)[0x80487fd]
./a(DepositMoney+0x11)[0x8048862]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(main+0x1d)[0x80488a4]
/lib/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xb7e16775]
./a[0x80486a1]

About a comment in an answer (getting the address of an instruction), you can use this very ugly trick
#include <setjmp.h>
void function() {
printf("in function\n");
printf("%d\n",__LINE__);
printf("exiting function\n");
}
int main() {
jmp_buf env;
int i;
printf("in main\n");
printf("%d\n",__LINE__);
printf("calling function\n");
setjmp(env);
for (i=0; i < 18; ++i) {
printf("%p\n",env[i]);
}
function();
printf("in main again\n");
printf("%d\n",__LINE__);
}
It should be env[12] (the eip), but be careful as it looks machine dependent, so triple check my word. This is the output
in main
13
calling function
0xbfff037f
0x0
0x1f80
0x1dcb
0x4
0x8fe2f50c
0x0
0x0
0xbffff2a8
0xbffff240
0x1f
0x292
0x1e09
0x17
0x8fe0001f
0x1f
0x0
0x37
in function
4
exiting function
in main again
37
have fun!