I've divided matrix by blocks and multiplied it using Fox's algorithm.
How can I print the result matrix to screen, when that is stored by blocks in different processes, without sending these blocks back to the process with rank 0?
For example.
After multiplication I've got:
Block A:
83 64
112 76
Block B:
118 44
152 34
Block C:
54 68
67 56
Block D:
89 85
114 68
Entire matrix should look like:
83 64 118 44
112 76 152 34
54 68 89 85
67 56 114 68
So far I've made:
Send two blocks that contain one row and print it to screen. But is it possible to print entire result matrix without sending more than one block to process 0?
// Function for gathering the result matrix
// pCBlock - one block containing part of entire result matrix
// Size - matrix dimension
// BlockSize - block dimension
void ResultCollection(double* pCblock, int Size,
int BlockSize) {
double * pResultRow = new double[Size*BlockSize];
for (int i = 0; i<BlockSize; i++) {
MPI_Gather(&pCblock[i*BlockSize], BlockSize, MPI_DOUBLE,
&pResultRow[i*Size], BlockSize, MPI_DOUBLE, 0, RowComm);
}
//print two matrix rows from two blocks
delete[] pResultRow;
}
This can't help
( Ordering Output in MPI )
because for the matrix output I need to print not the entire block A, than B, than C, than D,
but rather
one line from A ( in process 0 ), one line from B ( from process 1 ),
one line from A ( in process 0 ), one line from B ( from process 1 ),
one line from C ( from process 2 ), one line from D ( from process 3 )
and etc.
Example matrix and blocks
How can I print ... without sending these blocks back to the process with rank 0?
Well, it is time to realise,
that unless the process with rank 0 was equipped with some sort of clairvoyance, it will never be able to pretty-print any results, that were remotely computed in a herd of decentralised, distributed-processes.
Similarly, it is easy to test,
if you still do not believe what has been published on this, that MPI-distributed code was never promised to have any weak/strong warranty of how the principally uncoordinated delivery of any asynchronously remote-printed character-streams will centrally got ad-hoc ordered into one common serial output -- the system stdout -- and finally put onto the screen.
Even if you would play a lot with "addressable-ANSI-coded-screen", such design-efforts will not yield any universally working code and the tricks to inject an "absolute"-addressing into the ANSI-coded output would be obsessively awfull both to implement and to operate so as to paint a result on screen correctly.
No. Better do not try neither of these ideas.
Your actual MPI-infrastructure advisors / admins will for sure help you and show you appropriate tools for smart-collecting the results and post-process 'em accordingly.
Related
I'm trying to write a simple program (as a pre-cursor to a more complicated one) that stores an array of bytes to progmem, and then reads and prints the array. I've looked through a million blog/forums posts online and think I'm doing everything fine, but I'm still getting utter gibberish as output.
Here is my code, any help would be much appreciated!
void setup() {
byte hello[10] PROGMEM = {1,2,3,4,5,6,7,8,9,10};
byte buffer[10];
Serial.begin(9600);
memcpy_P(buffer, (char*)pgm_read_byte(&hello), 10);
for(int i=0;i<10;i++){
//buffer[i] = pgm_read_byte(&(hello[i])); //output is wrong even if i use this
Serial.println(buffer[i]);
}
}
void loop() {
}
If I use memcpy, I get the output:
148
93
0
12
148
93
0
12
148
93
And if I use the buffer = .... statement in the for loop (instead of memcpy):
49
5
9
240
108
192
138
173
155
173
You're thinking about two magnitudes too complicated.
memcpy_P wants a source pointer, a destination pointer and a byte count. And the PROGMEM pointer is simply the array. So, your memcpy_P line should like like
memcpy_P (buffer, hello, 10);
that's it.
memcpy (without the "P") will not be able to reach program memory and copy stuff from data RAM instead. That is not what you want.
I have a data file in the format <0:00> - <19321> , <1:00> - <19324>, up to <24:00> - <19648>, so for every hour there is the total power used so far(the total is incremented), I am supposed to calculate the power used, find the average, and the highest usage of power and its index(time), (I don't need help with finding the max power used at its time index). I traced the problem down to line 31, but I don't understand why what I did was wrong. Can someone explain to me why the code in line 31 isn't saving the value of power used into the array? And how I can fix it? Thanks in advance!
float compute_usage(int num, int vals[], int use[], int *hi_idx)
15 {
16 int i;// i is a general counter for all for loops
17 int r1, r2, u, v, pow_dif, temp;//for loop 1
18 int tot;//for loop 2
19 int max_use, init, fina, diff;//for loop 3 //don't have to worry about this for loop, I am good here
20 float avg;//average power used
21
22 for(r1=r2=i=u=v=0;i<num;i++)//for loop 1
23 {
24 r1= vals[v++];//I set values of every hour as reading 1 & 2(later)
25 #ifdef DEBUG
26 printf("pre-debug: use is %d\n", use[u]);
27 #endif
28 if(r1!=0 && r2!=0)
29 {
30 pow_dif = (r1 - r2);//I take the to readings, and calculate the difference, that difference is the power used in the interval between a time period
31 use[u++] = pow_dif; //I'm suppose to save the power used in the interval in an array here
32 }
33 r2=r1;//the first reading becomes the second after the if statement, this way I always have 2 readings to calculate the power used int the interval
34 #ifdef DEBUG
35 printf("for1-debug3: pow_dif is %d\n", pow_dif);
36 printf("for1-debug4: (%d,%d) \n", u, use[u]);
37 #endif
38
39 }
40 for(tot=i=u=0;i<num;i++)//for loop 2
41 {tot = tot + use[u++];}
42
43 avg = tot/(num-1);
44 #ifdef DEBUG
45 printf("for2-debug1: the tot is %d\n", tot);
46 printf("for2-debug2: avg power usage is %f\n", avg);
47 #endif
Just to understand, how did you figure out that the code in line 31 is problematic? Is it the printf statement in line 36?
When you do this:
use[u++] = pow_dif; //I'm suppose to save the power used in the interval in an array here
printf("for1-debug4: (%d,%d) \n", u, use[u]);
The "u" variable in printf statement is incremented in the previous operation (u++), so you are looking past the element you changed.
use[u++] = pow_dif; //I.e. u=0 here, but u=1 after this is executed.
printf("...\n", u=1, use[1]);
What is the "i" for in this loop? Why don't you try "u++" in the for statement instead of "i++" and remove the "u++" in the use assignment expression?
Thing is, we have N pairs of integers, as an example:
23 65
45 66
22 65
80 20
30 11
11 20
We say one pair is bigger than another one if both numbers from one pair are greater than the other two, or if the first number is equal and the other one is bigger, or vice-versa. Otherwise, if you can't compare them that way, then you can't establish which one is bigger.
The idea is to know, for each pair, how many pairs it is bigger to (in the example, the first pair is bigger than the third and the last one, therefore the answer for the first is 2).
The trivial solution would be O(n2), which is simply comparing every pair to every other one and adding one to a counter for each positive match.
Can anybody come up with a faster idea?
I have implemented the simple solution (N2), works reading from "sumos.in":
#include <iostream>
#include <fstream>
#define forn(i, x, N) for(i=x; i<N; i++)
using namespace std;
ifstream fin("sumos.in");
ofstream fout("sumos.out");
struct sumo{
int peso, altura;
};
bool operator < (sumo A, sumo B) {
if( A.altura == B.altura )
if( A.peso < B.peso )
return true;
else
return false;
else
if( A.peso == B.peso )
if( A.altura < B.altura )
return true;
else
return false;
else
if( (A.altura < B.altura) && (A.peso < B.peso) )
return true;
else
return false;
}
int L;
sumo T[100000];
int C[100000];
int main()
{
int i, j;
fin >> L;
forn(i, 0, L)
fin >> T[i].peso >> T[i].altura;
forn(i, 0, L)
forn(j, 0, L)
if( j!=i )
if( T[j]<T[i] )
C[i]++;
forn(i, 0, L)
fout << C[i] << endl;
return 0;
}
Example of input:
10
300 1500
320 1500
299 1580
330 1690
330 1540
339 1500
298 1700
344 1570
276 1678
289 1499
Outputs:
1
2
1
6
3
3
2
5
0
0
I solved this problem by using a segment tree. If you wish to see the implementation: http://pastebin.com/Q3AEF1WY
I think I came up with a solution to this but it is rather complex. The basic idea is that there are these groups where the pairs can be arranged in dominated order for example:
11 20 30 11
22 65 80 20
23 65
45 65
If you start thinking about taking your pairs and trying to create these groupings you realize you will end up with a tree structure. For example imagine we added the pair 81 19 to the list and add a pair (-∞, -∞)
(-∞, -∞)
/ \
11 20 30 11 ---\
22 65 80 20 81 19
23 65
45 65
If you follow the path from a node to the root you will count how many pairs the current pair dominates. From this example it kind of looks like you can use binary search to figure out where to insert a pair into the structure. This is where the complexity troubles start. You can't do a binary search/insertion on a linked list. However there is a very neat data structure called a skip list you might use. You can basically search and insert in O(logn) time.
There's still one problem. What if there are tons of these groupings? Imagine a list like
11 20
12 19
13 18
14 17
You're tree structure will look like:
(-∞, -∞)
/ / \ \
11 20 12 19 13 18 14 17
Again use skip lists to order these nodes. I think this will require two different kinds of nodes in the tree, a horizontal type like above and a vertical type like in the first examples. When you are done constructing the tree, do a iterate the tree with DFS while recording the current depth to associate each pair with the number of nodes it dominates.
If the above algorithm is correct you could insert a the pair into the tree in O(logn) time and thus all the pairs in O(nlogn) time. The DFS part will take O(n) time thus constructing the tree and associating a pair with the number it dominates will take O(nlogn) time. You can sort the pairs based on the number of dominations in O(nlogn) time so the whole process will take O(nlogn) time.
Again there is probably a simpler way to do this. Good luck!
You can use. A sort. like this
int z = {23,65,45, 66,22,65,80,20,30,11,11, 20};
int i, j, k, tmp;
for (i=1; i<n; i++){
j= n-i-1;
for (k=0; k<=j; k++)
//Put attention on this block.
if (z[k]>z[k+1]){
tmp= z[k];
z[k]= z[k+1];
z[k+1]= tmp;
}
}
I have an audio signal of length 12769. I'm trying to perform STFT on it by breaking it into small windows of 1024 samples. This gives me with 12 exact windows while there are 481 points remaining. Since i need 543 (1024 - 481) more points to make up 1024 samples, i used the following code to zero pad.
f = [a zeros(1,542)];
where a is the audio file.
However i get an error saying
??? Error using ==> horzcat
CAT arguments dimensions are not consistent.
How can I overcome this?
Your vector a is a column vector and cannot be concatenated with row vector zeros(1,542). Use zeros(542,1) instead.
However, it is much easier to just use
f = a;
f(1024*ceil(end/1024)) = 0;
MATLAB will zero pad the vector up to element 1024, and it is independent of the array being column or row.
You can either remove the excess 481 samples using
Total_Samples = length(a);
for i=1 : Total_Samples-481
a_new[i] = a[i];
or you could add an additional 543 Zero samples by using
Total_Samples = length(a);
for i=Total_Samples+1 : Total_Samples+543
a[i] = 0 ;
I'm having a hard time using sscanf to scan hour and minutes from a list. Below is a small snip of the list.
1704 86 2:30p 5:50p Daily
1711 17 10:40a 2:15p 5
1712 86 3:10p 6:30p 1
1731 48 6:25a 9:30a 156
1732 100 10:15a 1:30p Daily
1733 6 2:15p 3:39p Daily
I've tried this, but it keeps getting me segmentation Fault.(I'm putting this information into structures).
for(i=0;i<check_enter;i++){
sscanf(all_flights[i],
"%d %d %d:%d%c %d:%d%c %s",
&all_flights_divid[1].flight_number,
&all_flights_divid[i].route_id,
&all_flights_divid[i].departure_time_hour,
&all_flights_divid[i].departure_time_minute,
&all_flights_divid[i].departure_time_format,
&all_flights_divid[i].arrival_time_minute,
&all_flights_divid[i].arrival_time_minute,
&all_flights_divid[i].arrival_time_format,
&all_flights_divid[i].frequency);
printf("%d ",all_flights_divid[i].flight_number);
printf("%d ",all_flights_divid[i].route_id);
printf("%d ",all_flights_divid[i].departure_time_hour);
printf("%d ",all_flights_divid[i].departure_time_minute);
printf("%c ",all_flights_divid[i].departure_time_format);
printf("%d ",all_flights_divid[i].arrival_time_hour);
printf("%d ",all_flights_divid[i].arrival_time_minute);
printf("%c ",all_flights_divid[i].arrival_time_format);
printf("%s\n",all_flights_divid[i].frequency);
}
This is how I declared it.
struct all_flights{
int flight_number;
int route_id;
int departure_time_hour;
int departure_time_minute;
char departure_time_format;
int arrival_time_hour;
int arrival_time_minute;
char arrival_time_format;
char frequency[10];
};
struct all_flights all_flights_divid[3000];
These are the results I get
0 86 2 30 p 0 50 p Daily
0 17 10 40 a 0 15 p 5
0 86 3 10 p 0 30 p 1
0 48 6 25 a 0 30 a 156
0 100 10 15 a 0 30 p Daily
0 6 2 15 p 0 39 p Daily
Look carefully at your list of output targets in sscanf. Do you see the difference between &all_flights_divid[i].departure_time_minute and all_flights_divid[i].departure_time_format? Similary for .arrival_time_format and .frequency.
What do you think the & ampersand is for? Hint: what is one way of returning multiple values from a single function call, and what does this have to do with the & ampersand?
A segmentation fault arises when your program tries to write data into memory the operating system has never instructed the CPU to make available to the program. Segmentation faults do not always occur when data is misplaced, because sometimes data is misplaced within available memory. By way of analogy, if you inadvertently put a book in the wrong place on the bookshelf, you'll not easily find the book later, but the book is still on a bookshelf and does not seem to anyone to be out of place. On the other hand, if you inadvertently put the same book in the refrigerator, well, when mother goes to get the milk she's going to issue you a segmentation fault! That's the analogy, anyway.
In general, it is hard to guess whether misplacing data will cause a segmentation fault (as misplaced into the refrigerator) or not (as misplaced on the bookshelf) until you run the program. The segmentation fault (refrigerator) is preferable because it makes the mistake obvious, so the operating system tries to give you as many segmentation faults as it can by affording the program as little memory as possible.
I am avoiding giving a 100 percent direct answer because of your "homework" tag. See if you cannot figure out the & ampersand matter, then come back here if it still does not make sense.
Your pointers are all messed up. Perhaps a series of local variables specifically for reading this stuff in would help you organize this all in your head.
int flightNum, routeID, depHour, depMin, arrHour, arrMin;
char depFormat, arrFormat;
char * freq;
for(i=0;i<check_enter;i++){
sscanf(
all_flights[i],"%d %d %d:%d%c %d:%d%c %s"
&flightNum, &routeID,
&depHour, &depMin, &depFormat,
&arrHour, $arrMin, $arrFormat,
freq
);
all_flights_divid[i].flight_number = flightNum;
all_flights_divid[i].route_id = routeID;
all_flights_divid[i].departure_time_hour = depHour;
all_flights_divid[i].departure_time_minute = depMin;
all_flights_divid[i].departure_time_format = depFormat;
all_flights_divid[i].arrival_time_hour = arrHour;
all_flights_divid[i].arrival_time_minute = arrMin;
all_flights_divid[i].arrival_time_format = arrFormat;
strcpy(all_flights_divid[i].frequency, freq);
}