Summing rows in a file - arrays

I want to add the rows based on one column field. Is it possible to do by awk command or any simple way?
Date Hour Requests Success Error
10-Apr 11 1 1 0
10-Apr 13 1 1 0
10-Apr 14 1 1 0
10-Apr 18 1 1 0
10-Apr 9 1 1 0
10-Apr 11 1 1 0
10-Apr 12 3 3 0
10-Apr 13 2 1 1
10-Apr 14 2 2 0
10-Apr 15 1 1 0
10-Apr 16 1 1 0
10-Apr 12 3 3 0
10-Apr 13 4 1 3
10-Apr 14 1 1 0
10-Apr 16 2 2 0
10-Apr 18 1 1 0
10-Apr 10 3 3 0
10-Apr 11 1 1 0
10-Apr 12 3 3 0
10-Apr 13 1 1 0
10-Apr 14 2 2 0
10-Apr 15 2 2 0
10-Apr 16 2 2 0
10-Apr 17 2 2 0
From the above table I want add the rows(requests, success, errors for that hour) based on hour and the o/p should be as like as below
Date Hour Requests Success Error
10-Apr 9 1 1 0
10-Apr 10 3 3 0
10-Apr 11 3 3 0
10-Apr 12 9 9 0
10-Apr 13 8 4 4
10-Apr 14 6 6 0
10-Apr 15 3 3 0
10-Apr 16 5 5 0
10-Apr 17 2 2 0
10-Apr 18 2 2 0

Using GNU awk for true Multi-D arrays and sorted in:
$ cat tst.awk
NR==1 { print; next }
!seen[$1]++ { dates[++numDates] = $1 }
{ for (i=3;i<=NF;i++) sum[$1][$2][i] += $i }
END {
PROCINFO["sorted_in"] = "#ind_num_asc"
for (dateNr=1; dateNr<=numDates; dateNr++) {
date = dates[dateNr]
for (hr in sum[date]) {
printf "%s %s ", date, hr
for (i=3;i<=NF;i++) {
printf "%s%s", sum[date][hr][i], (i<NF?OFS:ORS)
}
}
}
}
$ awk -f tst.awk file | column -t
Date Hour Requests Success Error
10-Apr 9 1 1 0
10-Apr 10 3 3 0
10-Apr 11 3 3 0
10-Apr 12 9 9 0
10-Apr 13 8 4 4
10-Apr 14 6 6 0
10-Apr 15 3 3 0
10-Apr 16 5 5 0
10-Apr 17 2 2 0
10-Apr 18 2 2 0
I wasn't sure if your fields were space or tab separated so made no attempt to format the output within awk.

Related

Problems with setting array elements in Forth

I am writing code in Forth that should create a 12x12 array of random numbers from 1 to 8.
create big_array 144 allocate drop
: reset_array big_array 144 0 fill ;
reset_array
variable rnd here rnd !
: random rnd # 31421 * 6927 + dup rnd ! ;
: choose random um* nip ;
: random_fill 144 1 do 8 choose big_array i + c! loop ;
random_fill
: Array_# 12 * + big_array swap + c# ;
: show_small_array cr 12 0 do 12 0 do i j Array_# 5 u.r loop cr loop ;
show_small_array
However, I notice that elements 128 to 131 of my array are always much larger than expected:
0 4 0 4 2 6 0 5 2 5 7 3
6 3 7 3 7 7 3 1 5 0 6 1
0 3 3 0 3 1 0 7 2 0 4 5
3 7 6 6 2 1 0 2 3 4 2 7
4 7 1 5 3 5 7 2 3 5 3 6
3 0 6 4 1 3 3 2 5 4 4 7
3 2 1 4 3 4 3 7 2 6 5 5
2 4 4 3 4 5 4 4 6 5 6 0
2 5 2 7 3 1 5 0 1 4 6 7
2 0 3 3 0 7 3 6 4 1 3 6
0 1 1 6 0 3 0 2 169 112 41 70
7 2 3 1 2 2 7 6 0 5 1 2
Moreover, when I try to change the value of these elements individually, this causes the other three elements to change value. For example, if I code:
9 choose big_array 128 + c!
then the array will become:
0 4 0 4 2 6 0 5 2 5 7 3
6 3 7 3 7 7 3 1 5 0 6 1
0 3 3 0 3 1 0 7 2 0 4 5
3 7 6 6 2 1 0 2 3 4 2 7
4 7 1 5 3 5 7 2 3 5 3 6
3 0 6 4 1 3 3 2 5 4 4 7
3 2 1 4 3 4 3 7 2 6 5 5
2 4 4 3 4 5 4 4 6 5 6 0
2 5 2 7 3 1 5 0 1 4 6 7
2 0 3 3 0 7 3 6 4 1 3 6
0 1 1 6 0 3 0 2 2 12 194 69
7 2 3 1 2 2 7 6 0 5 1 2
Do you have any idea why these specific elements are always impacted and if there is a way to prevent this?
Better readability and less error prone: 144 allocate ⇨ 144 chars allocate
A mistake: create big_array 144 allocate drop ⇨ create big_array 144 chars allot
A mistake: random um* nip ⇨ random swap mod
A mistake: 144 1 do ⇨ 144 0 do
An excessive operation: big_array swap + ⇨ big_array +
And add the stack comments, please. Especially, when you ask for help.
Do you have any idea why these specific elements are always impacted and if there is a way to prevent this?
Since you try to use memory in the dictionary space without reserving it. This memory is used by the Forth system.

How to control the format to export discriminant analysis results to word by using asdoc in Stata?

I want to export the results of a discriminant analysis by using asdoc in stata.
I want to show as follows.
three Three Decimal Digits.
compress the table to fit into a page of the word.
However, the format of the results is horrible. I use dec(3) but not working. I read the "help asdoc" in Stata, but it all about regression.
Does anyone know how to export the total results of discriminant analysis to Word with nice format?
Thank you in advance.
The following is the sample code and asdoc code that I use.
input area Revenue age child_number grocery_expense credit_card exercise_week social_week
1 99336 76 1 22453 5 3 1
1 59092 75 4 16995 6 1 3
1 68614 49 0 37709 0 7 5
1 84805 55 3 21642 0 3 1
1 66138 41 3 10490 2 4 7
1 90238 43 2 30254 5 6 4
1 60466 49 2 18136 1 0 4
1 46575 64 0 25053 6 6 7
2 97811 40 4 36925 4 6 5
2 61862 40 0 14480 5 5 6
2 58071 73 0 24754 4 0 1
2 42539 66 2 19903 3 1 6
2 62074 56 3 12560 3 3 7
2 71619 34 2 24523 6 3 6
2 51281 74 2 23625 4 6 3
3 40990 25 3 38943 4 7 4
3 44567 73 2 39898 1 4 7
3 73586 42 2 20159 0 2 3
3 44907 44 3 31378 1 1 6
3 79352 20 3 39968 6 6 1
3 55647 50 1 27122 0 3 6
3 80943 43 1 15177 2 7 4
3 88892 77 2 22537 4 2 7
4 91735 74 3 27505 0 5 2
4 61224 60 5 12374 5 1 0
4 72192 68 4 36817 2 6 1
4 87486 59 0 34846 6 5 1
4 53131 52 4 12584 5 1 1
4 49083 33 5 30652 3 0 5
4 47408 49 0 28938 1 6 0
4 74647 52 2 15291 0 5 6
5 81643 37 0 37993 2 4 2
5 42371 46 1 33436 6 5 4
5 74074 24 3 16618 5 6 7
5 63502 34 3 19887 1 4 3
5 86779 31 5 37290 6 3 4
5 45842 45 5 20383 2 1 5
5 59835 42 5 30708 4 2 1
5 60486 38 2 36167 3 6 2
5 49099 58 0 13157 4 3 7
5 71692 37 5 36317 4 6 3
5 91406 45 5 12451 5 7 1
6 42742 48 1 39088 5 2 0
6 54538 21 2 19657 0 7 3
6 49323 69 4 37173 5 5 5
6 50053 54 4 32193 2 7 7
6 99139 48 1 14647 4 4 1
6 97908 26 0 14319 6 1 4
6 46504 27 1 39478 4 6 2
6 92330 28 3 23676 1 3 0
6 93926 34 3 10871 1 3 3
6 81890 51 2 16914 1 0 1
6 86679 79 1 35967 2 7 6
6 43783 67 2 31009 2 5 0
6 76770 66 5 13220 6 6 7
6 91160 67 2 29346 6 0 3
end
asdoc candisc Revenue age child_number grocery_expense credit_card exercise_week social_week , group (area) dec(3)

SAS: Non-sequential do loop within a data step

I would like to be able to execute a do loop for a non-sequential set of values. The way I have written this code runs a new data step for each value - so therefore the end product is a data table with a column added for the final value of the do loop only. What I want is for the the values in the varlst to loop through the if/then statements - thereby adding multiple columns to the table - without executing a new data step each time (which only results in adding one final column to the table).
INPUT DATA
DATA have;
INPUT id order Q3 Q5 Q6 Q50 Q75 Q102;
DATALINES;
1 1 2 0 7 2 2 0
1 2 3 0 5 5 3 0
3 1 6 1 7 2 7 1
3 2 6 0 7 5 7 0
6 1 3 1 4 7 7 2
6 2 5 2 7 7 7 1
7 1 3 5 6 5 3 0
7 2 4 1 7 5 2 1
9 1 4 1 6 5 6 1
9 2 1 3 5 7 5 0
;
run;
/********/
%macro test;
%let varlst=2 3 5 6 50 75 102 /*more values*/;
%do i=1 %to %sysfunc(countw(&varlst));
%let value=%scan(&varlst,&i);
data want;
set have;
by id order;
if Q&value ne lag(Q&value) and not first.id then do;
Q&value.Equal = 0;
end;
if Q&value=lag(Q&value) and not first.id then do;
Q&value.Equal = 1;
end;
%end;
run;
%mend;
%test;
/**********/
OUTPUT
id order Q3 Q5 Q6 Q50 Q75 Q102 Q102Equal
1 1 2 0 7 2 2 0 .
1 2 3 0 5 5 3 0 1
3 1 6 1 7 2 7 1 .
3 2 6 0 7 5 7 0 0
6 1 3 1 4 7 7 2 .
6 2 5 2 7 7 7 1 0
7 1 3 5 6 5 3 0 .
7 2 4 1 7 5 2 1 0
9 1 4 1 6 5 6 1 .
9 2 1 3 5 7 5 0 0
Why don't you try using PROC COMPARE?
data have ;
input id order Q3 Q5 Q6 Q50 Q75 Q102;
cards;
1 1 2 0 7 2 2 0 .
1 2 3 0 5 5 3 0 1
3 1 6 1 7 2 7 1 .
3 2 6 0 7 5 7 0 0
6 1 3 1 4 7 7 2 .
6 2 5 2 7 7 7 1 0
7 1 3 5 6 5 3 0 .
7 2 4 1 7 5 2 1 0
9 1 4 1 6 5 6 1 .
9 2 1 3 5 7 5 0 0
;;;;
proc compare
data=have(where=(order=1))
compare=have(where=(order=2))
outdiff out=want
;
id id ;
var q: ;
run;

How can I build a decreasing lower triangular matrix? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
Do you know if it is possible to get the following triangular matrix
[ N:-1:1; (N-1):-1:0; (N-2):-1:0 0; (N-3):-1:0 0 0; ....] without writing every line with horzcat and without using a loop?
thanks all
Fred
Is this what you want?
N = 8;
result = flipud(tril(toeplitz(1:N)));
This gives
result =
8 7 6 5 4 3 2 1
7 6 5 4 3 2 1 0
6 5 4 3 2 1 0 0
5 4 3 2 1 0 0 0
4 3 2 1 0 0 0 0
3 2 1 0 0 0 0 0
2 1 0 0 0 0 0 0
1 0 0 0 0 0 0 0
Maybe something like this:
N=10;
M=triu(gallery('circul',N)).'
M =
1 0 0 0 0 0 0 0 0 0
2 1 0 0 0 0 0 0 0 0
3 2 1 0 0 0 0 0 0 0
4 3 2 1 0 0 0 0 0 0
5 4 3 2 1 0 0 0 0 0
6 5 4 3 2 1 0 0 0 0
7 6 5 4 3 2 1 0 0 0
8 7 6 5 4 3 2 1 0 0
9 8 7 6 5 4 3 2 1 0
10 9 8 7 6 5 4 3 2 1
Or did you want this:
M=fliplr(triu(gallery('circul',N)))
M =
10 9 8 7 6 5 4 3 2 1
9 8 7 6 5 4 3 2 1 0
8 7 6 5 4 3 2 1 0 0
7 6 5 4 3 2 1 0 0 0
6 5 4 3 2 1 0 0 0 0
5 4 3 2 1 0 0 0 0 0
4 3 2 1 0 0 0 0 0 0
3 2 1 0 0 0 0 0 0 0
2 1 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0
I couldn't really tell from your code sample which direction you wanted this to go.
The power of bsxfun compels you!
[[N:-1:1]' reshape(repmat([N-1:-1:1]',1,N).*bsxfun(#ge,[1:N-1]',1:N),N,[])]
Sample run -
>> N = 8;
>> [[N:-1:1]' reshape(repmat([N-1:-1:1]',1,N).*bsxfun(#ge,[1:N-1]',1:N),N,[])]
ans =
8 7 6 5 4 3 2 1
7 6 5 4 3 2 1 0
6 5 4 3 2 1 0 0
5 4 3 2 1 0 0 0
4 3 2 1 0 0 0 0
3 2 1 0 0 0 0 0
2 1 0 0 0 0 0 0
1 0 0 0 0 0 0 0
This is basically inspired by this another bsxfun-based solution to a very similar question - Replicate vector and shift each copy by 1 row down without for-loop. There you can see similar solutions and related benchmarks, as it seems performance is a concern here.

how to add a factor to a sequence?

I'm analysing a dataset with some data-mining tools.The response variable has ten levels and I'm trying to create a classifier.
Here comes the problem.When using nnet and bagging function,the result is not that good and the 5th level is even not in the prediction.
I want to use a confusion matrix to analyse the classifier.but as the 5th level is not shown in the prediction I can't get a well-formed matrix.So how can I get a well-formed matrix?i.e. I want a 10*10 matrix.
The confusion matrix:
library("mda")#This is where **confusion** comes from
> confusion(pre.bag$class,CLASS)#here confusion acts like table
true
predicted 1 2 3 4 6 7 8 9 10 5
1 338 9 6 0 5 12 10 1 15 46
2 9 549 1 59 18 0 3 0 0 6
3 18 1 44 0 0 0 2 0 0 4
4 0 1 0 21 0 0 0 0 0 0
6 2 13 0 1 299 2 9 0 0 0
7 5 2 1 0 10 231 6 0 1 0
8 0 0 0 0 0 5 76 0 0 0
9 5 1 0 0 0 0 0 62 0 0
10 7 3 1 0 0 2 1 6 181 16
attr(,"error")
[1] 0.1231743
attr(,"mismatch")
[1] 0.03386642
Try this:
pred <- factor(pre.bag$class, levels=levels(CLASS) )
confusion(pre.bag$class, CLASS)
(Tested with an fda-object.)

Resources