Stata: Reshape command number of j dimensions - loops

After reshaping a dataset from long to wide using
reshape wide v1 v2 v3, i(i1 i2) j(jdimens)
I need to run a loop exactly max(jdimens) times. Example: Assume that the above code creates the new variables jdimens1 jdimens2 and jdimens3. Then I would like to have the loop run three times.
Any ideas how this can be neatly done?

You can count the variables:
foreach i of varlist jdimens* {
di "iteration `i'"
}
reshape also leaves some characteristics behind that you can use if you don't want to specify names:
local myvars: char _dta[ReS_Xij_wide1]
foreach i of local myvars {
di "iteration `i'"
}

Related

looping multiple vectors in a for loop

I'm programming an objloader and this is a small part of its code.I want to be able to loop through different vectors in a single for loop.The for loop doesn't work here but is it possible to implement this concept somehow? make the (for ((GLdouble* val : container)&&(GLdouble* val2:NContainer)) condition work somehow?
aclass e;
std::vector<GLdouble*> container = e.function();
Nclass n;
std::vector<GLdouble*> Ncontainer = n.function();
for ((GLdouble* val : container)&&(GLdouble* val2:NContainer))
{
glVertex3dv(val);
glNormal3dv(val2);
}

remove empty rows of an Eigen::SparseMatrix

I have built a sparse matrix mat from a list of triplets
Eigen::SparseMatrix<double, Eigen::RowMajor> mat(Nbins,Ndata);
mat.setFromTriplets(tripletList.begin(), tripletList.end());
Now I would like to create a new matrix ret, which only contains the rows of the previous matrix which are not empty. I do it as follows
Eigen::SparseMatrix<double, Eigen::RowMajor> ret(Nbins,Ndata);
unsigned Nrow=0;
for (unsigned i=0; i<Nbins; ++i) {
auto mrow = mat.row(i);
if (mrow.sum()>0) {
ret.row(Nrow++) = mrow;
}
}
ret.conservativeResize(Nrow,Ndata);
However, doing it this way is slow and inefficient. Slow because quick profiling suggests it spends most of its time on ret.row(Nrow++) = mrow;. Inefficient because we are also copying all the data twice.
Is there a better solution? I feel one has to fiddle with the inner vectors but I get confused by them and I don't know how user-proof it is to play with them.
EDIT: In my application, matrices are row major, and I want to remove empty rows. mat is not needed, just ret. All coefficients are positive hence the way I check for nonzero rows. The triplets are sorted but column-major. There are no duplicate triplets.
Found it! Instead of writing a hand-made setFromTriplets, I went with a modification of the tripletList. The interface of Eigen::Triplet makes it very easy.
//get which rows are empty
std::vector<bool> has_value(Nbins,false);
for (auto tr : tripletList) has_value[tr.row()] = true;
//create map from old to new indices
std::map<unsigned,unsigned> row_map;
unsigned new_idx=0;
for (unsigned old_idx=0; old_idx<Nbins; old_idx++)
if(has_value[old_idx])
row_map[old_idx]=new_idx++;
//make new triplet list, dropping empty rows
std::vector<Eigen::Triplet<double> > newTripletList;
newTripletList.reserve(Ndata);
for (auto tr : tripletList)
newTripletList.push_back(
Eigen::Triplet<double>(row_map[tr.row()],tr.col(),tr.value()));
//form new matrix and return
Eigen::SparseMatrix<double, Eigen::RowMajor> ret(new_idx,Ndata);
ret.setFromTriplets(newTripletList.begin(), newTripletList.end());

R parallel code for two arrays

I'm trying to process two 3d arrays using parallel computing in R. I have a function that takes two vectors as input, so I need to loop through the rows and columns of my arrays. Doing this in serial code is simply too slow and R gets stuck as the arrays are large.
I've not found a solution for doing this with parallel functions and would appreciate any suggestions. I've tried parApply but do not know how to incorporate a second input, and mcmapply but it is hard to use over rows/cols. Ideally the output should also be an array of the same dimension.
Below is a reproducible example of what I'm trying to do in serial code. Any help on how this could be written in parallel code would be much appreciated!
fun <- function(a,b)
{
a*b
}
input1 <- array(data=1:1000, dim=c(10,10,10))
input2 <- array(data=2:1001, dim=c(10,10,10))
result <- array(data=NA, dim=c(10,10,10))
for(i in 1:nrow(mat1))
{ for(j in 1:ncol(mat1)) {
result[,i,j] <- fun(input1[,i,j], input2[,i,j])
}}
Here's one way to do it with the foreach package:
library(doSNOW)
library(abind)
cl <- makeSOCKcluster(parallel::detectCores())
registerDoSNOW(cl)
fun <- function(a,b) a*b
input1 <- array(rnorm(60), dim=c(4,5,3))
input2 <- array(rnorm(60), dim=c(4,5,3))
rdim <- dim(input1)[1:2]
comb <- function(...) abind(..., along=3)
result <-
foreach(i1=iapply(input1, 3), i2=iapply(input2, 3),
.multicombine=TRUE, .combine='comb') %dopar% {
r <- array(data=NA, dim=rdim)
for (i in 1:ncol(i1)) {
r[,i] <- fun(i1[,i], i2[,i])
}
r
}
The "iapply" function from the iterators package is used to split the 3D input arrays into 2D matrices. The result matrices are combined into a 3D array using the "abind" function from the abind package.
Note that I'm specifically using the "doSNOW" parallel backend because it sends data from the two "iapply" iterators to the workers and processes the results on-the-fly. This reduces the memory needed by the master process. The "doParallel" backend can't work on-the-fly because the "parallel" package doesn't export the necessary functions.

AutoHotKey Adding to Array within function call

I know the newer version is better, but company does not allows me to. So the question is related to AutoHotKey, ver 1.0.47.06.
I am trying to refactor my 400 lines program, by separating them into functions.
CaseNumberArray := "" ; The array to store all the case numbers
CaseNumberArrayCount := 0
; Helper function to load the case number into the array
ReadInputFile() {
Loop, Read, U:\case.txt
{
global CaseNumberArrayCount
CaseNumberArrayCount += 1 ; Increment the ArrayCount
CaseNumberArray%CaseNumberArrayCount% := A_LoopReadLine
current := CaseNumberArray%CaseNumberArrayCount%
}
}
CreateOutputHeader()
ReadInputFile()
MsgBox, There are %CaseNumberArrayCount% case(s) in the file.
Loop, %CaseNumberArrayCount%
{
case_number := CaseNumberArray%A_Index%
MsgBox, %case_number%
}
The last part of the code is testing if I can retrieve the case numbers I loaded into the array named CaseNumberArray, but it is currently all blank.
I studied this question, the author user1944441 wrote:
Important: YourArray must not be global and the counter in
YourArray%counter% must not be global, the rest doesn't matter.
I experimented by placing the global variables in different location, but it still does not work. I know the CaseArrayCount is correctly stored, and the Read Loop is working as well (When it is outside of a function). Is it possible to separate the code into a function?
Usually, global/local declarations are placed right below the method header, not somewhere in some subsequent code block. After all, these declarations apply only to the entire function.
You have to distinguish between simple loop counter variables and variables holding the actual size of the array. In your code, CaseNumberArrayCount describes the size of CaseNumberArray whereas in the answer to which you're referring, it's a counter only used to iterate over the array, which might as well be local.
But you don't have to use two "variables" anyway. Your pseudo array (which can be accessed like CaseNumberArray1, CaseNumberArray2, CaseNumberArray2, ...) has an unused CaseNumberArray0, why not not store the size there?
A pseudo array is actually a collection of sequentially numbered variables. global CaseNumberArray (which by the way you didn't seem to try) will only allow access to the variable named CaseNumberArray, but not CaseNumberArray1 or CaseNumberArray2 and so on.
One solution would be to use Assume-global mode which makes every global variable accessible by default:
; Now, CaseNumberArray0 will hold the array length,
; rendering CaseNumberArrayCount unnecessary
CaseNumberArray0 := 0
; Helper function to load the case number into the array
ReadInputFile() {
; We want to access every global variable we have,
; beware of name conflicts within your function!
global
Loop, Read, test.txt
{
CaseNumberArray0 += 1
CaseNumberArray%CaseNumberArray0% := A_LoopReadLine
}
}
; Here's an alternative: Let AHK build the pseudo array!
ReadInputFileAlternative() {
global caseAlt0
FileRead, fileCont, test.txt
StringSplit, caseAlt, fileCont, `n, `r
}
ReadInputFile()
out := ""
Loop, %CaseNumberArray0%
{
out .= CaseNumberArray%A_Index% "`n"
}
MsgBox, There are %CaseNumberArray0% case(s) in the file:`n`n%out%
; Now, let's test the alternative!
ReadInputFileAlternative()
out := ""
Loop, %caseAlt0%
{
out .= caseAlt%A_Index% "`n"
}
MsgBox, There are %caseAlt0% case(s) in the alternative pseudo-array:`n`n%out%
Edit: "Real Arrays"
As suggested in the comments, here's what I would do instead: I would convince my boss to allow the use of an up-to-date version of AHK and then work with real arrays. This comes with several benefits:
Real arrays are fully managed by AHK, which means that things like inserting, removing, iterating and indexing can all automagically be done by AHK.
A real array resides in one real variable, meaning that you can pass it along functions and anywhere you want, without having to worry about the current scope and whether you can access it in the first place.
The array syntax is very similar to most other languages, making your code intuitive and easier to read. And maybe it helps you in the future when dealing with another language.
Primitive n-dimensional arrays (and primitive AHK objects in general) can be expressed using JSON. This provides you with an easy way to (de-)serialize AHK objects.
The following code snippet shows the two methods used above (reading loop and splitting), but with real arrays. You will notice that we don't need any global declarations anymore, since we now can declare the array inside our function, and simply pass it back to the caller. In my opinion, this is what functions should really look like: A "black box" that doesn't affect its surroundings.
; Method 1: Line by line
ReadLineByLine(file) {
out := []
Loop, Read, % file
{
out.Insert(A_LoopReadLine)
}
return out
}
; Method 2: StrSplit
ReadAndSplit(file) {
FileRead, fileCont, % file
return StrSplit(fileCont, "`n", "`r")
}
caseNumbers := ReadLineByLine("test.txt")
out := "ReadLineByLine() yields " caseNumbers.MaxIndex() " entries:`n`n"
; using the for loop
for idx, caseNumber in caseNumbers
{
out .= caseNumber "`n"
}
MsgBox % out
caseNumbers := ReadAndSplit("test.txt")
out := "ReadAndSplit() yields " caseNumbers.MaxIndex() " entries:`n`n"
; using the normal loop
Loop % caseNumbers.MaxIndex()
{
out .= caseNumbers[A_Index] "`n"
}
MsgBox % out
MsgBox % "The second item is " caseNumbers[2]

Math::Complex screwing up my array references

I'm trying to optimize some code here, and wrote two different simple subroutines that will subtract one vector from another. I pass a pair of vectors to these subroutines and the subtraction is then performed. The first subroutine uses an intermediary variable to store the result whereas the second one does an inline operation using the '-=' operator. The full code is located at the bottom of this question.
When I use purely real numbers, the program works fine and there are no issues. However, if I am using complex operands, then the original vectors (the ones originally passed to the subroutines) are modified! Why does this program work fine for purely real numbers but do this sort of data modification when using complex numbers?
Note my process:
Generate random vectors (either real or complex depending on the commented out code)
Print the main vectors to the screen
Perform the first subroutine subtraction (using the third variable intermediary within the subroutine)
Print the main vectors to the screen again to prove that they have not changed, no matter the use of real or complex vectors
Perform the second subroutine subtraction (using the inline computation method)
Print the main vectors to the screen again, showing that #main_v1 has changed when using complex vectors, but will not change when using real vectors (#main_v2 is unaffected)
Print the final answers to the subtraction, which are always the correct answers, regardless of real or complex vectors
The issue arises because in the case of the second subroutine (which is quite a bit faster), I don't want the #main_v1 vector changed. I need that vector to do further calculations down the road, so I need it to stay the same.
Any idea on how to fix this, or what I'm doing wrong? My entire code is below, and should be functional. I've been using the CLI syntax shown below to run the program. I choose 5 just to keep everything easy for me to read.
c:\> bench.pl 5 REAL
or
c:\> bench.pl 5 IMAG
#!/usr/local/bin/perl
# when debugging: add -w option above
#
use strict;
use warnings;
use Benchmark qw (:all);
use Math::Complex;
use Math::Trig;
use Time::HiRes qw (gettimeofday);
system('cls');
my $dimension = $ARGV[0];
my $type = $ARGV[1];
if(!$dimension || !$type){
print "bench.pl <n> <REAL | IMAG>\n";
print " <n> indicates the dimension of the vector to generate\n";
print " <REAL | IMAG> dictates to use real or complex vectors\n";
exit(0);
}
my #main_v1;
my #main_v2;
my #vector_sum1;
my #vector_sum2;
for($a=1;$a<=$dimension;$a++){
my $r1 = sprintf("%.0f", 9*rand)+1;
my $r2 = sprintf("%.0f", 9*rand)+1;
my $i1 = sprintf("%.0f", 9*rand)+1;
my $i2 = sprintf("%.0f", 9*rand)+1;
if(uc($type) eq "IMAG"){
# Using complex vectors has the issue
$main_v1[$a] = cplx($r1,$i1);
$main_v2[$a] = cplx($r2,$i2);
}elsif(uc($type) eq "REAL"){
# Using real vectors shows no issue
$main_v1[$a] = $r1;
$main_v2[$a] = $r2;
}else {
print "bench.pl <n> <REAL | IMAG>\n";
print " <n> indicates the dimension of the vector to generate\n";
print " <REAL | IMAG> dictates to use real or complex vectors\n";
exit(0);
}
}
# cmpthese(-5, {
# v1 => sub {#vector_sum1 = vector_subtract(\#main_v1, \#main_v2)},
# v2 => sub {#vector_sum2 = vector_subtract_v2(\#main_v1, \#main_v2)},
# });
# print "\n";
print "main vectors as defined initially\n";
print_vector_matlab(#main_v1);
print_vector_matlab(#main_v2);
print "\n";
#vector_sum1 = vector_subtract(\#main_v1, \#main_v2);
print "main vectors after the subtraction using 3rd variable\n";
print_vector_matlab(#main_v1);
print_vector_matlab(#main_v2);
print "\n";
#vector_sum2 = vector_subtract_v2(\#main_v1, \#main_v2);
print "main vectors after the inline subtraction\n";
print_vector_matlab(#main_v1);
print_vector_matlab(#main_v2);
print "\n";
print "subtracted vectors from both subroutines\n";
print_vector_matlab(#vector_sum1);
print_vector_matlab(#vector_sum2);
sub vector_subtract {
# subroutine to subtract one [n x 1] vector from another
# result = vector1 - vector2
#
my #vector1 = #{$_[0]};
my #vector2 = #{$_[1]};
my #result;
my $row = 0;
my $dim1 = #vector1 - 1;
my $dim2 = #vector2 - 1;
if($dim1 != $dim2){
syswrite STDOUT, "ERROR: attempting to subtract vectors of mismatched dimensions\n";
exit;
}
for($row=1;$row<=$dim1;$row++){$result[$row] = $vector1[$row] - $vector2[$row]}
return(#result);
}
sub vector_subtract_v2 {
# subroutine to subtract one [n x 1] vector from another
# implements the inline subtraction method for alleged speedup
# result = vector1 - vector2
#
my #vector1 = #{$_[0]};
my #vector2 = #{$_[1]};
my $row = 0;
my $dim1 = #vector1 - 1;
my $dim2 = #vector2 - 1;
if($dim1 != $dim2){
syswrite STDOUT, "ERROR: attempting to subtract vectors of mismatched dimensions\n";
exit;
}
for($row=1;$row<=$dim1;$row++){$vector1[$row] -= $vector2[$row]} # subtract inline
return(#vector1);
}
sub print_vector_matlab { # for use with outputting square matrices only
my (#junk) = (#_);
my $dimension = #junk - 1;
print "V=[";
for($b=1;$b<=$dimension;$b++){
# $temp_real = sprintf("%.3f", Re($junk[$b][$c]));
# $temp_imag = sprintf("%.3f", Im($junk[$b][$c]));
# $temp_cplx = cplx($temp_real,$temp_imag);
print "$junk[$b];";
# print "$temp_cplx,";
}
print "];\n";
}
I've even tried modifying the second subroutine so that it has the following lines, and it STILL alters the #main_v1 vector when using complex numbers...I am completely confused as to what's going on.
#result = #vector1;
for($row=1;$row<=$dim1;$row++){$result[$row] -= $vector2[$row]}
return(#result);
and I've tried this too...still modifies #main_V1 with complex numbers
for($row-1;$row<=$dim1;$row++){$result[$row] = $vector1[$row]}
for($row=1;$row<=$dim1;$row++){$result[$row] -= $vector2[$row]}
return(#result);
Upgrade Math::Complex to at least version 1.57. As the changelog explains, one of the changes in that version was:
Add copy constructor and arrange for it to be called appropriately, problem found by David Madore and Alexandr Ciornii.
In Perl, an object is a blessed reference; so an array of Math::Complexes is an array of references. This is not true of real numbers, which are just ordinary scalars.
If you change this:
$vector1[$row] -= $vector2[$row]
to this:
$vector1[$row] = $vector1[$row] - $vector2[$row]
you'll be good to go: that will set $vector1[$row] to refer to a new object, rather than modifying the existing one.

Resources