Error: Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent - database

I have the following dataset about the choices of different car brands and their attributes. I would like to create a matrix based on each attribute of the cars.
RespNum Task Concept Make Exterior.Design Interior.design
1 100086500 1 1 3 2 3
2 100086500 1 2 1 3 2
3 100086500 1 3 4 1 1
4 100086500 1 4 0 0 0
5 100086500 2 1 1 3 2
6 100086500 2 2 5 1 3
Driving.performance Driving.attributes Comfort Practibility Safety
1 1 1 1 3 3
2 3 3 3 2 1
3 2 2 2 1 2
4 0 0 0 0 0
5 3 2 1 1 3
6 1 3 3 3 2
Quality Equipment Sustainability Economy Price Response
1 2 1 1 3 1 0
2 1 3 3 1 3 0
3 3 2 2 2 2 1
4 0 0 0 0 0 0
5 3 2 1 1 4 0
6 1 3 3 3 8 0
I am using the function:
Make = attribcoding(6,4,'Other')
The first input (6) is the number of levels, the second (4) is the column position in the dataset, and the last ('Other') is the name of the outside option. However, I get the following error message:
Error in dimnames(x) <- dn :
length of 'dimnames' [2] not equal to array extent

Related

Nested for-loop: error variable already defined

I have a nested loop in Stata with four levels of foreach statements. With this loop, I am trying to create a new variable named strata that ranges from 1 to 40.
foreach x in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 {
foreach r in 1 2 3 4 5 {
foreach s in 1 2 {
foreach a in 1 2 3 4 {
gen strata= `x' if race==`r' & sex==`s' & age==`a'
}
}
}
}
I get an error :
"variable strata already defined"
Even with the error, the loop does assign strata = 1, but not the rest of the strata. All other cells are missing/empty.
Example data:
* Example generated by -dataex-. To install: ssc install dataex
clear
input byte(age sex race)
1 2 2
1 2 1
1 1 1
1 1 1
1 2 1
2 2 1
2 2 1
4 2 1
1 2 1
4 2 1
3 2 1
2 2 1
4 2 1
4 2 2
3 2 1
4 1 3
4 2 1
4 2 1
2 1 2
4 2 1
2 2 1
3 2 1
3 2 1
1 2 3
4 2 1
1 2 5
4 2 1
4 2 1
4 2 2
4 2 1
2 2 1
4 1 1
3 2 1
1 2 1
2 2 1
4 2 1
1 2 2
2 2 3
1 1 3
4 2 1
2 2 3
1 2 1
1 1 1
2 2 3
1 2 1
1 1 3
1 2 1
2 2 1
3 2 1
1 2 1
4 2 1
1 2 2
1 2 1
2 2 1
4 2 1
4 2 1
1 2 1
1 2 1
4 2 1
2 2 1
4 2 1
1 2 1
1 1 3
2 2 1
1 1 1
4 1 1
3 2 1
2 2 1
1 2 1
1 1 1
2 2 3
4 2 2
2 2 1
2 2 1
3 2 1
2 2 2
3 2 1
2 1 1
1 1 1
3 2 1
1 2 3
4 2 1
4 2 1
2 2 1
1 2 1
1 1 1
3 2 1
4 2 1
2 2 3
1 2 3
4 2 1
3 2 1
2 2 1
4 2 1
3 2 1
2 1 1
1 2 1
2 2 1
2 2 3
1 1 1
end
label values sex sex
label def sex 1 "male (1)", modify
label def sex 2 "female (2)", modify
label values race race
label def race 1 "non-Hispanic white (1)", modify
label def race 2 "black (2)", modify
label def race 3 "AAPI/other (3)", modify
label def race 5 "Hispanic (5)", modify
generate is for generating new variables. The second time your code reaches a generate statement, the code fails for the reason given.
One answer is that you need to generate your variable outside the loops and then replace inside.
For other reasons your code can be rewritten in stages.
First, integer sequences can be more easily and efficiently specified with forvalues, which can be abbreviated: I tend to write forval.
gen strata = .
forval x = 1/40 {
forval r = 1/5 {
forval s = 1/2 {
forval a = 1/4 {
replace strata = `x' if race==`r' & sex==`s' & age==`a'
}
}
}
}
Second, the code is flawed any way. Everything ends up as 40!
Third, you can do allocations much more directly, say by
gen strata = 8 * (race - 1) + 4 * (sex - 1) + age
This is a self-contained reproducible demonstration:
clear
set obs 5
gen race = _n
expand 2
bysort race : gen sex = _n
expand 4
bysort race sex : gen age = _n
gen strata = 8 * (race - 1) + 4 * (sex - 1) + age
isid strata
Clearly you can and should vary the recipe for a different preferred scheme.

Summarizing dataframe and sorting with inserting 0 if the target words doesn't exist

I have a tab-delimited data frame like this:
Sample 1
A 2
B 1
D 2
Sample 2
C 1
D 1
E 2
Sample 3
D 1
E 3
Sample 4
A 1
E 3
Sample 5
A 2
B 3
Sample 6
C 1
D 2
Sample 7
D 1
E 3
Sample 8
A 3
D 2
Sample 9
A 1
Sample 10
A 1
C 2
E 3
and I would like to transpose it and link the alphabets and the associated values. If the sample does not contain alphabets, I want "0" inserted.
So the modified dataframe I desire is
Sample A B C D E
1 2 1 0 2 0
2 0 0 1 1 1
3 0 0 0 1 3
4 1 0 0 0 3
5 2 3 0 0 0
6 0 0 1 2 0
7 0 0 0 1 3
8 3 0 0 2 0
9 1 0 0 0 0
10 1 0 2 0 3
I tried to summarize the datafreame, and then transpose it. When I used,
awk 'NR==1{print} NR>1{a[$1]=a[$1]" "$2}END{for (i in a){print i " " a[i]}}' TEST
the output data was
Sample 1
A 2 1 2 3 1 1
B 1 3
C 1 1 2
D 2 1 1 2 1 2
E 2 3 3 3 3
Sample 2 3 4 5 6 7 8 9 10
Sample 1 was isolated, and there was no space when the alphabets were not included by the samples.
I hope that makes sense.

Unique Columns Across an Array?

I have an array structured like so:
a = [1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 5 5 5 5;
1 1 1 1 2 2 2 2 2 2 2 1 1 1 1 2 2 3 3 1 1 1 2 3 4 4 4 1 1 1 1 2 2 3 3];
Pretty much, it's a 2 by n (I simplified my matrix in this question with reduced number of columns for simplicity's sake), no real pattern. I want to be able to find the unique number of columns. So in this simplified example, I can (but it'll take a while) count by hand and noticed that my unique matrix b is:
b= 1 1 2 2 2 3 3 3 3 4 5 5
1 2 1 2 3 1 2 3 4 1 2 3
In MATLAB, I can do something like
size(b,2)
To get the number of unique columns. In this example
size(b,2) = 12
My question is, how do I go from matrix a to matrix b so that I can do this computationally for very large n dimensional matrices that I have?
Use unique:
a = [1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 5 5 5 5;
1 1 1 1 2 2 2 2 2 2 2 1 1 1 1 2 2 3 3 1 1 1 2 3 4 4 4 1 1 1 1 2 2 3 3];
% Transpose to leverage the rows flag, then transpose back
b = unique(a.', 'rows').';
Which returns:
b =
1 1 2 2 2 3 3 3 3 4 5 5
1 2 1 2 3 1 2 3 4 1 2 3

how to vectorize the following for loop?

can any one help me to Vectorized this loop.
i have large Matrix and i want to replace all the pixel values whose length is less then some threshold Value For simplicity lets say
a = randi([1 5],10,10);
for i = 1:length(a)
someMat=a(a==i);
if length(someMat)<20
a(a==i)=0;
end
end
but its killing me.
Example:
a = randi([1 5],10,10)
a =
5 2 1 5 5 5 2 2 3 2
3 3 5 4 4 4 3 1 1 5
5 1 3 5 3 3 4 1 3 1
3 1 5 3 2 5 1 1 5 1
1 1 4 3 4 3 4 4 5 1
1 4 3 5 1 1 2 2 2 1
3 3 5 2 4 1 1 3 2 4
4 1 5 3 4 5 3 4 3 3
5 3 5 5 4 3 1 3 4 1
4 1 1 3 5 5 1 3 3 5
Result for Thresold 20
5 0 1 5 5 5 0 0 3 0
3 3 5 0 0 0 3 1 1 5
5 1 3 5 3 3 0 1 3 1
3 1 5 3 0 5 1 1 5 1
1 1 0 3 0 3 0 0 5 1
1 0 3 5 1 1 0 0 0 1
3 3 5 0 0 1 1 3 0 0
0 1 5 3 0 5 3 0 3 3
5 3 5 5 0 3 1 3 0 1
0 1 1 3 5 5 1 3 3 5
length of pixel 4 was 17
length of pixel 2 was 10
i try it by some thing like
[nVal Index] = histc(a(:),unique(a)); %
nVal(nVal>20) = 1; % just some threshold value and assigning by some Number may be zero as well
But I dont Know how to replace the Index Values of the corresponding Pixal and apply reshape to get it in original form. Here Even i am not sure that i will get the same Matrix With Reshape . Please Help me.....
thanks
I think this does what you want:
threshold_length = 20;
replace_value = 0;
u = unique(a); %// values of a
h = histc(a(:), u); %// count for each value
r = u(h<threshold_length); %// values to be removed
a(ismember(a,r)) = replace_value; %// remove those values
I see #LuisMendo arrived at mostly the same solution quicker than I did, but an alternative to using ismember is to use more of what unique gives you:
threshold = 20;
[vals, ~, ix] = unique(a); % capture the values and their indices
counts = histc(a(:), vals); % count the occurrences of each value
vals(counts<threshold) = 0; % zero the values that aren't common enough
a(:) = vals(ix); % recreate the matrix with updated values

Comparing adjacent elements in MATLAB

Does anyone know how I can compare the elements in an array with the adjacent elements?
For example, if I have an array:
0 0 0 1 1 1 1 0
0 1 1 1 1 1 1 0
0 1 0 1 1 1 1 0
0 1 1 1 1 1 0 0
0 0 0 0 1 1 1 1
1 1 1 1 1 1 1 1
Is there a way to cycle through each element and perform a logical test of whether the elements around it are equal to 1?
Oops, it looks like someone is doing a homework assignment. Game of life maybe?
There are many ways to do such a test. But learn to do it in a vectorized form. This involves understanding how matlab does indexing, and how the elements of a 2-d array are stored in memory. That will take some time to explain in detail, more than I want to do at this exact moment. I would definitely recommend you learn it though.
Until then, I'll just suggest that if you really are doing the game of life, then the best trick is to use conv2. Thus,
A =[0 0 0 1 1 1 1 0
0 1 1 1 1 1 1 0
0 1 0 1 1 1 1 0
0 1 1 1 1 1 0 0
0 0 0 0 1 1 1 1
1 1 1 1 1 1 1 1];
B = conv2(A,[1 1 1;1 0 1;1 1 1],'same')
B =
1 2 4 4 5 5 3 2
2 2 5 6 8 8 5 3
3 4 8 7 8 7 4 2
2 2 4 5 7 7 6 3
3 5 6 7 7 7 6 3
1 2 2 3 4 5 5 3
Loren has recently posted about this very issue: http://blogs.mathworks.com/loren/2010/01/19/mathematical-recreations-tweetable-game-of-life/ - lots of interesting things can be learned by studying the code in that post and its comments

Resources