This is a follow up question to convert 4-dimensional array to 2-dimensional data set in R which was answered by #Ben-Bolker.
I have a 3D array called 'y' with dimensions [37,29,2635] (i.e. firms, years, class). Using Ben's formula:
avm11<-matrix(aperm(y,c(1,3,2)),prod(dim(y)[c(1,3)]))
I managed to convert it to a 2D array with dimensions [37*2635,29]. However, the row names have become meaningless numbers and I'd need to generate row names during the permutation so that I'd get 97495 unique row names of the type firm_class.
I've been trying to do so via paste0() but I'm doing something wrong. Any suggestions?
You could use seq_len() for this. E.g.
rownames(avm11)<-paste(seq_len(length(avm11[,1])),"observation",sep=" ")
To make the rownames dependable on the variable firm_class:
#Create data frame:
df<-data.frame(X=c(0,1,2,3,4,5),Y=c("a","b","c","d","e","f"))
n<-length(df[,1])
#Generate names containing class name and unique number:
namevector<-sapply(seq_len(n),function(i) names_vector[i]<-paste("Class", df[i,1], i,sep="."))
#Equate rownames to generated names:
rownames(df)<-namevector
Related
The picture gives a schematic representation of the data. Columns up to 17th in the actual spreadsheet is the raw data entries.
First I sort Brand according to SUMIF and add the Brand Month Sorted into an array - ArrayCurrentMonth.
Then I add individual categories into Specialty Array.
Now what I want is to countifs number of entries with COUNTIFS formula for columns that are schematically pictured 9-12 using ArrayCurrentMonth and Specialty as Range arguments in SUM(COUNTIFS(.. function
Thus 10-12 - ArrayCurrentMonth transposed ;
Column 9 - Specialty Values
Rammaged through earlier Q&A's and found useful variants, that seems to work for others, but for me on Transpose I stubbornly get Sub or Function Not defined no matter how i use it - with Range or with Array.
ActiveSheet.Range(X;Y) = Application.WorksheetFunction.Sum(Application.WorksheetFunction.CountIfs(Range("C:C"), RTrim(Month(Mesyaz1)), Range("H:H"), "Headephones", Transpose(Range("F:F"), ArrayCurrentMonth(c)), Range("L:L"), Specialty(g)))
or
ActiveSheet.Range(X;Y) = Application.WorksheetFunction.Sum(Application.WorksheetFunction.CountIfs(Range("C:C"), "Month1", Range("H:H"), "Product1", Transpose(ArrayCurrentMonth), Range("L:L"), Specialty(g)))
or any other way I could think of just to check how it would work.
I would try looping For g=1 etc but I guess I need some Transpose anyhow
Would be grateful for some tips.
I have a 2-d array in numpy. I wish to obtain unique values only in a particular column.
import numpy as np
data = np.genfromtxt('somecsvfile',dtype='str',delimiter=',')
#data looks like
[a,b,c,d,e,f,g],
[e,f,z,u,e,n,c],
...
[g,f,z,u,a,v,b]
Using numpy/scipy only, how do I obtain an array or list of unique values in the 5th column. (I know it can easily be done with pandas.)
The expected output would be 2 values: [e,a]
Correct answer posted. A simple referencing question in essence.
np.unique(data[:, 4])
With thanks.
How do I convert a string from a cell to a text array inside of a function, that should use the array, without using VBA and without adding the array into any other part of the document? It will be one of these arrays on more than 1000 rows. The string format is ^[a-zA-Z0-9.,-]*$ with "," as delimiter.
This is the functionality I would like to achieve
I have an excel table with the following columns
A: ID numbers to compare, separated by comma (delimiter can be changed if needed). About 100 ID's would be good to support at least.
B: ID (Each value on the rows in the column are unique but not sorted and can't be sorted because sorting is needed based on other criterias)
C: Value (Several rows in the column can have the same value)
D: Output the one ID of the comma separated ID's that has the highest value on its row
The problem part of the output
So far I have made a function which find the correct ID in column B based on the values in column C but only if I enter the string from column A as an array constant manually within the function. I have not managed to get the function to create the array itself from column A, which is required.
Working part of the code
In this code I have entered the values from column A manually and this is working as it should.
=INDEX({"1.01-1","1.01-3","1.08-1","1.01-1-1A"},MATCH(MAX(INDEX(C$10:C$20,N(IF(1,MATCH({"1.01-1","1.01-3","1.08-1","1.01-1-1A"},B$10:B$20,0))))),INDEX(C$10:C$20,N(IF(1,MATCH({"1.01-1","1.01-3","1.08-1","1.01-1-1A"},B$10:B$20,0)))),0))
Note that the start row is not the first row and the array is used 3 times in the function.
Code to try to convert the string to a text array
Not working but if wrapped in SUMPRODUCT() it provide an array to the SUMPRODUCT() function, of course not usable since I then can't pass on the array. The background to this code can be found in question Split a string (cell) in Excel without VBA (e.g. for array formula)!.
=TRIM(MID(SUBSTITUTE(A10,",",REPT(" ",99)),(ROW(OFFSET($A$1,,,LEN(A10)-LEN(SUBSTITUTE(A10,",",""))+1))-1)*99+((ROW(OFFSET($A$1,,,LEN(A10)-LEN(SUBSTITUTE(A10,",",""))+1)))=1),99))
The second code output the first item of the array and inserted in the first code do not change this result as it did when wrapping the second code in SUMPRODUCT().
Here is a picture of my simplified test setup in Excel for this case, identical to what is described above.
Simplified test setup
I'm not really sure what you are doing with your formula.
But to convert contents of a cell to a comma separated text array to be used as the array argument to the INDEX or MATCH functions, you can use the FILTERXML function. You'll need to educate yourself about XML and XPATH to understand what's going on, but there are plenty of web resource for this.
For example, with
A10: "1.01-1","1.01-3","1.08-1","1.01-1-1A"
The formula below will return 1.08-1. Note the 3 for the row argument to the INDEX function.
=INDEX(FILTERXML("<t><s>" & SUBSTITUTE(SUBSTITUTE(A10,"""",""), ",", "</s><s>") & "</s></t>", "//s"),3)
My data is the following:
print(xr)
[1] 1.1235685 1.0715964 0.2043725 4.0639341
> class(xr)
[1] "array"
I'm trying to divide the values of all the columns in my array by the value given by the 1st column (ie, 1.1235685). The resulting array would be:
1.000 0.953 0.181 3.616
How would I do this in R, given my R-data object type? The columns do not have names, because of the datatype. (If there's a way I can assign a column names before dividing them, then that's even better.)
I'm new to R, so apologies for the simple question.
Thank you.
Some people already answered this in the comments, but I'll try to provide a more comprehensive one. The code to do what you want is pretty simple.
xr <- array(data = c(1.1235685, 1.0715964, 0.2043725, 4.0639341))
xr/xr[1]
However, if you created that array with only one dimension, I would recommend you use a numeric vector instead, which has no "dim" attribute. You'd create it as follows:
xr <- c(1.1235685, 1.0715964, 0.2043725, 4.0639341))
xr/xr[1]
I am using the following function in matlab:
getgenpept(AccessionNumber)
where the only parameter is a unique identifier. The problem is that I want to have a structure with around 50 different records all based on their unique identifier. Is there a way I can define a structure, and then add in my 50 different records as I go along, or ideally specify a list of identifiers before hand and loop the getgenpept() function in one go?
I want to end up with a cell array that stores each record in their own cell.
Hope this is clear!
If A is a cell array containing all the identifiers, then it's as easy as:
A = {'AAA59174', 'AAA59175','AAA59176'};
B = cellfun(#getgenpept,A);
B(1) is then the record for 'AAA59174', and so on.