Update dynamically headers of DataTable in Matlab - arrays

Is there an efficient way of updating dynamically the headers of a Table after performing T = array2table(P) on a matrix P of dimension e.g.(1x120) rather than manually resorting to T.Properties.VariableNames{i} = xxx for I belonging to [2;120]?
As Matlab assigns by default {Var1,....Var120} on T, the idea is to iterate over each column header of T and assign then a new name = T_Var2,....T_Var120} where T represents the table name
Once unique column headers is performed on T, then outerjoin could be used with other tables (using the expected solution) as they will share a unique key on Var1. A feedback would be very appreciated.

You can define the headers when you call array2table.
Here I first create all of the names (use your desired table name in place of 'Table_Var' and then assign when creating the table.
% Create unique headers, 'Table_Var1', 'Table_Var2', 'Table_Var3', ...
headers = arrayfun( #(x) sprintf('Table_Var%d',x), 1:size(P,2), 'uni', 0 );
% Assign headers when creating the table
T = array2table( P, 'VariableNames', headers );
If you wanted column 1 to always have the same name, say 'Col1', and only columns 2:end to use the table name, you could use:
headers = ['Col1', arrayfun( #(x) sprintf('Table_Var%d',x), 2:size(P,2), 'uni', 0 ) ]
Once you have a cell array of headers, you can also just rename the table variable names in one shot (you don't need to do this if you did it at the array2table stage)
T.Properties.VariableNames = headers;

Related

Convert string to variable name in Lua

In Lua, I have a set of tables:
Column01 = {}
Column02 = {}
Column03 = {}
ColumnN = {}
I am trying to access these tables dynamically depending on a value. So, later on in the programme, I am creating a variable like so:
local currentColumn = "Column" .. variable
Where variable is a number 01 to N.
I then try to do something to all elements in my array like so:
for i = 1, #currentColumn do
currentColumn[i] = *do something*
end
But this doesn't work as currentColumn is a string and not the name of the table. How can I convert the string into the name of the table?
If I understand correctly, you're saying that you'd like to access a variable based on its name as a string? I think what you're looking for is the global variable, _G.
Recall that in a table, you can make strings as keys. Think of _G as one giant table where each table or variable you make is just a key for a value.
Column1 = {"A", "B"}
string1 = "Column".."1" --concatenate column and 1. You might switch out the 1 for a variable. If you use a variable, make sure to use tostring, like so:
var = 1
string2 = "Column"..tostring(var) --becomes "Column1"
print(_G[string2]) --prints the location of the table. You can index it like any other table, like so:
print(_G[string2][1]) --prints the 1st item of the table. (A)
So if you wanted to loop through 5 tables called Column1,Column2 etc, you could use a for loop to create the string then access that string.
C1 = {"A"} --I shorted the names to just C for ease of typing this example.
C2 = {"B"}
C3 = {"C"}
C4 = {"D"}
C5 = {"E"}
for i=1, 5 do
local v = "C"..tostring(i)
print(_G[v][1])
end
Output
A
B
C
D
E
Edit: I'm a doofus and I overcomplicated everything. There's a much simpler solution. If you only want to access the columns within a loop instead of accessing individual columns at certain points, the easier solution here for you might just be to put all your columns into a bigger table then index over that.
columns = {{"A", "1"},{"B", "R"}} --each anonymous table is a column. If it has a key attached to it like "column1 = {"A"}" it can't be numerically iterated over.
--You could also insert on the fly.
column3 = {"C"}
table.insert(columns, column3)
for i,v in ipairs(columns) do
print(i, v[1]) --I is the index and v is the table. This will print which column you're on, and get the 1st item in the table.
end
Output:
1 A
2 B
3 C
To future readers: If you want a general solution to getting tables by their name as a string, the first solution with _G is what you want. If you have a situation like the asker, the second solution should be fine.

Counting rows in a table based on multiple array criterias

I am trying to count rows in a table based on multiple criteria in different columns of that table. The criteria are not directly in the formula though; they are arrays which I would like to refer to (and not list them in the formula directly).
Range table example:
Group Status
a 1
b 4
b 6
c 4
a 6
d 5
d 4
a 2
b 2
d 3
b 2
c 1
c 2
c 1
a 4
b 3
Criteria/arrays example:
group
a
b
status
1
2
3
I am able to do this if i only have one array search through a range (1 column) in that table:
{=SUM(COUNTIFS(data[Group],group[group]))}
Returns "9" as expected (=sum of rows in the group column of the data table which match any values in group[group])
But if I add a second different range and a different array I get an incorrect result:
{=SUM(COUNTIFS(data[Group],group[group], data[Status],status[status]))}
Returns "3" but should return "5" (=sum of rows which have 1, 2 or 3 in the status column and a or b in the group column)
I searched and googled for various ideas related to using sumproduct or defining arrays within the formula instead of classifying the whole formula as an array but I was not able to get expected results via those means.
Thank you for your help.
Because your group and status criteria are a different number of values (2 values for group, but 3 values for status), I'm not sure you can do this in a single formula. Best way I know of to do this would be to use a helper column (which can be hidden if preferred).
Put this array formula in a helper column and copy down the length of your data (array formulas must be confirmed with Ctrl+Shift+Enter):
=AND(OR(data[#Group]=group[group]),OR(data[#Status]=status[status]))
And then get the count with: =COUNTIF(helpercolumn,TRUE)
You could use a slightly different approach, using Power Query / Power Pivot.
Name your tables Data, Group and Status, then create the following query, named Filtered Data:
let
tbData = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
tbGroup = Excel.CurrentWorkbook(){[Name="Group"]}[Content],
tbStatus = Excel.CurrentWorkbook(){[Name="Status"]}[Content],
#"Merged Group" = Table.NestedJoin(tbData,{"Group"},tbGroup,{"Group"},"tbGroup",JoinKind.Inner),
#"Merged Status" = Table.NestedJoin(#"Merged Group",{"Status"},tbStatus,{"Status"},"Merged Status",JoinKind.Inner),
#"Removed Columns" = Table.RemoveColumns(#"Merged Status",{"tbGroup", "Merged Status"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"Status", type number}})
in
#"Changed Type"
Load To as connection only, and tick Load to Data Model
Now create a DAX measure:
Status Sum:=SUM ( 'Filtered Data'[Status] )
You can then use the following formula on your worksheet, to get the Sum of Status values, for rows matching the criteria specified in the Group and Status tables:
=CUBEVALUE("ThisWorkbookDataModel","[Measures].[Status Sum]")
Simply refresh the data connection to update the value.

MATLAB Extract all rows between two variables with a threshold

I have a cell array called BodyData in MATLAB that has around 139 columns and 3500 odd rows of skeletal tracking data.
I need to extract all rows between two string values (these are timestamps when an event happened) that I have
e.g.
BodyData{}=
Column 1 2 3
'10:15:15.332' 'BASE05' ...
...
'10:17:33:230' 'BASE05' ...
The two timestamps should match a value in the array but might also be within a few ms of those in the array e.g.
TimeStamp1 = '10:15:15.560'
TimeStamp2 = '10:17:33.233'
I have several questions!
How can I return an array for all the data between the two string values plus or minus a small threshold of say .100ms?
Also can I also add another condition to say that all str values in column2 must also be the same, otherwise ignore? For example, only return the timestamps between A and B only if 'BASE02'
Many thanks,
The best approach to the first part of your problem is probably to change from strings to numeric date values. In Matlab this can be done quite painlessly with datenum.
For the second part you can just use logical indexing... this is were you put a condition (i.e. that second columns is BASE02) within the indexing expression.
A self-contained example:
% some example data:
BodyData = {'10:15:15.332', 'BASE05', 'foo';...
'10:15:16.332', 'BASE02', 'bar';...
'10:15:17.332', 'BASE05', 'foo';...
'10:15:18.332', 'BASE02', 'foo';...
'10:15:19.332', 'BASE05', 'bar'};
% create column vector of numeric times, and define start/end times
dateValues = datenum(BodyData(:, 1), 'HH:MM:SS.FFF');
startTime = datenum('10:15:16.100', 'HH:MM:SS.FFF');
endTime = datenum('10:15:18.500', 'HH:MM:SS.FFF');
% select data in range, and where second column is 'BASE02'
BodyData(dateValues > startTime & dateValues < endTime & strcmp(BodyData(:, 2), 'BASE02'), :)
Returns:
ans =
'10:15:16.332' 'BASE02' 'bar'
'10:15:18.332' 'BASE02' 'foo'
References: datenum manual page, matlab help page on logical indexing.

How can I match multiple contents of an array to a separate single array?

I have a .csv file where multiple postcodes (characters and numbers) correspond to a unique ID number (also characters and numbers).
e.g
BS2 9TL, E00073143
BS2 9TB, E00073143
BS2 9XJ, E00073143
BS2 8AT, E00073144
BS2 8TY, E00073144
BS2 8UA, E00073144
BS2 8UG, E00073144
I need to create a new array for each unique ID number that stores the respective postcodes. The amount of postcodes for each ID number is not the same every time.
The file contains 9010 postcodes and 1258 ID numbers.
Can anyone show me how to go about doing this?
PCs=importdata('PostalCodes.csv'); %// import data
PostalCodes = cell(numel(PCs,2)); %// create storage
IDs = cell(numel(PCs,2));
for ii = 1:numel(PCs)
tmp = strsplit(PCs{ii,1}, ','); %// split on comma
PostalCodes{ii,1} = tmp{1};
IDs{ii,1} = tmp{2};
end
[IDs,idx] = sort(IDs); %// sort on ID
PostalCodes = PostalCodes(idx); %// sort PCs the same way
PostalCodes = cell2mat(PostalCodes); %// go to matrix
[IdNums,~,tmp2] = unique(IDs); %// get unique IDs
tmp3 = [1; find(diff(tmp2)); numel(IDs)]; %// create index array
for ii = 1:numel(tmp3)-1;
PostalCode(ii).IDs = PostalCodes(tmp3(ii):tmp3(ii+1),:); %// store in struct
end
You don't actually want separate arrays, because that's very bad practise, so I've put everything in a structure for you. You can now access the structure by simply typing:
PostalCode(1).IDs(2,:)
ans =
BS2 9TL
where the (1) beteen PostalCode and IDs corresponds to the ID (which is found in IdNums), and the (2,:) plucks out the second postal code corresponding to ID IdNums(1).
You could use an array of structs
[x,y]=textread('/tmp/file.csv' , '%s %s','delimiter',',')
csv=[x,y]
values=struct('key',{},'value',{})
keys= unique(csv(:,2));
for i = 1:length(keys)
values(i).key=keys{i}
values(i).value=csv( strcmp( csv(:,2) , keys{i}),1)
end
Tested this using octave. On matlab you could use a map container instead of key/value structs for direct access via id's

Web2py: Write SQL query: "SELECT x FROM y" using DAL, when x and y are variables, and convert the results to a list?

My action passes a list of values from a column x in table y to the view. How do I write the following SQL: SELECT x FROM y, using DAL "language", when x and y are variables given by the view. Here it is, using exequtesql().
def myAction():
x = request.args(0, cast=str)
y = request.args(1, cast=str)
myrows = db.executesql('SELECT '+ x + ' FROM '+ y)
#Let's convert it to the list:
mylist = []
for row in myrows:
value = row #this line doesn't work
mylist.append(value)
return (mylist=mylist)
Also, is there a more convenient way to convert that data to a list?
First, note that you must create table definitions for any tables you want to access (i.e., db.define_table('mytable', ...)). Assuming you have done that and that y is the name of a single table and x is the name of a single field in that table, you would do:
myrows = db().select(db[y][x])
mylist = [r[x] for r in myrows]
Note, if any records are returned, .select() always produces a Row object, which comprises a set of Row objects (even if only a single field was selected). So, to extract the individual values into a list, you have to iterate over the Rows object and extract the relevant field from each Row object. The above code does so via a list comprehension.
Also, you might want to add some code to check whether db[y] and db[y][x] exist.

Resources