Accessing data in a nested structure - arrays

Okay so I've found a better way of accessing my files but I'm still a bit stuck.
My code so far:
clc % clear window
clear %clear workspace
numfiles = 21;
data = cell(1, numfiles);
obsdata = dir('*.mat');
numfiles = length(obsdata);
data = cell(1, numfiles);
for k = 1:numfiles
data{k} = load(obsdata(k).name);
end
This sorts my data out.
There are 21 cells that contains the J6 - files as shown (the list of the files can be seen on the left):
Clicking each cell brings me to a structure:
Each of which contains data that I want to access.
I'm unsure as to how to go about writing my code so that I can store the data in the last part into two arrays (wavelength and intensity)

Try this. I used deal to copy out each of the obsdata.name's into a cell array called names. (It's probably a poor choice of name for a variable since you already have one called name, but anyway...)
obsdata = dir('*.mat');
numfiles = length(obsdata);
data = cell(1, numfiles);
names = cell(1,numfiles);
names = cell(1,numfiles);
[names{:}] = deal(obsdata.name);
for k = 1:numfiles
data{k} = load(names{k})
end

It'd help to hear a little bit more about how you want to store the data, but generally the work will be done inside the loop you use to load the files. A good starting point might be to get the data out of structs and into cell arrays:
for k = 1:numfiles
data{k} = struct2cell(load(obsdata(k).name));
end
I believe (it's been a while, and I don't have access to matlab anymore) that each of the 21 cells will now contain a cell array, storing the matrices you're interested in. Perhaps this is enough? From this point I think you can access data like this:
data{file_num}{struct_field_num}(x,y)
where x and y are the indices within the matrices that used to be stored as fields in a struct.
If you want to concatenate each of these matrices so that you get 21 cells, each with a single mx2 matrix, you can modify the loop:
for k = 1:numfiles
tmp = load(obsdata(k).name);
data{k} = vertcat(tmp{:});
end
With more information about how you want to structure the data we can refine the answer.

Related

MATLAB - repmat values into cell array where individual cell elements have unequal size

I am trying to repeat values from an array (values) to a cell array where the individual elements have unequal sizes (specified by array_height and array_length).
I hope to apply this to a larger data set (containing ~100 x ~100 values) and my current solution is to have a line of code for each value (code example below). Surely there is a better way... Please could someone offer an alternative solution?
C = cell(3,2);
values = rand(3,2);
array_height = randi(10,3,2);
array_length = randi(10,3,2);
C{1,1} = repmat((values(1,1)),[array_height(1,1),array_length(1,1)]);
C{2,1} = repmat((values(2,1)),[array_height(2,1),array_length(2,1)]);
C{3,1} = repmat((values(3,1)),[array_height(3,1),array_length(3,1)]);
C{1,2} = repmat((values(1,2)),[array_height(1,2),array_length(1,2)]);
C{2,2} = repmat((values(2,2)),[array_height(2,2),array_length(2,2)]);
C{3,2} = repmat((values(3,2)),[array_height(3,2),array_length(3,2)]);
If you did this in a for loop, it might look something like this:
for i = 1:size(C,1)
for j = 1:size(C,2)
C{i,j} = repmat(values(i,j),[array_height(i,j),array_length(i,j)]);
end
end
However, if you are trying to generate or use this with a larger dataset, this code snippet likely will take forever! I suspect whatever your overall objective is can be better served by matlab's many optimizations for matrices and vectors, but without more information I can't help more than that.

Store big structure in Matlab

I am fitting a statistical model in matlab using fitglm which returns a structure mdl. I would like to store many such structures in an array of cells to reuse them later but this seems not to work. Here is the code:
models = cell(size(quarterList,1)-lag-1,1);
for i=1:size(quarterList,1)-lag-1
%indicesTemp = find(and(annQuarters(:,2) <= quarterList(i+11,2),annQuarters(:,2) >= quarterList(i,2)));
memberTemp = ismember(annQuarters(:,:), quarterList(i:i+lag,:));
indicesTemp = find(memberTemp(:,2));
fprintf('Perdiod: Q%i %i to Q%i %i - Nb samples: %i \n',annQuarters(i,1),annQuarters(i,2),annQuarters(i+lag,1),annQuarters(i+lag,2),size(indicesTemp,1));
[Xtemp Ytemp] = categorizeVariables(X(indicesTemp,:),Y(indicesTemp,:));
mdl = fitglm(Xtemp,Ytemp-1,'Distribution','binomial', 'Link','logit');
models(i,1) = mdl;
end
Now when I try to assign such structure to a single cell, it works:
temp = cell(1,1);
mdl = fitglm(Xtemp,Ytemp-1,'Distribution','binomial', 'Link','logit');
temp = mdl;
Why is the assignment in the array of cells not working in that case? Any suggestion on how to go around this?
This doesn't work because using models(index) assignment (with ()) assumes that the thing on the right side is a cell. You instead want to use curly brackets which will copy the item on the right (of any type) into the cell array at the specified element.
models{i,1} = mdl;
If you really wanted to use (), you could instead convert the thing on the right to a cell first.
models(i,1) = {mdl};
The reason that your second example (with a scalar cell array) doesn't result in an error is because you aren't putting the output of fitglm into the cell array but rather overwriting the variable temp to point to mdl instead of the cell array.
temp = cell(1,1);
% Check if temp is a cell
iscell(temp)
%// TRUE
mdl = fitglm(Xtemp,Ytemp-1,'Distribution','binomial','Link','logit');
temp = mdl;
% Check if temp is still a cell (it isn't)
iscell(temp)
%// FALSE
All of that aside, you can actually store structs within an array themselves. You don't actually need a cell array unless the fields are different.
for i = 1:N
mdl(i) = fitglm(Xtemp, Ytemp - 1, 'Distribution', 'binomial', 'Link', 'logit');
end

How to save dynamic variable from workspace in a separate file in matlab?

I'm working on a problem where I have an array A of 100 elements.
All these 100 elements are changing with time.
So in my workspace, I only get the final values of all these elements after the entire time cycle has run.
I'm trying to save the values with time in a separate file (.txt or .mat) so that I can access that file in order to check how the variable varies with time.
I'm trying the following command:
save('file.mat','A','-append');
But this command overwrites the existing values in my file.
Kindly suggest me a way to save these values without overwriting them and also guide me how to access them in MATLAB.
You can also change the output filename to be unique for each iteration:
for iter=1:n
A = rand(10);
save(sprintf('file%d.mat',iter), 'A');
end
That way each iteration creates one file.
The reason that saving to a file (even using the -append) flag didn't work is because the variable A already exists in the file and will be over-written each time through the loop. You would need to create a new file or new variable name every time through the loop in order for this to not happen.
Saving the results in a file is probably not the best way to store the time-varying values of A. You would be better off using a cell array to store all intermediate values of A.
A_over_time = cell();
for k = 1:n
%// Get A somehow
A_over_time{k} = A;
end
Depending on what A is, you could also store the values of A in a numeric array or matrix.
%// Using an array
A_over_time = zeros(N, 1);
for k = 1:N
A_over_time(k) = A;
end
%// Using a matrix
A_over_time = zeros(N, numel(A));
for k = 1:N
A_over_time(k,:) = A;
end

How to name variables in a data array using a for loop

I have an array within an array and I am trying to name the variables using a for loop as there are a lot of variables. When I use the following simple code Time1 = dataCOMB{1,1}{1,1}(1:1024, 1); it opens the first cell in an array and proceeds to open the first cell in the following array and finally defines all the values in column 1 rows 1 to 1024 as Time1. However I have 38 of these different sets of data and when I apply the following code:
for t = 1:38
for aa = 1:38
Time(t) = dataCOMB{1,1}{1,aa}(1:1024, 1);
end
end
I get an error
In an assignment A(I) = B, the number of elements in B and I must be the same.
Error in Load_Files_working (line 39)
Time(t) = dataCOMB{1,1}{1,aa}(1:1024, 1);
Basically I am trying to get matlab to call the first column in each data set Time1, Time2, etc.
The problem:
1)You'd want to extract in a cell row...
2) ...the first 1024 numbers in the 1st column...
3) ...from each of the first 38 cells of a cell array.
The plan:
1) If one wants to get info from each element of a cell array (that is, an array accessed via {} indexing), one may use cellfun. Calling cellfun(some_function, a_cell_array) will aggregate the results of some_function(a_cell_array{k}) for all possible k subscripts. If the results are heterogeneous (i.e. not having the same type and size), one may use the cell_fun(..., 'UniformOutput', false) option to put them in an output cell array (cell arrays are good at grouping together heterogeneous data).
2) To extract the first 1024 numbers from the first column of an numeric array x one may use this anonymous function: #(x) x(1:1024,1). The x argument will com from each element of a cell array, and our anonymous function will play the role of some_function in the step above.
3) Now we need to specify a_cell_array, i.e. the cell array that contains the first 38 cells of the target. That would be, simply dataCOMB{1,1}(1,1:38).
The solution:
This one-liner implements the plan:
Time = cellfun(#(x) x(1:1024,1), dataCOMB{1,1}(1,1:38), 'UniformOutput', false);
Then you can access your data as in this example:
this_time = Time{3};
Your error is with Time(t). That's not how you create a new variable in matlab. To do exactly what you want (ie, create variables names Time1, Time2, etc...you'll need to use the eval function:
for aa = 1:38
eval(['Time' num2str(aa) '= dataCOMB{1,1}{1,aa}(1:1024,1);']);
end
Many people do not like recommending the eval function. Others wouldn't recommend moving all of your data out of a data structure and into their own independently-named variables. So, to address these two criticisms, a better alternative might be to pull your data out of your complicated data structure and to put it into a simpler array:
Time_Array = zeros(1024,38);
for aa = 1:38
Time_Array(:,aa) = dataCOMB{1,1}{1,aa}(1:1024,1);
end
Or, if you don't like that because you really like the names Time1, Time2, etc, you could create them as fields to a data structure:
Time_Data = [];
for aa = 1:38
fieldname = ['Time' num2str(aa)];
Time_Data.(fieldname) = dataCOMB{1,1}{1,aa}(1:1024,1);
end
And, in response to a comment below by the original post, this method can be extended to further unpack the data:
Time_Data = [];
count = 0;
for z = 1:2;
for aa = 1:38
count = count+1;
fieldname = ['Time' num2str(count)];
Time_Data.(fieldname) = dataCOMB{1,z}{1,aa}(1:1024,1);
end
end

I am looking for an elegant means of extracting nested data from a MATLAB data structure

Using MATLAB, other than the brute force technique of using nested FOR loops, I am curious if there is a more elegant means of extracting the X & Y data from the sample data structure that I have shown below. I haven't been able to devise an elegant way of doing this in MATLAB using bsxfun, arrayfun, or strucfun.
% Create an example of the input structure that I need to parse
for i =1:100
setName = ['n' num2str(i)];
for j = 1:randi(10,1)
repName = ['n' num2str(j)];
data.sets.(setName).replicates.(repName).X = i + randn();
data.sets.(setName).replicates.(repName).Y = i + randn();
end
end
clearvars -except data
% Brute force technique using nested FOR Loops to extract X & Y from this
% nested structure for easy plotting. Is there a better way to extract the
% X & Y values created above without using FOR loops?
n = 1;
setNames = fieldnames(data.sets);
for i =1:length(setNames)
replicateNames = fieldnames(data.sets.(setNames{i}).replicates);
for j = 1:length(replicateNames)
X(n) = data.sets.(setNames{i}).replicates.(replicateNames{j}).X;
Y(n) = data.sets.(setNames{i}).replicates.(replicateNames{j}).Y;
n = n+1;
end
end
scatter(X,Y);
MATLAB works best with arrays/matrices (be it numeric arrays, struct arrays, cell arrays, object arrays, etc..). The language offers constructs to slice and index into arrays easily.
So the idiomatic way in MATLAB would have been to create a non-scalar structure array, as opposed to a deeply nested structure.
For example lets first convert the nested structure into an 2D array of structures, where the first dimension denotes the "replicates", and the second dimension denotes the "sets":
ds = struct('X',[], 'Y',[]);
sets = fieldnames(data.sets);
for i=1:numel(sets)
reps = fieldnames(data.sets.(sets{i}).replicates);
for j=1:numel(reps)
ds(j,i) = data.sets.(sets{i}).replicates.(reps{j});
end
end
The result is a 10-by-100 structure array, each with two fields X and Y:
>> ds
ds =
10x100 struct array with fields:
X
Y
Accessing data.sets.n99.replicates.n9 in the original structure would be equivalent to ds(9,99) in the new structure.
>> data.sets.n99.replicates.n9
ans =
X: 100.3616
Y: 98.8023
>> ds(9,99)
ans =
X: 100.3616
Y: 98.8023
This new struct has the benefit that it can easily be accessed using array-indexing notation and comma-separated lists. So we can to extract the X and Y vectors like you did simply as:
XX = [ds.X]; % or XX = cat(2, ds.X)
YY = [ds.Y];
scatter(XX, YY, 1)
So if you had control over building the struct, I would design it as described above to begin with. Otherwise the double for-loop in your code with the dynamic field names is the best way to extract the values from it.
You could probably write a bunch of structfun called on each other, but that won't be the most readable code. Here is what I came up with to flatten the nested structure:
D = structfun(#(n) ...
structfun(#(nn) [nn.X nn.Y], n.replicates, 'UniformOutput',false), ...
data.sets, 'UniformOutput',false);
The resulting structure can be accessed with less nested fields:
>> D.n99.n9
ans =
100.3616
98.8023
Slightly better the original one, but still not easily traversed without some for-loops.
Since we often are "given" deeply nested structures from sources we can't control (other business units, customers, etc.), sometimes a baby's gotta do what a baby's gotta do. Here's a hack that seems to work to completely flatten a nested structure. Also posted to here just in case one of these questions gets deleted. . Copyright Carl Witthoft under usual GPL-3 rules.
% struct2sims converter
function simout = struct2sims(structin)
fnam = fieldnames(structin);
for jf = 1:numel(fnam)
subnam = [inputname(1),'_',fnam{jf}];
if isstruct(structin.(fnam{jf}) ) ,
% need to dive; build a new variable that's not a substruct
eval(sprintf('%s = structin.(fnam{jf});', fnam{jf}));
eval(sprintf('simtmp = struct2sims(%s);',fnam{jf}) );
% try removing the struct before getting any farther...
simout.(subnam) = simtmp;
else
% at bottom, ok
simout.(subnam) = structin.(fnam{jf});
end
end
% need to unpack structs here, after each level of recursion
% returns...
subfnam = fieldnames(simout);
for kf = 1:numel(subfnam)
if isstruct(simout.(subfnam{kf}) ),
subsubnam = fieldnames(simout.(subfnam{kf}));
for fk = 1:numel(subsubnam)
simout.([inputname(1),'_',subsubnam{fk}])...
= simout.(subfnam{kf}).(subsubnam{fk}) ;
end
simout = rmfield(simout,subfnam{kf});
end
end
% if desired write to file with:
% save('flattened','-struct','simout');
end

Resources