How to extract struct element name given part of the name? - arrays

I am currently post-processing a lot of .mat files (300+) created with a long simulation (3days+).
Each of these mat files contain several variables, each named after their position on a grid (yes, they've been created with eval).
I thus created a script that opens each of the files sequentially
s = what(pwd);
s = s.mat;
for i=1:length(s)
data = open(sprintf('%s',s{i,:}));
% here goes what I would like to do
end
What I am trying to do now is to extract the current name of the data component that fits a certain pattern.
Specifically, I know that in data there is a vector named coef_%i_%i and I would like to isolate it and assign it to a multi-dimensional array.
The second part I know how to do it, I can scan the variable name for the _ characters, isolate the integers and assign the vector to its appropriate location in the array.
My question is:
Is there in matlab a way to do something along the lines of vectorname = extractname('data.coef*');?

Assuming you have something like:
clear data
% create a dummy field
name = sprintf ( 'coef_%i_%i', randi(100,[2 1]) );
% a data struct with a field which starts "coef*"
data.(name) = rand;
% and your data field may contain some other fields?
data.other = [];
You can then extract out the fields which contain the coef string
function output = extractname ( data )
%extract the fieldnames from data:
fn = fieldnames(data);
% find all that have coef in them
index = cellfun(#isempty, strfind(fn,'coef'));
% remove any that dont have coef in them
fn(index) = [];
%You are then left with one(or more) that contain coef in the name:
output = fn;
end
If your data struct contains fields which may have "coef" elsewhere in the name you would need to go through them and check if coef is at the start. You should also check that at the end of your function that one and only one field has been found.

Related

Extract Data From NetCDF4 File Using List

I am using a list of integers corresponding to an x,y index of a gridded NetCDF array to extract specific values, the initial code was derived from here. My NetCDF file has a single dimension at a single timestep, which is named TMAX2M. My code written to execute this is as follows (please note that I have not shown the call of netCDF4 at the top of the script):
# grid point lists
lat = [914]
lon = [2141]
# Open netCDF File
fh = Dataset('/pathtofile/temperaturedataset.nc', mode='r')
# Variable Extraction
point_list = zip(lat,lon)
dataset_list = []
for i, j in point_list:
dataset_list.append(fh.variables['TMAX2M'][i,j])
print(dataset_list)
The code executes, and the result is as follows:
masked_array(data=73,mask=False,fill_value=999999,dtype=int16]
The data value here is correct, however I would like the output to only contain the integer contained in "data". The goal is to pass a number of x,y points as seen in the example linked above and join them into a single list.
Any suggestions on what to add to the code to make this achievable would be great.
The solution to calling the particular value from the x,y list on single step within the dataset can be done as follows:
dataset_list = []
for i, j in point_list:
dataset_list.append(fh.variables['TMAX2M'][:][i,j])
The previous linked example contained [0,16] for the indexed variables, [:] can be used in this case.
I suggest converting to NumPy array like this:
for i, j in point_list:
dataset_list.append(np.array(fh.variables['TMAX2M'][i,j]))

How to name variables in a data array using a for loop

I have an array within an array and I am trying to name the variables using a for loop as there are a lot of variables. When I use the following simple code Time1 = dataCOMB{1,1}{1,1}(1:1024, 1); it opens the first cell in an array and proceeds to open the first cell in the following array and finally defines all the values in column 1 rows 1 to 1024 as Time1. However I have 38 of these different sets of data and when I apply the following code:
for t = 1:38
for aa = 1:38
Time(t) = dataCOMB{1,1}{1,aa}(1:1024, 1);
end
end
I get an error
In an assignment A(I) = B, the number of elements in B and I must be the same.
Error in Load_Files_working (line 39)
Time(t) = dataCOMB{1,1}{1,aa}(1:1024, 1);
Basically I am trying to get matlab to call the first column in each data set Time1, Time2, etc.
The problem:
1)You'd want to extract in a cell row...
2) ...the first 1024 numbers in the 1st column...
3) ...from each of the first 38 cells of a cell array.
The plan:
1) If one wants to get info from each element of a cell array (that is, an array accessed via {} indexing), one may use cellfun. Calling cellfun(some_function, a_cell_array) will aggregate the results of some_function(a_cell_array{k}) for all possible k subscripts. If the results are heterogeneous (i.e. not having the same type and size), one may use the cell_fun(..., 'UniformOutput', false) option to put them in an output cell array (cell arrays are good at grouping together heterogeneous data).
2) To extract the first 1024 numbers from the first column of an numeric array x one may use this anonymous function: #(x) x(1:1024,1). The x argument will com from each element of a cell array, and our anonymous function will play the role of some_function in the step above.
3) Now we need to specify a_cell_array, i.e. the cell array that contains the first 38 cells of the target. That would be, simply dataCOMB{1,1}(1,1:38).
The solution:
This one-liner implements the plan:
Time = cellfun(#(x) x(1:1024,1), dataCOMB{1,1}(1,1:38), 'UniformOutput', false);
Then you can access your data as in this example:
this_time = Time{3};
Your error is with Time(t). That's not how you create a new variable in matlab. To do exactly what you want (ie, create variables names Time1, Time2, etc...you'll need to use the eval function:
for aa = 1:38
eval(['Time' num2str(aa) '= dataCOMB{1,1}{1,aa}(1:1024,1);']);
end
Many people do not like recommending the eval function. Others wouldn't recommend moving all of your data out of a data structure and into their own independently-named variables. So, to address these two criticisms, a better alternative might be to pull your data out of your complicated data structure and to put it into a simpler array:
Time_Array = zeros(1024,38);
for aa = 1:38
Time_Array(:,aa) = dataCOMB{1,1}{1,aa}(1:1024,1);
end
Or, if you don't like that because you really like the names Time1, Time2, etc, you could create them as fields to a data structure:
Time_Data = [];
for aa = 1:38
fieldname = ['Time' num2str(aa)];
Time_Data.(fieldname) = dataCOMB{1,1}{1,aa}(1:1024,1);
end
And, in response to a comment below by the original post, this method can be extended to further unpack the data:
Time_Data = [];
count = 0;
for z = 1:2;
for aa = 1:38
count = count+1;
fieldname = ['Time' num2str(count)];
Time_Data.(fieldname) = dataCOMB{1,z}{1,aa}(1:1024,1);
end
end

Accessing data in a nested structure

Okay so I've found a better way of accessing my files but I'm still a bit stuck.
My code so far:
clc % clear window
clear %clear workspace
numfiles = 21;
data = cell(1, numfiles);
obsdata = dir('*.mat');
numfiles = length(obsdata);
data = cell(1, numfiles);
for k = 1:numfiles
data{k} = load(obsdata(k).name);
end
This sorts my data out.
There are 21 cells that contains the J6 - files as shown (the list of the files can be seen on the left):
Clicking each cell brings me to a structure:
Each of which contains data that I want to access.
I'm unsure as to how to go about writing my code so that I can store the data in the last part into two arrays (wavelength and intensity)
Try this. I used deal to copy out each of the obsdata.name's into a cell array called names. (It's probably a poor choice of name for a variable since you already have one called name, but anyway...)
obsdata = dir('*.mat');
numfiles = length(obsdata);
data = cell(1, numfiles);
names = cell(1,numfiles);
names = cell(1,numfiles);
[names{:}] = deal(obsdata.name);
for k = 1:numfiles
data{k} = load(names{k})
end
It'd help to hear a little bit more about how you want to store the data, but generally the work will be done inside the loop you use to load the files. A good starting point might be to get the data out of structs and into cell arrays:
for k = 1:numfiles
data{k} = struct2cell(load(obsdata(k).name));
end
I believe (it's been a while, and I don't have access to matlab anymore) that each of the 21 cells will now contain a cell array, storing the matrices you're interested in. Perhaps this is enough? From this point I think you can access data like this:
data{file_num}{struct_field_num}(x,y)
where x and y are the indices within the matrices that used to be stored as fields in a struct.
If you want to concatenate each of these matrices so that you get 21 cells, each with a single mx2 matrix, you can modify the loop:
for k = 1:numfiles
tmp = load(obsdata(k).name);
data{k} = vertcat(tmp{:});
end
With more information about how you want to structure the data we can refine the answer.

Read multiple values from array dynamically in MATLAB

I have an array of structures.
I'm trying to select several records from the array, that match some condition.
I know there's this option: (For example array A with field f1):
A([A.f1]==5)
Which would return all the records that have f1 = 5.
But I want to do it for several different fields, in a loop. I saved the fields names in a cell array, but I don't know how to do the same with a dynamic field name.
I know there's the 'getfield' function, but it only selects a field from a single structure.
Is there a way to do it?
Thanks!
To access dynamically a field of a structure:
% Create example structure
s.a = 1;
s.b = 2;
% Suppose you retrieve the fieldnames (or hardcode them fnames = {'a','b'})
fnames = fieldnames(s);
The you can retrieve e.g. the second one:
s.(fnames{2})
In a loop
for f = 1:numel(fnames)
s.(fnames{f})
end
In your case:
A([A.(fnames{ii})] == n)
This code will run through the first 5 records of your dynamical names
for i=1:5
eval(['A([A.' cell_array{i} ']==5)'])
end

How to save a structure array to a text file

In MATLAB how do I save a structure array to a text file so that it displays everything the structure array shows in the command window?
I know this thread is old but I hope it's still going to help someone:
I think this is an shorter solution (with the constraint that each struct field can contain scalar,arrays or strings):
%assume that your struct array is named data
temp_table = struct2table(data);
writetable(temp_table,'data.csv')
Now your struct array is stored in the data.csv file. The column names are the field names of a struct and the rows are the different single-structs of your struct-array
You have to define a format for your file first.
Saving to a MATLAB workspace file (.MAT)
If you don't care about the format, and simply want to be able to load the data at a later time, use save, for example:
save 'myfile.mat' structarr
That stores struct array structarr in a binary MAT file named "file.mat". To read it back into your workspace you can use load:
load 'myfile.mat'
Saving as comma-separated values (.CSV)
If you want to save your struct array in a text file as comma-separated value pairs, where each pair contains the field name and its value, you can something along these lines:
%// Extract field data
fields = repmat(fieldnames(structarr), numel(structarr), 1);
values = struct2cell(structarr);
%// Convert all numerical values to strings
idx = cellfun(#isnumeric, values);
values(idx) = cellfun(#num2str, values(idx), 'UniformOutput', 0);
%// Combine field names and values in the same array
C = {fields{:}; values{:}};
%// Write fields to CSV file
fid = fopen('myfile.csv', 'wt');
fmt_str = repmat('%s,', 1, size(C, 2));
fprintf(fid, [fmt_str(1:end - 1), '\n'], C{:});
fclose(fid);
This solution assumes that each field contains a scalar value or a string, but you can extend it as you see fit, of course.
To convert any data type to a character vector as displayed in the MATLAB command window, use the function
str = matlab.unittest.diagnostics.ConstraintDiagnostic.getDisplayableString(yourArray);
You can then write the contents to a file
fid = fopen('myFile.txt', 'w');
fwrite(fid, str, '*char');
fclose(fid);

Resources