Finding an element of a structure based on a field value - arrays

I have a 1x10 structure array with plenty of fields and I would like to remove from the struct array the element with a specific value on one of the field variables.
I know the value im looking for and the field I should be looking for and I also know how to delete the element from the struct array once I find it. Question is how(if possible) to elegantly identify it without going through a brute force solution ie a for-loop that goes through elements of the struct array to compare with the value I m looking for.
Sample code: buyers as 1x10 struct array with fields:
id,n,Budget
and the variable to find in the id values like id_test = 12

You can use the fact that if you have an array of structs, and you use the dot referencing, this creates a comma-separated list. If you enclose this in [] it will attempt to create an array and if you enclose it in {} it will be coerced into a cell array.
a(1).value = 1;
a(2).value = 2;
a(3).value = 3;
% Into an array
[a.value]
% 1 2 3
% Into a cell array
{a.value}
% [1] [2] [3]
So to do your comparison, you can convert the field you care about into either an array of cell array to do the comparison. This comparison will then yield a logical array which you can use to index into the original structure.
For example
% Some example data
s = struct('id', {1, 2, 3}, 'n', {'a', 'b', 'c'}, 'Budget', {100, 200, 300});
% Remove all entries with id == 2
s = s([s.id] ~= 2);
% Remove entries that have an id of 2 or 3
s = s(~ismember([s.id], [2 3]));
% Find ones with an `n` of 'a' (uses a cell array since it's strings)
s = s(ismember({s.id}, 'a'));

Related

How to convert two associated arrays so that elements are evenly distributed?

There are two arrays, an array of images and an array of the corresponding labels. (e.g pictures of figures and it's values)
The occurrences in the labels are unevenly distributed.
What I want is to cut both arrays in such a way, that the labels are evenly distributed. E.g. every label occurs 2 times.
To test I've just created two 1D arrays and it was working:
labels = np.array([1, 2, 3, 3, 1, 2, 1, 3, 1, 3, 1,])
images = np.array(['A','B','C','C','A','B','A','C','A','C','A',])
x, y = zip(*sorted(zip(images, labels)))
label = list(set(y))
new_images = []
new_labels = []
amount = 2
for i in label:
start = y.index(i)
stop = start + amount
new_images = np.append(new_images, x[start: stop])
new_labels = np.append(new_labels, y[start: stop])
What I get/want is this:
new_labels: [ 1. 1. 2. 2. 3. 3.]
new_images: ['A' 'A' 'B' 'B' 'C' 'C']
(It is not necessary, that the arrays are sorted)
But when I tried it with the right data (images.shape = (35000, 32, 32, 3), labels.shape = (35000)) I've got an error:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
This does not help me a lot:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
I think that my solution is quite dirty anyhow. Is there a way to do it right?
Thank you very much in advance!
When your labels are equal, the sort function tries to sort on the second value of the tuples it has as input, since this is an array in the case of your real data, (instead of the 1D data), it cannot compare them and raises this error.
Let me explain it a bit more detailed:
x, y = zip(*sorted(zip(images, labels)))
First, you zip your images and labels. What this means, is that you create tuples with the corresponding elements of images and lables. The first element from images by the first element of labels, etc.
In case of your real data, each label is paired with an array with shape (32, 32, 3).
Second you sort all those tuples. This function tries first to sort on the first element of the tuple. However, when they are equal, it will try to sort on the second element of the tuples. Since they are arrays it cannot compare them en throws an error.
You can solve this by explicitly telling the sorted function to only sort on the first tuple element.
x, y = zip(*sorted(zip(images, labels), key=lambda x: x[0]))
If performance is required, using itemgetter will be faster.
from operator import itemgetter
x, y = zip(*sorted(zip(images, labels), key=itemgetter(0)))

An array of arrays of different sizes

I'm learning R and I'd like to make an "array of arrays" (not sure if the expression is correct) inserting for example these values
N_seq = c(10,50,100,500,1000)
inside this function (not correct):
x = rnorm(N_seq,3.2,1)
The desired result should be like an object made by five arrays (as length(N_seq) = 5) where each one is equal to the result of x inserting each value of N_seq (so that x[1] has the values of rnorm(N_seq[i], 3.2, 1) with length 10, and x[2] has the values rnorm(N_seq[2], 3.2, 1) with length 50, etc.
For ragged array, use "list". This is a special type of "vector" in R. You can not only hold vectors of difference length in each list element, but also different type of objects for each list element.
The lapply function for "list apply" is frequently used to process a list and / or return a list. For your task, you can do:
lapply(N_seq, FUN = rnorm, mean = 3.2, sd = 1)
lapply applies function FUN to each vector elements of N_seq, where mean = 3.2 and sd = 1 are additional parameters passed to FUN, which is rnorm here.

Modify struct array and return struct array

I have a struct array: a 1x10 struct array with fields: N, t, q, r, T, each of which is a vector of type double.
The 10 array entries each represent the outcome of a testing condition in an experiment. I would like to be able to make a function that takes two indices, index1 and index2, and modifies the constituent N, t, q, r vectors (T is a single number) so that they become length index1:index2. Something like
function sa = modifier(struct_array, index1, index2)
sa = structfun(#(x) x(index1:index2), struct_array, 'UniformOutput', false)
stuff
end
Now, where stuff is, I've tried using structfun and cellfun, see here except that those return a struct and a cell array, respectively, whereas I need to return a struct array.
The purpose of this is to be able to get certain sections of the experimental results, e.g. maybe the first five entries in each vector inside each cell correspond to the initial cycles of the experiment.
Please let me know if this is possible, and how I might go about it!
You can try this:
From this question's answer, I figured out how to loop through struct fields. I modified the code to address your question by extracting a subsample from each field that goes through the for loop and then copy the desired subset of that data into a new struct array with identically named fields.
% Define indexes for extraction
fieldsToTrim = {'a' 'b'};
idx = 2:3; % Create index vector for extracting selected data range
% Define test struct to be read
teststruct.a = [1 2 3];
teststruct.b = [4 5 6];
teststruct.c = [7 8 9];
% Get names of struct fields
fields = fieldnames(teststruct);
% Loop through each field and extract the subset
for i = 1:numel(fields)
if max(strcmp(fields{i},fieldsToTrim)) > 0
% If current matches one of the fields selected for extraction
% extract subset
teststructResults.(fields{i}) = teststruct.(fields{i})(idx);
else
% Else, copy all contents on field to resulting struct
teststructResults.(fields{i}) = teststruct.(fields{i});
end
end
Finally, to turn this into a function, you can modify the above code to this:
function teststructResults = extractSubsetFromStruct(teststruct,fieldsToTrim,idx1, idx2)
% idx1 and idx2 are the start and end indicies of the desired range
% fieldsToTrim is a string array of the field names you want
% included in the trimming, all other fields will be fully copied
% teststruct is your input structure which you are extracting the
% subset from
% teststructResults is the output containing identically named
% struct fields to the input, but only containing data from the selected range
idx = idx1:idx2; % Create index vector for extracting selected data range
% Get names of struct fields
fields = fieldnames(teststruct);
% Loop through each field and extract the subset
for i = 1:numel(fields)
if max(strcmp(fields{i},fieldsToTrim)) > 0
% If current matches one of the fields selected for extraction
% extract subset
temp = teststruct.(fields{i});
teststructResults.(fields{i}) = temp(idx);
else
% Else, copy all contents on field to resulting struct
teststructResults.(fields{i}) = teststruct.(fields{i});
end
end
end
I successfully ran the function like this:
teststruct =
a: [1 2 3]
b: [4 5 6]
c: [7 8 9]
>> extractSubsetFromStruct(teststruct,{'a' 'b'},2,3)
ans =
a: [2 3]
b: [5 6]
c: [7 8 9]

Get indices of string occurrences in cell-array

I have a cell array that contains a long list of strings. Most of the strings are in duplicates. I need the indices of instances of a string within the cell array.
I tried the following:
[bool,ind] = ismember(string,var);
Which consistently returns scalar ind while there are clearly more than one index for which the contents in the cell array matches string.
How can I have a list of indices that points to the locations in the cell array that contains string?
As an alternative to Divakar's comment, you could use strcmp. This works even if some cell doesn't contain a string:
>> strcmp('aaa', {'aaa', 'bb', 'aaa', 'c', 25, [1 2 3]})
ans =
1 0 1 0 0 0
Alternatively, you can ID each string and thus have representative numeric arrays corresponding to the input cell array and string. For IDing, you can use unique and then use find as you would with numeric arrays. Here's how you can achieve that -
var_ext = [var string]
[~,~,idx] = unique(var_ext)
out = find(idx(1:end-1)==idx(end))
Breakdown of the code:
var_ext = [var string]: Concatenate everything (string and var) into a single cell array, with the string ending up at the end (last element) of it.
[~,~,idx] = unique(var_ext): ID everything in that concatenated cell array.
find(idx(1:end-1)==idx(end)): idx(1:end-1) represents the numeric IDs for the cell array elements and idx(end) would be the ID for the string. Compare these IDs and use find to pick up the matching indices to give us the final output.
Sample run -
Inputs:
var = {'er','meh','nop','meh','ya','meh'}
string = 'meh'
Output:
out =
2
4
6
regexp would solve this problem better and the easy way.
string = ['my' 'bat' 'my' 'ball' 'my' 'score']
expression = ['my']
regexp(string,expresssion)
ans = 1 6 12

How do concatenation and indexing differ for cells and arrays in MATLAB?

I am a little confused about the usage of cells and arrays in MATLAB and would like some clarification on a few points. Here are my observations:
An array can dynamically adjust its own memory to allow for a dynamic number of elements, while cells seem to not act in the same way:
a=[]; a=[a 1]; b={}; b={b 1};
Several elements can be retrieved from cells, but it doesn't seem like they can be from arrays:
a={'1' '2'}; figure; plot(...); hold on; plot(...); legend(a{1:2});
b=['1' '2']; figure; plot(...); hold on; plot(...); legend(b(1:2));
%# b(1:2) is an array, not its elements, so it is wrong with legend.
Are these correct? What are some other different usages between cells and array?
Cell arrays can be a little tricky since you can use the [], (), and {} syntaxes in various ways for creating, concatenating, and indexing them, although they each do different things. Addressing your two points:
To grow a cell array, you can use one of the following syntaxes:
b = [b {1}]; % Make a cell with 1 in it, and append it to the existing
% cell array b using []
b = {b{:} 1}; % Get the contents of the cell array as a comma-separated
% list, then regroup them into a cell array along with a
% new value 1
b{end+1} = 1; % Append a new cell to the end of b using {}
b(end+1) = {1}; % Append a new cell to the end of b using ()
When you index a cell array with (), it returns a subset of cells in a cell array. When you index a cell array with {}, it returns a comma-separated list of the cell contents. For example:
b = {1 2 3 4 5}; % A 1-by-5 cell array
c = b(2:4); % A 1-by-3 cell array, equivalent to {2 3 4}
d = [b{2:4}]; % A 1-by-3 numeric array, equivalent to [2 3 4]
For d, the {} syntax extracts the contents of cells 2, 3, and 4 as a comma-separated list, then uses [] to collect these values into a numeric array. Therefore, b{2:4} is equivalent to writing b{2}, b{3}, b{4}, or 2, 3, 4.
With respect to your call to legend, the syntax legend(a{1:2}) is equivalent to legend(a{1}, a{2}), or legend('1', '2'). Thus two arguments (two separate characters) are passed to legend. The syntax legend(b(1:2)) passes a single argument, which is a 1-by-2 string '12'.
Every cell array is an array! From this answer:
[] is an array-related operator. An array can be of any type - array of numbers, char array (string), struct array or cell array. All elements in an array must be of the same type!
Example: [1,2,3,4]
{} is a type. Imagine you want to put items of different type into an array - a number and a string. This is possible with a trick - first put each item into a container {} and then make an array with these containers - cell array.
Example: [{1},{'Hallo'}] with shorthand notation {1, 'Hallo'}

Resources