How to use CONCATENATEX or similar function to get Distinct ID with many types in POWER BI - concatenation

I tried classic formulation of CONCATENATEX, but it didnt work.
I need calculate this table
ID TYPE
1 A
1 B
2 B
3 A
4 A
4 A
4 A
4 C
4 D
4 E
5 B
5 B
6 A
7 A
7 B
7 C
8 B
8 B
9 D
10 A
10 A
10 D
to this table
ID TYPES
1 A,B
2 B
3 A
4 A,A,A,C,D,E
5 B,B
6 A
7 A,B,C
8 B,B
9 D
10 A,A,D
Looked for answer in exceltown, didnt help.
kombi = CONCTENATEX(TABLE;TYPE;"+")
I expect result A+B or A+A+A, or A+C, but results be like
A+A+A+A+A+B+B+B+B+B+C+C+C+C+B+B+B+B+B+A+A+A+A++D+D+D+

You can use the below power query to get the desired result.
let
Source = Excel.Workbook(File.Contents("c:\Desktop\stac.xlsx"), null, true),
Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]),
#"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",{{"TYPE", type text}, {"ID", type text}}),
#"Changed Type" = Table.TransformColumnTypes(#"Changed Type1",{{"ID", type text}, {"TYPE", type text}}),
#"Grouped Rows1" = Table.Group(#"Changed Type", {"ID"}, {{"All Rows", each _, type table [ID=text, TYPE=text]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows1", "Custom", each [All Rows][TYPE]),
#"Extracted Values" = Table.TransformColumns(#"Added Custom", {"Custom", each Text.Combine(List.Transform(_, Text.From), "+"), type text}),
#"Removed Columns" = Table.RemoveColumns(#"Extracted Values",{"All Rows"})
in
#"Removed Columns"

Related

Cumulative Sum from a range identified based on Vlookup

In Excel sheet 1, I have the following data:
A B C D E F G
------------------------------
Name1 1 2 3 4 5 6
Name2 2 9 3 8 4 7
Name3 4 6 0 3 2 1
In Excel sheet 2, I have to calculate cumulative sum based on values in sheet 1
For example,
A B C D E F G
------------------------------
Name1 1 3 6 10 15 21
While I can calculate cumulative sum easily, I do not know how to select the correct range of cells from sheet 1, by searching for 'Name1'
You need a SUMPRODUCT with both relative and absolute column/row cell references.
=SUMPRODUCT(($A2:INDEX($A:$A,MATCH(1E+99,$B:$B))=$I5)*($B2:INDEX(B:B,MATCH(1E+99, B:B))))

Create result sets from matches across 2 tables

Given a set of results in table 1
Col 1 Col 2 Col 3 Result
A B C 1
A B D 2
A B D 3
A B E 4
A B E 5
A B F 6
and a set of conditions in table 2
Col 1 Col 2 Col 3
A B C
A B D
A B E
how do I return a table of 'Result Sets' (or grouped results) in T-SQL to identify all the possible combinations of results from table 1 where all conditions in table 2 are met?
Result Set Result
1 1
1 2
1 4
2 1
2 2
2 5
3 1
3 3
3 4
4 1
4 3
4 5
EDIT: To clarify, the 'Result Set' value in the output table would be generated in the T-SQL to identify each set of results.

Calculating difference between both adjacent and non-adjacent pairs using multiple index vectors

I have three numerical vectors containing position values (pos), a category (type), and an index (ind), in these general forms:
pos =
2 4 5 11 1 5 8 11 12 20
type =
1 2 1 2 1 1 2 1 2 3
ind =
1 1 1 1 2 2 2 2 2 2
I want to calculate the difference between values held within pos but only between the same types, and confined to each index. Using the above example:
When ind = 1
The difference(s) between type 1 positions = 3 (5-2).
The difference(s) between type 2 positions = 7 (11-4).
In the case where more than two instances of any given type exist within any index, the differences are calculate sequentially from left to right as shown here:
When ind = 2
The difference(s) between type 1 positions = 4 (5-1), 6 (11-5).
The difference(s) between type 2 positions = 4 (12-8).
Even though index 2 contains type '3', no difference is calculated as only 1 instance of this type is present.
Types are not always only 1, 2 or 3.
Ideally, the desired output would be matrix containing the same number of columns as length(unique(type)) with rows containing all differences calculated for that type. The output does not need to separate by index, only the actual calculation needs to. In this case there are three unique types, so the output would be (labels added for clarity only):
Type 1 Type 2 Type 3
3 7 0
4 4 0
6 0 0
Any empty entries can be padded with zeroes.
Is there a concise or fast manner to do this?
EDIT:
EDIT 2:
Additional input/output example.
pos = [1 15 89 120 204 209 8 43 190 304]
type = [1 1 1 2 2 1 2 3 2 3]
ind = [1 1 1 1 1 1 2 2 2 2]
Desired output:
Type 1 Type 2 Type 3
14 84 261
74 182 0
120 0 0
In this case, the script works perfectly:
At least for creating the output matrix a loop is required:
pos = [2 4 5 11 1 5 8 11 12 20]
type = [1 2 1 2 1 1 2 1 2 3]
ind = [1 1 1 1 2 2 2 2 2 2]
%// get unique combinations of type and ind
[a,~,subs] = unique( [type(:) ind(:)] , 'rows')
%// create differences
%// output is cell array according to a
temp = accumarray(subs,1:numel(subs),[],#(x) {abs(diff(pos(x(end:-1:1))))} )
%// creating output matrix
for ii = 1:max(a(:,1)) %// iterating over types
vals = [temp{ a(:,1) == ii }]; %// differences for each type
out(1:numel(vals),ii) = vals;
end
out =
3 7 0
4 4 0
6 0 0
In case it doesn't work for your real data you may need unique(...,'rows','stable') and a 'stable' accumarray.
It appeared that the above solution gives different results depending on the system.
The only reason, why the code could give different results on different machines, is that accumarray is not "stable" as mentioned above. And in some very rare cases it could return unpredictable results. So please try the following:
pos = [2 4 5 11 1 5 8 11 12 20]
type = [1 2 1 2 1 1 2 1 2 3]
ind = [1 1 1 1 2 2 2 2 2 2]
%// get unique combinations of type and ind
[a,~,subs] = unique( [type(:) ind(:)] , 'rows')
%// take care of unstable accumarray
[~, I] = sort(subs);
pos = pos(I);
subs = subs(I,:);
%// create differences
%// output is cell array according to a
temp = accumarray(subs,1:numel(subs),[],#(x) {abs(diff(pos(x(end:-1:1))))} )
%// creating output matrix
for ii = 1:max(a(:,1)) %// iterating over types
vals = [temp{ a(:,1) == ii }]; %// differences for each type
out(1:numel(vals),ii) = vals;
end
out =
3 7 0
4 4 0
6 0 0

Filter matrix by multiple column values w/o loops (Matlab)?

Say I have the following:
Data matrix M (m-by-n);
Matching row V (1-by-n);
Matching positions I (1-by-n logical);
I want to filter all rows of M that have the same values as V at the matching positions I. I believe that Matlab indexing if powerful enough to do that without loops. But how?
Current solution: run though all the columns and update the filtered row positions F (m-by-1 logical).
F = true(m,1);
for k = 1:n;
if I(k);
F = F & (M(:,k)==V(k));
end;
end;
M = M(F,:);
Here's one way:
result = M(all(bsxfun(#eq, M(:,I), V(I)), 2), :);
How it works
Each row of M(:,I) is compared element-wise with the row vector V(I) using bsxfun. Rows for which all columns match are selected. The resulting logical vector is used to index the rows of M.
Example
M = [ 8 3 6 9
5 4 9 8
8 9 6 9 ];
I = [ true false true true ];
V = [ 8 1 6 9 ];
>> result = M(all(bsxfun(#eq, M(:,I), V(I)), 2), :)
result =
8 3 6 9
8 9 6 9

How to create equivalence class from an array in matlab

I am working on matlab and have an array:
a =
2 1 5 3 2 1 2 1
You can see there may be one value multiple times. I want a function that will give me an array for each of those value that contains the index(s) of that value in the array as output.
Using the above example, the output would be:
(1 5 7)
(2 6 8)
(3)
(4)
(1 5 7) are the indexes of 2 in the input array. Same happens for 1,5 and 3.
This can be done using for loops etc. I just want to know if there is some in-built function for this in matlab.
**** EDIT ****
There may be two columns as following.
2 1 5 3 2 1 2 1
3 4 3 2 4 4 3 4
In that case output will be
(1 7)
(2 6 8)
(3)
(4)
(5)
Use the third output of unique to get unique labels for each column of a, and then apply accumarray with a custom function:
[~, ~, kk] = unique(a.', 'rows', 'stable'); %'
result = accumarray(kk, (1:numel(kk)).', [], #(x) {sort(x).'});
This works for any number of rows. For your two-row example
a = [2 1 5 3 2 1 2 1
3 4 3 2 4 4 3 4];
the result is
result{1} =
1 7
result{2} =
2 6 8
result{3} =
3
result{4} =
4
result{5} =
5
If element order is not important, you can simplify the code a little:
[~, ~, kk] = unique(a.', 'rows'); %'
result = accumarray(kk, (1:numel(kk)).', [], #(x) {x.'});
which gives
result{1} =
2 8 6
result{2} =
7 1
result{3} =
5
result{4} =
4
result{5} =
3
I don't think there is a buit-in function to do that, but you can do it in one line with unique and arrayfun:
Res = arrayfun(#(x) find(x==a), unique (x, 'stable'), 'UniformOutput', false);
Best,
Extend Ratbert's nice answer to arrive at a more general one which addresses the "edited" request:
[~, ~, J]= unique(a.', 'rows');
Res = cellfun(#str2num,accumarray(J,[1:size(a,2)]',[], #(x)num2str(x),'uni',0);
Use of conversion from double to char and back is clunky, but it works with Octave 3.6.4, and should also work on MATLAB. In MATLAB a more elegant answer with accumarray is likely possible.
Edit: The following is a more elegant answer with accumarray (see the MATLAB documentation for more details) - equivalent to LuisMendo's answer.
[~, ~, J]= unique(a.', 'rows');
Res = accumarray(J,[1:numel(J)]',[],#(x){x});

Resources