How to map Huggingface trainer output to actual label - artificial-intelligence

I have a dataset where I calculate one-hot encoded labels for the hugging face trainer. However I have to drop some labels before training, but I don't know which ones exactly. So the resulting label space looks something like this:
{[1,0,0,0], [0,0,1,0], [0,0,0,1]}
Note how [0,1,0,0] is not in the list.
Now, when evaluating the model, it returns an array of probabilities.
Assuming I have hundreds of labels and don't know which ones have been dropped, how do I map the trainer's output (e.g. after softmax) to the correct labels?

Related

Dynamic Spill Formula For Reduce Function Mixed With Split

My sample sheet is probably easier to understand than my writing but here's the issue: I have a sheet that I'm trying to create a spill formula that sums an array of numbers up in each line.
Columns B:D is my existing data that's being evaluated
If values exist in Column D (which is not always the case), split values (defined by , ) and lookup each one's most recent entry (column B) and sum its value from column C with other members in same cell.
I can accomplish this using the Reduce formula shown in my sample data in blue column F, and dragging the formula to the latest entry, however it will not spill down dynamically.
=iferror(REDUCE(0,SPLIT(D2,",",false),lambda(total,value,xlookup(value,B:B,C:C,"",0,-1)+total)),0)
I can get the C values to spill down dynamically (as shown in green columns in sample) as numeric values, but I can't figure out how to sum them.
=Filter(iferror(XLOOKUP(SPLIT(D2:D,", ",false),B:B,C:C,"",0,-1),0),A2:A>0)
I would have expected something like either of these to work, but both generate a #N/A
=Filter(iferror(REDUCE(0,SPLIT(D2:D,", ",false),
lambda(total,value,xlookup(value,B:B,C:C,"",0,-1)+total)),0),A2:A>0)
=Filter(sum(iferror(XLOOKUP(SPLIT(D2:D,", ",false),B:B,C:C,"",0,-1),0)),A2:A>0)
I've also tried these as named functions with only the spilled variables as input, but same result.
I know the reduce function can perform a spilled range, as shown here on Ben Collins' site, however I can't figure out how to get it to do so with my dataset. It's occurred to me that because I'm generating a horizontal array, a verticle array may not be possible?
Any helpful answers will be upvoted if not accepted. Thanks.
Here's one approach:
=byrow(index(map(iferror(split(D2:index(D:D,match(2,1/(D:D<>""))),", ",0,1)),lambda(z,xlookup(z,B:B,C:C,)))),lambda(y,sum(y)))
To have your formula spill down you can use MAP or BYROW:
Your formula:
=iferror(REDUCE(0,SPLIT(D2,", ",false),lambda(total,value,xlookup(value,B:B,C:C,"",0,-1)+total)),0)
With MAP:
=MAP(D2:D7,LAMBDA(ζ,iferror(REDUCE(0,SPLIT(ζ,", ",false),lambda(total,value,xlookup(value,B:B,C:C,"",0,-1)+total)),0)))
Here's another solution using FILTER:
=MAP(D2:INDEX(D:D,MAX(ROW(D:D)*(D:D<>""))),LAMBDA(ζ,IFNA(SUM(FILTER(C:C,COUNTIF(SPLIT(ζ,", ",),B:B))),0)))

Can I make an array out of a range of countif functions?

A truncated version of my data is in the form shown in the screenshot below: three columns of 5 unique names. The names appear in any order and in any position but never repeat in a single row.
My goal is to create an array that contains the number of times Adam appears in each row. I can fill down the formula=countif(A2:C2,$I$2) in a new column, or if I write the array manually for each row, it looks like:
={countif(A2:C2,$I$2);countif(A3:C3,$I$2);countif(A4:C4,$I$2);countif(A5:C5,$I$2);countif(A6:C6,$I$2)}
Where cell I2 contains "Adam". Of course, this is not feasible for large data sets.
I know that arrays are effectively cells turned into ranges, but my main issue is that the cell I'm trying to transform already references a range, and I don't know how to tell the software to apply the countif down each row (i.e. I intuitively would like to do something like countif((A2:C2):(A99:C99),"Adam") but understand that's not how spreadsheets work).
My goal is ultimately to perform some operations on the corresponding array but I think I'm comfortable enough with that once I can get the array formula I'm looking for.
try:
=ARRAYFORMULA(IF(A2:A="",,MMULT(IF(A2:C="Adam", 1, 0), {1;1;1})))

How to use index match (array formula) to return corresponding values from a drop down list?

Excel Screenshot
Excel Screenshot with Formulas
I have attached photos to show an idea of what I am trying to do. Basically, I have a very large list of features that are shared between certain groups. I want to use a drop down list of the features, and then have a formula that will output the group that has the lowest cost of that feature along with the cost of that feature within the group.
(Also you will see that I purposefully ignore zero values. I do this because not every group has a certain feature and those cells default to zero).
I figured out how to get the cost of the feature to output, but I'm having trouble getting to output the group name. I am assuming there will be an array formula to do this, but I am just starting to learn those and I'm having trouble with this one.
Well you could always use the same approach you used to pull in the value, by pulling in the index of the column heading that matches the computed min, and using an offset function to match on the right row:
=+INDEX($B$1:$D$1,MATCH($B$10,OFFSET($B$1:$D$1,MATCH($A$7,$A$2:$A$4,0),0),0))
The thing is, I'm not sure how you would want to handle ties, if 2 vendors had the same price, this would match the first one in the list.

QlikView - Apply different collors on concatenated values

I have a small issue on QlikView, i m trying to apply different colors on concatenated values, when in separated columns, color works fine no big deal, but concatenated none is applied:
code on value definition: =concat(Milestones,' / ')
code on Text color under value definition:
=if(Status='Finished',Green(),if(Status='In Progress',Blue(), if(Status='Overdue',Red())))
i ve tried something like =concat(
Milestones,if(Status='Finished',Green(),if(Status='In Progress',Blue(), if(Status='Overdue',Red()))),' / ') in the value definition, but it gets me an unpleasant error.
Hope i ve expressed my issue well, Thank u in advance for your help
The problem is that when you concat the fields in the expression you now have many lines of data on one line. When you have them in separate columns they will appear on separate lines.Have a look at the difference between these two tables
QlikView no longer knows which of the statuses is the correct one for each id since there are multiple possibilities. So it can't evaluate an answer which means it can't give you a colour.
The way to fix this is to give it something to look for. I would do it like this. (My test data is slightly different to yours but the important part is the index()>0 which is a function to find a given piece of text in a larger string and then return the numeric position, 0 means not found, anything bigger than 0 is found.
if(index(concat(Status,'/'),'Resolved')>0,lightGreen(),
if(index(concat(Status,'/'),'New')>0,lightRed()))
That should give you something like this.
The order of teh nested if will be important as the first true will be the assigned colour, so if things can be In Progress and Overdue you should test for Overdue first.

Format text in flow document table cell

I have a program that creates a table, adds it to a flow document along with table cells that are populated with text. Everything works great with one exception. One column of cells in the table displays costs and they have been formated as follows:
cellValue = "$" + string.Format("{0:##,#.00}", int.Parse(cellValue)).PadLeft(22 -
cellValue.Length);
As it turns out, with this formatting numbers like $ 11,111 take up a different width then numbers like $ 10,000. I would guess because the font is not equal width for each character.
What I would like to do is be able to display the costs just like the are when in an Excel spreadsheet when formatted as Accounting (ie the dollar sign is left hand justified, the numbers are right hand justified and the numbers are lined up from cell to cell).
Example:
$ 10,000.00
$ 11,111.11
If someone knows what formatting to apply to reach this goal please let me know.
This is probably too simple but at a guess I would use fixed font for the cell like "Courier New" so that all the characters are the same width and would align up. The downside of this is that "Courier New" is not the most elegant font to use.
I am sure Excel uses a much more sophisticated mechanism for aligning numeric values.
Maybe you should use something that better mimics a spreadsheet than a table in a flowdocument.

Resources