SQL Server : FOR XML PATH and avoiding duplicate words - sql-server

I've looked around for quite a while to try and find the answer to my specific issue but I'm not having any luck.
I'm using the following code to export to an XML Path:
SELECT
productid, title,
(SELECT colors + ' '
FROM dbo.productdetails
WHERE (active = 1)
AND (productid = j.productid)
GROUP BY colors
FOR XML PATH('')) AS 'color_tags'
FROM
dbo.jewelry AS j
I have a group by applied and it's working but it's viewing things like "black blue" and "black green" as entire values so basically I'm getting duplicate words when it outputs the XML. I've also tried DISTINCT but it does the exact same thing.
I have some data stored in a table like this:
+-----------+-------------+
| productid | color_tags |
+-----------+-------------+
| 1 | black |
| 1 | black blue |
| 1 | black green |
| 1 | blue green |
| 1 | black |
+-----------+-------------+
The data I want to have on a single line is like this (basically no duplicate values):
black, blue, green
But what I'm getting is this:
black, black blue, black green, blue green
Any help would be appreciated, thank you!

Related

Conditional Formatting of Axis Label Names

I have a bar chart where I need to do conditional formatting for the colours of the axis label names. eg. If Axis.Labels.Names from Table_1 = Axis.Labels.Names from Table_2 then Label text colour = Green or else Red.
Table_1
|Customer | Value |
|---------|-------
|ABCD | 100 |
|EFGH | 150 |
|IJKL | 200 |
|MNOP | 250 |
|QRST | 300 |
|UVWX | 350 |
Table_2.
|Customer | Value |
|---------|-------|
|ABCD | 500 |
|QRST | 550 |
|IJKL | 750 |
If I plot Table_1. Then I want to get the colour of ABCD,QRST,IJKL text of the Axis Labels as Green, as those labels are also present in table_2. For EFGH,MNOP,UVWX I want the axis label text colour to be Red (as those labels are not in table_1). How can I do that in Spotfire?

Merge multiple tabs in Google Sheets and add a column for where the data came from

I have a spreadsheet which contains multiple tabs with similar layouts. I want to use a formula to merge these into a single tab which has a new column naming the tab it came from.
Example
Tab: Area A
| Item | Status |
|------|-------------|
| Foo | Blocked |
| Bar | In Progress |
Tab: Area B
| Item | Status |
|--------|-----------|
| Foobar | Completed |
Tab: Merged
| Area | Item | Status |
|------|--------|-------------|
| A | Foo | Blocked |
| A | Bar | In Progress |
| B | Foobar | Completed |
Merging without new column
I can merge the data without the additional column, using this formula:
=ARRAYFORMULA(SORT({'Area A'!A2:B; 'Area B'!A2:B}))
Which looks like this:
|--------|-------------|
| Item | Status |
|--------|-------------|
| Foo | Blocked |
| Bar | In Progress |
| Foobar | Completed |
Adding the Area column
What's missing from the above formula is the addition of the area column. This would be possible by cross-referencing the item in every tab using a vlookup and labelling it. But that wouldn't be very efficient and some updates are already slow to re-calculate in this document. I expect this to have approx. 40 tabs with 10,000 rows in total to merge.
Eg:
=IFS(NOT(ISERROR(VLOOKUP(B2,'Area A'!A$2:A,1,FALSE))), "A", NOT(ISERROR(VLOOKUP(B2,'Area B'!A$2:A,1,FALSE))), "B")
Is there a better way to do this?
I'd like something like this, but it doesn't work as the constant I'm adding doesn't match the number of rows it needs to be:
=ARRAYFORMULA(SORT({{"A",'Area A'!A2:B}; {"B", 'Area B'!A2:B}}))
you can borrow empty column and do:
=ARRAYFORMULA(SORT({{'Area A'!X2:X&"A", 'Area A'!A2:B};
{'Area B'!X2:X&"B", 'Area B'!A2:B}}))
or you can add it to first column and then split it:
=ARRAYFORMULA(QUERY(SORT({SPLIT(
{"A♦"&'Area A'!A2:A;
"B♦"&'Area B'!A2:A}, "♦"),
{'Area A'!B2:B;
'Area B'!B2:B}}), "where Col2 is not null", 0))
see: https://stackoverflow.com/a/63496191/5632629

How to combine rows into one column based on boolean values in Google Sheets?

I'm having a hard time wrapping my mind around how to approach this...
Say I have a table with several items with different potential properties, which are assigned a boolean value for each property for each item. (in this case we have several options for each "color" and "size")
| Color | Size
-----------------------------------------------------------------------
Item | Red | Blue | White | Black | Tan | Large | Medium | Small
-----------------------------------------------------------------------
Pants | | y | | y | y | y | y | y |
Shirt | y | | y | y | | y | y | y |
Skirt | | y | y | | | | y | y |
Socks | y | y | y | y | y | y | y | y |
And I want to then output this into a single column like:
Pants, Blue, Large
Pants, Blue, Medium
Pants, Blue, Small
Pants, Black, Large
Pants, Black, Medium
Pants, Black, Small
Pants, Tan, Large
Pants, Tan, Medium
Pants, Tan, Small
Shirt, Red, Large
etc etc
So, I'm trying to populate a column with all the possible results of the true values for each row, concatenated with the header for each column that has a true value. In this example, it would further break out color and size for each item.
Thoughts on the best approach to this in Google Sheets?
Example sheet:
https://docs.google.com/spreadsheets/d/199msbUtUuZzb0HvBMgjQW98hRt3StFEyUFfTfLzSQzw/edit?usp=sharing
on a small scale:
=ARRAYFORMULA(TRANSPOSE(SPLIT(TEXTJOIN("\", 1,
IF(B3:F6="y", REPT(A3:A6&", "&B2:F2&"\", MMULT(N(G3:I6="y"),
TRANSPOSE(COLUMN(G2:I2)^0))), )), "\"))&", "&
TRANSPOSE(SPLIT(JOIN(" ", REPT(TRIM(TRANSPOSE(QUERY(TRANSPOSE(
IF(G3:I6="y", G2:I2, )),, COLUMNS(IF(G3:I6="y", G2:I2, )))))&" ",
MMULT(N(B3:F6="y"), TRANSPOSE(COLUMN(B3:F6)^0)))), " ")))
for just combo of two:
=ARRAYFORMULA(QUERY(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY(
IF(B3:F="y", "♦"&A3:A&", "&B2:F2&"♦", ),,99^99)),,99^99), "♦")),
"where not Col1 starts with ' ' order by Col1"))
modded #MK formula to mimic CSV:
=ARRAYFORMULA(TRANSPOSE(QUERY(TRANSPOSE(QUERY(VLOOKUP(SEQUENCE(4*5*3, 1, 0)/(5*3)+3,
{ROW(A:A), A:A&",", IF(B:I="y", {B2:F2&",", G2:I2}, )},
INT(MOD(SEQUENCE(4*5*3, 1, 0), {9^99, 5*3, 3})/{9^99, 4, 1})+{2, 3, 3+4}),
"where Col2<>'' and Col3<>''")),,9^99)))
For problems like these, I like to use a VLOOKUP() with an [index] value that is made up of an array constructed out of some of the parameters of your data. I made a new tab called mK.Help and put this formula there which will retabulate your data into the way you're hoping (i think).
If you definitely want comma separated, I can help with that, but usually folks just want the data in columns. each of the parameters is fed by a simple formula counting the various aspects of your data. The big reorganizing formula can be rewritten to include each of those formulae instead of referring to a helper cell, but I have found that it is easier for folks to understand what's going on when I leave it broken out like that.
=ARRAYFORMULA(QUERY(VLOOKUP(SEQUENCE(Q3*Q4*Q5,1,0)/(Q4*Q5)+Q2,{ROW(A:A),A:A,IF(B:I="y",B2:I2,)},INT(MOD(SEQUENCE(Q3*Q4*Q5,1,0),{9^99,Q4*Q5,Q5})/{9^99,Q5,1})+{2,3,3+Q4}),"where Col2<>'' and Col3<>''"))

How to aggregate column values into array after groupBy?

I want to group by name and add color into array i have done following thing but it cant helped
val uid = flatten(collect_list($"color")).alias("color")
val df00= df_a.groupBy($"name")
.agg(color)
I have a dataframe with following values
---------------
|name |color |
---------------
|gaurav| red |
|harsh |black |
|nitin |yellow|
|gaurav|white |
|harsha|blue |
---------------
I want to group by name and store the color values into array using scala, to get a result like this:
----------------------
|name | color |
----------------------
|gaurav| [red,white] |
|harsh | [black,blue]|
|nitin | [yellow] |
----------------------
Use collect_list
The code is shown below:
import org.apache.spark.sql.functions._
df.groupBy($"name").agg(collect_list($"color").as("color_list")).show
Hope it helps!!

Geographically display item with highest score by country - Tableau

I have data in the following format:
Type | Country | Item | Value
Category A | Afghanistan | Item 1 | 5
Category A | Afghanistan | Item 2 | 3
Category A | Afghanistan | Item 3 | 1
Category B | Afghanistan | Item 1 | 2
Category B | Afghanistan | Item 2 | 5
Category B | Afghanistan | Item 3 | 1
I'm trying to create a map of country values, such that:
Each country is colored depending on which item received the highest cumulative value (in this case, if Item 1 is red, 2 is blue, and 3 is green, Afghanistan would be colored blue)
The item with the highest cumulative value is displayed when hovering over the country
I know that I could just manually calculate it on my own end, but I want to introduce additional filters to the file so that you could, for example, exclude Category A or B and have it recalculate the top value item.
Thanks in advance for the help!
Here's what worked for me:
Create a Max value calculation using the Fixed syntax: {FIXED [Country] : max([Value])}
Drag that column into the Dimensions area
Place that calculated field on the color marker shelf

Resources