Row and Column search to map values to another df with python - loops

I need to be able to find the values from one df in another by looking at each value per column per row.
Basically I have this df which is my main_df (note this is a mock up of the data in reality I have hundreds of columns and thousands of rows):
user_id
fav_fruit
fav_veg
basket
1
apple
potato
apple
2
pear
potato
fruit
3
banana
carrot
fruit
4
apple
broccoli
carrot
And I have another df whihch is my auxiliary_df:
category
answer
value
fav_fruit
apple
0.5
fav_fruit
pear
0.5
fav_fruit
banana
0.8
fav_veg
potato
0.7
fav_veg
carrot
1
fav_veg
broccoli
1
basket
apple
3
basket
fruit
5
basket
carrot
3
I need to sear for each user id (row loop) find the column name in the auxiliary_df - category column, and within the answers in that category map what i have in my main_df as value with the auxiliary_df to get the value of that table so at the end I get the following:
user_id
fav_fruit
fav_veg
basket
1
0.5
0.7
3
2
0.5
0.7
5
3
0.8
1
5
4
0.5
1
3
So at this point I repalced all the values from my main_df with the numerical values from the auxiliary_df. I cannot ma just by answer since many answers are repeated across categories so unsure how t map it properly. I dont know how to do this, I tried to do a dictionary to map but couldnt make it work to get the correct dection of category with answer and value. Also, I cannot hardcode anything since is hundreds of columns and values that change regularly.

You can .pivot() the auxiliary_df and .map():
tmp = auxiliary_df.pivot(index='answer', columns='category', values='value')
main_df = main_df.set_index('user_id')
for c in main_df:
main_df[c] = main_df[c].map(tmp[c])
print(main_df.reset_index())
Prints:
user_id fav_fruit fav_veg basket
0 1 0.5 0.7 3.0
1 2 0.5 0.7 5.0
2 3 0.8 1.0 5.0
3 4 0.5 1.0 3.0

Related

In Google data studio I have the following Data, How to make a easy counts?

Let's say I have this data, how can I segment in Google Data Studio to return the result as in next table:
Item Name
Category
Returned
Apples
Fruits
0
Potato
Vegetables
1
TV
Electronics
2
Banana
Fruits
2
Tomato
Vegetables
0
Fridge
Electronics
2
Grapes
Fruits
1
Onion
Vegetables
2
AC
Electronics
2
Pineapple
Fruits
0
Carrot
Vegetables
1
Oven
Electronics
1
I am looking for that end result appears like that Returned (0-2) is the count not the sum.
Category
Returned (0)
Returned (1)
Returned (2)
Fruits
2
1
2
Vegetables
1
2
1
Electronics
0
1
2
I tried filtering but not appearing correctly.
create a new calculated field with the formula:
CONCAT("Returned(",Returned,")")
Next create a PIVOT table chart with Category as row-dimension & above created calculated field as column-dimension
-

How can I filter multiple columns that include the same value in Power BI?

I currently have a table in Power BI named Jira Tickets.
Here is sample data from Jira Tickets:
Issue id
Label
Label 1
Label 2
Label 3
Label 4
1000
Apples
Grapes
Bananas
Oranges
Strawberries
1001
Oranges
Pears
Apples
Bananas
Strawberries
1002
Pears
Dragon Fruit
Apples
Strawberries
Dragon Fruit
1003
Bananas
Oranges
Apples
Grapes
Pears
1004
Grapes
Apples
Bananas
Pears
Strawberries
I want to create a slicer in Power Bi to filter by the column values in Label Label 1 Label 2 Label 3 Label 4. The issue is that this gets confusing when choosing which column value to filter by, as the same column value exists within different columns. For example, If I wanted to filter by Apples, I would need to select multiple Apples values from Label Label 1 Label 2 and Label 4.
How can I create a slicer in Power BI to ensure I can filter uniquely by:
Apples
Oranges
Pears
Bananas
Strawberries
Dragon Fruit
Grapes?
Create disconnected table for alllabel:
AllLabel = SUMMARIZE(UNION(VALUES(Fruit[Label 1]), VALUES(Fruit[Label 2]), VALUES(Fruit[Label 3]), VALUES(Fruit[Label 4])),Fruit[Label 1])
Create a measure:
PickThis =
var _selectedFruit = SELECTEDVALUE(AllLabel[Label 1])
return
CALCULATE(COUNTROWS(Fruit) , FILTER(Fruit,Fruit[Label 1] = _selectedFruit || Fruit[Label 2] = _selectedFruit || Fruit[Label 3] = _selectedFruit || Fruit[Label 4] = _selectedFruit))
Add measure to your visualization (or to filter):

How to filter a Line Chart with a Measure in Power BI?

I currently have a table in Power BI named Fruit.
Here is sample data from Fruit:
Issue id
Label
Label 1
Label 2
Label 3
Label 4
Created
Resolved
Time Difference (MINS)
1000
Apples
Grapes
Bananas
Oranges
Strawberries
14/03/2021 11:38:23
11/02/2022
525632
1001
Oranges
Pears
Apples
Bananas
Strawberries
13/03/2021 12:34:34
18/03/2022 11:38:23
524324
1002
Pears
Dragon Fruit
Apples
Strawberries
Dragon Fruit
04/03/2021 18:31:11
12/03/2022 11:38:23
525345
1003
Bananas
Oranges
Apples
Grapes
Pears
11/03/2021 19:34:57
11/03/2022 11:38:23
528264
1004
Grapes
Apples
Bananas
Pears
Strawberries
12/03/2021 12:32:52
15/03/2022 11:38:23
521927
I have created a table to join the label values into one:
AllLabel = SUMMARIZE(UNION(VALUES(Fruit[Label 1]), VALUES(Fruit[Label 2]), VALUES(Fruit[Label 3]), VALUES(Fruit[Label 4])),Fruit[Label 1])
I also have created a measure count the labels and filter them uniquely:
Apples
Oranges
Pears
Bananas
Strawberries
Dragon Fruit
Grapes:
SELECTEDFruit =
var _selectedFruit = SELECTEDVALUE(AllLabel[Label 1])
return
CALCULATE(COUNTROWS(Fruit) , FILTER(Fruit,Fruit[Label 1] = _selectedFruit || Fruit[Label 2] = _selectedFruit || Fruit[Label 3] = _selectedFruit || Fruit[Label 4] = _selectedFruit))
This is how the tables look:
I have a line chart to calculate the Average resolution time:
However, when I choose a particular fruit using a slicer, it does not change the value of Average Resolution Time Line Chart.
How can I filter the Average Resolution Line Chart using the Label 1 slicer?
the reason is that your tables don't have relationships. make sure that there exist relationships between tables. it can be one active relationship between two table,you should use dax Relationship functions such as USERELATIONSHIP formula for calculate between other columns.
however i recommend using pivot,unpivot and union dax function like in order to create another table to get best result from slicer and there will be no need for your dax formula AllLabel
UnpivotedTable= FILTER(
UNION(
SELECTCOLUMNS('Fruit',"Created",[Created],"Resolved",[Resolved],"Time Difference (MINS)",[Time Difference (MINS)],"label",[Label]),
SELECTCOLUMNS('Fruit',"Created",[Created],"Resolved",[Resolved],"Time Difference (MINS)",[Time Difference (MINS)],"label",[Label 1]),
SELECTCOLUMNS('Fruit',"Created",[Created],"Resolved",[Resolved],"Time Difference (MINS)",[Time Difference (MINS)],"label",[Label 2]),
SELECTCOLUMNS('Fruit',"Created",[Created],"Resolved",[Resolved],"Time Difference (MINS)",[Time Difference (MINS)],"label",[Label 3]),
SELECTCOLUMNS('Fruit',"Created",[Created],"Resolved",[Resolved],"Time Difference (MINS)",[Time Difference (MINS)],"label",[Label 4])),[Label]<>"")

How to recap cell into column where the data is shown in rows

Sorry for the bad subject title but I have a sheet #1 like this
A B C D E F G
Invoice Fruit Price Fruit Price Fruit Price
101 Apple 10 Orange 30 Mango 40
102 Orange 30 Pear 20 Berry 10
103 Melon 50 Apple 10 Berry 10
104 Pear 20 Melon 50 Apple 10
which basically detailing what fruits inside the invoice, but in columns
Then i want to have another sheet #2 to look like this
A B C D
Inv Price Fruit: Apple
101 10
103 10
104 10
So basically sheet #2 is a recap from sheet #1 where cell D1 will be the deciding factor which fruits recap will be shown in column A & B
Any ideas about what formula for cell A2:A and B2:B in sheet #2 ?
try:
=QUERY({A1:C10; A1:A10, D1:E10; A1:A10, F1:G10}, "select Col1,Col3 where Col2='"&D13&"'", 1)

Mapping vertical values to Horizontal rows in SQL Server

I have two tables. First is ItemDetails and second is ItemHeaders.
ItemHeaders:
ItemID ItemName
1 Apple
2 Orange
3 Grapes
ItemDetails:
ID ItemHeader1 ItemHeader2 ItemHeader3
1 1 2 1
2 3 2 1
3 2 1 2
4 2 3 3
OutPut:
ID Categroy1 Categroy2 Category3
1 Apple Orange Apple
2 Grapes Orange Apple
3 Orange Apple Orange
4 Orange Grapes Grapes
My Query:
Select ID, i1.ItemName as Categroy1, i2.ItemName as Categroy2, i3.ItemName as Categroy3
From ItemDetails d
Left Join ItemHeaders i1 on d.ItemHeader1 = i1.ItemID
Left Join ItemHeaders i2 on d.ItemHeader2 = i2.ItemID
Left Join ItemHeaders i3 on d.ItemHeader3 = i3.ItemID
Question: This is sample data and I have 50,000 records in ItemDetails. When I run my query it takes lots of time. Can someone suggest optimize query or best option to achieve above result? Please let me know if question or query is not clear.
Edit: There is an index on ItemID. You said about Pivot. How can I use pivot to get my result? And also there are 10 headers instead of 3. Here I have mentioned only 3.
Apart from index.hope left join is correct.i mean in your requirement you cannot use inner join .
what is the purpose of ItemDetails ?
you can have one column called itemHeader and type .
why you will need to run 50000 and above rows at one go.why not use paging ?
you can pivot the thing in front end also .

Resources