Pandas making a new list from groupby object - loops

My data has a country column and 'Clicked on Ad' column which has boolean value for customers preference for ad. I want to groupby my list to see the number of clicks based on countries. Then I want to cut (4,8) clicks which represent highest clicks per country. I want to cut these rows and create a new list while keeping all features of rows
ad_country=ad_data.groupby('Country')
Country_sum=[]
for i in range(4,8):
if ad_country['Clicked on Ad']==i:
Country_sum.append(iloc[ad_country])
SAMPLE
Daily >>>>> Age >>Daily Internet Usage >>> Country >>>> Clicked on Ad(Boolean)
68.95 >>>>>>35 >>>>>> 256.09 >>>>>>>>> Tunisia>>>>>>>>>>>> 0
75,78>>>>>>28>>>>>>>>214.9>>>>>>>>>>Mexico>>>>>>>>>>>>1
My result should have a Dataframe containing rows with Country names as index, while having total clicked on ad feature totals and other features(though not important for analysis) totals values in the columns.

ad_country=ad_data.groupby('Country')['Clicked on Ad'].sum()
I could get a country lists using groupby and sum(). I can also see max sample size w count()
I am still looking to find if I can slice a groupby object for max counts like value_counts().

Related

Google Data Studio, how to get a sum of all Max or Min values

I am working with a data set where i have to get Min or Max for different text fields. My dataset can have thousands of rows so below is a simpler example. So I have 3 categories having multiple values and I can put this dataset in GDS to build a table where I select Category as dimention and Value as Max(Value) in metric.
Now I need to see the sum of all those values too. But like the pivot table in excel, the subtotal in GDS shows the Max out of all the max listed above. So instead of 65, it shows 30 in GDS. Is there a way I can get it to show the sum?
To reach the desired result you will need:
Make a data combination, not being necessary to insert a second base, just so that a current base is defined as a data combination.
In the combination use the Category dimension and define the Max Value metric. The combination is only necessary for the metric to be used in the table as a dimension (this is a property resulting from the combination of data).
Configure the table with the Category dimension and Include the metric with the Value sum option. Remember that now Value is the maximum value (as defined in the data combination).
Finally, display the Summary line. And the desired result is obtained

Identify elements that do not appear in the period (Google Data Studio)

I have a table that shows the recurrence of purchasing a product, with the columns: product_id, report_date, quantity.
I need to list in a table the products that are more than 50 days unsold. The opposite I managed to do (list those that were sold in the last 50 days) but the opposite logic has not yet been able to implement.
Does anyone have any tips?
An example of the table:
product_id,date,report_date,quantity
329,2019-01-02 08:19:17,2019-01-02 14:34:12,6
243,2019-01-03 09:19:17,2019-01-03 15:34:12,6
238,2019-02-02 08:19:17,2019-03-02 14:34:12,84
170,2019-04-02 08:19:17,2019-04-02 14:34:12,84
238,2019-04-02 08:19:17,2019-04-02 14:34:12,8
238,2019-04-02 08:19:17,2019-04-02 14:34:12,100
238,2019-08-02 08:19:17,2019-08-02 14:34:12,100
238,2019-10-02 08:19:17,2019-10-02 14:34:12,100
170,2020-01-02 08:19:17,2020-01-02 14:34:12,84
170,2020-01-02 08:19:17,2020-01-02 14:34:12,84
There are many steps to do this task. I assume the date column is the one to work with. Your example from table includes duplicated entries. Is it right that at the same time the order is there twice?
So here are the steps:
At first add an calculated field date_past to your dataset:
DATE_DIFF(CURRENT_DATE(),date)
To the dataset add a filter SO_demo with:
include date_past<30
Then blend the data with it self. Use product_id as Join key. Only the 2nd dataset has the SO_demo filter. Add to the dimension of this dataset the calculated field sold_last_30_days with the formula "yes".
In the table/chart to display add a filter on the field include sold_last_30_days is Null.

Calculating dynamic pricing on Google Sheets

I have imported data from a trading exchange listing sellers of a particular cryptocurrency.
From this data, I want to create dynamic pricing to display an average cost on an order based on given order size.
I will give an example of what I am looking for:
Example dataset
Within this example, we would be purchasing the cryptocurrency 'SINS'. As per the data showed on this table, if 29.06 SINS was purchased, that would fill the first order, and the total BTC paid would be 0.00459 BTC.
If an order was placed for 145 SINS, it would fill the orders up to row 12 and partially fill the order in row 13. By calculating that manually, I know that would cost 0.02293365 BTC (calculated using col D) at an average price of 0.00015816 per SIN.
What I would like to achieve is if a number is entered in a cell, it confirms the average price of an order based on the number entered and the orders imported from the trading exchange.
=INDIRECT(ADDRESS(MATCH(VLOOKUP(O2,F2:F,1),F:F,0),7,4))+(
INDIRECT(ADDRESS(MATCH(VLOOKUP(O2,F2:F,1),F:F,0)+1,4,4))*(O2-
INDIRECT(ADDRESS(MATCH(VLOOKUP(O2,F2:F,1),F:F,0),6,4)))/
INDIRECT(ADDRESS(MATCH(VLOOKUP(O2,F2:F,1),F:F,0)+1,3,4)))
spreadsheet demo

SQL syntax to calculate total menu items per meal orderID across multiple meal orderIDs

I am new to SQL and Stack overflow and have a question about SQL Server syntax. i have searched online but I am not finding what I need and I would appreciate your assistance in this matter.
I have data in a source table for meal orders (each with a specific ID (e.g. 12345C) and items of each order (e.g. sandwich, drink, chips), each with an associated number starting with 1. For instance, the sandwich would have an item number of 1, the chips would be item # 2, and the drink would be item # 3 for the same orderID 12345C. The prior example would therefore have 3 rows of data in the source table for orderID 12345C.
My questions are these:
how do I use a SQL expression to determine the number of items per each order (e.g. 3 for the above example, which is also the maximum value for item number for each orderID)
and then add all of these items per order across hundreds of orders per day to determine the daily total number of items ordered.
So, if I had 3 orders in one day - one with 2 items, the second with 3 items, and the third with 4 items, I would like my final number to be 9.
This number is for use in a Sisense dashboard that allows SQL syntax in the field definition. Thank you for your help!
It is a bit difficult to explain but I am not able to use a query from a table because I am working with a dashboard in Sisense so I am adding fields in a pivot display and one of the fields I would like to include is the total count of order items per day (across several dozen orderIDs).
Here is an example of the data in the table: from the example I would like the final answer for orderID 1787588 to be 3 (there are 3 items within the order).

Excel formula for creating a list using predefined criteria

The array formula I need is to create a list of unique divisions based on the country that is entered in cell (A1) for only divisions that have forecasted sales. The raw data sheet the list is pulling off is called 'Forecast' and has the following information:
Column A = Country, Column B = Division, Column C = Forecasted Sales, Column D = Forecast month
As the forecast data is broken into months, each division is repeated 24 times (for the 24 month forecast) so the formula will need to be able to return one record while eliminating the rest.
The closest solution I have found is this entry however I was unable to tailor it to my situation: Create a dynamic list from multiple criteria in data block

Resources