With this Query I get this result. The idea is i have a date of the product that is sold. When check if the date is between date_start and date_end of the marketing table, if so put that price in the marketingprice.
In some cases there is no end. This mean it's still running and has no end date. We still use them.
Because there is no end date I want to use the date of today. So if E (date_end) is empty , use the today date if not then use E
=query(Marketing!$B$2:E,
"select C,D where B='" & D2 & "' and
D<=date '"&TEXT(E2,"yyyy-MM-dd")&"' and
date'"&TEXT(today(),"yyyy-MM-dd")&"'>=date '"&TEXT(E2,"yyyy-MM-dd")&"' ")
+------------+---------------+--------+-----------------+----------+----------------+
| product_no | product_price | amount | deliver_country | datum | marketingprice |
+------------+---------------+--------+-----------------+----------+----------------+
| 1001 | 2.8 | 2 | de | 2-1-2020 | |
+------------+---------------+--------+-----------------+----------+----------------+
+-----+------------+
| 3.2 | 01-01-2020 |
| 1.2 | 02-01-2020 |
+-----+------------+
I want to use IF(isblank(E),date'"&TEXT(TODAY(),"yyyy-MM-dd")&"', E)
Then the code will be
=query(Marketing!$B$2:E,
"select C,D where B='" & D2 & "' and
D<=date '"&TEXT(E2,"yyyy-MM-dd")&"' and
IF(isblank(E),date'"&TEXT(today(),"yyyy-MM-dd")&"',E)>=date '"&TEXT(E2,"yyyy-MM-dd")&"' ")
Then I get a error:
QUERY: PARSE_ERROR: Encountered " "IF "" at line 1, column 58.
Was expecting one of: "(" ... "(" ...
Marketing Table
+---------+---------+-------+-------------+------------+
| channel | country | price | date_start | date_end |
+---------+---------+-------+-------------+------------+
| Google | de | 3.2 | 01-01-2020 | 01-01-2020 |
| Google | de | 1.2 | 02-01-2020 | |
| Amazon | en | 5.4 | 01-01-2020 | |
+---------+---------+-------+-------------+------------+
Output how it should be
+------------+---------------+--------+-----------------+----------+----------------+
| product_no | product_price | amount | deliver_country | datum | marketingprice |
+------------+---------------+--------+-----------------+----------+----------------+
| 1001 | 2.8 | 2 | de | 2-1-2020 | 1.2 |
| 1002 | 3.8 | 4 | en | 3-1-2020 | 5.4 |
| 1001 | 2.8 | 1 | de | 1-1-2020 | 3.2 |
+------------+---------------+--------+-----------------+----------+----------------+
In mysql I use this code :
b.start_date <= date(i.system_created) AND
coalesce(b.end_date,now()) >= date(i.system_created)
Solution i found myself:
=QUERY(Marketing!$B$2:$E;IF(Marketing!E$2:E="";
"select E where B ='"&D2&"' and D <=date'"&TEXT(E2;"yyyy-MM-dd")&"' and date'"&TEXT(VANDAAG();"yyyy-MM-dd")&"' >=date'"&TEXT(E2;"yyyy-MM-dd")&"' limit 1";
"select E where B ='"&D2&"' and D <=date'"&TEXT(E2;"yyyy-MM-dd")&"' and E >=date'"&TEXT(E2;"yyyy-MM-dd")&"'"
);0)
try:
=ARRAYFORMULA(IFERROR(QUERY(
{Marketing!B$2:D\ IF(Marketing!E$2:E=""; TODAY(); Marketing!E$2:E)};
"select Col2
where Col1 = '"&D2&"'
and Col3 <= date '"&TEXT(E2;"yyyy-MM-dd")&"'
and Col4 <= "&TODAY()&"
limit 1"; 0)))
You could use Apps Script to solve this issue.
How does it work?
Assign Sales data (D2:E7) to sales variable, Marketing data (B2:E7) to marketing variable.
in for loop through sales and if sale_date < date_start set cell value as marketing_cost - 1
Below you can find a screenshot attached before and after.
function worker(){
let ss = SpreadsheetApp.getActive();
let sheetSales = ss.getSheetByName("Sales");
let sheetMarketing = ss.getSheetByName("Marketing");
// 1.
let sales = sheetSales.getRange("D2:E7").getValues();
/** sales
[
[de, 02-01-2020],
[de, 03-01-2020],
[de, 04-01-2020],
[de, 06-01-2020],
[de, 10-01-2020],
[en, 10-01-2020]
]
*/
let marketing = sheetMarketing.getRange("B2:E7").getValues();
/** marketing
[
[de, 3.2, 01-01-2020, 02-01-2020],
[de, 1.2, 03-01-2020, 04-01-2020],
[de, 4.4, 05-01-2020, 06-01-2020],
[de, 8.8, 07-01-2020, 08-01-2020],
[de, 9.9, 09-01-2020, 25-02-2020],
[en, 5.4, 01-01-2020, 25-02-2020]
]
*/
let sale_date,
date_start,
date_end,
marketing_cost = [];
for(let i = 0; i < sales.length; i++){
sale_date = new Date(sales[i][1]);
date_start = new Date(marketing[i][2]);
date_end = new Date(marketing[i][3]);
marketing_cost.push(marketing[i][1]);
// 2.
if(sale_date < date_start){
sheetSales.getRange(2+i, 6).setValue(marketing_cost[i-1]);
}else{
sheetSales.getRange(2+i, 6).setValue(marketing_cost[i]);
}
}
}
Before:
After:
Reference:
Spreadsheet Service
Related
Objective:
Looking up Class ID (ignoring anything past the **), and return corresponding date in (B2:B).
Then Looking up Class Event, and return the corresponding date in (C2:C).
I have tried combinations of HLookup, VLookup, Index & Match, and Query, but cannot seem to get it to work correctly.
My Sheet:
Column | A | B? | C? | D | E | F |
Row1 | [Class ID's] | [Class ID Date] | [Class Event Date] | [Dates] | [Name1] | [Name2] |
Row2 | Class ID1 | 01/02/2021 | 01/04/2021 | 01/01/2021 | | Class ID3** |
Row3 | Class ID2 | 01/08/2021 | 01/09/2021 | 01/02/2021 | Class ID1** | |
Row4 | Class ID3 | 01/01/2021 | 01/07/2021 | 01/03/2021 | | Class ID4** |
Row5 | Class ID4 | 01/03/2021 | 01/09/2021 | 01/04/2021 | Class Event | |
Row6 | Class ID5 | * Formula #1 * | * Formula #2 * | 01/05/2021 | | |
Row7 | Class ID6 | | | 01/06/2021 | | |
Row8 | Class ID7 | | | 01/07/2021 | | Class Event |
Row9 | Class ID8 | | | 01/08/2021 | Class ID2** | |
Row10 | Class ID9 | | | 01/09/2021 | Class Event | Class Event |
Row11 | Class ID10 | | | 01/10/2021 | | |
Formula #1 (Column: B)
Find the class ID, and returning the dates into range: B2:B (Working, but not efficient)
=IFERROR(INDEX($D$2:$D,MATCH($A2,E$2:E,0)),INDEX($D$2:$D,MATCH($A2,F$2:F,0)))
...and so on, for each column (There are 60 columns).
Formula #2 (Column: C)
Find class ID, search column, find first instance of "Class Event", return date into range: C2:C
="I have absolutely no clue for this one"
Is this even possible in Google Sheets?
I can use excel if needed (but preferably not as this sheet pulls data from another Google Sheet)
C2:
=ARRAYFORMULA(IFNA(VLOOKUP(B2:B, {F2:F, E2:E; G2:G, E2:E}, 2, 0)))
D2:
as for now, how you stated the dataset, there is nothing to pair events with specific IDs - it would be possible if you would have
Class ID4 Event
instead of just
Class Event
update 1:
C2 for 60 columns would be:
=ARRAYFORMULA(IFNA(VLOOKUP(B2:B,
SPLIT(FLATTEN(IF(F2:BN="",,F2:BN&"×"&E2:E)), "×"), 2, 0)))
update 2:
try in D2 (but it will work only if class event will follow after each id class)
=ARRAYFORMULA(IFNA(VLOOKUP(B2:B, {
QUERY(SPLIT(FLATTEN(IF(TRANSPOSE(F2:BN)="",,
TRANSPOSE(F2:BN)&"×"&TRANSPOSE(E2:E))), "×"),
"select Col1 where Col2 is not null", 0), {
QUERY(QUERY(SPLIT(FLATTEN(IF(TRANSPOSE(F2:BN)="",,
TRANSPOSE(F2:BN)&"×"&TRANSPOSE(E2:E))), "×"),
"select Col2 where Col2 is not null", 0),
"offset 1", 0); ""}}, 2, 0)))
So I have a view like this in SQL Server which I'm using to build a dashboard in Power BI:
ID | Name | IsRegional | IsFederal | Department | ...
1 | John | Yes | No | Paris | ...
2 | Mike | No | Yes | Brussels | ...
3 | Bill | No | Yes | Berlin | ...
4 | Bart | Yes | Yes | Berlin | ...
5 | Suzy | Yes | No | New York | ...
Currently I have 2 slicers in PowerBi that say "Is Regional: Yes/no" and "Is Federal: Yes/no". I want to make one slicer of this saying "Type: Federal/Regional"
My idea was to add a column TYPE in the view that says
WHEN IsRegional = 'Yes' THEN 'Regional'
WHEN IsFederal = 'Yes' THEN 'Federal'
ELSE 'None'
and then use the new column for the slicer
ID | Name | IsRegional | IsFederal | Type | Department | ...
1 | John | Yes | No | Regional | Paris | ...
2 | Mike | No | Yes | Federal | Brussels | ...
3 | Bill | No | Yes | Federal | Berlin | ...
4 | Bart | Yes | Yes | Regional | Berlin | ...
5 | Suzy | Yes | No | Regional | New York | ...
However, this creates an issue with record 4 where the type can be both. I would like the slicer to include row 4 when I have Federal selected as type (since it's both federal as Regional). Is there a way to solve this issue so I can use the single slicer? I would rather not add a 4th option saying "Both" to the slicer because I'm sure people will look over that one.
Just change your CASE expression slightly:
CASE WHEN IsRegional = 'Yes' AND IsFederal = 'Yes' THEN 'Both'
WHEN IsRegional = 'Yes' THEN 'Region'
WHEN IsFederal = 'Yes' THEN 'Federal'
ELSE 'None'
END
I have a table with a number of variables such as:
+-----------+------------+---------+-----------+--------+
| DateFrom | DateTo | Price | Discount | Cost |
+-----------+------------+---------+-----------+--------+
| 01jan17 | 01jul17 | 17 | 4 | 5 |
| 01aug17 | 01feb18 | 15 | 1 | 3 |
| 01mar18 | 01dec18 | 12 | 2 | 1 |
| ... | ... | ... | ... | ... |
+-----------+------------+---------+-----------+--------+
However I want to split this so I have:
+------------+------------+----------+-------------+---------+-------------+------------+----------+-------------+-------------+
| DateFrom1 | DateTo1 | Price1 | Discount1 | Cost1 | DateFrom2 | DateTo2 | Price2 | Discount2 | Cost2 ... |
+------------+------------+----------+-------------+---------+-------------+------------+----------+-------------+-------------+
| 01jan17 | 01jul17 | 17 | 4 | 5 | 01aug17 | 01feb18 | 15 | 1 | 3 |
+------------+------------+----------+-------------+---------+-------------+------------+----------+-------------+-------------+
There's a cool (not at all obvious) solution using proc summary and the idgroup statement that only takes a few lines of code. This runs in memory and you're likely to come into problems if the dataset is large, otherwise this works very well.
Note that out[3] relates to the number of rows in the source data. You could easily make this dynamic by adding a prior step that calculates the number of rows and stores it in a macro variable.
/* create initial dataset */
data have;
input (DateFrom DateTo) (:date7.) Price Discount Cost;
format DateFrom DateTo date7.;
datalines;
01jan17 01jul17 17 4 5
01aug17 01feb18 15 1 3
01mar18 01dec18 12 2 1
;
run;
/* transform data into 1 row */
proc summary data=have nway;
output out=want (drop=_:)
idgroup(out[3] (_all_)=) / autoname;
run;
I have a HIVE Table with following schema like this:
hive>desc books;
gen_id int
author array<string>
rating double
genres array<string>
hive>select * from books;
| gen_id | rating | author |genres
+----------------+-------------+---------------+----------
| 1 | 10 | ["A","B"] | ["X","Y"]
| 2 | 20 | ["C","A"] | ["Z","X"]
| 3 | 30 | ["D"] | ["X"]
Is there a query where I can perform some SELECT operation and that returns individual rows, like this:
| gen_id | rating | SplitData
+-------------+---------------+-------------
| 1 | 10 | "A"
| 1 | 10 | "B"
| 1 | 10 | "X"
| 1 | 10 | "Y"
| 2 | 20 | "C"
| 2 | 20 | "A"
| 2 | 20 | "Z"
| 2 | 20 | "X"
| 3 | 30 | "D"
| 3 | 30 | "X"
Can someone guide me how can get to this result. Thanks in advance for any kind of help.
You need to do Lateral view and explode,i.e.
SELECT
gen_id,
rating,
SplitData
FROM (
SELECT
gen_id,
rating,
array (ex_author,ed_genres) AS ar_SplitData
FROM
books
LATERAL VIEW explode(books.author) exploded_authors AS ex_author
LATERAL VIEW explode(books.genres) exploded_genres AS ed_genres
) tab
LATERAL VIEW explode(tab.ar_SplitData) exploded_SplitData AS SplitData;
I had no chance to test it but it should show you general path. GL!
I need to retrieve data from a specific time period.
The query works fine until I specify the time period. Is there something wrong with the way I specify time period? I know there are many entries within that time-frame.
This query returns empty:
SELECT stop_times.stop_id, STR_TO_DATE(stop_times.arrival_time, '%H:%i:%s') as stopTime, routes.route_short_name, routes.route_long_name, trips.trip_headsign FROM trips
JOIN stop_times ON trips.trip_id = stop_times.trip_id
JOIN routes ON routes.route_id = trips.route_id
WHERE stop_times.stop_id = 5508
HAVING stopTime BETWEEN DATE_SUB(stopTime,INTERVAL 1 MINUTE) AND DATE_ADD(stopTime,INTERVAL 20 MINUTE);
Here is it's EXPLAIN:
+----+-------------+------------+--------+------------------+---------+---------+-------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+------------------+---------+---------+-------------------------------+------+-------------+
| 1 | SIMPLE | stop_times | ref | trip_id,stop_id | stop_id | 5 | const | 605 | Using where |
| 1 | SIMPLE | trips | eq_ref | PRIMARY,route_id | PRIMARY | 4 | wmata_gtfs.stop_times.trip_id | 1 | |
| 1 | SIMPLE | routes | eq_ref | PRIMARY | PRIMARY | 4 | wmata_gtfs.trips.route_id | 1 | |
+----+-------------+------------+--------+------------------+---------+---------+-------------------------------+------+-------------+
3 rows in set (0.00 sec)
The query works if I remove the HAVING clause (don't specify time range). Returns:
+---------+----------+------------------+-----------------+---------------+
| stop_id | stopTime | route_short_name | route_long_name | trip_headsign |
+---------+----------+------------------+-----------------+---------------+
| 5508 | 06:31:00 | "80" | "" | "FORT TOTTEN" |
| 5508 | 06:57:00 | "80" | "" | "FORT TOTTEN" |
| 5508 | 07:23:00 | "80" | "" | "FORT TOTTEN" |
| 5508 | 07:49:00 | "80" | "" | "FORT TOTTEN" |
| 5508 | 08:15:00 | "80" | "" | "FORT TOTTEN" |
| 5508 | 08:41:00 | "80" | "" | "FORT TOTTEN" |
| 5508 | 09:08:00 | "80" | "" | "FORT TOTTEN" |
I am using Google Transit format Data loaded into MySQL.
The query is supposed to provide stop times and bus routes for a given bus stop.
For a bus stop, I am trying to get:
Route Name
Bus Name
Bus Direction (headsign)
Stop time
The results should be limited only to buses times from 1 min ago to 20 min from now.
Please let me know if you could help.
UPDATE
The problem was that I was comparing DATE to DATETIME as one answer said.
I could not use DATE because my values had times but not dates.
So my solution was to use Unix time:
SELECT stop_times.stop_id, stop_times.trip_id, UNIX_TIMESTAMP(CONCAT(DATE_FORMAT(NOW(),'%Y-%m-%d '), stop_times.arrival_time)) as stopTime, routes.route_short_name, routes.route_long_name, trips.trip_headsign FROM trips
JOIN stop_times ON trips.trip_id = stop_times.trip_id
JOIN routes ON routes.route_id = trips.route_id
WHERE stop_times.stop_id = 5508
HAVING stopTime > (UNIX_TIMESTAMP(NOW()) - 60) AND stopTime < (UNIX_TIMESTAMP(NOW()) + (60*20));
Stoptime is a time value, and DATE_ADD/SUB work with datetime fields. Ensure they are both the same type.
Try this instead:
SELECT * FROM
(SELECT stop_times.stop_id, STR_TO_DATE(stop_times.arrival_time, '%H:%i:%s') as stopTime, routes.route_short_name, routes.route_long_name, trips.trip_headsign FROM trips
JOIN stop_times ON trips.trip_id = stop_times.trip_id
JOIN routes ON routes.route_id = trips.route_id
WHERE stop_times.stop_id = 5508) AS qu_1
WHERE qu_1.stopTime BETWEEN DATE_SUB(qu_1.stopTime,INTERVAL 1 MINUTE) AND DATE_ADD(qu_1.stopTime,INTERVAL 20 MINUTE);
Have to warn you I haven't tested this but it does remove the need for the HAVING clause.
Don't work with the synthetic column stopTime other than as the output.
I think your query should be something like:
SELECT stop_times.stop_id, STR_TO_DATE(stop_times.arrival_time, '%H:%i:%s') as stopTime, routes.route_short_name, routes.route_long_name, trips.trip_headsign FROM trips
JOIN stop_times ON trips.trip_id = stop_times.trip_id
JOIN routes ON routes.route_id = trips.route_id
WHERE stop_times.stop_id = 5508
AND arrival_time BETWEEN <something> AND <something else>
The HAVING clause you wrote should always return true, so I'm guessing that's not what you really had in mind.