How can I get the NYC taxi zone using a given coordinate? - dataset

NYC Taxi trips dataset in BigQuery has changed pickup and dropoff locations from the geographic positions(longitude and latitude) to the Taxi zone. I found NYC Taxi Zones but I'm struggling to get the zone from the given coordinates.
I think it is related to multipolygon but not sure. Does anyone can help me to provide a simple example to get the zone from a coordinate?

Found this link to figure out how.
/* find the boroughs and zone names for dropoff locations */
INNER JOIN `bigquery-public-data.new_york_taxi_trips.taxi_zone_geom` tz_do ON
(ST_DWithin(tz_do.zone_geom,ST_GeogPoint(dropoff_longitude, dropoff_latitude), 0))
/* find the boroughs and zone names for pickup locations */
INNER JOIN `bigquery-public-data.new_york_taxi_trips.taxi_zone_geom` tz_pu ON
(ST_DWithin(tz_pu.zone_geom,ST_GeogPoint(pickup_longitude, pickup_latitude), 0))

Related

How can I apply 2 date controls on report/page level?

I want to know if it is possible to have 2 date range controls on my page. Each date range control would be connected to a different date (Purchase date / Consumption date) of our products).
Here is a simplified editable copy of the data studio report.
The Google Sheet source looks like:
ID
Purchase date
Consumption date
Product
Price
ABCD12
21/03/2022
09/11/2022
A
£50
EFGH34
22/03/2022
22/11/2022
B
£80
IJKL56
23/04/2022
15/11/2022
A
£50
MNOP78
24/03/2022
06/12/2022
A
£50
The output I'm looking for is to be able to filter data so that I can answer the question "how many products were purchased in March 2022 that have a consumption date in November 2022". The expected output is as follows:
ID
Purchase date
Consumption date
Product
Price
ABCD12
21/03/2022
09/11/2022
A
£50
EFGH34
22/03/2022
22/11/2022
B
£80
Supermetrics has a Date Picker that essentially does what I need it to do. But it has 2 downsides 1) it is bulky and does not work well with many years of data and 2) It does not allow breaking down to more than a monthly level.
Is there another way to make this happen with parameters?
Through this post I've gotten as far as getting a 'switch' for my graphs and tables between the two date datapoints, but that is not the solution I'm looking for.
actually you did find already a good solution by the 3rd party Add-one "Date Picker" from Supermetrics. An alternative route is to include two tables which only have the consumption date as a column. The user can then select these and do a cross filtering of the main table.
In the first table, the dimension has to be changed to "Year Month":
An alternative community visualisation to the Date Picker (based on the limitations cited in the question) would be the Range Slider.
Two Range Sliders could be used (one for each date field), however, the below will use one Date range control and one Range Slider to demonstrate that they can work together (as well as maintaining the original setup in the question):
1) Purchase date
1.1) Date range control
1.2) Table
Date Range Dimension: Purchase date
Dimension 1: ID
Dimension 2: Purchase date
Dimension 3: Consumption date
Dimension 4: Product
Metric: Price
2) Consumption date
2.1) Range Slider
Column to filter on: Consumption date
(Chart Interactions) Cross Filtering: ☑
Publicly editable Google Data Studio report (embedded Google Sheets data source) and a GIF to elaborate:

How to calculate the tax free amount of sales, based on date fields?

i need your help for a task that i have undertaken and i face difficulties.
So, i have to calculate the NET amount of sales for some products, which were sold in different cities on different years and for this reason different tax rate is applied.
Specifically, i have a dimension table (Dim_Cities) which consists of the cities that the products can be sold.
i.e
Dim_Cities:
CityID, CityName, Area, District.
Dim_Cities:
1, "Athens", "Attiki", "Central Greece".
Also, i have a file/table which consists of the following information :
i.e
[SalesArea]
,[EffectiveFrom_2019]
,[EffectiveTo_2019]
,[VAT_2019]
,[EffectiveFrom_2018]
,[EffectiveTo_2018]
,[VAT_2018]
,[EffectiveFrom_2017]
,[EffectiveTo_2017]
,[VAT_2017]
,[EffectiveFrom_2016_Semester1]
,[EffectiveTo_2016_Semester1]
,[VAT_2016_Semester1]
,[EffectiveFrom_2016_Semester2]
,[EffectiveTo_2016_Semester2]
,[VAT_2016_Semester2]
i.e
"Athens", "2019-01-01", "2019-12-31", 0.24,
"2018-01-01", "2018-12-31", 0.24,
"2017-01-01", "2017-12-31", 0.17,
"2016-01-01", "2016-05-31", 0.16,
"2016-01-06", "2016-12-31", 0.24
And of course there is a fact table that holds all the information,
i.e
FactSales_ID, CityID, SaleAmount (with VAT), SaleDate_ID.
The question is how to compute for every city the "TAX-Free SalesAmount", that corresponds to each particular saledate? In other words, i think that i have to create a function that computes every time the NET amount, substracting in each case the corresponding tax rate, based on the date and city that it finds. Can anyone help me or guide me to achieve this please?
I'm not sure if you are asking how to query your data to produce this result or how to design your data warehouse to make this data available - but I'm hoping you are asking about how to design your data warehouse as this information should definitely be pre-calculated and held in your DW rather than being calculated every time anyone wants to report on the data.
One of the key points of building a DW is that all the complex business logic should be handled in the ETL (as much as possible) so that the actually reporting is simple; the only calculations in a reporting process are those that can't be pre-calculated.
If your CITY Dim is SCD2 (or could be made to be SCD2) then I would add the VAT rate as an attribute to that Dim - otherwise you could hold VAT Rate in a "worker" table.
When your ETL loads your Fact table you would use the VAT rate on the CITY Dim (or in the worker table) to calculate the Net and Gross amounts and hold both as measures in your fact table

How to select a nonblank date

In power BI, I am computing the percentage difference between Stock price index levels over the last year.
Ann pch =
VAR __EarliestValue = CALCULATE(SUM('Equity Markets (2)'[Value]),
DATEADD(LASTDATE('Calendar'[Date]),-1,YEAR))
VAR __LastDateValue = CALCULATE(SUM('Equity Markets (2)'[Value]),
LASTDATE('Calendar'[Date]))
RETURN
CALCULATE(
DIVIDE(__LastDateValue,__EarliestValue) -1)
The above is correct but there is a bug: some dates fall on the weekend, or other non-trading days, in which case I want to select the next nonblank value for __EarliestValue and the previous nonblank value in the case of __LastDateValue.
Could anyone suggest the code to implement this.
I am very much a DAX/Power BI novice. Thank you very much.
Data Sample:
I created a slicer based on the column 'Equity Markets(2)'[Date].
note: Do not make the slicer on your calendar date, then you get your "holes"
Then I created a measure with following formula:
Measure = LOOKUPVALUE(EquityMarkets[Value];EquityMarkets[Date]; MAX(EquityMarkets[Date]))/ LOOKUPVALUE(EquityMarkets[Value];EquityMarkets[Date]; MIN(EquityMarkets[Date]))
This measure I show in the card visual. The result is when using the slicer, the calculation is made.

MS SQL - Calculating plan payments for a month

I need to calculate how much a plan has cost the customer in a specific month.
Plans have floating billing cycles of a month's length - for example a billing cycle can run from '2014-04-16' to '2014-05-16'.
I know the start date of a plan, and the end date can either be a specific date or NULL if the plan is still running.
If the end date is not null, then the customer is charged for a whole month - not pro rated. Example: The billing cycle is going from the 4th to 4th each month, but the customer ends his plan on the 10th, he will still be charged until the 4th next month.
Can anyone help me? I feel like I've been going over this a million times, and just can't figure it out.
Variables I have:
#planStartDate [Plan's start date]
#planEndDate [Plan's end date - can be null]
#billStartDate [The bill's start date - example: 2015-02-01]
#billEndDate [One month after bill's start date - 2015-03-01]
#price [the plan's price per billing cycle]
Heres the best answer I can give based on the very small information you have given so far(btw, in the future, it would really help people answer your question faster/easier/more efficiently if you could specify a lot more info;tables involved, all columns, etc..):
"I need to calculate how much a plan has cost the customer in a specific month."
SELECT SUM(price), customerID(I assume you have a column of some sort in this table to distinguish between customers) FROM table_foo
where planStartDate BETWEEN = 'a specific date you specify'
Its a bit rough of a query, but thats the best I can give till you specify more clearly your variable (i.e. tables involved, ALL columns in table, etc etc.....)

How to keep track changing items in a stock portfolio?

I have a system where people can pick some stocks and it values their portfolios but I'm having trouble doing this in a efficient way on a daily basis because I'm creating entries for days that don't have any changes(think of it like I'm measuring the values and having version control so I can track changes to the way the portfolio is designed).
Here's a example(each day's portfolio with stock name and weight):
Day1:
ibm = 10%
microsoft = 50%
google = 40%
day5:
ibm = 20%
microsoft = 20%
google = 40%
cisco = 20%
I can measure the value of the portfolio on day1 and understand I need to measure it again on day5(when it changed) but how do I measure day2-4 without recreating day1's entry in the database?
My approach right now(which I don't like) is to create a temp entry in my database for when someone changes the portfolio and then at the end of the day when I calculate the values if there is a temp entry I use that otherwise I create a new entry(for day2-4) using the last days data. The issue is as data often doesn't change I'm creating entries that are basically duplicates. The catch is: my stock data is all daily. I also thought of taking the portfolio and if it hasn't been updated in 3 days to find the returns of the last 3 days for each stock but I wasn't sure if there was a better solution.
Any ideas? I think this is a straight forward problem but I just can't see a efficient way of doing it.
note: in finance terms, its called creating a NAV and most firms do it the inefficient way I'm doing it but its because the process was created like 50 years ago and hasn't changed. I think this problem is very similar to version control but I can't seem to make a solution.
In storage terms is makes most sense to just store:
UserId - StockId1 - 23% - 2012-06-25
UserId - StockId2 - 11% - 2012-06-26
UserId - StockId1 - 20% - 2012-06-30
So you see that stock 1 went down at 30th. Now if you want to know the StockId1 percentage at the 28th you just select:
SELECT *
FROM stocks
WHERE datecolumn<=DATE(2012-06-28)
ORDER BY datecolumn DESC LIMIT 0,1
If it gives nothing back you did not have it, otherwise you get the last position back.
BTW. if you need for example a graph of stock 1 you could left join against a table full of dates. Then you can fill in the gaps easily.
Found this post here for example:
UPDATE mytable
SET number = (#n := COALESCE(number, #n))
ORDER BY date;
SQL QUERY replace NULL value in a row with a value from the previous known value

Resources