Google Analytics Cohort data doesn't make sense - mobile

I'm trying to understand my cohort data and something there doesn't make sense:
On day/week 0 I have sometimes less than 100% - how can that be logic ?
If I choose "all sessions" shouldn't it be 100% always in day/week 0 ?
Why isn't it 100% when I choose a segment ? What does it mean if I have 50% on week 0 for a specific segment - that the start group is 50% of what ?
Also, where can I find benchmarks for this data in order to understand where I'm at in the industry (Mobile app, Transportation or service)

Related

Firebase Firestore Stock Count updating?

I am using firebase purely to integrate a simple ticket buying system.
I would think this is a very common scenario people have and wondering what the solutions are.
I have an issue with the write limit time, it means I can't keep the stock count updated.
Due to Firebase's 1 second write limit and the way transactions work, they keep timing out when there is a large buy of tickets at one point in time.
For example:
Let's say we have a simple ticket document like this
{
name: "Taylor Bieber Concert"
stock: 100
price: 1000
}
I use a firebase transaction server side that does (pseudo)
transaction{
ticket = t.get(ticketRef).data() //get the data of ticketRef doc
guard (ticket .stock > 0) else return //check the stock is more than 0
t.update(ticketRef, {stock : increment(-1) }) //update the document and remove 1 stock value
}
The transaction and functionality all works however if I get 20-100 people trying to buy a ticket as it releases, it goes into contention it seems and times out a bunch of the requests...
Is there a way to avoid these timeouts? Some sort of queue or something?
I have tried using Transactions server-side in firebase functions to update the stock value, when many people try to purchase the product simultaneously it leads to majority of the transactions being locked out / Aborted Code 10

Is there a dataset for products (UPC/EAN level) and their recycling information?

I am looking to do some analysis around plastic recycling and interested to know if there is any dataset that gives recycling information for products sold in US. For ex: a product with UPC/EAN number has a resin code of 1 (number written at the bottom of a plastic container). If you have any ideas on how to start creating it will be helpful as well. I understand there is something out there that gives information of a general 1 gallon milk container but I am looking at information on a brand/manufacturer level.
Thanks

Trouble Excluding Nodes from Graph

Set-Up
I'm very new to graph databases and neo4j/cypher and I'm having a hard time understanding how to exclude various pieces from my results. Below is an image of my graph. Every node and every relationship has an activeFrom and activeTo property to allow me to view the graph as it existed at any given point in history.
MATCH (:Collective:Company)<-[tree *0..4]-(downline:Collective) RETURN downline
(Any relationship with a date indicates it's already, or scheduled (future date) to expire. No date or future date means it's active.).
Question
My ultimate goal here is to view this same graph, minus all expired nodes and relationships. Right now, I'm trying to build the query that will let me see that and am failing :(
What I'm not understanding is why:
Region5's relationship to Company1 is still active... why isn't company showing up? (shouldn't the 0-length path bring the company back like in the first image?)
Both Office5 and Office27 have expired relationships, so why are they still in the result?
Offices 1, 2, 6, 9, and 11 are active nodes, but have no active relationships, so why are they being returned? (my GUESS here is that my 2nd WHERE clause (branch filter) is filtering out the relationships, but not the nodes they associate, but I'm not sure how to do it differently)
.
MATCH (:Collective:Company)<-[tree *0..4]-(downline:Collective)
WHERE
// -- node(s) are active
downline.activeFrom <= '2015-08-31 23:59:59'
AND (downline.activeTo IS NULL OR downline.activeTo > '2015-08-31 23:59:59')
UNWIND tree AS branch
WITH branch, downline
WHERE
// -- branch is active
branch.activeFrom <= '2015-08-31 23:59:59'
AND (branch.activeTo IS NULL OR branch.activeTo > '2015-08-31 23:59:59')
RETURN downline
Bonus
I've set up a neo4j sandbox with this data for you guys to play with if needed. Please be mature with this, as I don't know how to make it read only. Please don't go deleting data and messing things up for other people. I'm also personally paying for this cloud instance, so please don't abuse the VM/resources :)
You can access it here: (sorry, removed for security purposes now that question has been answered).
Based on your questions, I'm trying to piece together what you require and I understand that you want to return paths that contain all active nodes and relationships. This is because you've asked about Office 27 and Office 5 which are both active nodes, but their single relationship to region 5 is inactive, so you do not want the paths between Office 27->Region 5 and Office 5->Region 5.
Office 2 however, is active, and it has an active relationship to region 4(also active). Region 4 has an inactive relationship to Company 1, so since you don't expect Office 2 in the results, I'm assuming it's because it has the inactive relationship in the entire path?
If this is the case, here's a query that hopefully does what you want-
MATCH p=(:Collective:Company)<-[tree*0..4]-(downline:Collective)
WHERE
ALL(x in relationships(p) WHERE x.activeFrom <= '2015-08-31 23:59:59'
AND (x.activeTo IS NULL OR x.activeTo > '2015-08-31 23:59:59'))
AND ALL(x in nodes(p) WHERE x.activeFrom <= '2015-08-31 23:59:59'
AND (x.activeTo IS NULL OR x.activeTo > '2015-08-31 23:59:59'))
RETURN p
This makes sure that every relationship and every node in a path is active. To bring back Office 2,1, change ALL to ANY and you'll see those back in the results because the path is now partially active.
BTW, you could also set up your graph at http://console.neo4j.org/?init=0 and share it

Can a value in AWS DynamoDB point to value in different table?

First off, I have very minimal experience with servers and databases (I have only used it once in my entire life and only beginning to learn) and this would not exactly be a "code" question strictly speaking because it is a question concerning a concept regarding DynamoDB.. But here it is because I cannot find answer to it no matter how much I search!
I am trying to make an application where users can see if their friends are "online" or not. There will be a table that keeps track of the users who are online and offline like this:
user_id | online
1 | O
2 | X
3 | O
and when user_id 1 who has friends 2 & 3 "refreshes", 1 would be able to see that 2 is offline and 3 is online. This would normally be done by batch_get in dynamodb, but each item I read would count as one unit, meaning if user1 had 20 friends, one refresh would use up 20 read units. To me, that would cost too much, and I thought that if I made a table for each user that would hold list of their friends that shows whether they are online or not, each refresh would cost only one read unit.
user_id | friends_on_off_line
1 | {2:X, 3:O}
2 | {1:O}
3 | {1:O}
However, the values in the list would have to be a "pointer" to the first table, because I cannot update the value everytime someone goes online or offline (if 1 went offline, I would have to write 1 as offline to both tables, and in second table, write it twice, using 3 write units which would end up costing even more)
So I am trying to make it so that in second table, values would point to the first table that would read whether they are online/offline and return the values as a list using only 1 read unit: like this
user_id | friends_on_off_line
1 | {pointer_to_2.online , pointer_to_3.online}
2 | {pointer_to_1.online}
3 | {pointer_to_1.online}
Is this possible in DynamoDB? If not, which service should I use and how can I make it possible?
Thanks in advance!
I don't think DynamoDB is the right tool for this kind of job.
SQL databases (Mysql/PostgreSQL) both have easy designs - just use joins (pointers).
You can also look at this question regarding this area for MongoDB.
What you should ask yourself is what are the most common questions the database needs to answer and what is the update / read rate. This questions usually navigate you to the right direction when picking up a database.

How to keep track changing items in a stock portfolio?

I have a system where people can pick some stocks and it values their portfolios but I'm having trouble doing this in a efficient way on a daily basis because I'm creating entries for days that don't have any changes(think of it like I'm measuring the values and having version control so I can track changes to the way the portfolio is designed).
Here's a example(each day's portfolio with stock name and weight):
Day1:
ibm = 10%
microsoft = 50%
google = 40%
day5:
ibm = 20%
microsoft = 20%
google = 40%
cisco = 20%
I can measure the value of the portfolio on day1 and understand I need to measure it again on day5(when it changed) but how do I measure day2-4 without recreating day1's entry in the database?
My approach right now(which I don't like) is to create a temp entry in my database for when someone changes the portfolio and then at the end of the day when I calculate the values if there is a temp entry I use that otherwise I create a new entry(for day2-4) using the last days data. The issue is as data often doesn't change I'm creating entries that are basically duplicates. The catch is: my stock data is all daily. I also thought of taking the portfolio and if it hasn't been updated in 3 days to find the returns of the last 3 days for each stock but I wasn't sure if there was a better solution.
Any ideas? I think this is a straight forward problem but I just can't see a efficient way of doing it.
note: in finance terms, its called creating a NAV and most firms do it the inefficient way I'm doing it but its because the process was created like 50 years ago and hasn't changed. I think this problem is very similar to version control but I can't seem to make a solution.
In storage terms is makes most sense to just store:
UserId - StockId1 - 23% - 2012-06-25
UserId - StockId2 - 11% - 2012-06-26
UserId - StockId1 - 20% - 2012-06-30
So you see that stock 1 went down at 30th. Now if you want to know the StockId1 percentage at the 28th you just select:
SELECT *
FROM stocks
WHERE datecolumn<=DATE(2012-06-28)
ORDER BY datecolumn DESC LIMIT 0,1
If it gives nothing back you did not have it, otherwise you get the last position back.
BTW. if you need for example a graph of stock 1 you could left join against a table full of dates. Then you can fill in the gaps easily.
Found this post here for example:
UPDATE mytable
SET number = (#n := COALESCE(number, #n))
ORDER BY date;
SQL QUERY replace NULL value in a row with a value from the previous known value

Resources