Social feed system design - google-app-engine

http://twitter.com/#!/ladygaga
When ladygaga tweet 1 message, does it mean to insert 1 data record for EACH of her followers (total 12,221,751)? So totally 12,221,751 records are inserted?
Any clues in designing such a social feed system?
------------------------------- Edit line -------------------------------
Real issue:
Performing SELECT tweet FROM Tweets IN ([FollowingIDs]) is not possible in google app engine, which limiting to a maximum of 30 items in the IN clause
While in app engine it actually means, performing 30 queries in parallel, which is not very wise to do so I guess.
Even if I am allowed to overtake the 30 limits,
what if I am subscribing to 10000 people? I am not sure if there are any performance issues to do it in MYSQL or any other kind of database infrastructure using the "IN clause"
(the bigtable of app engine is different from MYSQL)
So it is better to use the IN clause to query?
or setting up a UserFeed table for storing the feed relationship?
or 3rd method?
Database/SQL guru please help

Please see this talk from Google I/O 2009 to see how to handle these sort of cases on App Engine with a 'fan out' data structure.

Can you imagine it?
no
every people has list of followers
ID | FOLLOWER__ID
ladygaga | genesis
ladygaga | user
//php
$result = mysql_query("SELECT ID FROM followers WHERE FOLLOWER__ID = 'genesis';");
while($row = mysql_fetch_assoc($result)){
$select[] = $row['ID'];
}
$tweets = mysql_query("SELECT * FROM tweets WHERE owner IN (".implode(",", $select).")");

Related

Firebase Firestore Stock Count updating?

I am using firebase purely to integrate a simple ticket buying system.
I would think this is a very common scenario people have and wondering what the solutions are.
I have an issue with the write limit time, it means I can't keep the stock count updated.
Due to Firebase's 1 second write limit and the way transactions work, they keep timing out when there is a large buy of tickets at one point in time.
For example:
Let's say we have a simple ticket document like this
{
name: "Taylor Bieber Concert"
stock: 100
price: 1000
}
I use a firebase transaction server side that does (pseudo)
transaction{
ticket = t.get(ticketRef).data() //get the data of ticketRef doc
guard (ticket .stock > 0) else return //check the stock is more than 0
t.update(ticketRef, {stock : increment(-1) }) //update the document and remove 1 stock value
}
The transaction and functionality all works however if I get 20-100 people trying to buy a ticket as it releases, it goes into contention it seems and times out a bunch of the requests...
Is there a way to avoid these timeouts? Some sort of queue or something?
I have tried using Transactions server-side in firebase functions to update the stock value, when many people try to purchase the product simultaneously it leads to majority of the transactions being locked out / Aborted Code 10

Database logic for user achievments

I have a table in my database that stores users, for example:
userID | userName | email | password | wins | losses | exp
Now I want the user to be able to get achievments in my game, like "win 5 games in a row", and I obviously want that progress in the database (Google app-engine) so progress is not lost when user exits client. Example of achievment table:
achievmentID | achievmentTitle | description | reward
Now how would I go about saving achievment progress for each user in the best manner? I need to save both progress (like 3/5 games in a row won) and if achievment is completed or not.
The product is for Android/iOS and uses google app engine (datastore) as database.
The way you set up your table would not be very efficient. In my mind to be the most efficient, you would have to make a new column on your users table (not new table, but after the 'exp') and create a sort of key for achievements. For example, you could give each achievement an ID (which you would keep track of like in a notes on notepad or something).
Then, when they get that achievement, you would put "123/" and if you did another achievement, it would say something like "123/461/".
Then you could make a script that breaks apart these IDs to see what achievements have been completed.

Can a value in AWS DynamoDB point to value in different table?

First off, I have very minimal experience with servers and databases (I have only used it once in my entire life and only beginning to learn) and this would not exactly be a "code" question strictly speaking because it is a question concerning a concept regarding DynamoDB.. But here it is because I cannot find answer to it no matter how much I search!
I am trying to make an application where users can see if their friends are "online" or not. There will be a table that keeps track of the users who are online and offline like this:
user_id | online
1 | O
2 | X
3 | O
and when user_id 1 who has friends 2 & 3 "refreshes", 1 would be able to see that 2 is offline and 3 is online. This would normally be done by batch_get in dynamodb, but each item I read would count as one unit, meaning if user1 had 20 friends, one refresh would use up 20 read units. To me, that would cost too much, and I thought that if I made a table for each user that would hold list of their friends that shows whether they are online or not, each refresh would cost only one read unit.
user_id | friends_on_off_line
1 | {2:X, 3:O}
2 | {1:O}
3 | {1:O}
However, the values in the list would have to be a "pointer" to the first table, because I cannot update the value everytime someone goes online or offline (if 1 went offline, I would have to write 1 as offline to both tables, and in second table, write it twice, using 3 write units which would end up costing even more)
So I am trying to make it so that in second table, values would point to the first table that would read whether they are online/offline and return the values as a list using only 1 read unit: like this
user_id | friends_on_off_line
1 | {pointer_to_2.online , pointer_to_3.online}
2 | {pointer_to_1.online}
3 | {pointer_to_1.online}
Is this possible in DynamoDB? If not, which service should I use and how can I make it possible?
Thanks in advance!
I don't think DynamoDB is the right tool for this kind of job.
SQL databases (Mysql/PostgreSQL) both have easy designs - just use joins (pointers).
You can also look at this question regarding this area for MongoDB.
What you should ask yourself is what are the most common questions the database needs to answer and what is the update / read rate. This questions usually navigate you to the right direction when picking up a database.

How to keep track changing items in a stock portfolio?

I have a system where people can pick some stocks and it values their portfolios but I'm having trouble doing this in a efficient way on a daily basis because I'm creating entries for days that don't have any changes(think of it like I'm measuring the values and having version control so I can track changes to the way the portfolio is designed).
Here's a example(each day's portfolio with stock name and weight):
Day1:
ibm = 10%
microsoft = 50%
google = 40%
day5:
ibm = 20%
microsoft = 20%
google = 40%
cisco = 20%
I can measure the value of the portfolio on day1 and understand I need to measure it again on day5(when it changed) but how do I measure day2-4 without recreating day1's entry in the database?
My approach right now(which I don't like) is to create a temp entry in my database for when someone changes the portfolio and then at the end of the day when I calculate the values if there is a temp entry I use that otherwise I create a new entry(for day2-4) using the last days data. The issue is as data often doesn't change I'm creating entries that are basically duplicates. The catch is: my stock data is all daily. I also thought of taking the portfolio and if it hasn't been updated in 3 days to find the returns of the last 3 days for each stock but I wasn't sure if there was a better solution.
Any ideas? I think this is a straight forward problem but I just can't see a efficient way of doing it.
note: in finance terms, its called creating a NAV and most firms do it the inefficient way I'm doing it but its because the process was created like 50 years ago and hasn't changed. I think this problem is very similar to version control but I can't seem to make a solution.
In storage terms is makes most sense to just store:
UserId - StockId1 - 23% - 2012-06-25
UserId - StockId2 - 11% - 2012-06-26
UserId - StockId1 - 20% - 2012-06-30
So you see that stock 1 went down at 30th. Now if you want to know the StockId1 percentage at the 28th you just select:
SELECT *
FROM stocks
WHERE datecolumn<=DATE(2012-06-28)
ORDER BY datecolumn DESC LIMIT 0,1
If it gives nothing back you did not have it, otherwise you get the last position back.
BTW. if you need for example a graph of stock 1 you could left join against a table full of dates. Then you can fill in the gaps easily.
Found this post here for example:
UPDATE mytable
SET number = (#n := COALESCE(number, #n))
ORDER BY date;
SQL QUERY replace NULL value in a row with a value from the previous known value

What is a viable local database for Windows Phone 7 right now?

I was wondering what is a viable database solution for local storage on Windows Phone 7 right now. Using search I stumbled upon these 2 threads but they are over a few months old. I was wondering if there are some new development in databases for WP7. And I didn't found any reviews about the databases mentioned in the links below.
windows phone 7 database
Local Sql database support for Windows phone 7
My requirements are:
It should be free for commercial use
Saving/updating a record should only save the actual record and not the entire database (unlike WinPhone7 DB)
Able to fast query on a table with ~1000 records using LINQ.
Should also work in simulator
EDIT:
Just tried Sterling using a simple test app: It looks good, but I have 2 issues.
Creating 1000 records takes 30 seconds using db.Save(myPerson). Person is a simple class with 5 properties.
Then I discovered there is a db.SaveAsync<Person>(IList) method. This is fine because it doesn't block the current thread anymore.
BUT my question is: Is it save to call db.Flush() immediately and do a query on the currently saving IList? (because it takes up to 30 seconds to save the records in synchronous mode). Or do I have to wait until the BackgroundWorker has finished saving?
Query these 1000 records with LINQ and a where clause the first time takes up to 14 sec to load into memory.
Is there a way to speed this up?
Here are some benchmark results: (Unit tests was executed on a HTC Trophy)
-----------------------------
purging: 7,59 sec
creating 1000 records: 0,006 sec
saving 1000 records: 32,374 sec
flushing 1000 records: 0,07 sec
-----------------------------
//async
creating 1000 records: 0,04 sec
saving 1000 records: 0,004 sec
flushing 1000 records: 0 sec
-----------------------------
//get all keys
persons list count = 1000 (0,007)
-----------------------------
//get all persons with a where clause
persons list with query count = 26 (14,241)
-----------------------------
//update 1 property of 1 record + save
persons list with query count = 26 (0,003s)
db saved (0,072s)
You might want to take a look at Sterling - it should address most of your concerns and is very flexible.
http://sterling.codeplex.com/
(Full disclosure: my project)
try Siaqodb is commercial project and as difference from Sterling, not serialize objects and keep all in memory for query.Siaqodb can be queried by LINQ provider which efficiently can pull from database even only fields values without create any objects in memory, or load/construct only objects that was requested.
Perst is free for non-commercial use.
You might also want to try Ninja Database Pro. It looks like it has more features than Sterling.
http://www.kellermansoftware.com/p-43-ninja-database-pro.aspx

Resources