Related
I am a fairly new web developer and would need your help with a project I am currently working on. I have worked in the past on a very simple realtime database example and have little to none experience in firestore or NoSql in general.
I want to create a system which allows end-users to get an email once a week that contains a list of special offers from bars the end-user has subscribed to. The offers change each day of the week. Bar owners can fill out a form in a vue.js web application every week with their weekly special offers.
Every Monday morning a cron job has to look up which end user has subscribed to which bars and then aggregate the data and send it via email.
The question is how would you structure the data so that I can easily compose the email and send it via a cloud function?
My approach would be to have three main collections: RestaurantOwner, EndUser, SpecialOfferings
Please see the graphic for an example process:
BarOwner and EndUser are pretty straight forward. However, the difficult part is how to structure the SpecialOffers in order to be queried the right way.
My idea would be to structure it based on the calendar week and link it to the uid from the barOwner:
specialOffers: {
2019_CW27: {
barUID001: {
mon: {
title: 'Banana Daiquir',
price: 4.99,
},
tue: {
title: 'After Five',
price: 2.99,
},
wed: {
title: 'Cool Colada',
price: 6.99
},
thu: {
title: 'Crantini',
price: 5.99
},
fri: {
title: 'French Martini',
price: 4.99
}
},
barUID002: {
mon: {
title: 'Gin & Tonic',
price: 8.99,
},
tue: {
title: 'Cratini',
price: 4.99,
},
wed: {
title: 'French Martini',
price: 4.99
},
thu: {
title: 'After Five',
price: 3.99
},
fri: {
title: 'Cool Colada',
price: 6.99
}
}
},
2019_CW28: {
barUID01: {~~~},
barUID02: {~~~}
}
}
The disadvantage of this approach is that it creates a deeply nested object when you imagine that there are 52 calendar weeks, f.e 100 signed up bars à 5 special offers per week and I am not sure if I am able to query it the way I need to.
Is this approach reasonable or what would you do differently?
Thank you so much for your help! I highly appreciate it.
I'm assuming the following scenarios:
1) The bar owners make modifications to their offers very often.
2) The bar owners should be the only ones allowed to modify each bar's offers.
If you have these two scenarios, I would recommend a sub-collections approach here.
When to use sub-collections:
1) When there are lot of fields in a document. Cloud Firestore has 20,000 field limit. (If the number of Bars can exceed more than 20,000 fields)
2) When updating the parent collection is a common operation. Firestore only lets you update the document at rate of 1 write/second. (If the SpecialOffers information of each bar is modified very often. If two bar owners modify their offers, only 1 write is successful and the second write operation waits until the first is completed. This can delay the updation offers particularly at the end of a week when almost all the bars update the offers.)
3) When you want to limit the access to particular fields of a document. (If you want to restrict the access to a Bar's Offers to the barOwner alone. You can restrict the access to each document in the Bars sub-collection according to its owner using Firestore Security Rules)
So I would recommend a sub-collection Bars under the main collection SpecialOffers. This way the design becomes scalable and you can add restaurants and super-markets as other similar sub-collections in the future without heavily altering your design.
Another advantage is that sub-collections are basically collections and they don't have a limit for number of documents it can hold. So even if the number of bars registered is above 20,000 which is the limit of number of fields for a fire-store document, your sub-collection wont be having a problem but your document will run out of fields to save the offers for a new bar.
Ultimately the choice depends on your use cases.
Hope this helps.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 10 months ago.
Improve this question
Im going to create a achievement system in Mongodb. But im not sure how i would format/store it in the database.
As of the users should have a progress (on each achievement they would have some progress value stored), im really confused what would be the best way to perform this, and without having an performence issue.
what should i do?, cause i dont know, what i had in mind, was maybe something like:
Should i store each achievement in an unique row in a Achievement collection, and an user array within that row, containing object with userid and achievement progress?
Would i then get an performance issue when its 1000+ achievements, that is beeing checked fairy often?
or should i do something else?
example schema for the option above:
{
name:{
type:String,
default:'Achievement name'
},
users:[
{
userid:{
type:String,
default:' users id here'
},
progress:{
type:Number,
default:0
}
}
]
}
Even though the question is specifically about the database design, I will give a solution for the tracking/awarding logic as well to establish more accurate context for the db design.
I would store the achievements progress separately from the already awarded achievements for cleaner tracking and discovery.
The whole logic is event based and has multiple layers of event handling. This gives you TONS of flexibility on how you track your data and also gives you a pretty good mechanism to track history. Basically, you can look at it as a form of logging.
Of course, your system design and contracts are highly dependent on the information you're gonna be tracking and its complexity. A simple progress field may not suffice for each case(you might want to track something more complex, not a simple number between X and Y). There is also the case of tracking data which updates quite frequently(as distance travelled in games, for example). You didn't give any context on the topic of your achievement system so we're gonna stick with a generic solution. It's just a couple of things that you should take a note about as it will affect the design.
Okay, so, let's start from the top and track the entire flow for a tracked piece of data and its eventual achievement progress. Let's say we're tracking consecutive days of user login and we're gonna award him with an achievement when he reaches [10].
Note that everything below is just a pseudo-code.
So, let's say today is [8th of July, 2017]. For now, our User entity looks like this:
User: {
id: 7;
trackingData: {
lastLogin: 7 of July, 2017 (should be full DateTime object, but using this for brevity),
consecutiveDays: 9
},
achievementProgress: [
{
achievementID: 10,
progress: 9
}
],
achievements: []
}
And our achievements collection contains the following entity:
Achievement: {
id: 10,
name: '10 Consecutive Days',
rewardValue: 10
}
The user tries to login(or visit the site). The application handler takes note of that and after handling the login logic fires an event of type ACTION:
ACTION_EVENT = {
type: ACTION,
name: USER_LOGIN,
payload: {
userID: 7,
date: 8 of July, 2017 (should be full DateTime object, but using this for brevity)
}
}
We have an ActionHandler which listens for events of type ACTION:
ActionHandler.handleEvent(actionEvent) {
subscribersMap = Map<eventName, handlers>;
subscribersMap[actionEvent.name].forEach(subscriber => subscriber.execute(actionEvent.payload));
}
subscribersMap gives us a collection of handlers that should respond to each specific action(this should resolve to USER_LOGIN for us). In our case we can have 1 or 2 that concern themselves with updating the user tracking information of lastLogin and consecutiveDays tracking properties in the user entity. The handlers in our case will update the tracking information and fire new events further down the line.
Once again, for brevity, we're gonna incorporate both into one:
updateLoginHandler: function(payload) {
user = db.getUser(payload.userID);
let eventType;
let eventValue;
if (date - user.trackingData.lastLogin > 1 day) {
user.trackingData = 1;
eventType = 'PROGRESS_RESET';
eventValue = 1;
}
else {
const newValue = user.trackingData.consecutiveDays + 1;
user.trackingData.consecutiveDays = newValue;
eventType = 'PROGRESS_INCREASE';
eventValue = newValue;
}
user.trackingData.lastLogin = payload.date;
/* DISPATCH NEW EVENT OF TYPE ACHIEVEMENT_PROGRESS */
AchievementProgressHandler.dispatch({
type: ACHIEVEMENT_PROGRESS
name: eventType,
payload: {
userID: payload.userID,
achievmentID: 10,
value: eventValue
}
});
}
Here, PROGRESS_RESET have the same contract as the PROGRESS_INCREASE but have a different semantic meaning and I would keep them separate for history/tracking purposes. If you wish, you can combine them into a single PROGRESS_UPDATE event.
Basically, we update the tracked fields that are dependent on the lastLogin date and fire a new ACHIEVEMENT_PROGRESS event which should be handled by a separate handler with the same pattern(AchievementProgressHandler). In our case:
ACHIEVEMENT_PROGRESS_EVENT = {
type: ACHIEVEMENT_PROGRESS,
name: PROGRESS_INCREASE
payload: {
userID: 7,
achievementID: 10,
value: 10
}
}
Then, in AchievementProgressHandler we follow the same pattern:
AchievementProgressHandler: function(event) {
achievementCheckers = Map<achievementID, achievementChecker>;
/* update user.achievementProgress code */
switch(event.name): {
case 'PROGRESS_INCREASE':
achievementCheckers[event.payload.achievementID].execute(event.payload);
break;
case 'PROGRESS_RESET':
...
}
}
achievementCheckers contains a checker function for each specific achievement that decides if the achievement has reached its desired value(a progress of 100%) and should be awarded. This enables us to handle all kinds of complex cases. If you only track a single X out of Y scenario, you can share the function between all achievements.
The handler basically does this:
achievementChecker: function(payload) {
achievementAwardHandler;
achievement = db.getAchievement(payload.achievementID);
if (payload.value >= achievement.rewardValue) {
achievementAwardHandler.dispatch({
type: ACHIEVEMENT_AWARD,
name: ACHIEVEMENT_AWARD,
payload: {
userID: payload.userID,
achievementID: achievementID,
awardedAt: [current date]
}
});
/* Here you can clear the entry from user.achievementProgress as you no longer need it. You can also move this inside the achievementAwardHandler. */
}
}
We once again dispatch an event and use an event handler - achievementAwardHandler. You can skip the event creation step and award the achievement to the user directly but we keep it consistent with the whole history logging flow.
An added benefit here is that you can use the handler to defer the achievement awarding to a specific later time thus effectively batching awarding for multiple users, which serve a couple of purposes including performance enhancement.
Basically, this pseudo code handles the flow from [a user action] to [achievement rewarding] with all intermediate steps included. It's not set in stone, you can modify it as you like but all in all, it gives you a clean separation of concerns, cleaner entities, it's performant, let's you add complex checks and handlers which are easy to reason about while in the same time provide a great history log of the user overall progress.
Regarding the DB schema entities, I would suggest the following:
User: {
id: any;
trackingData: {},
achievementProgress: {} || [],
achievements: []
}
Where:
trackingData is an object that contains everything you're willing
to track about the user. The beauty is that properties here are
independent from achievement data. You can track whatever and eventually use it for achievement purposes.
achievementProgress: a map of <key: achievementID, value: data> or
an array containing the current progress for each achievement.
achievements: an array of awarded achievements.
and Achievement:
Achievement: {
id: any,
name: any,
rewardValue: any (or any other field/fields. You have complete freedom to introduce any kind of tracking with the approach above),
users?: [
{
userID: any,
awardedAt: date
}
]
}
users is a collection of users who have been rewarded the given achievement. This is optional and is here only if you have the use for it and query for this data frequently.
What you might be looking for is a Badge style implementation. Just like Stack Overflow rewards it's users with badges for specific achievements.
Method 1: You can have flags in the user profile for each badge. Since you're doing it in NoSQL database, you just have to set a flag for each badge.
const badgeSchema = new mongoose.Schema({
badgeName: {
type: String,
required: true,
},
badgeDescription: {
type: String,
required: true,
}
});
const userSchema = new mongoose.Schema({
userName: {
type: String,
required: true,
},
badges: {
type: [Object],
required: true,
}
});
If your application architecture is event based, you can trigger awarding badges to users. And that operation is just inserting Badge object with progress in User badges array.
{
badgeId: ObjectId("602797c8242d59d42715ba2c"),
progress: 10
}
Update operation will be to find and update the badges array with progress percentage number
And while displaying user achievements on user interface, you can just loop over badges array to show the badges this user has achieved and their progress with it.
Method 2: Have a separate mongo collection for Badge and User Mapping. Whenever a user achieves a badge you insert a record in that collection. It will be one to one mapping of user _id and badge _id and progress value. But as the table will grow huge you will need to do indexing to efficiently query user and badge mapping.
You will have to do analysis on best approach according to your specific use case.
MongoDB is flexible enough to allow teams develop applications quickly, and involve their model with litter friction as the application needs it. In cases where you need a robust model from day one, theirs is a flexible methodology that can guide you through the process of modeling your data.
The methodology is composed of:
Workload: This stage is about gathering as much information as possible to understand your data. This will allow you formulate assumptions about, you data size the operations that will be performance against it (reads and writes), quantify operations and qualify operations.
You can get this by:
Scenarios
Prototype
Production Logs & Stats (if you are migrating).
Relationships: Identify the relationship between the different entities in your data, quantify those relationships and apply embedding or linking. In general you should prefer embedding by default, but remember that arrays should not grow without bound (6 Rules of Thumb for MongoDB Schema Design: Part 3).
Patterns: Apply schema design patterns. Take a look at Building with Patterns: A Summary, it presents a matrix that highlights the pattern that could be useful for a given use case.
Finally, the goal of this methodology is help you create a model, that can scale and perform well under stress.
If you design the achievement schema like this:
{
name: {
type: String,
default: "Achievement name",
},
userid: {
type: String,
default: " users id here",
},
progress: {
type: Number,
default: 0,
},
}
}
When an achievement is gained you just add another entry
for getting achievements Map-Reduce is a good candidate for running map reduce on the database. you can run them on a less regular basis, using them for offline computation of the data that you want.
based on documentation you can do like the following photo
In my db scheme, I need a autoincrement primary key. How I can realize this feature?
PS For access to DynamoDB, I use dynode, module for Node.js.
Disclaimer: I am the maintainer of the Dynamodb-mapper project
Intuitive workflow of an auto-increment key:
get the last counter position
add 1
use the new number as the index of the object
save the new counter value
save the object
This is just to explain the underlying idea. Never do it this way because it's not atomic. Under certain workload, you may allocate the same ID to 2+ different objects because it's not atomic. This would result in a data loss.
The solution is to use the atomic ADD operation along with ALL_NEW of UpdateItem:
atomically generate an ID
use the new number as the index of the object
save the object
In the worst case scenario, the application crashes before the object is saved but never risk to allocate the same ID twice.
There is one remaining problem: where to store the last ID value ? We chose:
{
"hash_key"=-1, #0 was judged too risky as it is the default value for integers.
"__max_hash_key__y"=N
}
Of course, to work reliably, all applications inserting data MUST be aware of this system otherwise you might (again) overwrite data.
the last step is to automate the process. For example:
When hash_key is 0:
atomically_allocate_ID()
actual_save()
For implementation details (Python, sorry), see https://bitbucket.org/Ludia/dynamodb-mapper/src/8173d0e8b55d/dynamodb_mapper/model.py#cl-67
To tell you the truth, my company does not use it in production because, most of the time it is better to find another key like, for the user, an ID, for a transaction, a datetime, ...
I wrote some examples in dynamodb-mapper's documentation and it can easily be extrapolate to Node.JS
If you have any question, feel free to ask.
Another approach is to use a UUID generator for primary keys, as these are highly unlikely to clash.
IMO you are more likely to experience errors consolidating primary key counters across highly available DynamoDB tables than from clashes in generated UUIDs.
For example, in Node:
npm install uuid
var uuid = require('uuid');
// Generate a v1 (time-based) id
uuid.v1(); // -> '6c84fb90-12c4-11e1-840d-7b25c5ee775a'
// Generate a v4 (random) id
uuid.v4(); // -> '110ec58a-a0f2-4ac4-8393-c866d813b8d1'
Taken from SO answer.
If you're okay with gaps in your incrementing id, and you're okay with it only roughly corresponding to the order in which the rows were added, you can roll your own: Create a separate table called NextIdTable, with one primary key (numeric), call it Counter.
Each time you want to generate a new id, you would do the following:
Do a GetItem on NextIdTable to read the current value of Counter --> curValue
Do a PutItem on NextIdTable to set the value of Counter to curValue + 1. Make this a conditional PutItem so that it will fail if the value of Counter has changed.
If that conditional PutItem failed, it means someone else was doing this at the same time as you were. Start over.
If it succeeded, then curValue is your new unique ID.
Of course, if your process crashes before actually applying that ID anywhere, you'll "leak" it and have a gap in your sequence of IDs. And if you're doing this concurrently with some other process, one of you will get value 39 and one of you will get value 40, and there are no guarantees about which order they will actually be applied in your data table; the guy who got 40 might write it before the guy who got 39. But it does give you a rough ordering.
Parameters for a conditional PutItem in node.js are detailed here. http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/frames.html#!AWS/DynamoDB.html. If you had previously read a value of 38 from Counter, your conditional PutItem request might look like this.
var conditionalPutParams = {
TableName: 'NextIdTable',
Item: {
Counter: {
N: '39'
}
},
Expected: {
Counter: {
AttributeValueList: [
{
N: '38'
}
],
ComparisonOperator: 'EQ'
}
}
};
For those coding in Java, DynamoDBMapper can now generate unique UUIDs on your behalf.
DynamoDBAutoGeneratedKey
Marks a partition key or sort key property as being auto-generated.
DynamoDBMapper will generate a random UUID when saving these
attributes. Only String properties can be marked as auto-generated
keys.
Use the DynamoDBAutoGeneratedKey annotation like this
#DynamoDBTable(tableName="AutoGeneratedKeysExample")
public class AutoGeneratedKeys {
private String id;
#DynamoDBHashKey(attributeName = "Id")
#DynamoDBAutoGeneratedKey
public String getId() { return id; }
public void setId(String id) { this.id = id; }
As you can see in the example above, you can apply both the DynamoDBAutoGeneratedKey and DynamoDBHashKey annotation to the same attribute to generate a unique hash key.
Addition to #yadutaf's answer
AWS supports Atomic Counters.
Create a separate table (order_id) with a row holding the latest order_number:
+----+--------------+
| id | order_number |
+----+--------------+
| 0 | 5000 |
+----+--------------+
This will allow to increment order_number by 1 and get the incremented result in a callback from AWS DynamoDB:
config={
region: 'us-east-1',
endpoint: "http://localhost:8000"
};
const docClient = new AWS.DynamoDB.DocumentClient(config);
let param = {
TableName: 'order_id',
Key: {
"id": 0
},
UpdateExpression: "set order_number = order_number + :val",
ExpressionAttributeValues:{
":val": 1
},
ReturnValues: "UPDATED_NEW"
};
docClient.update(params, function(err, data) {
if (err) {
console.log("Unable to update the table. Error JSON:", JSON.stringify(err, null, 2));
} else {
console.log(data);
console.log(data.Attributes.order_number); // <= here is our incremented result
}
});
🛈 Be aware that in some rare cases their might be problems with the connection between your caller point and AWS API. It will result in the dynamodb row being incremented, while you will get a connection error. Thus, there might appear some unused incremented values.
You can use incremented data.Attributes.order_number in your table, e.g. to insert {id: data.Attributes.order_number, otherfields:{}} into order table.
I don't believe it is possible to to a SQL style auto-increment because the tables are partitioned across multiple machines. I generate my own UUID in PHP which does the job, I'm sure you could come up with something similar like this in javascript.
I've had the same problem and created a small web service just for this purpose. See this blog post, that explains how I'm using stateful.co with DynamoDB in order to simulate auto-increment functionality: http://www.yegor256.com/2014/05/18/cloud-autoincrement-counters.html
Basically, you register an atomic counter at stateful.co and increment it every time you need a new value, through RESTful API. The service is free.
Auto Increment is not good from performance perspective as it will overload specific shards while keeping others idle, It doesn't make even distribution if you're storing data to Dynamodb.
awsRequestId looks like its actually V.4 UUID (Random), code snippet below to try it:
exports.handler = function(event, context, callback) {
console.log('remaining time =', context.getRemainingTimeInMillis());
console.log('functionName =', context.functionName);
console.log('AWSrequestID =', context.awsRequestId);
callback(null, context.functionName);
};
In case you want to generate this yourself, you can use https://www.npmjs.com/package/uuid or Ulide to generate different versions of UUID based on RFC-4122
V1 (timestamp based)
V3 (Namespace)
V4 (Random)
For Go developers, you can use these packages from Google's UUID, Pborman, or Satori. Pborman is better in performance, check these articles and benchmarks for more details.
More Info on Universal Unique Identifier Specification could be found here.
Create the new file.js and put this code:
exports.guid = function () {
function _p8(s) {
var p = (Math.random().toString(16)+"000000000").substr(2,8);
return s ? "-" + p.substr(0,4) + "-" + p.substr(4,4) : p ;
}
return (_p8() + _p8(true) + _p8(true)+new Date().toISOString().slice(0,10)).replace(/-/g,"");
}
Then you can apply this function to the primary key id. It will generate the UUID.
Incase you are using NoSQL DynamoDB then using Dynamoose ORM, you can easily set default unique id. Here is the simple user creation example
// User.modal.js
const dynamoose = require("dynamoose");
const userSchema = new dynamoose.Schema(
{
id: {
type: String,
hashKey: true,
},
displayName: String,
firstName: String,
lastName: String,
},
{ timestamps: true },
);
const User = dynamoose.model("User", userSchema);
module.exports = User;
// User.controller.js
const { v4: uuidv4 } = require("uuid");
const User = require("./user.model");
exports.create = async (req, res) => {
const user = new User({ id: uuidv4(), ...req.body }); // set unique id
const [err, response] = await to(user.save());
if (err) {
return badRes(res, err);
}
return goodRes(res, reponse);
};
Instead of using UUID use KSUID for ids. Naturally ordered by generation time.
https://www.npmjs.com/package/ksuid?activeTab=readme
I have a table with a location column and "count" column (with values from 1 to 100).
I'd like to map the records with markers that change in size, i.e. the bigger the count value is, the bigger the marker is.
Is that possible in Google Fusion? How would you suggest to do that?
Thanks.
Currently there are only 2 sizes of icons available: small and large, I put together a little example to show you how to use them together with the FusionTablesLayer, which is a special layer for Google Maps that can use to query your Google Fusion Tables.
FusionTablesLayer allow you apply a style to your data (markers, lines or polygons), it boils down to this:
layer = new google.maps.FusionTablesLayer({
query: {
select: 'Location',
from: '3609183'
},
styles: [
{ where: "Number > 1000",
markerOptions: {
iconName: 'large_green'
}
},
{ where: "Number <= 1000",
markerOptions: {
iconName: 'large_red'
}
},
{ where: "Number <= 100",
markerOptions: {
iconName: 'small_purple'
}
}
]});
If two sizes are not enough, then maybe you can play around with different colors/icons (there is a list with supported icons). Otherwise you have to retrieve your data and create custom markers with images of different size.
Javram pointed to one approach, but the list of available marker icons is limited in Fusion Tables and AFAIK there is no way to vary the icon size. Another approach might be to use the JSONP support provided by Fusion Tables to retrieve you your data and create your own makers. This blog post explains how to do it.
The answer is here, http://support.google.com/fusiontables/bin/answer.py?hl=en&answer=185991 basically, you need to add a column in your table that is the name of the marker type you want to use for that location.
What is the best way to store object data in HTML5's localStorage. I haven't worked much with key value storage.
In my research i've seen a few different approaches.
example data:
var commands = [
{invokes: 'Window', type: 'file', data: '/data/1'},
{invokes: 'Action', type: 'icon', data: '/data/2'},
{invokes: 'Window', type: 'file', data: '/data/3'}
];
Approach 1: store keys that represent each data item
// for(...) {
localStorage["command[" + i + "].invokes"] = command[i].invokes
localStorage["command[" + i + "].type"] = command[i].type
localStorage["command[" + i + "].data"] = command[i].data
//}
Approach 2: keys is entity name, store json
localStorage["commands"] = JSON.stringify(commands);
Second approach would require a JSON.parse().
pros/cons?
For the record I went with approach 2. My key is similar to a table name, the value is a json stringified array of records. When retrieving the table you must call JSON.parse().
technologies for local storages are: http://madhukaudantha.blogspot.com/2011/02/client-side-storages-with-html-5.html...
Approach 2, OK..
there are more ways
Certainly your approach works just fine however it does leave you a bit stuck with the convention you chose moving forward. I would recommend that you consider wrapping your localStorage access up in a class so that your convention is isolated to a class as a true convention.
Otherwise should you chose to change ho you approach it you will have implementation code scattered all over your code base.