I have a question concerning microservices and databases. I am developing an application: a user sees a list of countries and can click through it so he can see a list of attractions of that country. I created a country-service, auth-service (contains users for oAuth2) and an attraction-service. Each service has its own database. I mapped the association between an attraction and its country by the iso code (for example: BE = belgium): /api/attraction/be.
The approach above seems to work but I am a bit stuck with the following: a user must be able to add an attraction to his/her list of favorites, but I do not see how that's possible since I have so many different databases.
Do I create a favorite-service, do I pass id's (I don't think I should do this), what kind of business key can I create, how do I associate the data in a correct way...?
Thanks in advance!
From the information you have provided, using a standalone favourite service sounds like the right option.
A secondary simpler and quicker option might be to also to handle this on your user service which looks after the persistence of your users data as favourites are exclusive to a user entity.
As for ID's, I haven't seen many reasons as to why this might be a bad idea? Your individual services are going need to store some identifying value for related data and the main issue here I feel is just keeping this ID field consistent across your different services. What you choose just needs to be reliable and predictable to keep things easy and simple as your system grows.
If you are using RESTful HTTP, you already have a persistent, bookmarkable identification of resources, URLs (URIs, IRIs if you want to be pedantic). Those are the IDs that you can use to refer to some entity in another microservice.
There is no need to introduce another layer of IDs, be it country codes, or database ids. Those things are internal to your microservice anyway and should be transparent for all clients, including other microservices.
To be clear, I'm saying, you can store the URI to the country in the attractions service. That URI should not change anyway (although you might want to prepare to change it if you receive permanent redirects), and you have to recall that URI anyway, to be able to include it in the attraction representation.
You don't really need any "business key" for favorites either, other than the URI of the attraction. You can bookmark that URI, just as you would in a browser.
I would imagine if there is an auth-service, there are URIs also for identifying individual users. So in a "favorites" service, you could simply link the User URI with Attraction URIs.
Related
How multiple teams(which own different system components/micro-services) in a big tech company share their databases.
I can think of multiple use cases where this would be required. For example in an e-commerce firm, same product will be shared among multiple teams like product at first will be part of product onboarding service, then may be catalog service (which stores all products and categories), then search service, cart service, order placing service, recommendation service, cancellation & return service and so on.
If they don't share any db then
Do they all have redundant copy of the products with same product ID and
Wouldn't there be a challenge to achieve consistency among multiple team.
There are multiple related doubt I have in both the case wether they share DB or not.
I have been through multiple tech blogs and video on software design, and still didn't get satisfying answer. Do share some resources which can give a complete workflow of how things work end-to-end in a big tech firm.
Thank you
In the microservice architecture, each microservice exposes endpoints where other microservice can access shared information between the services. So one service would store as minimal information of a record that is managed by another microservice.
For example if a user service would like to fetch orders for a particular user in an e-commerce case, then the order service would expose an endpoint given a user id would return all orders related to the userid supplied and so on...so essentally the only field related to the user that the order service needs to store is the userid, the rest of the user details is irrelevant to it.
To further improve the cohesion and understanding between teams, data discovery apis/documentation are also built to share metadata of databases to other teams to further explain what each table/field means for one to efficiently plan out a microservice. You can read more about how such companies build data discovery tools
here
If I understand you correctly, you are unsure how different departments receive data in a company?
The idea is that you create reusable and effective API's to solve this problem.
Let's generically say the company we're looking at is walmart. Walmart has millions of items in a database(s). Each item has a unique ID etc etc.
If Walmart is selling items online via walmart.com, they have to have a way to get those items, so they create API's and use them to grab items based on certain query conditions.
Now, let's say walmart has decided to build an app... well they need those exact same items! Well, good thing we already created those API's, we will use the exact same ones to grab the data.
Now, how does Walmart manage which items are available at which store, and at what price? They would usually link this meta data through additional database schema tables and tying them all together with primary and foreign keys.
^^ This essentially allows walmart to grab ONLY the item out of their CORE database that only has details that are necessary to the item (e.g. name, size, color, SKU, details, etc), and link it to another database that is say, YOUR local walmart that contains information relevant to only your walmart location in regard to that item (e.g. price, stock, aisle number etc).
So using multiple databases yes, in a sense.
Perhaps this may drive you down some more roads: https://learnsql.com/blog/why-use-primary-key-foreign-key/
https://towardsdatascience.com/designing-a-relational-database-and-creating-an-entity-relationship-diagram-89c1c19320b2
There's a substantial diversity of approaches used between and even within big tech companies, driven by different company/org cultures and different requirements around consistency and availability.
Any time you have an explicit "query another service/another DB" dependency, you have a coupling which tends to turn a problem in one service into a problem in both services (and this isn't a necessarily a one-way thing: it's quite possible for the querying service to encounter a problem which cascades into a problem in the queried service (this is especially possible when a cache becomes load-bearing, which has led to major outages at at least one FANMAG in the not-that-distant past)).
This has led some companies that could be fairly called big tech to eschew that approach in their service design, typically by having services publish events describing what has changed to a durable log (append-only storage). Other services subscribe to that log and use the events to construct their own eventually consistent view of the data owned by the other service (i.e. there's some level of data duplication, with services storing exactly the data they need to function).
In my app I am integrating different third party APIs, like Google Contacts, Google Calendar, Mailchimp, etc...
Each third party API has specific settings I can ask to the user to apply during the sync with my app.
I have two options:
I make unique table like "integration" where I store all connections. Here I am afraid I will need to add many columns to track settings for different APIs (contact privacy, calendar id, Mailchimp audience id...etc...)
I make a table for each third party API integrated in order to track separately all relevant setting through dedicated columns. Here I may have many tables when I reach many different third party APIs.
I would think the second is the best, but I need your help to structure my database as per best practices to be scalable.
Thanks!
What you are trying to do is clearly against Google's Terms of Service, specifically Section 5 e. Prohibitions on Content:
Unless expressly permitted by the content owner or by applicable law, you will not, and will not permit your end users or others acting on your behalf to, do the following with content returned from the APIs:
Scrape, build databases, or otherwise create permanent copies of such content, or keep cached copies longer than permitted by the cache header;
[...]
And there are very good legal and moral reasons for that. So the answer is actually not to do it and use the APIs every time.
I am using neo4j to build a social network web app where users that are friends can communicate with each other through video calls. Each participating user will also be able to submit a review at the end of each call. I structured the graph such that two (:User) nodes can have a [:FRIEND] relationship between each other. For a particular video call, I am planning on creating a (:VideoCall) node (which contains properties such as roomId) and a [:PARTICIPANT] relationship from the (:VideoCall) node to each participating (:User) node. The [:PARTICIPANT] relationship will have a rating property containing the user's review for that video call. Would this model be performant if there are a large number of user and video call nodes? Is there a better way to design the database for this type of feature?
Yes it should be performing well. Just make sure you have properties that you want to look up by indexed and constraints in place
What kind of use cases would you want to cover besides regular ones?
It is a good model if the video calls involve multiple users AND you want to use roomId as a condition for queries because in this way you can easily find all users that have participated in a specific video call.
However, I noticed that you mentioned it is a social networking web app. So chances are the video calls are just between TWO users. If that's the case, then there'a an alternative to your current model: Make video calls as an edge between users: (:user)-[:videocall]->(:user) Properties such as roomId can be assigned to the edge. This model saves memory because you have fewer nodes.
I've been following the examples in the book "MEAN Machine", and I've implemented a simple token-based authentication system that makes the contents of a certain model only available to authenticated users.
I'd like to take this to a more complex level: I need three different user types.
I am building an app where some users (let's say, vendors) can upload certain data that could only be accessible to certain authenticated users (let's say, consumers), but vendors also need to be able to see, but not edit data uploaded by other vendors. Then, there would be a third type of user, the admin, who would be able to edit and see everything, including the details of other, lower level users.
How should I proceed in constructing this?
Thanks in advance for your help.
As you mentioned that the authentication system is already working and now you need to implement Access List Control. The ACL end implementation depends a lot on your database model and requirements. There are also Node modules which have the support for more advanced models like this acl module https://www.npmjs.com/package/acl, supports also MongoDB.
Thinking about how to model a simple graph for a Side project.
The project's users will be able to add social networks so that they can find other users with social information.
Given neo4j's architecture, which I'm new to, is the correct way to do this:
A different type for each social network (e.g, twitter, LinkedIn) and a relationship of user --> has_twitter_account / user --> has_linkedin_account with relevant keys
one type (SocialMediaAccount) with user --> has_socialmedia_acct relationship with relevant keys In a more generic way and an attribute for name of social network
adding social networks as attributes under each User entity?
Something else I haven't thought of?
I'm leaning towards number 1 but given that I'm a newcomer I want to make sure this isn't an anti-pattern or setting me up for pain later.
Not sure if I understand you requirements correctly, but in general, it's good to model real world entities as nodes and relationships between things, quite naturally, as relationships.
In your case, I'd go one of the two following ways, depending on how much you want to do with their social media accounts.
1) A node for each social media, e.g. (LinkedIn), (Twitter), (Facebook). Then a single relationship type, call it HAS_ACCOUNT, which links (user) nodes to accounts. For example:
(user1)-[:HAS_ACCOUNT]->(LinkedIn)
2) If you find you're storing too many properties in the HAS_ACCOUNT relationship, or even that you feel like it should be linked to something else (e.g. links between social media accounts), create a node for each account the user has.
(user1)-[:HAS_ACCOUNT]->(user1LinkedInAccount)-[:IS_ACCOUNT]->(LinkedIn)
That way, your model is more flexible and you can link users account together with a different kind of relationship. Say user1 follows user2 on Twitter:
(user1TwitterAccount)-[:IS_LINKED_TO]->(user2TwitterAccount)
Hope it makes sense.