Google App Engine large IN clause query - google-app-engine

I have an Account entity that has a facebook id.
Sometimes, the client might send all facebook ids (the clients facebook friends) to the server.
We want to select all Accounts IN the facebook ids the client provided.
Looping and calling get on each facebook id seems rather slow, considering people might have 1000+ friends. Further more, GAE is limited to 30 queries with IN clause.
Has anyone had a similar situation? How did you handle it?
Thanks!

You can set up a model that uses the facebook ID as a key which allows you to use Model. get_by_key_name(key_names=fb_ids) to fetch all the models with keys in fb_ids at once.
e.g.
class FBModel(db.Model):
account = db.ReferenceProperty(reference_class=Account)
When creating the model:
model = FBModel(key_name=fb_id)

Related

How to query data from different types of databases in a microservice based architecture?

We are a using a micro-service based pattern for our project where we have Users and their Orders. Users personal information (name, email, mobile) is stored in User table in relational database while we are storing Orders data of users in Orders collection in NoSql database. We want to develop an API to get a paginated list of all the orders placed with order details along with finer details of user associated like - user name, mobile, email along with each order. We are storing userId in Orders collection.
The problem is how do we get User details for each order in this list since both the resources are in different databases. We also thought of storing user name, email and mobile in Orders collection only but what if a user updates their profile, the Orders collection will have stale user data.
What is the best approach to address this issue?
You can use API gateway pattern, UI will call to API gateway endpoint and the Endpoint will call the both the API/services to get the result and aggregate it then returns aggregated response to the UI (caller)
https://microservices.io/patterns/apigateway.html
Well it mostly depends on scalability needs in terms of data size and number of requests. You may go with the API gateway if you don't have too much data and you don't get many requests to that service.
Otherwise if you really need something scalable then you should implement your own thought with an event based communication.
I already provided an answer for a similar situation you can take a look
https://stackoverflow.com/a/63957775/3719412
You have two services Orders and Users. You are requesting Orders service to get all Orders. It will return a response data which will contains ID of Users (each Order contains ID of User). Then, you will make a request to a Users service to get an information regarding User by ID which you got before. And finally, you can aggregate those results (if it is needed).
As guys mention, good solution will be to implement API Gateway here. As a client, you will send a request to a single port with endpoint (to a Gateway) and Gateway should create logic which I have described before.

Automated DynamoDB Database Checks | ReactJS + AWS Amplify

My team and I are working on a Full-Stack Application using ReactJS on the frontend and AWS Amplify on the backend. We are using AWS AppSync to Query data in our DynamoDB tables (through GraphQL Queries), Cognito for User Authentication, and SES to send out emails to users. Basically, the user inputs some info (DynamoDB Table #1), and that is matched against an opportunity database (DynamoDB table #2), and the top 3 opportunities are shown to the user. If none are found, an email is sent to inform the user that they will receive an email when opportunities are found. Now for the Question: I wanted to know if there is a way to automatically Query a DynamoDB table (Like once a day or every time a new opportunity is added to the DynamoDB Table #2) and send out emails with matching opportunities to users who were waiting for them? I tried using Lambda Triggers but the only way I could do it was by querying each row of DynamoDB Table #1 against DynamoDB Table #2. That is computationally infeasible as there will be too many resources being used up. I am asking for advice on how I can go about making that daily check because I haven't been able to figure it out yet! Any responses are appreciated, and let me know if you need any additional information from my side! Thank you!
You could look into using DynamoDB Streams. When a new Opportunity is added to DynamoDB, the stream would trigger a lambda to be called. Your lambda could then execute your business logic to match the opportunity with the appropriate user.

Microsoft Graph AD Users or people API to search all users?

I'm trying to build functionality into my app for 'admins' to assign users from their AD group to certain groups that are further assigned to app-specific roles. Basically a simple management component.
Adding the user with the oid to a group is easy, the problem I'm facing is finding the actual user.
Currently, the only option I'm seeing is making multiple api requests to v1.0/users (999 items max) and grouping them all in memory and then provide a simple search function to narrow it down.
I have also used the v1.0/me/people endpoint to search for users but this does not reveal all users from the AD group, just relevant users they deal with, so not too useful.
Is there any other api endpoint I could tap into to do a search ONLY on members of the same active directory?
Using the startsWith filter on multiple properties is probably the closest we can get to user search in MS Graph at the moment:
https://graph.microsoft.com/v1.0/users?$filter=startswith(displayName,'sarah') or startswith(givenName,'sarah') or startswith(surname,'sarah') or startswith(mail,'sarah') or startswith(userPrincipalName,'sarah')
Ended up switching to the old AD Graph API and implementing a query on the endpoint as follows:
https://graph.windows.net/{ tenant ID }/users?api-version=1.6&$select=mail,displayName,objectId,givenName,surname&$filter=startswith(givenName,'SEARCH TERM') or startswith(surname,'SEARCH TERM')
If a function receives 1 single param, it will search for that parameter in both givenName and surname but you could configure this to search accross any other supported fields.
You could also completely ditch the $select= completely to get the whole data. I didn't want the clutter though and those keys are enough for me.
Instead of going with startswith You may get better experience using search keyword:
https://learn.microsoft.com/en-us/graph/api/user-list?view=graph-rest-1.0&tabs=http#example-6-use-search-to-get-users-with-display-names-that-contain-the-letters-wa-including-a-count-of-returned-objects

What is the best practices for building REST API with different subscribers (companies)?

What is the best design approach in term of security, performance and maintenance for REST API that has many subscribers (companies)?
What is the best approach to use?:
Build a general API and sub APIs for each subscriber (company), when request come we check the request and forward it to the sub API using (API Key) then retrieve data to general API then to client.
Should we make single API and many databases for storing each subscribe(company) data (because each company has huge records that why we suggested to separated databases to improve performance)? when request come we verify it and change database Connection String based on client request.
Should we make one API and one big database that handle all subscribes data?
Do you suggest any new approach to solve this problem? We used Web API and MS SQL Server and Azure Cloud.
In the past I've had one API, the API is secured using OAuth/JWT in the token we have a company id. When a request comes in we read the company id from the JWT and perform a lookup in a master database, this database holds global information such a connection strings for each company. We then create a unit of work that has the company's conneciton string associated with it and any database lookups use that.
This mean that you can start with one master and one node database, when the node database starts getting overloaded you can bring up another one and either add new companies to that or move existing companies to take pressure off. Essentially you're just scaling out when the need arises.
We had no performance issues with this setup.
Depends on the transaction volume and nature of data, you can go for a single database or separate database for each company.
Option 2 would be the best , if you have complex data model
I don't see any advantage of going for option 1, because , anyway general API will call for each request.
You can use the ClientID verification while issuing access tokes.
What I understood from your question is, you want an rest API for multiple consumers(companies). Logically the employees from that company will consume your API, employees may be admin, HR etc. So what I suggested for such scenario you must go with single Rest API for providing the services to your consumers and for security you have to use OpenId on the top of OAuth 2. This resolves the authentication and authorization for you.

Sync google contacts by group to a limited number of users

I am trying to build an open-source python code hosted at GAE to sync contacts by group to a limited number of users. In a web interface users will be able to pick their group and whom it will be synced with.
I understand there is a lot of applications on market place withe the same functionality, but my organization is concerned about those provides selling contacts to 3rd parties. We are a non-profit organization, so the code could be hosted at google project or github for community contribution.
(sorry for the long intro)
How is the best way to start? is there tutorial available with similar functionality that I can expand?
What is the best way to compare two Contact kind elements? To see if they need to be sync.
Is there a last update on the Contact kind elements? In case I want to implement a last update wins?
thanks!
I don't know of any tutorials for syncing and comparing contacts specifically, but there is a getting started guide for the Google Contacts API at https://developers.google.com/google-apps/contacts/v3/.
The contacts are sent as XML blobs, so you could compare two contacts by parsing them and looking at the individual elements within them. I don't think there's a better way to do this but there are libraries to handle it for you.
There is a last updated field sent as part of the contacts when retrieving them with the API. It is an XML element labeled <updated>.
how are you getting different user's contacts feeds?
i tried to save the tokens in the datastore when the
users grant the access, but when i get the token back
from datastore for 2 users at a time, after an hour
when the token expires,
all tokens start working like the current users token
and i can only get current users contacts.
token = Get_Shared_User_Token(user_email)
contact_client = gdata.contacts.client.ContactsClient(source=USER_AGENT)
authorized_client = token.authorize(contact_client)
contacts_feed = authorized_client.GetContacts(q = query)
can you please tell how one can get any user's contacts?

Resources