How does Zapier/IFTTT implement the triggers and actions for different API providers? - mashup

How does Zapier/IFTTT implement the triggers and actions for different API providers? Is there any generic approach to do that, or they are implemented by individual?
I think the implementation is based on REST/Oauth, that is generic from high level to see. But for Zapier/IFTTT, it defines a lot of trigger conditions, filters. These conditions, filters should be specific to different provider. Is the corresponding implementation in individual or in generic? If in individual, there must be a vast labor force. If in generic, how to do that?

Zapier developer here - the short answer is, we implement each one!
While standards like OAuth make it easier to reuse some of the code from one API to another, there is no getting around the fact that each API has unique endpoints and unique requirements. What works for one API will not necessarily work for another. Internally, we have abstracted away as much of the process as we can into reusable bits, but there is always some work involved to add a new API.

PipeThru developer here...
There are common elements to each API which can be re-used, such as OAuth authentication, common data formats (JSON, XML, etc). Most APIs strive for a RESTful implementation. However, theory meets reality and most APIs are all over the place.
Each services offers its own endpoints and there are no commonly agreed upon set of endpoints that are correct for given services. For example, within CRM software, its not clear how a person, notes on said person, corresponding phone numbers, addresses, as well as activities should be represented. Do you provide one endpoint or several? How do you update each? Do you provide tangential records (like the company for the person) with the record or not? Each requires specific knowledge of that service as well as some data normalization.
Most of the triggers involve checking for a new record (unique id), or an updated field, most usually the last update timestamp. Most services present their timestamps in ISO 8601 format which makes parsing timestamp easy, but not everyone. Dropbox actually provides a delta API endpoint to which you can present a hash value and Dropbox will send you everything new/changed from that point. I love to see delta and/or activity endpoints in more APIs.
Bottom line, integrating each individual service does require a good amount of effort and testing.
I will point out that Zapier did implement an API for other companies to plug into their tool. Instead of Zapier implementing your API and Zapier polling you for data, you can send new/updated data to Zapier to trigger one of their Zaps. I like to think of this like webhooks on crack. This allows Zapier to support many more services without having to program each one.

I've implemented a few APIs on Zapier, so I think I can provide at least a partial answer here. If not using webhooks, Zapier will examine the API response from a service for the field with the shortest name that also includes the string "id". Changes to this field cause Zapier to trigger a task. This is based off the assumption that an id is usually incremental or random.
I've had to work around this by shifting the id value to another field and writing different values to id when it was failing to trigger, or triggering too frequently (dividing by 10 and then writing id can reduce the trigger sensitivity, for example). Ambiguity is also a problem, for example in an API response that contains fields like post_id and mesg_id.
Short answer is that the system makes an educated guess, but to get it working reliably for a specific service, you should be quite specific in your code regarding what constitutes a trigger event.

Related

Where to store possibly sensitive but unimportant information

I am working on an app which touches sensitive information, like money.
We have some calculators, and we want to prefill the values with whatever the user has entered last. Apart from increasing UX, we don't need those. But we cannot store it in web storage or cookie because of security.
We have
a JS frontend,
an API Gateway backend that is supposed to be "stupid", so it only handles authentication and sending messages to to corresponding services
some services that actually care about the business logic
These possibilities come to mind and I cannot decide which I should do (and foremost: why)
Add a table in backend, that is a catch all for implementing cookie-like functionality in backend
Add a specific table in the service it fits the most
Use a key value store in backend (don't know about this, a coworker put it out there)
As i read your requirements it seams that this is kind of a defaulting including some business logic (stupid or smart). Personally i see defaulting as part of business logic and based on this it's part of the service which cares about this functionality.
Add a table in backend, that is a catch all for implementing cookie-like functionality in backend
This sounds like a generic solution for a pretty generic requirement. what do you wanna achieve with this?
Add a specific table in the service it fits the most
Sounds reasonable especially because you put it there where it belongs. Does it have to be a table? why not calculate or copy the values on runtime?
Use a key value store in backend (don't know about this, a coworker put it out there)
This is maybe a technological decision but first you need a design decision.

Microservices unsuitable for business domain?

The business domain has five high-level bounded contexts
Customers
Applications
Documents
Decisions
Preforms
Further, these bounded contexts has sub-contexts like ordering and delivery of the documents. Despite the project of consisting ten of thousands of classes and dozens of EJB's, most of the business logic resides in relational database views and triggers for a reason: A lot of joins, unions and constraints involved in all business transactions. In other words, there is complex web of dependencies and constraints between the bounded contexts, which restricts the state transfers. In layman terms: the business rules are very complicated.
Now, if I were to split this monolith to database per service microservices architecture, bounded contexts being the suggested service boundaries, I will have to implement all the business logic with explicit API calls. I would end up with hundreds of API's implementing all these stupid little business rules. As the performance is main factor (we use a lot of effort to optimize the SQL as it is now), this is out of the question. Secondly, segregated API's would probably be nightmare to maintain in this web of ever evolving business rules, where as database triggers actually support the high cohesion and DRY mentality, enforcing the business rules transparently.
I came up with a conclusion microservice architecture being unsuitable for this type of document management system. Am I correct, or approaching the idea from wrong angle?
First of all, you don't have to have a Microservices architecture. I really mean it! If you were ordered by management/architect to do it, and it doesn't solve any real problems you are having, you are probably right for pushing back.
That being said, and with the disclaimer that I don't know the exact requirements of your application, having "things" as bounded context is a smell. So having "Customers", "Applications", "Documents", etc. as services is very likely the wrong approach.
Bounded contexts should not be CRUD operations on a specific entity. They should be completely independent (or as independent as possible) "vertical" parts of the whole application. Preferably with their own Database and GUI. They should also operate independently of each other, not requiring input from other services for own decisions.
It is the complete opposite of data-centric design, where tables/fields and relations are the core concepts. Here, functionality is the core concept. You would have to split your application along functionality to arrive at a good separation.
I could imagine a document management system having these idependent bounded contexts / services: Search, Workflow, Editing, etc.
Here is how you would think about it: Search does not require any (synchronous) input from any other service. It may receive regular, even near-time updates with new documents, but that does not impact it's main feature: searching already indexed documents. The GUI is also independent, something like one google-like page with a search box maybe. It can deliver results independently, and would link back to the Workflow or Editing apps when you click on a result.
The others would be similarly independent. Again, the point is to split the services in a way that makes them work independently. If you don't have that, you will only make things worse with Microservices.
First of all the above answer is correct in suggesting that you need to breaup your microservice in a better way.
Now If scalability is your concern(lots of api calls between microservice).
I strongly suggest you to validate that how many of the constraints are really required at the first level, and how many of them you could do in async way. With that what i mean is in distributed enviornment we actually do not need to validate all the things at the same time.
Sometimes these things are not directly visible , for eg: lets say there are two services order service and customer service and order service expose a api which say place a order for customer id. and business say you cannot place a order for a unknown customer
one implementation is from the order service you call the customer service in sync ---- in this case customer service down will impact your service, now lets question do we really need this.
Because a scenario could happen where customer just placed an order and somebody deleted that customer from customer service, now we have a order which dosen't belong to customer.Consistency cannot be guaranteed.
In the new sol. we are saying allow the order service to place the order without checking the customer id and do one of the following:
Using ProcessManager check the customer validity and update the status of the order as invalid and when customer get deleted using ProcessManager update the order status as invalid or perform business logic
Do not check at all , because placing a order dosen't count a thing, when this order will be in the process of dispatch that service will anyway check the customer status
In this way your API hits are reduced and better independent services are produced

Online Secure Message Centre Design

I've got a requirement for an Online Customer Portal Secure 'Message Centre' to allow the back and front office to communicate with their customers in a two way fashion once the Customer has logged in via a secure channel.
We have procured a CMS platform with this widget presentation layer out of the box that expects to connect to an API to handle the communication and persistence i.e., the CMS is stateless.
I was wondering how people have designed and solutioned this - my current thinking:
Shoehorn it into our backend CRM system via a REST API - this would need custom dev
Use an RDBMS (custom DB data model adhering to the message structure) and build a REST API over the DB to handle the customer interaction events i.e., read, delete, new message
Build a pure microservice architecture with persistence coupled to the service - i.e., adhering to the pattern - engineering wise we don't have this capability yet
Other obvious solution that I have missed?
Am sure this has been solved multiple times over, keen to hear what works best?
*One thing I forgot to mention, is that we are migrating from an old legacy system and will need to bring about 10GB of customer messages with us i.e., historical data; this data needs to migrate into the new solution.
Many thanks
However you implement the back-end, the key here is to spend time getting your REST interfaces 'right', before doing any coding. Try to breakdown the interfaces into small specialized interfaces that service a specific business-focused responsibility. Also, think about the data model abstraction and its representation in the HTTP payload, and how to cross-reference to other data, using links embedded in the data transferred over the interface. If you get the interfaces right, then you can swap out the implementation down the line.
It is impossible to say without a deep analysis of the options what is best way to go. Unfortunately you haven't really explained the full extent of the API required or the capabilities of your existing CRM, but I am assuming there would be useful business advantages to option 1, as it integrates with your existing systems and business process. Option 2/3 would need your office staff users to use a different system, requiring training/support, which to my mind doesn't seem ideal. Option 3 requires a significant amount of work (not just coding, but integration testing, deployment, orchestration etc!), and from your description of the task, it is not clear that there really is a need to go down this route. My very high level hunch is option 1, but you will obviously need to research whether there is appropriate mapping between the API you present to the CMS and the API that is available on the CRM. Also bear in mind the security model with the CRM and of course responsiveness/throughput.

AngularJS + Breeze + Security

We are trying to use AngularJS with Breeze, with a .NET backend. We have gotten things hooked up working together. However, we are having trouble figuring out how to lock things down based on the user role and the user's own data.
Can anyone point us in the general direction? We couldn't find anything explicitly in Breeze's documention.
There is no reason why Breeze should be insecure. Security is orthogonal. My question remains: what are your concerns?
Update 2 March 2015
Thanks for the clarifying comment ... which reflects concerns that are widely share. I really am going to have to write about this at length in our documentation.
Fortunately, I believe I can ease your mind about the specific issues you raised.
BreezeJS, the client library, can only reach the data that your server allows the current user to access. It's the server's job to grant or refuse such requests.
This is fundamentally the same story for a client written with any technology talking to a server written with any technology. If the server has a "Customers" endpoint, than a client can request all of your customers and will receive them unless you guard that endpoint with logic on the server. This is true with or without Breeze.
You may be thinking that the metadata describes your entire database schema and therefore exposes the entire database to Breeze client requests. That statement is not true on a couple of counts.
First, even if the client knows about your entire database schema, it can't do anything with that knowledge unless you go to the trouble of exposing every table in your web api with unguarded endpoints. This is entirely within your control and its not something you can do by accident.
Second, there is no good reason to send metadata that describe your entire database. If you let the server generate the metadata based on the Entity Framework model, you can easily limit the size and shape of that model to the subset of the database that you want to expose in your client-facing api.
After you've narrowed the model and the web api to the size and shape appropriate for your application, you must take the next step ... the step you'd take for any web api imaginable ... which is to guard the endpoints.
At a minimum that means ensuring that the user is authenticated and authorized to make requests of each endpoint. But it also means preventing unwanted responses even to authorized user requests. For example, you might want to limit on the server the number of Customers that can be returned for any given customer query. You might want to throttle the number of requests that you'll process in a single interval of time. You might want to filter the customers down to just those few that the user is allowed to see.
The techniques for doing these things are all part of the ASP.NET Web API itself, having nothing to do with Breeze whatsoever. You'll want to investigate the options that Web API has to offer.
The update side of things is actually much easier to manage with Breeze components for ASP.NET Web API. The conventional Breeze update mechanism is a batch post to a single SaveChanges endpoint. In other words, the surface area of the attack can be limited to a single endpoint. The Breeze SaveChanges method for .NET offers two interception points for security and integrity checks:
BeforeSaveEntity where you can inspect and confirm every entity individually before it gets saved to the database.
BeforeSaveEntities where you can inspect the entire batch as a whole ... to ensure that the save request is cohesive and legitimate. This is something you can't do in a simple REST-ish api where PUT/POST/DELETE requests arrive as separate, autonomous events
The Breeze query language is highly expressive so it is possible that the client may query the server for something that you were not expecting. The expand clause is probably the most "dangerous" in this regard. Someone can query the Customer endpoint and get their related Orders, their related OrderDetails, the related Products, etc ... all at the same time.
That is a great power and with it comes responsibility. You may choose to withhold that power by refusing to allow expand queries. You can refuse select queries that can "expand" by selecting related entities. The ASP.NET Web API makes it easy to impose these restriction.
Alternatively, you can allow expand in some cases and not others. Or you can inspect the query request within the GET endpoint's implementation and refuse it if it fails your security checks.
You could decide that you don't want certain entity types to be "queryable" at all. You can create just the specialized GET endpoints you need to support safe access to those highly sensitive types. If the supporting methods don't return IQueryable, neither Breeze nor Web API will attempt to turn the OData query parameters into LINQ queries. These endpoints look and behave exactly like the traditional REST-ish apis that are familiar to you now. And the Breeze client will be happy to play along. When you compose a Breeze query, the client doesn't know whether the server will honor that request. With Breeze you can compose any request you want and send it to any HTTP endpoint you desire. You're not limited to OData-style queries.
You don't need ONE approach for querying. You decide what entity types are exposed, how, and under what conditions. You can and should write the guard logic to ensure that the proper data are returned by a query or updated by a save ... just as you would for any web api. Both Breeze and Web API give you a rich set of tools for writing such guard logic. Your options are unbounded.
Finally, I observe that Breeze-oriented apis tend to be much smaller than the typical RESTy api ... that is, they offer fewer endpoints and (in this sense) a smaller surface area. As a practical matter, that means you can concentrate your server-side security budget on fewer methods and potentially improve both the quality of that code and your capacity to scrutinize your api's security risks.

Is OData suitable for multi-tenant LOB application?

I'm working on a cloud-based line of business application. Users can upload documents and other types of object to the application. Users upload quite a number of documents and together there are several million docs stored. I use SQL Server.
Today I have a somewhat-restful-API which allow users to pass in a DocumentSearchQuery entity where they supply keyword together with request sort order and paging info. They get a DocumentSearchResult back which is essentially a sorted collection of references to the actual documents.
I now want to extend the search API to other entity types than documents, and I'm looking into using OData for this. But I get the impression that if I use OData, I will face several problems:
There's no built-in limit on what fields users can query which means that either the perf will depend on if they query a indexed field or not, or I will have to implement my own parsing of incoming OData requests to ensure they only query indexed fields. (Since it's a multi-tenant application and they share physical hardware, slow queries are not really acceptable since those affect other customers)
Whatever I use to access data in the backend needs to support IQueryable. I'm currently using Entity Framework which does this, but i will probably use something else in the future. Which means it's likely that I need to do my own parsing of incoming queries again.
There's no built-in support for limiting what data users can access. I need to validate incoming Odata queries to make sure they access data they actually have permission to access.
I don't think I want to go down the road of manually parsing incoming expression trees to make sure they only try to access data which they have access to. This seems cumbersome.
My question is: Considering the above, is using OData a suitable protocol in a multi-tenant environment where customers write their own clients accessing the entities?
I think it is suitable here. Let me give you some opinions about the problems you think you will face:
There's no built-in limit on what fields users can query which means
that either the perf will depend on if they query a indexed field or
not, or I will have to implement my own parsing of incoming OData
requests to ensure they only query indexed fields. (Since it's a
multi-tenant application and they share physical hardware, slow
queries are not really acceptable since those affect other customers)
True. However you can check for allowed fields in the filter to allow the operation or deny it.
Whatever I use to access data in the backend needs to support
IQueryable. I'm currently using Entity Framework which does this, but
i will probably use something else in the future. Which means it's
likely that I need to do my own parsing of incoming queries again.
Yes, there is a provider for EF. That means if you use something else in the future you will need to write your own provider. If you change EF probably you took a decision to early. I don´t recommend WCF DS in that case.
There's no built-in support for limiting what data users can access. I
need to validate incoming Odata queries to make sure they access data
they actually have permission to access.
There isn´t any out-of-the-box support to do that with WCF Data Services, right. However that is part of the authorization mechanism that you will need to implement anyway. But I have good news for you: do it is pretty easy with QueryInterceptors. simply intercepting the query and, based on the user privileges. This is something you will have to implement it independently the technology you use.
My answer: Considering the above, WCF Data Services is a suitable protocol in a multi-tenant environment where customers write their own clients accessing the entities at least you change EF. And you should have in mind the huge effort it saves to you.

Resources