Can all Twilio resource SIDS be considered globally unique? - database

I work on an application that stores data from multiple Twilio accounts in a single backend DB. Part of the stored data is SIDs for various resources within an account - tasks, channels, services etc.
When designing my schema, can I consider all these various types of SIDs to be unique across all accounts, or would I need to consider them only to be unique within the scope of their owning account, and specify my constraints, relationships & queries accordingly?

Yes, they are globally unique :)
A String Identifier (SID) is a unique key that is used to identify specific resources.
source

Related

User identity claims where the combination of two identifiers is unique

We are using IdentityServer4 for managing user identities and logins. In the business domain there are companies that are divided into organizations. Companies are distinguished by unique identifiers, while organization identifiers might clash between different companies, which makes only the combination of company ID and organization ID unique.
A normal user typically has access only to some organizations in one company, but there is a need for superusers that have access to organizations in multiple companies. The claims in IdentityServer were originally designed only for one company, so it has not really been thought how multiple companies should fit into this.
For a normal user it would totally fine that he would have one company claim and one or many organization claims. But this does not really work for superusers that need to access many organizations in many companies.
How should this kind of combination identifier be modelled? The only way we can come up with is combining the company ID and organization ID into one claim, but that means that we must split the claim everywhere we need to have the identifiers separated, which feels rather cumbersome and error prone. So is there a better or more proper way, or is this perhaps a problem that should be fixed outside of Identity server?

In general with Active Directory, what do most companies use as unique identifier for people?

I am trying to build a database that stores Active-Directory entries for users/employees.
Is it safe to assume to query on: (objectClass=person)
What attribute should I store as a unique identifier that isn't the DN? e.g. should I use mail or uid
Also when an employee gets de-activated is there a new attribute that gets added or are they simply removed entirely from AD?
The question asked by you seems to be somewhat opinion based, but I'll talk it from the context of general options available in AD and the usual practices followed.
Is it safe to assume to query on: (objectClass=person)?
All the users created do come under the category of (objectClass=person). But, then if you create a generic-user for having file-share access on a system (through ADUC(dsa.msc) / powershell / C#, etc) which would not be an employee, then in this case it would violate your search condition despite being a person class. I can think of so many other scenarios where it would be impossible to avoid generic-users creation (which would again lie in person objectClass), at least from the viewpoint of mid-sized company and above.
Hence, in such cases it is better to follow a naming convention in your environment to avoid any such confusion. One sample example could be, say set the UPN/sAMAccountName for non-employee users to start from genXXXX, and you'd be easily able to search all employee users henceforth.
What attribute should I store as a unique identifier that isn't the DN? e.g. should I use mail or uid?
There are unique identifiers already available in AD like objectGUID and objectSid. In a domain, the sAMAccountName/UPN values are also unique. But, you cannot rely on that for forest-level search.
objectSid for a user can change when the user is migrated to another domain, but objectGUID never changes. You can read more about SIDs versus GUIDs here.
Also when an employee gets de-activated is there a new attribute that
gets added or are they simply removed entirely from AD?
There is no automatic trigger at AD side. There is an attribute called lastLogontimeStamp which helps keep a track when a user or computer account has logged onto the domain (not the live scenario, but recent one - depending on if it keeps updating properly).
Someone has to manually disable/delete the account if an employee/user leaves the organisation. There are process setup in companies to deal with this scenario where the Access Management solutions are linked with AD modules, and take care of the entry and exit of the users and perform relevant action in AD.
Hope it gives a rough idea of management for the queries raised by you.

How to adequately combine a database and an LDAP directory?

I intend to build a system that stores information in a relational database (PostgreSQL) and in a directory (OpenLDAP).
the directory is for the system's users (workers, customers, brokers), each of them has a UUID that uniquely identifies them;
the database is for things that change often (e.g. order details, transactions currently in progress, etc);
some tables in the database will have a UUID attribute, pointing to entities in OpenLDAP.
A directory is chosen because I want to leverage the ability to build entities that have variable sets of attributes, or entities that inherit or combine attributes from other classes of entities. Such flexibility is needed to support a wider range of business cases.
In other words, the directory provides some "object-orientedness", which would have to be reinvented from scratch, had I chosen to use an RDBMS exclusively.
There's a catch that I am not sure how to deal with yet: if a table refers to a UUID that was removed from the directory - the database will effectively point to nothing. Thus the data will become incomplete or inconsistent.
My solution is to never remove entities from OpenLDAP, instead just mark them as "inactive".
My questions:
is it common practice to utilize a directory and a relational database in such a way?
are there other approaches that can solve the same problem using a single component?

Is there existing terminology / pattern to cover this scenario?

In our SaaS system we're dividing users into separate "pools" according to the customer that originally "owns" the user. We're using "email addresses plus ID of owning organisation" to identify users, rather than just email addresses - so duplicate email addresses can exist between customers (don't ask). Users arrive at the site on various subdomains, and we use these subdomains to identify the "user pool" we're authenticating the user against.
My question: is there any established name for this pattern or something similar?
Cheers!
In database terminology, when uniquely identifying a row using more than one column, this is called a composite primary key (aka compound key).
The scenario you describe is used commonly when a single database is used for multiple customers - one form of multitenancy.
"home-realm-discovery" is a common term for identifying what tenant a user belongs to in a multi-tenant SaaS application. It's most often talked about in the context of Federated Identity but applies in your case too. Using a sub-domain like you're doing is a common practice.
I am not aware of any specific name for this scenario, but in general, this would fall under the phrase "multi-tenant" / "multi-tenancy". Many SaaS implementations do customer (or rather tenant) based branding already on the login screen, which would mean that they'd have to identify the user based on the URL / subdomain, or at least in some way other than the email address used.
Routing to different servers based on the subdomain is also a common way to achieve tiered service levels for SaaS implementations.
I'm not sure I've answered the question, but I hope the general info helps!

Multi Tenant Database with some Shared Data

I have a full multi-tenant database with TenantID's on all the tenanted databases. This all works well, except now we have a requirement to allow the tenanted databases to "link to" shared data. So, for example, the users can create their own "Bank" records and link accounts to them, but they could ALSO link accounts to "global" Bank records that are shared across all tenants.
I need an elegant solution which keeps referential integrity
The ways I have come up with so far:
Copy: all shared data is copied to each tenant, perhaps with a "System" flag. Changes to shared data involve huge updates across all tenants. Probably the simplest solution, but I don't like the data duplication
Special ID's: all links to shared data use special ID's (e.g. negative ID numbers). These indicate that the TenantID is not to be used in the relation. You can't use an FK to enforce this properly, and certainly cannot reuse ID's within tenants if you have ANY FK. Only triggers could be used for integrity.
Separate ID's: all tables which can link to shared data have TWO FK's; one uses the TenantID and links to local data, the other does not use TenantID and links to shared data. A constraint indicates that one or the other is to be used, not both. This is probably the most "pure" approach, but it just seems...ugly, but maybe not as ugly as the others.
So, my question is in two parts:
Are there any options I haven't considered?
Has anyone had experience with these options and has any feedback on advantages/disadvantages?
A colleague gave me an insight that worked well. Instead of thinking about the tenant access as per-tenant think about it as group access. A tenant can belong to multiple groups, including it's own specified group. Data then belongs to a group, possibly the Tenant's specific group, or maybe a more general one.
So, "My Bank" would belong to the Tenant's group, "Local Bank" would belong to a regional grouping which the tenant has access to, and "Global Bank" would belong to the "Everyone" group.
This keeps integrity, FK's and also adds in the possibility of having hierarchies of tenants, not something I need at all in my scenario, but a nice little possibility.
At Citus, we're building a multi-tenant database using PostgreSQL. For shared information, we keep it in what we call "reference" tables, which are indeed copied across all the nodes. However, we keep this in-sync and consistent using 2PC, and can also create FK relationships between reference and non-reference data.
You can find more information here.

Resources