How to automatically authenticate with Hadoop using Active Directory? - active-directory

We have an application that accesses Hadoop via HDFS, YARN, and Hive interfaces. This application works fine against Kerberos-secured clusters if kinit has been run. It also works fine if we call UserGroupInformation.loginUserFromKeytab(). We are able to delegate the HDFS and Hive tokens to YARN applications. The thing we cannot figure out is the following scenario:
The Hadoop cluster is secured using Kerberos
The Hadoop cluster either uses Active Directory as its KDC, or has
established a one-way trust between its KDC and the AD controller.
Our software is running in a session that has been authenticated
using AD directly on Windows, or via PAM or LDAP (or some other mechanism) on
Linux.
Our software queries the active AD session to extract a TGT or
equivalent, and relays that information to the Hadoop APIs (via
UserGroupInformation, presumably).
Hadoop authentication is thus achieved without the need for the user
to enter a principal, password, or keytab.
We know this is possible in theory, because there are two examples of software that achieve this. The first is HDFS Explorer from RedGate. The second is Hue. However, we just can't seem to figure out the right incantation, and even Hortonworks support can't seem to help.

Hue comes with a LDAP backend that can transparently authenticate users against your company directory,
Hue also comes with a KT renewer command for keeping its Kerberos ticket up to date. It is even ran automatically when using CM.

Related

Replace AD with Azure AD

We are using a third-party IT provider that handles our network administration and domain accounts, but as part of moving to a different office and setting up new infrastructure, we are considering dropping that and using Azure Active Directory only.
Researching the topic online seems to indicate that Azure AD is not a complete replacement for on-premises Active Directory, as things like local resource access and group policies outside of Azure would be missing. However, we are moving towards using Azure for most things (file storage, etc), so that should be fine if we still have that functionality there.
Before finalizing the decision to go in that direction, we just need to be certain of a few things:
1) Is there a way to create a new account in Azure AD so that it can be used to login from any machine in the office, without having to create it locally first and then connect the two?
2) Is there a way to sync user data, such as user/desktop files, across any devices the account is used to log into?
3) Is it possible to have an office printer configured in Azure so that it can be used with an Azure AD login, completely independent on any on-premises setup (i.e, not Hybrid Cloud Print, which seems to require an on-premises network/AD to be joined with Azure AD)?
The goal is to be able to log in and work from any internet-connected device, whether in the office or at home, without needing to use a VPN and/or remote desktop, and forego on-premises AD administration.
This is possible as long as the device is joined to Azure AD. Once the device is joined to Azure AD, then newly created cloud-only users can also login to the devices.
Ref: https://learn.microsoft.com/en-us/azure/active-directory/devices/concept-azure-ad-join
Enterprise state roaming should help in this aspect. It might not cover everything you are looking for but the important app-specific data and user settings are synced.
Ref: https://learn.microsoft.com/en-us/azure/active-directory/devices/enterprise-state-roaming-overview
There is no direct solution from Microsoft for pure cloud scenarios. There are few 3rd party services offered for this.
Ref: https://appsource.microsoft.com/en-us/product/azure/printix.64182edf-4951-40d5-91c8-733e1c896b70
Hope this helps.

Best way to implement SSO with .net applications

I have to implement Single Sign on with following user case:
We have three kinds of users:
1) Corporate employees [Stored in Active directory]
2) Clients can access our application
3) We have hosted separate application for each client and clients employees can access this application [hosted on our server] and number of employees can be have million.
So we cannot store use credentials in active directory because we need per user license to use.
Please help me to find better solutions
ADFS only works with AD. The next version of ADFS will work against LDAP so that's a possibility.
I would look at Azure Active Directory (you could use something like AAD Sync. to migrate your existing AD users) or something like OpenAM or PingFederate (both of which are Java based) which you have to pay for or something like shibboleth (Java based but open source). These all support LDAP.
Or if you want to go the SQL Server route, look at thinktecture's identityserver.

Connecting to ad-lds without credentials

I've generated an AD-LDS instance on a Windows Server 2008 R2 and successfully connected to it via ADSI Edit on a windows 7 machine (both computers are situated on the same domain).
My goal is to create a lightweight .NET program that will be run by all domain users and determine whether a specific user can or cannot perform a certain action (roles & groups).
So far i've managed to write most of it, but i'm now facing a small security issue: althought no credentials are required when running from the server itself, when running from another user (in the same domain, ofcourse), LDS connection requires the instance's administrator credentials - and i'm not too keen to leave this kind of thing lie around in my code.
I've search the web quite a lot for a way to bypass that (Active Directory binding? / SimpleBinding?), but all solutions i found involved SSL and certificate installations.
Is there a simple way for a user in the domain to connect the LDS instance without exposing his/the server's credentials?
Thanks.
Have you looked at permissions in the instance itself? There are groups you can add principals to. It sounds like you're running the code locally as the user that installed LDS which by default gets all sorts of perms, but other users were not granted enough rights (secure by default and all that).

JBoss Database Identity Propagation?

I've read how IBM's WebSphere can propagate the identity of a user back to a backend database (http://www.ibm.com/developerworks/websphere/techjournal/0506_barghouthi/0506_barghouthi.html). Does JBoss have similar functionality? Ideally, I'd like to be able to login to my JBoss application using SPNEGO and propagate that identity back to a PostgreSQL database using Kerberos or some other mechanism. Is this possible?
The article you've linked to could be used for that, but there are some caveats.
Having the app server re-authenticate as different users using Kerberos is probably not realistic. From my knowledge of Kerberos (admittedly limited), it is designed so that end-user interaction is required to do an actual authentication handshake. The user does the handshake with the app server over HTTP-- there isn't really a mechanism for asking them to re-authenticate with the DB itself.
You could use their hooks to issue "SET SESSION AUTHORIZATION TO ..." commands to PostgreSQL, though, if your app server performs its connections to the DB as a superuser. That doesn't actually re-authenticate, just changes the session authorisation role temporarily.
You could also use one of the myriad "store some session information in the DB" solutions (custom variables, PL/Perl etc global variables) and use their hooks to work with that. This is actually what their Oracle solution seems to do, it sets the client identifier which iirc is just a global variable in dbms_util somewhere that is included in views showing current sessions (and postgresql 9.0 has an "application name" that performs the same role)

What allows a Windows authentication username to work (flow) between 2 servers?

Typical ISP setup. One server is the web server, another is the DB SQL server. There is a local administrator account, let's say XYZ, created on both machines. So when I log in remotely, I am either WebServer\XYZ or DBServer\XYZ, depending where I log in.
Now, when I login to SQL Server SSMS on DBServer using Windows Authentication, and execute "SELECT SUSER_NAME()", I get DBServer\XYZ. That makes sense since it's picking up the fact that I logged in with those credentials.
Now, move over to the WebServer. I remotely login as WebServer\XYZ. I've installed the SQL client components there. When I launch SSMS, choose the DBServer, login with Windows Authentication, and execute "SELECT SUSER_NAME()", I somehow get DBSERVER\XYZ, instead of what I would assume should be WebServer\XYZ.
Somehow, the XYZ from the WebServer becomes the XYZ from the DBServer. Why is that? How does that happen? Surely, it can't just be because the names happen to be the same?
I've heard of trusted domains, but neither machine is a Domain Controller, so I don't have access to that info. How can I tell if it's trusted or not, without the GUI tools?
The reason I ask the question is because, I'm trying to implement the same thing on my XP laptop (using Virtual PC), so I can imitate the production environment, but I'm not having any luck.
The NTLM challenge between machines is a little more complex #Quassnoi indicates but it is similar. The machines may well be in the same domain or trusted domains, but the accounts you are using are local machine accounts, scoped only to the local machine's security access management.
Local SAM accounts patterned as machinename\userid are non-propagatable. You'd experience a series of negotiated fallbacks when you tried to authenticate against external resources using that account as follows:
Pass current domain/username/password hash token - it'll fail, the account is untrusted
Fallback - revert passing hash of UserID + Password
Fallback - revert to connecting as anonymous credentials.
The fallbacks can also be disabled through configuration, it is very common for anonymous authentication to be prevented.
As #Quassnoi indicates in this instance you managed to login using the #2 fallback.
To enable account credentials to propagate, you'd need the following to be true:
machines would need to be members of domains with at least one-way trust between each other (they don't necessarily have to be members of the same domain).
use domain accounts - not local machine accounts - would look something like domainname\userid. A special case is the Network Service account which has a proxy account in the domain scenario - domainname\machinename$.
How do you tell if your machine is a member of the domain? It's pretty easy if you've got interactive login to the machines. There are a few strategies
interactively the System control panel will show workgroup or domain membership. (Right-click properties on Computer in the start menu)
at the command-line, IPCONFIG /ALL will also show the default DNS prefix which is typically the same as your domain name.
I suspect your ISP would create a domain just to make it easy to manage and monitor their machines. Whether they'd let you create domain accounts is a different question.
You XYZ accounts seem to have same passwords on both machines, and they are not a part of a domain.
WebServer sends just XYZ as a username and answers all password challenges successfuly, as the passwords do match.
DbServer, of course, thinks of you as of DbServer/XYZ, as it knows of no others.
Exactly same thing happens when you try to access one standalone machine from another one over SMB. If your usernames and password match, you succeed.

Resources