Deploying an AWS EMR cluster with on-prem/cross-cloud Active Directory Authentication
If you're ever in the enviable position of having to get your AWS Elastic Map Reduce (EMR) cluster authenticating against an on-prem/cross-cloud Active Directory instance this post is for you!
Let's break this down into the separate pieces we're going to need:
A VPN/Direct-Connect connection to the on-prem/cross-cloud Active Directory network
AWS actually has all of this pretty well documented, so I'm not going to list individual steps. However, I'll list a couple of gotchas that ended up taking us a couple of days to work through.
First, the resources:
Setting up a VPN connection to your AD network: https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/SetUpVPNConnections.html
Instead of deploying an Windows Server EC2 instance, use a machine on your internal network/your router and work through the steps.
Deploying a Kerberized EMR cluster with a cross-realm AD trust: https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-kerberos-cross-realm.html
Things to look out for:
You will need to initiate a connection from your external network to the AWS VPC to activate the VPN after you configure it. The easiest way to do this is to allow incoming ICMP packets to an existing EC2 instance in your VPC and ping it.
- While specifying the DHCP option set to specify your AD DC as a DNS server, there is a line that specifies:
xx.xx.xx.xx,AmazonProvidedDNS. You literally have to enter the string
AmazonProvidedDNSafter the IP of your DC. I've never used a DHCP option set before so that tripped me up for bit.
- Follow the casing exactly as specified in the post for the realms, domains and servers
- You do not have to add individual users. The EMR deployment handles PAM and sssd configs. If your cluster has been set up correctly, you should be able to ssh into the cluster with AD_username@yourdomain.com and the user's AD password. The first time you login as a user the user's home directory is automatically created, and a kerberos ticket is requested.
- You can absolutely configure the trust to be transitive. Not sure why their documentation specifies a non-transitive one (This is something that can be changed after initial deployment so is not a big deal)
Slow Kerberos auth/tickets?
- Kerberos tries using UDP before TCP by default. Switching to TCP significantly sped things up. Add the following line to the [libdefaults] section
udp_preference_limit = 1. This will prioritize TCP over UDP.
Feel free to reach out to me @rohchak if you have any questions!