Configuring Kerberos Security in Hortonworks Data Platform 2.0


Hadoop was originally created without any external security in mind. It was meant to be used by trusted users in a secure environment, and the constraints that were put in place were intended to prevent users from making mistakes, not from preventing malicious characters from harming the system.

Any person who wanted to wreak havoc on a Hadoop system could impersonate another user to circumvent the imposed constraints. In other words, while Hadoop had the means to impose restrictions as authorization policy, it lacked the ability to authenticate users.

As Hadoop became more widespread and saw use in the enterprise arena, security from outsiders became a real concern. The need for authentication became critical.

Enter Kerberos. Kerberos is a single sign-on authentication protocol that uses the concept of “tickets” to provide identity. A user provides their username and password once to the Kerberos server, either directly or through a keytab file, and receives a “ticket granting ticket” (TGT) that can be used to request tickets for any specific service.

The Hadoop community has accepted Kerberos as the de facto standard and it is supported in most Hadoop services and tools. For example, after receiving a TGT, a user can use it to obtain a ticket for Hive and then use the Hive ticket to interact with the Hive service.

Configuring Kerberos in HDP 2.0

Install, configure and start your Kerberos server

The following steps should be performed on the machine where your Kerberos services will be. It is recommended that they be installed on a server separate from your Hadoop cluster.

1. Install the Kerberos server and client packages
sudo yum install krb5-server krb5-workstation

2. Modify /etc/krb5.conf with the correct realm and hostnames. Here the one I used for a single Kerberos server, containing both the Key Distribution Center (KDC) and the Kerberos Admin service:


default = FILE:/var/log/krb5libs.log

kdc = FILE:/var/log/krb5kdc.log

admin_server = FILE:/var/log/kadmind.log


default_realm = SPRY.DEV.COM

dns_lookup_realm = false

dns_lookup_kdc = false

ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true

 kdc = vm-centos6-4-anastetsky
 admin_server = vm-centos6-4-anastetsky

vm-centos6-4-anastetsky =

Replace SPRY.DEV.COM with the name of the Kerberos realm.
Replace vm-centos6-4-anastetsky with the host name of the Kerberos server.

The realm can be anything, it's just a namespace.

If you have more than one Kerberos machine, e.g. you have the KDC and Admin on separate machines, you can specify in the domain_realm section a domain instead of an explicit host. For example:

[domain_realm] =

And then you can have machines and, for example.

3. Create the initial Kerberos database and supply a master password
sudo kdb5_util create -s
4. Update /var/kerberos/krb5kdc/kadm5.acl for principals who have administrative access to the Kerberos database.
5. Start the kadmin service
sudo service kadmin start 
6.  Use kadmin.local to create an admin principal (e.g. alex/admin)
addprinc alex/admin
7. Start the Kerberos service (krb5kdc)
sudo service krb5kdc start
8. Make sure you open the right ports:
sudo iptables -I INPUT -p udp --dport 88 -j ACCEPT
sudo iptables -I INPUT -p tcp --dport 749 -j ACCEPT 
          sudo iptables -I INPUT -p udp --dport 464 -j ACCEPT
 sudo service iptables save

Hadoop / Ambari configuration, part 1

1. Log in to your Ambari web interface as an admin user
2. Go to Admin > Security, and click Enable Security
3. Click Next.
4. Enter your realm name, e.g. SPRY.DEV.COM
5. Click Next.
6. Click Download CSV (host-principal-keytab-list.csv). (We will return to this wizard in step 14)
7. Copy the CSV to the server where Ambari is installed
8. Run the script with the CSV which generates a custom script (e.g. that you can use to create Kerberos principals and keytabs:
/var/lib/ambari-server/resources/scripts/ host-principal-keytab-list.csv >
9. Copy the generated to you Kerberos server.
10. Before you can run this script, you will need to create the appropriate users and group so that the keytabs will be properly "chowned". These users are never used on the Kerberos server and can be deleted after step 11.

The following users need to exist:
hdfs, yarn, mapred, hive, hbase, oozie, nagios, zookeeper, ambari-qa
You also need to create the "hadoop" group.

10. If you have any policies you want your Kerberos principals to be created with, you will want to edit in the section named "Creating Kerberos Principals". E.g.

kadmin.local -q "addprinc -randkey -policy user ambari-qa@SPRY.DEV.COM"
kadmin.local -q "addprinc -randkey -policy service HTTP/<hostname>@SPRY.DEV.COM" 
11. Run with sudo. This creates a tar file for each node/host in your Hadoop cluster. Each tar contains the keytabs needed to be on that host.

12. Copy each tar file to the right host and unzip it to the root directory (it already contains the correct directory structure).

sudo tar -C / -xvzf keytabs_hostname.tar.gz

NOTE: There is a utility that can do this automatically for you (/var/lib/ambar-server/resources/scripts/ that uses SCP, however I did not test it because it does not support SSH agent forwarding as of this writing and requires your private key to exist on the server where Ambari is installed.

 13. On each Hadoop node, add the unlimited security policy JCE jars to $JAVA_HOME/jre/lib/security/. You can download them from Oracle's site. Make sure to get the right ones for your version of Java.

14. Back in the Ambari Secure Wizard, click Apply. It will update the configuration and restart the services.

At this point all or most of the Hadoop services should start up, but there are still some things missing that the Ambari Secure Wizard does not do automatically.

Hadoop / Ambari configuration, part 2

1. Make the following additional config changes to Hadoop services. This can be done either from Ambari UI by going each service's config and adding custom properties.

hdfs-site dfs.permissions.enabled true

mapred-site mapreduce.cluster.acls.enabled true

hive-site hive.server2.enable.impersonation true

core-site hadoop.proxyuser.hive.hosts *

core-site hadoop.proxyuser.hive.groups *

core-site hadoop.http.filter.initializers

core-site hadoop.http.authentication.type kerberos

core-site hadoop.http.authentication.signature.secret.file /etc/security/http_secret

core-site hadoop.http.authentication.kerberos.principal HTTP/_HOST@SPRY.DEV.COM

core-site hadoop.http.authentication.kerberos.keytab /etc/security/keytabs/spnego.service.keytab
Replace SPRY.DEV.COM with your actual realm name.

2. Create the secret file for HTTP authentication that you referenced above.
sudo dd if=/dev/urandom of=/etc/security/http_secret bs=1024 count=1
3. Copy that file to the same location on every node.
4. Modify from 1000 to 500 in /var/lib/ambari-agent/puppet/modules/hdp-yarn/templates/container-executor.cfg.erb (this is necessary for some distributions like CentOS where the normal user ids start at 500)
5. Add the Kerberos principal and keytab path to the Zookeeper JAAS client file -- modify /etc/zookeeper/conf/zookeeper_client_jaas.conf and make it look like this:
Client { required useKeyTab=false useTicketCache=true keyTab='/etc/security/keytabs/zk.service.keytab' principal='zookeeper/ZK_HOST@SPRY.DEV.COM'; };
Replace ZK_HOST with the host where Zookeeper is installed.
Replace SPRY.DEV.COM with your realm name.

6. Restart the Hadoop services from Ambari

No comments:

Post a Comment