HDFS Permissions: Overcoming the "Permission Denied" AccessControlException

After installing the processes necessary for a Hadoop cluster, the natural next step is to begin experimenting with some of the tools. Unfortunately, the most common hindrance to this process is the following error:

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=spry, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x

A quick Google search for this error will lead to many responses suggesting the workaround of disabling permission checking by setting the dfs.permissions property in hdfs-site.xml to false. However, that workaround is unnecessary and causes all built-in HDFS security features to be disabled!

HDFS permissions are tied into the permissions set on the local file system for any users accessing the cluster. If we take a look at the last part of the error message, we will see that there are two additional ways to address the problem.


inode="/user":hdfs:supergroup:drwxr-xr-x

This is the user that owns the folder being accessed. Since that user will always have the correct permissions, one solution is to always interact with files as the user that owns them in HDFS. For example, if we re-ran the original command as the hdfs user using sudo -u hdfs, the command would not throw an error. However, this is not an ideal solution since there are many built-in users when working with Hadoop (hdfs, mapred, root, oozie, etc.). This method will require a conscious modification to any commands discovered in normal documentation.

inode="/user":hdfs:supergroup:drwxr-xr-x

This is the group that owns the folder being accessed, so any users in this group will be able to access the files. The ideal solution is to create a user group called supergroup and add all users that should access files under that ownership in HDFS to the group. This is performed using the following commands:

groupadd supergroup
usermod -a -G supergroup spry


Once the user is in the proper group, the original command can be re-run without the Access Control Exception being thrown. The name of this overarching group is specified by the dfs.permissions.supergroup property in hdfs-site.xml. So if your environment has all appropriate users in an existing group, altering the value for that property will have the same effect.

7 comments:

  1. This is a really nice post! Thanks, it helped me! :)

    ReplyDelete
  2. Good Observation. With the help of this article my issues got resolved. I am able to connect remote Hadoop Server from eclipse which is in my local Windows machine. Thanks

    ReplyDelete
  3. Thanks, this helped me resolve my issue.

    ReplyDelete
  4. This is small post but really helpful. Great work!

    ReplyDelete
  5. [root@master tmp]# hadoop fs -mkdir /tmp
    mkdir: Permission denied: user=root, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x

    I created a new user as I didn't want to mess with permissions with root user and added to supergroup group.
    [root@master ~]# groupadd supergroup
    [root@master ~]# useradd ravi
    [root@master ~] cd /home/ravi
    [root@master home]# chmod 777 -R ravi
    usermod -a -G supergroup ravi
    login as user ravi
    Executed the command and executed successfully.
    Thank you very much your post was the most helpful in understanding security.

    ReplyDelete