Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=spry, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
A quick Google search for this error will lead to many responses suggesting the workaround of disabling permission checking by setting the
hdfs-site.xmlto false. However, that workaround is unnecessary and causes all built-in HDFS security features to be disabled!
HDFS permissions are tied into the permissions set on the local file system for any users accessing the cluster. If we take a look at the last part of the error message, we will see that there are two additional ways to address the problem.
This is the user that owns the folder being accessed. Since that user will always have the correct permissions, one solution is to always interact with files as the user that owns them in HDFS. For example, if we re-ran the original command as the
sudo -u hdfs, the command would not throw an error. However, this is not an ideal solution since there are many built-in users when working with Hadoop (hdfs, mapred, root, oozie, etc.). This method will require a conscious modification to any commands discovered in normal documentation.
This is the group that owns the folder being accessed, so any users in this group will be able to access the files. The ideal solution is to create a user group called
supergroupand add all users that should access files under that ownership in HDFS to the group. This is performed using the following commands:
usermod -a -G supergroup spry
Once the user is in the proper group, the original command can be re-run without the Access Control Exception being thrown. The name of this overarching group is specified by the
hdfs-site.xml. So if your environment has all appropriate users in an existing group, altering the value for that property will have the same effect.