Using Oozie in Kerberized Cluster

In general, most Hadoop ecosystem tools work rather transparently in a kerberized cluster. Most of the time things “just work”. This includes Oozie. Still, when things don’t “just work”, they tend to fail with slightly alarming and highly ambiguous error messages. Here are few tips for using Oozie when your Hadoop cluster is kerberized. Note that this is a client/user guide. I assume you already followed the documentation on how to configure the Oozie server in the kerberized cluster (or you are using Cloudera Manager, which magically configures it for you).

    1. As always, use “kinit” to authenticate with Kerberos and get your tgt before trying to run oozie commands. Verify with klist. Failure to do this will result in “No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)”
    2. I always enable security debug messages. This helps me troubleshoot, and also helps when I need to ask support/mailing list for help.
      export HADOOP_ROOT_LOGGER=TRACE,console;
      export HADOOP_JAAS_DEBUG=true;
      export HADOOP_OPTS="-Dsun.security.krb5.debug=true"
    3. Your Oozie command typically contains a URL. Something like “oozie -url http://myserver:11000/oozie -conf job.properties -run” The server name in the URL must match an existing principal name in Kerberos. If your principals are actually “myserver.mydomain.com” make sure you use that in the URL.
    4. If you decide to use CURL to connect to your Oozie server, either for troubleshooting or for using the REST API, don’t forget to use “–negotiate -u foo:bar”. The actual username and password don’t matter (you are authenticating with your Kerberos ticket), but CURL throws a fit if they don’t exist.
    5. If you have Hive action in your Oozie workflow, you need to define and use credentials. Here’s an example:
      <workflow-app xmlns="uri:oozie:workflow:0.2.5" name="example-wf">
              <credentials>
                      <credential name='hive_credentials' type='hcat'>
                              <property>
                                  <name>hcat.metastore.uri</name>
                                  <value>thrift://metastore_server:9083</value>
                              </property>
                              <property>
                                  <name>hcat.metastore.principal</name>
                                  <value>hive/_HOST@KERBDOM.COM</value>
                              </property>
                      </credential>
              </credentials>
      <start to="hive-example"/>
      <action name="hive-example" cred="hive_credentials">
              <hive xmlns="uri:oozie:hive-action:0.2">
                      <job-tracker>${jobTracker}</job-tracker>
                      <name-node>${nameNode}</name-node>
                      <job-xml>${hiveSiteXML}</job-xml>
                      <script>${dbScripts}/hive-example.hql</script>
              </hive>
              <ok to="end"/>
              <error to="fail"/>
      </action>
      <kill name="fail">
              <message>Workflow failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
      </kill>
      <end name="end"/>
      </workflow-app>
      
    6. To make step #5 actually work (i.e. allow Oozie to run Hive actions), you will also need to do the following:
      In CM:
    7. go to “HDFS Service->Configuration->Service-Wide->Advanced->Cluster-wide Configuration Safety Valve for core-site.xml” and add:

      <property>
      <name>hadoop.proxyuser.oozie.hosts</name>
      <value>*</value>
      </property>
      <property>
      <name>hadoop.proxyuser.oozie.groups</name>
      <value>*</value>
      </property>
      

      – go to “Oozie service->Configuration->Oozie Server(default)->Advanced-> Oozie Server Configuration Safety Valve for oozie-site.xml” and add:

      <property>
      <name>oozie.credentials.credentialclasses</name>
      <value>hcat=org.apache.oozie.action.hadoop.HCatCredentials</value>
      </property>
      

      – Deploy client configuration and restart Hive service and Oozie service.

    8. Oozie doesn’t kinit a user for you on the node its launching the action on, and it doesn’t move principles and tickets around. Instead it uses delegation tokens. If you want to authenticate to Hadoop inside a shell or java action, you’ll need to use the same tokens.

      In a shell action, it will be something like:

      if [ -z ${HADOOP_TOKEN_FILE_LOCATION} ]
      then
      	hive -e "select x from test" -S
      else
      	hive -e "SET mapreduce.job.credentials.binary=$HADOOP_TOKEN_FILE_LOCATION; select x from test" -S
      fi
      

      In Java it will be:

      if (System.getenv("HADOOP_TOKEN_FILE_LOCATION") != null) {
                  jobConf.set("mapreduce.job.credentials.binary", System.getenv("HADOOP_TOKEN_FILE_LOCATION"));
      }
      

    Hope this helps! Feel free to comment with questions, especially if you ran into errors that I did not address. I’ll be happy to add more tips to the list.