Security Guide > Security Guide > A. Securely Managing Cloud Credentials > How to Securely Manage Azure OAuth Credentials
Was this helpful?
How to Securely Manage Azure OAuth Credentials
Follow this process to use the Hadoop credential provider framework to protect Azure OAuth credentials when accessing ADL through Spark or Hadoop:
1. Create a credential file to store the credentials on HDFS. For example:
hadoop credential create fs.azure.account.oauth2.client.endpoint -value OAUTH_CLIENT_ENDPOINT -provider jceks://file/opt/Actian/VectorVH/ingres/files/hdfs/abfs.jceks
hadoop credential create fs.azure.account.oauth2.client.id -value OAUTH_CLIENT_ID -provider jceks://file/opt/Actian/VectorVH/ingres/files/hdfs/abfs.jceks
hadoop credential create fs.azure.account.oauth2.client.secret -value OAUTH_CLIENT_SECRET -provider jceks://file/opt/Actian/VectorVH/ingres/files/hdfs/abfs.jceks
Notes:
The credential file can exist on local or HDFS storage but not ABFS. To use HDFS, replace ‘file’ with ‘hdfs’ or the storage class being used. If the credential file is on local storage, it must be present in the same location on all nodes.
The credential store is password protected. The default password will be used unless another is specified by setting HADOOP_CREDSTORE_PASSWORD in the environment prior to running the above commands. The password can also be stored in a file pointed to by hadoop.security.credstore.java-keystore-provider.password-file in the Hadoop configuration. For details, see the Credential Provider API Guide.
2. Verify the contents of the credential store (no values shown):
hadoop credential list -provider jceks://file/opt/Actian/VectorVH/ingres/files/hdfs/s3.jceks
fs.azure.account.oauth2.client.endpoint
fs.azure.account.oauth2.client.id
fs.azure.account.oauth2.client.secret
3. Update configuration files:
a. Update the Hadoop configuration (core-site.xml) to define where OAuth credentials are stored.
<property>
  <name>fs.s3a.security.credential.provider.path</name>
  <value>jceks://file/opt/Actian/VectorVH/ingres/files/hdfs/s3.jceks<value/>
   <description>
    Optional comma separated list of credential providers, a list
    which is pre-pended to that set in hadoop.security.credential.provider.path
   </description>
</property>
Note:  Configuration files should be updated on all nodes. We recommend using a cluster manager such as Ambari to make these changes.
If using a publicly accessible bucket that does not require access keys, set the following in core-site.xml instead:
<property>
  <name>fs.s3a.aws.credentials.provider</name>
  <value>org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider</value>
</property>
b. For stand-alone Spark deployments, add the following lines to spark_provider.conf:
spark.hadoop.fs.azure.account.auth.type=OAuth
spark.hadoop.fs.azure.account.oauth.provider.type=org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider
spark.hadoop.hadoop.security.credential.provider.path=jceks://hdfs/user/actian/adls.jceks
4. If using HDFS, verify the ABFS container can be accessed using the HDFS client:
hdfs dfs -ls abfs://my-container@ myaccount.dfs.core.windows.net/
Found 8 items
-rw-rw-rw- 1 actian actian 181080083 2019-02-14 16:50 abfs://my-container@ myaccount.dfs.core.windows.net/On_Time_On_Time_Performance_1988_1.csv
-rw-rw-rw- 1 actian actian 18452125824 2019-11-28 12:04 abfs://my-container@ myaccount.dfs.core.windows.net/On_Time_On_Time_Performance_Part1
-rw-rw-rw- 1 actian actian 19494962213 2019-11-29 23:54 abfs://my-container@ myaccount.dfs.core.windows.net/On_Time_On_Time_Performance_Part2
-rw-rw-rw- 1 actian actian 19725641334 2019-11-29 19:53 abfs://my-container@ myaccount.dfs.core.windows.net/On_Time_On_Time_Performance_Part3
-rw-rw-rw- 1 actian actian 19773821142 2019-11-28 13:51 abfs://my-container@ myaccount.dfs.core.windows.net/On_Time_On_Time_Performance_Part4
 
Last modified date: 01/26/2023