Using Azure Data Lake Store with HBase
CDH 5.12 and higher support using Azure Data Lake Store (ADLS) as a storage layer for HBase.
There are two scenarios in which ADLS can be used with HBase:
- ADLS-only: In this scenario, both HFiles, which contain user data, and write-ahead logs (WALs) are written to ADLS. This configuration is not recommended for use cases that demand high performance.
- ADLS + HDFS: In this scenario, HFiles are written to ADLS, but WALs are written to HDFS. This configuration provides higher throughput and lower latency for writes than does the ADLS-only configuration.
Configuring HBase to Use ADLS as a Storage Layer
- Set up credentials to enable communication between HBase and ADLS. See Configuring ADLS Connectivity for CDH and use one of the configuration methods listed there that HBase supports.
- In the Cloudera Manager Admin Console, select the HBase service, click the Configuration tab, and locate the Hbase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml.
- 
Depending on which scenario you plan to use, add the following values for the Name and Value fields: - 
ADLS-only: - 
Name: hbase.rootdir Value: adl://<adls_account_name>.azuredatalakestore.net/<hbase_directory> 
 
- 
- 
ADLS + HDFS: - 
Name: hbase.rootdir Value: adl://<adls_account_name>.azuredatalakestore.net/<hbase_directory> 
- 
Name: hbase.wal.dir Value: hdfs://<name_node>:8020/<hbase_wal_directory> 
 
- 
 
- 
- 
Still on the Configuration page for the HBase service, locate the HBase Service Advanced Configuration Snippet (Safety Valve) for core-site.xml and add the following Name and Value pairs for both configuration scenarios (ADLS-only and ADLS + HDFS): - 
Name: fs.defaultFS Value: adl://<adls_account_name>.azuredatalakestore.net/ 
- 
Name: adl.debug.override.localuserasfileowner Value: true 
  Note: All files and folders in ADLS are owned by the same account owner. When HDFS checks for a file
owner, the Azure Active Directory (AD) owner is used and the Access Control List (ACL) check fails to match with the HBase user who is making the request. The above configuration works around this
issue by instructing the HDFS client to assume the current user owns all files when requesting data stored in ADLS. Note: All files and folders in ADLS are owned by the same account owner. When HDFS checks for a file
owner, the Azure Active Directory (AD) owner is used and the Access Control List (ACL) check fails to match with the HBase user who is making the request. The above configuration works around this
issue by instructing the HDFS client to assume the current user owns all files when requesting data stored in ADLS.
- 
| << Configuring the Storage Policy for the Write-Ahead Log (WAL) | ©2016 Cloudera, Inc. All rights reserved | Managing HBase >> | 
| Terms and Conditions Privacy Policy |