You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Read support probably requires an Azure implementation for arrow::io::RandomAccessFile then that can be used to implement the OpenInputStream and OpenInputFile methods of the AzureFileSystem.
#12914 implemented all of these features so this will be largely a case of just extracting the relevant parts from there. One modification I would suggest compared to that would be to avoid branching logic based on whether the Azure storage account has the hierarchical namespace enabled. Utilising features of the hierarchical namespace can make renames and listing tasks faster but for just reading blobs it shouldn't make any difference.
If we want to use features of the hierarchical namespace that adds some complexities:
Describe the enhancement requested
Read support probably requires an Azure implementation for
arrow::io::RandomAccessFilethen that can be used to implement theOpenInputStreamandOpenInputFilemethods of theAzureFileSystem.#12914 implemented all of these features so this will be largely a case of just extracting the relevant parts from there. One modification I would suggest compared to that would be to avoid branching logic based on whether the Azure storage account has the hierarchical namespace enabled. Utilising features of the hierarchical namespace can make renames and listing tasks faster but for just reading blobs it shouldn't make any difference.
If we want to use features of the hierarchical namespace that adds some complexities:
ServiceClient::GetAccountInfo()requires Storage Account Contributor permissions (https://learn.microsoft.com/en-us/rest/api/storageservices/get-blob-service-properties?tabs=azure-ad#authorization) which is quite significantly elevated. Hadoop solves this by essentially callingPathClient::GetAccessControlList()and if it raises an exception hierarchical namespace is not supported https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java#L356-L385.Related Issues:
Component(s)
C++