Microsoft Fabric: Shortcutting to a firewall-protected Azure Data Lake, distilled.

Christian Henrik Reich
5 min readOct 2, 2024

--

Motivation

This post is based on a case where there was a need to shortcut access to Azure Data Lake from OneLake, with the Azure Data Lake protected by the network setting “Enabled from selected virtual networks and IP addresses.” This is a good practice — the less direct access, the better. Microsoft Fabric offers a feature called ‘Trusted workspace access’ for accessing protected Azure Data Lakes. Although documentation exists, it can be a bit of a hassle at first. So, this is the distilled version to help you succeed faster.

Most important!

Shortcutting to a firewall-protected Azure Data Lake only works with Lakehouses in workspaces that have a Fabric Capacity (also known as an F-sku license). If you’re using a Lakehouse with a Power BI Premium license or the trial license, this feature will not work.

Be aware that adding members to the the Lakehouse’s workspace with a “Reader” role can also disrupt this functionality.

Additionally, it currently appears that the storage network settings can only allow one Fabric workspace.

Setup flow

Start with the principal for the shortcut

The workspace identity is one of the best additions to Fabric in terms of infrastructure. Assigning an identity to the workspace, so it can own the shortcuts instead of a personal work account, is far more preferable.

Microsoft Fabric workspace managed identities are not the same as Azure Managed Identities. They are Microsoft Fabric-managed Azure service principals and can be found in Azure Entra ID as app registrations with the same name as the workspace.

These Microsoft Fabric-managed Azure service principals might pose a concern for company Azure administrators. There is a chance of polluting Azure Entra ID with app registrations beyond their control, potentially violating the company’s Azure naming standards.

In such cases, it’s possible to create an Azure service principal manually and use that instead.

Either way, remember to assign the appropriate Azure RBAC role for Storage Blob. In most cases, Storage Blob Reader should suffice. For this post, the role assignment is done under Access Control (IAM), and the principal in question is the Microsoft Fabric-managed service principal: Exploring_ADLS_shortcut.

Setting the storage network settings

In this case, the storage is named explfabric. We start by enabling the setting “Enabled from selected virtual networks and IP addresses,” which locks down the storage from internet access and restricts access to only those explicitly allowed.

Traditionally, it has been possible to allow certain Azure services through this setting. However, this option doesn’t work for Microsoft Fabric, as it isn’t included on the trusted services list. In general, Microsoft has begun recommending against relying on this feature, and it appears to be deprecated.

It should be possible to enable Microsoft Fabric access through Resource Instances, but we need to be specific about which workspace we are referring to, and the UI does not currently support this.

This can be done with this ARM snippet from the documentation. Fill out the place holders:

{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"resources": [
{
"type": "Microsoft.Storage/storageAccounts",
"apiVersion": "2023-01-01",
"name": "<storage account name>",
"id": "/subscriptions/<subscription id of storage account>/resourceGroups/<resource group name>/providers/Microsoft.Storage/storageAccounts/<storage account name>",
"location": "<region>",
"kind": "StorageV2",
"properties": {
"networkAcls": {
"resourceAccessRules": [
{
"tenantId": "<tenantid>",
"resourceId": "/subscriptions/00000000-0000-0000-0000-000000000000/resourcegroups/Fabric/providers/Microsoft.Fabric/workspaces/<workspace-id>"
}]
}
}
}
]
}

The ARM snippet can be deployed by searching for “Deploy a custom template” in the Azure Portal. Then, selecting “Build your own template in the editor”, to access the editor for the snippet.

When the snippet has runned, the specified Microsoft Fabric workspace is allowed to access storage.

Create the shortcut

Prefer using a workspace identity when connecting to Azure Data Lake within the same Azure Entra ID tenant; otherwise, use a service principal. Other options should generally be avoided unless there are special cases.

Data engineer setting up the RBAC role, should be Storage blob contributer or owner.

Conclusion

From my perspective, the less access and openness we allow to our data storages, the better. It might take some tries to get Trusted workspace access to work, but it becomes fairly easy once you get the hang of it. Closing down network access to Azure Data Lake and accessing it via Trusted workspace access should be integrated into the Infrastructure as Code (IaC) process to ensure it is done by default.

--

--

Christian Henrik Reich
Christian Henrik Reich

Written by Christian Henrik Reich

Renaissance man @ twoday Kapacity, Renaissance man @ Mugato.com. Focusing on data architecture, ML/AI and backend dev, cloud and on-premise.

Responses (1)