Microsoft Fabric: Writing to OneLake with .NET C#
Introduction
We have many options for ingesting data into our OneLake. Most of these options are pull-based, where data is pulled from various sources using Pipelines, Python, Dataflows, etc. Data can also be pushed externally for real-time intelligence.
In some cases, it might be best to push the data directly from an app to OneLake. This post focuses on how to do that when the app is written in C#. Hopefully, the same principles can apply to other languages as well.
I’ve covered this topic before using Python, so you may find additional information there as well:
- https://medium.com/@christianhenrikreich/microsoft-fabric-diving-into-lakehouse-access-from-local-machines-and-other-remotes-with-delta-rs-79faa6cf1fdf
- https://medium.com/@christianhenrikreich/accessing-delta-lakes-on-azure-data-lake-in-ordinary-python-b07d1af85a1a
Procedure
OneLake is essentially a collection of Azure Data Lakes, meaning we can use the same C# libraries that we use for Azure Data Lakes. Each Fabric workspace has its own Azure Data Lake. In our example, we create a Lakehouse within a workspace to store files in its ‘Files’ section. Conceptually, a Lakehouse can be thought of as a folder within an Azure Data Lake.
Accessing the Lakehouse requires authorization via a Microsoft Entra ID principal (for example, a user, a managed identity, or a service principal).
Authentication
The preferred approach is token-based authentication, so we don’t have to provide or store passwords. Microsoft provides the DefaultAzureCredential class for this purpose. It’s available in many languages and is used to authenticate against Microsoft Entra ID.
On a local machine, DefaultAzureCredential can use user principal credentials from Visual Studio, VS Code, or from ‘az login’ When running within an Azure service, it can leverage the service’s managed identity automatically, without requiring code changes.
If that ideal setup isn’t possible, we can still use a service principal.
Authorization
Regardless of the principal type, the principal must be assigned either to the Fabric workspace or directly to a Lakehouse. The latter approach is more secure, because workspace-level assignment grants access to all items in the workspace.
Prerequisites
This solution depends on two Microsoft libraries, which can be found on NuGet.org or installed directly into your project from the command line:
dotnet add package Azure.Storage.Files.DataLake
dotnet add package Azure.Identity
For my examples, I have created a Lakehouse called Bronze:
Token based solution
When developing locally, we need to sign in to the Microsoft Entra ID tenant where your Microsoft Fabric resources are hosted. The easiest way to do this is often through the Azure CLI (Command Line Interface).
A principal doesn’t require an Azure subscription to authenticate, so it’s a good idea to assume there is no subscription to avoid potential login errors. From PowerShell or any other shell, enter:
az login --allow-no-subscriptions
In some cases, when the user has access to multiple tenants, it might be necessary to specify which tenant to log into.
az login --allow-no-subscriptions --tenant <insert_tenant_guid_here>
The code will work both locally and from an Azure service (such as Azure Functions or an Azure Web App) without any changes. Only the workspaceId and lakehouseId need to be provided.
using System.Text;
using Azure.Identity;
using Azure.Storage.Files.DataLake;
namespace OneLakeConnect;
class Program
{
static void Main(string[] args)
{
var workspaceId = "Here_Lakehouses";
var lakehouseId = "Bronze.Lakehouse";
var fileName = "Files/FromCsharp/Filessample.txt";
var dfsUri = $"https://onelake.dfs.fabric.microsoft.com";
var credential = new DefaultAzureCredential();
var serviceClient = new DataLakeServiceClient(new Uri(dfsUri), credential);
var fileSystemClient = serviceClient.GetFileSystemClient(workspaceId);
var directoryClient = fileSystemClient.GetDirectoryClient(lakehouseId);
var fileClient = directoryClient.GetFileClient(fileName);
var content = "abc,123";
var byteArray = Encoding.UTF8.GetBytes(content);
using var stream = new MemoryStream(byteArray);
fileClient.Upload(stream, overwrite: true);
Console.WriteLine("File uploaded successfully.");
}
}
Service principal solution (password solution)
Her we need the credentials for the service principal(which your Azure admin can provide). We are also required to use a key vault, at least when running in an Azure Service context.
using System.Text;
using Azure.Identity;
using Azure.Storage.Files.DataLake;
namespace OneLakeConnect;
class Program
{
static void Main(string[] args)
{
var tenantId = "<tenant id>";
var clientId = "<client id>";
var clientSecret = "<client secret>";
var workspaceId = "Here_Lakehouses";
var lakehouseId = "Bronze.Lakehouse";
var fileName = "Files/FromCsharpSP/Filessample.txt";
var dfsUri = $"https://onelake.dfs.fabric.microsoft.com";
var credential = new ClientSecretCredential(tenantId, clientId, clientSecret);
var serviceClient = new DataLakeServiceClient(new Uri(dfsUri), credential);
var fileSystemClient = serviceClient.GetFileSystemClient(workspaceId);
var directoryClient = fileSystemClient.GetDirectoryClient(lakehouseId);
var fileClient = directoryClient.GetFileClient(fileName);
var content = "abc,123";
var byteArray = Encoding.UTF8.GetBytes(content);
using var stream = new MemoryStream(byteArray);
fileClient.Upload(stream, overwrite: true);
Console.WriteLine("File uploaded successfully.");
}
}
What about Delta Tables?
I haven’t come across a de facto solution for C# and Delta tables yet. That said, in an ingest-to-Bronze scenario, raw files are a viable solution; the one I personally prefer. The compute for further data processing shouldn’t be done outside Fabric, because that would, in some ways, miss the point of Fabric.
Conclusion
There are many ways to ingest data into Fabric, so you can select the method best suited to each task. As mentioned in the introduction, the posts I’ve referenced follow the same general pattern for external access to OneLake, regardless of the programming language used.