Microsoft Fabric: Navigating Git and team collaboration

Christian Henrik Reich
4 min readJan 27, 2024

--

Introduction

For a long time, Git has been at the center of IT development. While there are a few patterns on how to use Git, the main idea remains the same. There is a collaboration branch, often called ‘main,’ where developers work from by branching out. At some point, the work in the branch-out branch is merged into the collaboration branch.

The power of branching lies in developers’ ability to isolate their work, avoiding interference with other developers or even the working product.

It is customary that when a work branch is merged into the collaboration branch, an automatic build of a new version of the product is triggered. This process is known as continuous integration, often abbreviated as CI. Following CI, the result is released to the testing environment and later to a production environment. Sometimes, there is also a release to a development environment. With some variations, this integrated approach is referred to as continuous delivery or CD. So, CI/CD.

It is common procedures, which most knows. Still, it can be a challenge to do just traditional CI with the way Fabric currently works with Git.

This post is about to setup workspaces, so more developers can branch out in the same Git repository, and having an environment for collaboration.

How Fabric work with Git

To create a collaboration practice, it is needed to understand how Fabric works with Git currently, and also which pitfalls there might be.

For a start, a Fabric Workspace is both a work environment and a serving environment. It is a bit untraditional for Developers. For comparison, in other development scenarios developers are developing in a development environment. It could be Power BI Desktop, Visual Studio Code etc. The work is later pushed to the serving environment, such as the Power BI service, a web server etc. A workspace currently opens for doing it all in one, which can make it hard to protect production, and also hard to do known developers best practices.

Regarding Git, a Fabric Workspace is tightly connected to a Git branch with one set of credentials. It is not connected to a Git repository but to a specific branch. It is still possible to branch out from the current branch, and the new branch will become the active branch for the Workspace. If multiple developers are working in the same Workspace, there are no other options than to work on the same branch. If one developer branches out or changes the branch, it affects the entire Workspace and all other developers.

There is no Git branch isolation within a Fabric Workspace

Furthermore, when assigning a workspace to Git, the credentials of the assignee are used. Consequently, every Git commit will have the assignee as the author. It is considered best practice to have the authoring developer as the author of a commit.

The developer is not connected directly to Git, and a Workspace doesn’t reflects what is in Git. Workspaces hold much more information than shown in Git. Like the Item reference ids of the Fabric Items. The Git repository hold a logical id which is not an item Id. The Git repositories are not aware of Item Id.

Workspaces don’t forget. Currently, it is not possible to delete an item and recreate it, for example, by an upload in a Workspace. The process of deleting and recreating has many quirks in general and may even require the recreation of the Workspace. So, how can Git perform an update? It is possible due to the logical ID. If an item is recognized by a logical ID from Git, then it can be overwritten with the information from Git.

Fabric Workspaces setup

More workspaces are needed. A naming standard for workspaces are needed first. There is usually only one tenant and with many workspaces we need some sort of naming order.

Lake houses should have own workspaces, without notebooks. The reason is, Lake house definitions are stored in Git, so connecting a new Workspace to Git repository with Notebooks and Lakehouse definitions, will create yet another Lakes houses with same names. Lake house can also block for git updates within a Workspace.

Notebooks should have a production Workspace, which is connected to a Git’s collabration branch(often named main). There is no issues in Notebooks refering Lake houses in other Workspaces.

Short lived developer Workspaces, should be used. Create and delete workspace requires the administrator role. Constant Workspace, can be used, but might affacted by quirks.

Developer process

So for creating new features, fixing bugs etc.

  1. Developers creates a new branch in Azure Devops.
  2. Create a new developer Workspace. A common Fabric Capacity can used for this.
  3. Connects the new Workspace to the branch created in step 1.
  4. Do work, commit work.
  5. Do the PR flow in Azure Devops.
  6. Press the Sync botton in the Notebook production Workspace for release.
  7. Delete developer Workspace, created in step 2.

With these steps, we’ll get:

  1. Each developer workspace would be a copy of the notebooks production Workspace.
  2. Each developer work space would have Git branch isolation.
  3. Each commit would have the right developer as author.
  4. Keeping a developer Workspace short-lived will avoid quirks regarding recreate, rename etc.
  5. The Fabric is not flooded with empty Lake houses.

Last comment

This is getting the best out of what we have, regarding Fabric. It is not perfect, as it requires elevated rights for developer creating Workspaces.

Hopefully, Microsoft will improve this aspect. If I were asked, I would suggest types of Workspaces, such as short lived Developer Workspaces, which only Developer roles could create and see. Perhaps, by default, you could only view your own Developer Workspace.

--

--

Christian Henrik Reich
Christian Henrik Reich

Written by Christian Henrik Reich

Renaissance man @ twoday Kapacity, Renaissance man @ Mugato.com. Focusing on data architecture, ML/AI and backend dev, cloud and on-premise.

Responses (1)