Sharing individual Power BI dataflows, but how?

Recently, I have had a challenge at a customer, where a central teams maintains many dataflows in Power BI, to store their only and single version of the truth. However, this central team maintained many different dataflows in a single workspace, but did not want to share the entire workspace with others. What now? How can they share a single dataflows in Power BI?

In this blog, I will describe different ways to share dataflows in the Power BI service and highlight pros and cons of each solution. Read on to find out what options you have, and what my personal preference would be.

Why sharing dataflows?

Master data management is still a challenge for many organization. Also, backlogs for data platform teams are packed and priorities spread all over. With that, a centrally build and maintained single version of the truth is our every-day challenge. Power BI dataflows can play a role in this area, as self-service data preparation, where individual teams can build up their satellite solutions, which then can be reused by others.

In the same setup, I phased a challenge at a customer. A central team had a Power BI dataflow, which contained data they wanted to make available to other departments, so that they can reuse this data to setup row-level-security in Power BI. Preferably, they managed certain data in a central place, in the data platform. However, as already shared in the introduction of this post, you often have to deal with different priorities in certain teams and overly full backlogs. So the team decided to build a dataflow, where all logic is captured in one place and the resulting table can be used in further solutions.

Sharing a dataflow

So far, so good… but the team has many dataflows like this, and sharing of a dataflow can only be done through workspace permissions. So the entire workspace has to be shared to all dataset creators who use this dataflow. As said, this team has many dataflows like this one, all in one workspace. Patterns like object level security or row level security does not exist in Power BI dataflows unfortunately.

Going forward, imagine that you work in that central team to control all these dataflows and aim for the single source of truth. Let’s go through the different solution options that can apply here, with all its pros and cons.

Setup different workspaces

Since workspace permissions are required, but the data in various dataflows might be sensitive, you do not want to grant workspace permissions with which users can see more data than they should. As a result, you might consider to setup multiple workspaces. Each workspace contains one dataflow that you want to share.

Pros

  • Using native functionality from the Power BI ecosystem.
  • Nothing really fancy and easy to setup.

Cons

  • Endless set of workspaces with one dataflow in each workspace.
  • Access management will be challenging, as you might have to grant permissions on different workspaces for a single user.
  • Active Directory groups can help to keep more control, but still many workspaces to maintain.

Build linked entities

An alternative could be to have one workspace with many different dataflows, all built and maintained by the team. For each user of the dataflows, the team creates one linked entity in the target workspace for reuse. With that, only the dataflow they need is linked to a different workspace. This linked entity is created by the central team that maintains all the dataflows, since they already have permissions on the workspace where the original copy of the upstream dataflow resides.

Pros

  • Single source of truth.
  • All logic and schema in a single place defined.
  • Easy overview with lineage across workspaces.

Cons

Creating copies of the dataflow

The logic of Dataflows can also be exported easily, in a json file structure. The other way around, dataflows can also be restored and imported from a json file back in the Power BI service. By clicking the ellipsis in the workspace menu, you will find a button to export the json file. Via this method, you could potentially create various copies of the dataflow logic, stored in different workspaces.

There are a lot of similarities with the Different workspaces approach mentioned before, though in this case you still have one workspace where you continuously develop and improve your dataflows, and various copies of those dataflows are distributed to other workspaces. Potentially you can use Power BI Deployment Pipelines combined with Azure DevOps for deployment and distribution to other workspaces.

Pros

  • Easy access management via workspace settings.

Cons

  • Logic is duplicated across different workspaces.
  • Access management will be challenging, as you might have to grant permissions on different workspaces for a single user.
  • All dataflows must be refreshed, which causes a higher impact on the upstream data source.
  • If refreshes are out of sync, this might lead to different versions of the truth across the organization.

DirectQuery for Power BI datasets with Object Level Security

A final alternative, is the usage of object level security on Power BI datasets. You might wonder what a dataset has to do with the fact that we try to share a dataflow, but this is where the fancy stuff comes in. Since you cannot share an individual dataflow on a workspace, but you can grant build permissions to a Power BI dataset. So, when we import all dataflows in a dataset, they will become just tables. These tables can be part of a Security role in Power BI, which can be on object level.

If you start working this way, remember that (right now, April 2022) for Power BI Pro everyone creating a composite model on top of your dataset does requires Build permissions. For Power BI Premium this limitation has already been lifted. Read more about this in the blog post about Update to required permissions when using composite models on a Power BI dataset.

Pros

  • One solution to rule them all.
  • Single source of truth.
  • Everything maintained in one place, no copies of logic in various workspaces.
  • No duplicate (out of sync) refresh schedules.

Cons

  • At this moment (April 2022) Build permissions might be required on Power BI Pro.
  • An additional refresh between dataflow and dataset is required.
  • Having DirectQuery to another dataset, might result in limited relationships in your data model.

If you are not familiar with setting up Object Level Security, I recommend to read through the step-by-step guide conducted by Gilbert on his website Fourmoo. Also, if you want to learn more how you can combine Row Level Security and Object Level Security in one model, cause that might be part of your requirements, I encourage you to watch the recent video from Guy in a Cube.

Wrap up

What is the best solution for you? The always existing phrase comes back again, “It depends”. I think that each solution has its pros and cons as I described. However, my personal preference is at the latest described option by using DirectQuery for Power BI datasets. Why? Well this mainly relates to having a single source of truth and maintaining everything in one place. So, the central team can keep control over everything in their own workspace.

Though, we should not forget about the cons. We should be very aware of the potential limited relationships that might occur in our remote model. The side-effect that comes with these limited relationships, potentially directly influences the results we present in our reports.

Furthermore, if you have everything in your dataset, and you share these artifacts via a dataset to other users, the dataflows might become obsolete as a whole. Still dataflows can be used as staging and transformation layer before you import the data in your data model.

3 thoughts on “Sharing individual Power BI dataflows, but how?

  1. JenHen

    Hi Marc,

    Thanks for this article. This is great timing as we’re looking to summarise our options for a one source of the truth solution with dataflows. We already tried the linked entities approach where we have dataflow A in workspace A locked down to just IT, we then created linked tables in workspace B giving permissions to another department user but when they used the Power BI Dataflows connector in Power BI Desktop (Dec 2022) the navigation tree structure to the linked tables didn’t show them, only when we granted the user viewer access to workspace A did the linked tables became available? Am I doing something wrong? The user could navigate to the linked tables within Power BI Service but what use is that if the user wanted to further develop the linked tables?

    Thanks.

    Like

    1. Hi,
      Thanks for your comment! The person building the linked entity in workspace B must have sufficient permissions on the upstream sources (being the dataflow in workspace A). Typically, the dataflow in workspace B is build by someone owning workspace A. If then they want to continue their developments in workspace B, there is no other option than building another linked entity on top, if you want to prevent them from granting specific permissions on workspace A. So they either build another linked entity on top, or do transformation in the data model.

      Hope this answers your question.
      –Marc

      Like

      1. JenHen

        Hi,

        I’m not sure if I understand. I build the linked tables in workspace B as I am the creator of the dataflow in workspace A. Now I give permissions to the linked tables to another user by granting them permission to workspace B but not workspace A but they cannot see the linked tables in the PBI Desktop connector, am I missing something?

        Thanks.

        Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s