After everything about sharing content and requesting access to content, it is time to talk about discoverability of content in the Power BI Service. The next blog in the series of transforming a local into a global enterprise grade solution.
In this blog, we will look closer at the discoverability of datasets and in particular datasets only. Up till today (August 2021) this feature is only available for datasets in the Power BI Service and relates to the datasets hub. A feature that contributes to driving a single source of truth and reuse of datasets for many reports and dashboards. Let’s get started!
The discoverability feature is closely related to the sharing experience for datasets. In Power BI, datasets, are only visible to those who have access to them and can be found in the datasets hub. But as this only works when people already have access to the dataset, in one way or the other through workspace access or having explicitly granted access to the dataset. This makes it difficult for people without access to find these trusted sources of data. Not only do they not have access to the dataset, but they do also not know it exists, so they cannot even request access.
While we are aiming for a single source of truth and want to avoid dataset duplication, we have to overcome this challenge. By using the discoverability features of a dataset, which can be set by an Admin or Member in the workspace, the dataset becomes discoverable for everyone in the datasets hub, even for users who do not have access to it yet. Users can request build permissions for the dataset, as described in the previous episode of this blog series.
What is the datasets hub?
The datasets hub is an element of the Power BI Service which can be found in the left-hand menu. With the datasets hub you can easily discover the datasets you have access to, or find datasets that might be of your interest when they are made discoverable. In general there are two sections in the datasets hub, where you will find recommended datasets and all others. The recommended datasets are prioritized based on which are endorsed and be promoted or certified. Later in this series more about content endorsement.
Once you start exploring the datasets in the datasets hub, there are several things you can do here;
- All datasets you have access to, because you are part of the workspace or the dataset is specifically shared with you, either direct of via an App.
- Discoverable datasets, those which are shared by others but might be of your interest.
- Find all related information about the dataset you have access to, such as
- Last refresh date
- Related reports
- Tables and columns in the dataset
- Full lineage view
- Directly connect with Analyze in Excel to the dataset
- Chat in teams about the dataset
- Start creating a new report on top of this dataset
Besides the fact that the datasets hub can act like a catalog within the Power BI Service, the datasets hub does help dataset owners to quickly navigate to all datasets they own. The tab “My datasets” will show you call your datasets across the different workspaces you have access to.
Why should I make my dataset discoverable?
You could ask yourself, why should I make my dataset discoverable to others? Well, there are many good reasons to do so. Let’s start with the fact that it helps to drive a single source of truth. In case that there are many initiatives in your organization that all start analyzing the same data, from the same source, it is not very uncommon that all outcomes differ. Therefore, having one version of the truth by having a single dataset representing this would help. By making your dataset discoverable, it will be easier for others to explore already existing datasets through the datasets hub in the Power BI Service before they start building their own data model.
Besides aiming for a single version of the truth, there is also a technical advantage of having a single dataset pulling the data from the source to Power BI. It simply lowers the impact on source systems by only pulling the data once.
All this aside, I can think of a third reason to make your dataset discoverable to others. In case you are the owner of a process or data source within your organization, it would make sense that you want to control the access and ability for analytical purposes as well. If you build the dataset yourself and make it discoverable in Power BI, you can control access levels in the Power BI dataset which are matching the permissions as set in the source system itself. If you allow every user to build datasets, they might share it with unauthorized users that can lead to a security incident.
Personally, I believe that every enterprise-grade datasets should be discoverable through the datasets hub. Especially as these datasets are often driven by teams of processes in the organization who are aiming for relevant insights and a single source of truth.