2016-11-08

One Power BI feature that almost passed me by (because it was released in August while I was on holiday) was the ability to create streaming datasets in the Power BI web app and push data to them via the Power BI REST API. This blog post has the announcement:
https://powerbi.microsoft.com/en-us/blog/real-time-in-no-time-with-power-bi/
The documentation is here:
https://powerbi.microsoft.com/en-us/documentation/powerbi-service-real-time-streaming/
And Charles Sterling has an example of how to use it with Flow and PowerApps here:
https://blogs.msdn.microsoft.com/charles_sterling/2016/10/17/how-to-create-and-customize-applications-with-powerapps-with-laura-onu-webinar-1020-10am-pst/

However, when I played around with this I found there were a few things that were either confusing or not properly documented, so I thought it would be useful to give an example of how to use this functionality to automatically synch data from a table in Excel to Power BI using a Power Query query.

Creating the streaming dataset in Power BI

Imagine that you have a table called Sales in an Excel workbook on your desktop:



There are three columns: Month and Product, which contain text values, and Sales, which contains an integer value. This is the data that we want to push up to Power BI.

The first step is to create a streaming dataset in Power BI to receive this data. Go to PowerBI.com and on the left-hand menu, under the Datasets heading, you’ll find a link called Streaming Datasets right at the bottom next to the Get Data button:



Click it and you’ll go to the Streaming data screen. Click on the “Add streaming dataset” button in the top-right to create a new streaming dataset:



Choose the API option in the pop-out pane then click Next:

Then give your dataset a name, enter the names and data types for the columns in the table and leave the Historic data analysis option turned off (we’ll come back to this later):

Hit Create and you’ll see a screen showing the URL to use to push data to the dataset and an example of the JSON to use to send the data:

Copy the URL and put it somewhere handy – you’ll need it in a moment.

Pushing data to the streaming dataset from Excel

Back in your Excel workbook, open the Power Query Query Editor window and create a new text parameter called PowerBIStreamingDatasetURL and paste in the URL for the streaming dataset:

Next, create a new blank query and use the following M code:

This query does the following:

Reads the data from the Sales table in the workbook

Converts the data to JSON (for some background on how it does this, see here)

Sends the data to the streaming dataset using Web.Contents() to make a POST request. See this post on how to make POST requests using Web.Contents() and this post on the technique I’m using to handle HTTP errors manually.

Returns a table containing a message saying whether the data was updated successfully or not, and the time of execution like so:

Finally, back in Excel, go to the Data tab on the ribbon, click on the Connections button to open the Workbook Connections dialog, select the connection that represents the query you’ve just created, click Properties, then in the Connection Properties dialog tick the “Refresh every” box and set the query to refresh automatically every minute:

Displaying data in a Power BI dashboard

Back in the browser in Power BI, create a new dashboard, click the Add Tile button and choose the Custom Streaming Data option:

Click Next and select the streaming dataset created earlier:

Click Next again and then choose Clustered bar chart as your Visualization Type, select the Month field of the dataset for the Axis, Product for the Legend…

…the Sales field for the Value and set the time window to display to 1 second:

Frustratingly there’s no way to create a measure or otherwise aggregate data here. In this example you’re using all of the fields in dataset in the chart; if you left out Product, however, you wouldn’t see aggregated sales for all products you would just see data for one (the last?) row in the table for each month.

Finally, set a title for the chart:

You now have a dashboard that gets updated automatically and shows the data from the Sales table in the Excel workbook:

When you change the data in Excel, after the Power Query query has run in the background every minute, the new data will appear in the chart.

[Be aware that it might take a few minutes for everything to start working when you first create a new tile]

Other ways of visualising the data

There are other types of data visualisation your can use such as line charts that are all very straightforward. One thing that did confuse me was the card visual: it shows one number, but which number? In this example if you create a card and link it to the Sales field in the dataset, it will always display the value from the last row in the table:

Again, it would be really nice if there was a way of creating a measure here…

The Historic Data Analysis option

You may remember the Historic Data Analysis option from an earlier step. What happens if you turn it on? Basically, instead of storing just one copy of the table you push through the API it stores multiple copies of the table (although it doesn’t store everything – I guess it’s subject to this retention policy or something similar). For example, consider the following variation on the streaming dataset above:

There’s a new field called UpdateDateTime (which is of type text, not datetime, because I found this worked better in reports) and the Historic data analysis switch is turned on.

Here’s an updated version of the Power Query query that populates the UpdateDateTime field with the date and time that the query was run:

You can download a demo workbook with this second example query in here.

The dashboards now work in more or less the same way. The Time Window To Display option that we set to 1 Second above can be used to control the number of copies of the pushed table that are displayed. For example, setting it to five minutes shows data from all of the copies of the table pushed in the last five minutes:

[Incidentally, if you’re using the Power Query query above you’ll see that every time the query runs, the web service is actually called twice! This is a feature of Power Query and M in general – there’s no guarantee that the web service will be called just once even if the query itself is executed once. This is why the dev team always tells people never to use Power Query to update data in a data source (see here for another example of this)]

You now also get a new option to create a report from a streaming dataset on the Streaming Data screen – you need to click the small graph icon next to the name of the streaming dataset:

So now you can create reports that show how the data in your Excel table has changed over time, and slice by the values in the UpdateDateTime field:

It’s important to realise that unlike the dashboards, reports connected to a streaming dataset don’t refresh automatically – you have to click the Refresh button on the report.

Conclusion

Although the examples in this post are perhaps not all that practical, hopefully they show what’s possible with the streaming API and some M code. It would be great if we could do data modelling-type stuff like add measures in the web interface, in the same way that we can in Power BI Desktop, because that would open the door to doing even more cool things with streaming data.

Show more