How I ruined Power BI Desktop file

This is a short story of how I ruined one of my Power BI Desktop files. Luckily I stored it on SharePoint where I have versioning enabled, so I could restore a working version.

In that Power BI report I used an Excel file as a source and decided to turn it into a parameter for other queries – Binary parameter.

BTW, I faced that issue using Power BI Desktop June 2019 version from MS Store.

Parameter

When you create a parameter in Power BI Desktop, you can choose type Binary, but cannot set a value – fields are greyed out

In my case, I wanted to link a parameter to another query. There is no such option in the UI.

So, I created a parameter and opened it in the Advanced Editor to change type to Binary and replace value with a reference to another query #”SourceFile” .

My Mistake

I wrongly entered parameter’s Type, used “binary” instead of “Binary”. I don’t usually use this type, so didn’t remember the correct spelling, I though it should be in lowcase as other frequently used types “text” or “number”.

And got an error message

I reported an issue and clicked Cancel.

Then by mistake clicked on the Close button which closed window with parameter’s code with no questions

For some reason I decided to “Close and Apply” at certain point.

I didn’t know this will be the end of story for my report.

Soon, I tried to get back to the queries but couldn’t. Yes, I couldn’t open Power Query Editor window anymore – lost access to this area of Power BI Desktop file.

Every time I’m trying to open it I see this error window

Conclusion

Those who watched Chernobyl by HBO, probably remember old good Russian proverb “Trust but verify” (Доверяй, но проверяй) said by the head of KGB in one of the episodes. This applies to your files and the way you work as well. “Trust” that everything will be OK with your files, but “verify” by storing them in the locations with versioning enabled such as OneDrive or SharePoint and don’t forget to save frequently during active development phase.

Yammer Analytics in Power BI Part 2

This is the second post in the series “Yammer Analytics in Power BI”

1 – Intro to Yammer Analytics in Power BI

2 – Paging through Yammer group messages

3 – Paging through Yammer user messages

In the first post I shed some light on simple Yammer authorization approach and general Yammer API URLs.

In this post, let’s try to build a function to get group messages with their attributes.

(more…)

Intro to Yammer Analytics In Power BI

yammer group analytics

Have you heard about Yammer? It is a social network for organizations, kind of a “private Facebook”.

Let’s imagine your organization is using Yammer as an internal platform for digital collaboration, engagement, knowledge sharing etc.

It is a large international organization with employees around the globe (think big).

I have access to Yammer in one of such organizations since 2014 and in November 2014 I founded a group dedicated to “Data Analysis & Reporting” related questions.

IMHO, any large organization needs something like this. There are usually tons of company-specific questions around data & reporting: where to get data from, how to clean / transform, how to interpret etc.

With the good thoughts in my mind I started utilizing Yammer to spread the knowledge, share tips and tricks on various topics: Excel, VBA, Power BI, O365 services, SharePoint, MS Flow etc..

Why not in public place? Because of company-related technologies/data on screenshots and videos.

After a certain time I was obviously keen to know stats about my group. Does anyone read my posts?

Back in the days we used https://analytics.tryane.com for that. Great tool I have to admit!

Below are stats of my group after 1 year



It is not a secret that adoption of internal social media in any organization takes time.

That screenshot was made in Nov-15, pretty much one year since Yammer was widely introduced in the organization.

People were reluctant to use Yammer in the beginning. As of today, many are still considering it as evil, “waste of time” etc. But this is not something I want to talk about today.

You are here to know how to get data from Yammer to Power BI in the end.

Before pulling data to Power BI and reinventing the wheel, you might want to consider existing options like Tryane analytics or something similar.

For example, by default we have Yammer’s Group Insights, which it not a bad thing I should say, but too high-level.

Below is the last 12 months report for my group with 556 members at the time of writing.

Default ‘Group Insights’ feature gives a very high-level overview as you may see.

And we always want more, aren’t we? 🙂

Another option would be a solution from tyGraph for Yammer! Yes… but I couldn’t get it working. Perhaps, because I’m not a Verified Admin.

Ooooookay, but I just need the info about messages in my group and my personal messages (how many likes do I have, eh?)

Building Yammer analytics on Power BI

Fortunately, Yammer has the REST API.

https://developer.yammer.com/docs/rest-api-rate-limits

Documentation is far not perfect, but it provides an idea on where to start.

There is also a link to mass Data Export from Yammer, but again – for verified admins only.

Good news! If you are not a verified admin, you still can use GET requests to API.

Firstly, we need to connect to Yammer API. One of the easiest options would be using Microsoft Flow and export necessary data to CSV / Excel / Dataflow

However, O365 admins can limit list of allowed for MS Flow connectors.

As you may guess, for me this option doesn’t work. But you can give it a try!

Alternatively, consider making GET requests using Power Query to get data from Yammer

Explanation of methods in that documentation is not very detailed and not everything is covered.

For example, I was looking for a way of exporting all my messages.

In other words, “all messages from a user with a certain ID”.

Ended up Googling for that query.

Crikey! Seems like Queensland government uses Yammer!

Google has indexed one of their documents

Link to that doc: https://www.forgov.qld.gov.au/file/21891/download?token=5DVxAoBr

It covers following

1. export all messages from a group

2. export new messages

3. export group members

4. export all messages from a user

5. export all messages in a conversation

Querying Users of a Certain Group

Shortly, for getting users from Yammer group we can use following pattern

https://www.yammer.com/api/v1/users/in_group/[groupid].json

To get response in JSON format, or

https://www.yammer.com/api/v1/users/in_group/[groupid].xml

To get response in XML format.

For paging we need to add “page=N” as a query option

https://www.yammer.com/api/v1/users/in_group/[groupid].json?page=1

Using paging is as important as breathing, because by default Yammer returns only limited number of records.

For example,

https://www.yammer.com/api/v1/users/in_group/[groupid].json

Returns only 50 users per page.

If you are eager to test this yourself, open your group page in a browser and get Group ID from the URL.

Build a query string using pattern above and navigate to it in the same browser, just in a different tab.

That way you don’t need to enter credentials second time as you has been authorized in that browser (when opened your group page).

On the screenshot below you may see fields provided by a standard response to https://www.yammer.com/api/v1/users/in_group/[groupid].json

As it is not an OData service, we don’t receive NextLink in the response, hence cannot utilize code from TripPin Tutorial (best PowerQuery vs OData tutorial available on the web today).

The only indicator of the next page existence is “more_available” tag, which keeps “true” value until we get to the last page.

It turns to “false” once we reach the last page

I have to add here, that current version of Yammer API doesn’t allow us to define list of fields we need, so we receive a standard response (leave a comment if I’m wrong).

What could be good to know about user for the resulting analysis?

User ID – can be used to count # of unique users

Name – to display user names when we drill down to user level

Job Title – allows to roughly count number of managers reading post

Mugshot_Url – link to an employee photo.

Activated_at – allows to split users on new/old employees.

Stats – users with the large (higher than average) number of followers can be treated as “VIP-employees”, so we can count likes from them as “super-likes”.

Web_Url – could be useful if you need to provide a link to user profile from your report

Rule of thumb: “keep your data model lean – don’t import columns that you don’t plan to use”.

As it is an “intro” post, I won’t publish queries with paging here yet, it will be in the next post.

From Theory to Practice

“Enough theory, show us practicalities!” – I clearly hear from you, dear reader. Fair enough!

Querying API from Power BI is not the same as navigating to the URL in a browser because we have to provide authorization token with each individual request.

There is a separate section about authentication in the Yammer API documentation

https://developer.yammer.com/docs/api-requests

IMHO, the simplest way is to use “Authorization” query option and provide a “Bearer value”

Where can we get a token value from?

Navigate to: https://developer.yammer.com/docs/userscurrentjson

Find a “Key icon”, Log In, copy token

This is a temporary token, that will expire after a certain time, but this is the simplest option for getting started.

Alternatively, you can register an application and get developer’s token, read more here: https://developer.yammer.com/docs/test-token

Finally, we can create a query in Power BI.

Query to Yammer in Power BI

Go to New Query -> From Web, switch to Advanced Mode

Use https://www.yammer.com/api/v1/users/current.json to test your connection.

Paste URL and add Header “Authorization”.

Type and paste “Bearer token” (don’t miss a space between “Bearer” and “token value”)

Click “OK” to test your query, you should receive an info about yourself

That’s all for today.

In the next post I’ll explain how to build a query with paging through the group members and messages, and through specific user’s messages.

Stay tuned and thank you for reading!

Filter Excel Function CUBEVALUE Connected to Power BI

cubevalue cubeset formula

You might know that we can connect Excel to Power BI using “Analyze in Excel” feature.

If this is something new for you – check out following video (3 min)

By default, after clicking on “Analyze in Excel” you get Excel file with Pivot Table connected to Power BI.

There you can add rows, columns, slicers – everything as usual.


But what if Pivot Table is not what you need?

What if you have pixel perfect dashboard which you need to fill with CUBEVALUE formulas (or maybe you use Think Cell addin)?

And what if you want dropdown list of values in cell instead of slicers (or want filter to be generated via formula)?

Let’s check what we can do.

CUBEVALUE formulas

The easiest way to get CUBEVALUE formulas instead of Pivot Table is to convert Pivot Table to Formulas


That action deletes Pivot Table and replaces all its cells with functions CUBEMEMBER…


 

…and CUBEVALUE


As you may see, CUBEVALUE is referencing to a Slicer with countries and a cell with a CUBEMEMBER function.

Replace single value slicer with a cell

Everybody likes slicers, me too. However, in certain scenarios it is better to have a cell with value instead.

Why?

(more…)

Data type conversion in custom columns

data type conversion

You probably know that you can manually set data type for custom column created in Power Query with help of Table.AddColumn function (using 4th argument).

Rick de Groot has recently published a good post about this at Gorilla.BI.

Just be sure you use right data type for the result of your calculation.

Once I faced strange Power Query behavior, as I thought, but later I understood that everything works fine (probably).

(more…)

Export PowerQuery query to CSV

Recently I found a PowerQuery gem, trick with Java/VB Script that allows to export data from Power Query to CSV without R / DAX Studio / SMS and Registration. However, related with risk. As everything else in our life.

Kudos to user Shi Yang from Stack Overflow who replied to How to write to DATA sources with Power Query?.

Shi proposes to use following code (extended with my comments)

Let
     // reference to a query you wish to export to CSV
    Source = ReferenceToYourTableOrQuery,
     // demote headers to have headers in resulting CSV
     // if you don't need headers, remove Table.DemoteHeaders
     Json = Text.FromBinary(Json.FromValue(Table.ToRows(Table.DemoteHeaders(Source)))),
     // trigger execution of script
  Export = Web.Page("
 var fso=new ActiveXObject('Scripting.FileSystemObject');
 var f1=fso.CreateTextFile('C:/Temp/test.csv',true);
 var arr=" & Json & ";
 f1.WriteLine(arr.join('\n'));
 f1.WriteBlankLines(1);
 f1.Close();
 ")
 in Export

All great, but this method doesn’t work with default settings of Internet Explorer.

(more…)

How to track refresh time in Power BI Desktop

Usually, I use Power Pivot and VBA in Excel to measure Power Query performance by comparing refresh time.

But, I suppose Power BI Desktop refresh process may be different, therefore would be nice to have something that would allow measure time between start and end of refresh.

Unfortunately, we do not have VBA in Power BI Desktop, nor can trigger and monitor refresh of Power BI Desktop from another application. So, we have only M and queries.

For the experiment I’ve created a new Power BI Desktop file, three queries in it, and put queries in order Start-Delay-End

Start

= DateTime.LocalNow()

Delay

= Function.InvokeAfter(()=> DateTime.LocalNow(), #duration(0,0,0,3))

End

= DateTime.LocalNow()

“Delay” must make a 3-seconds delay. You may read about Function.InvokeAfter in old good post from Chris Webb.

The idea is naive – hope that Power BI Desktop will execute queries in the same order as in the list of queries.

If so – query “Start” will load start time, Delay will make a pause, then “End” will load time of the end of refresh.

However, by default, Power BI Desktop loads tables in parallel, to optimize load time.

This property can be found in Options -> Current File ( Data Load ) -> Enable parallel loading of tables

This really works. I was receiving same time for Start and End tables while this property was enabled.

When I disabled it, I finally got desired difference between Start and End

Then I changed order of queries in the list of queries to check whether it impacts on execution order

Result shows that “Yes – order does matter“:

Of course, Power Query engine must be generating Execution plan when user press Refresh and then follows it. But in simple scenarios, when “Parallel loading of tables” is disabled, seems like Power BI Desktop follows order of queries from Query Editor.

I have no groups of queries, have no references between queries etc. It allowed me to check load time.

What about complex models

I tried to use same technique in more complex file – with groups of queries (but with no references).

I created query “Start” and placed it into the first group.

Similar for “End” query – but in the last group.

Result seems correct

Conclusion

Above is, of course, not a serious solution. Mainly, because you won’t want to disable parallel loading of tables, and won’t rely on order or queries.

Nevertheless, it would be good to have total refresh time directly in the model. It would allow to monitor refresh time of growing datasets.

More sophisticated way of query engine processing analysis is hidden in diagnostics of trace logs. You may read about this in several posts from Chris Webb here, here, and here.

Set of CSVs as a database for Power BI

What if part of your reporting database is a set of CSV files?

Apart from possible problems with different number of columns, data types, delimiters, encoding etc., you have to care about performance.

According to my practice, large number of files kills productivity. It is better to firstly combine CSV / TXT files into one, then use Power Query to load it.

(more…)

Shift cells up/down in same column in Power Query

In my previous post I wrote about one interesting technique used by my colleague Zoltán Kaszaki-Krsjak.

Categoty_tmp = Table.AddColumn(Buffer, "Category_tmp", each Buffer[Category]{[ID2]}?),
// It helps to shift values of column [Category] one row upwards.

It is a trick that you most probably will not use in any of your solutions. However, worth to know how it works and how to do this without adding new columns.

“each Buffer[Category]{[ID2]}?” is just a function, so we can combine it with methods described in one of my posts (Transform Column Using Custom Function).

Here is the code.

let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Duplicated Column" = Table.DuplicateColumn(Source, "Category", "Method1"),
#"Duplicated Column1" = Table.DuplicateColumn(#"Duplicated Column", "Category", "Method2"),
#"Duplicated Column2" = Table.DuplicateColumn(#"Duplicated Column1", "Category", "Method3"),
Buffer = Table.Buffer( #"Duplicated Column2" ),

Method1 = Table.FromRecords( Table.TransformRows( Buffer, each [Index = [Index],
    Category = [Category],
    Method1 = Buffer[Method1]{[Index]}?,
    Method2 = [Method2],
    Method3 = [Method3] ] ) ),

Method2 = Table.FromRecords( Table.TransformRows( Method1, (row) =>
    Record.TransformFields( row,
    {"Method2", each Buffer[Method2]{ row[Index] }? } ) ) ),
    // method offered by Miguel Escobar in comment to previous post

Method3 = Table.ReplaceValue(Method2, each [Method3], each Buffer[Method3]{[Index]}?, Replacer.ReplaceValue, {"Method3"})
in
Method3

File with methods is here.

But which method is faster?

(more…)

Transform Column using custom function in Power Query (part 2)

Year ago I wrote a post Transform table column using own function in Power Query. According to stats, it is quite popular topic, and I’m not surprised.

However, I think that old post is quite complex and too long. Advice should be shorter.

In addition, that post do not cover another very important scenario, when you need to transform column using value from another column.

Consider following case


Task:

  1. We need to Trim column [Name]
  2. We need to divide [Amount] by [Counter] (and do not want to add additional column)

Let’s do this.

(more…)