This is the third post in the series “Yammer Analytics in Power BI”
1 – Intro to Yammer Analytics in Power BI
2 – Paging through Yammer group messages
3 – Paging through Yammer group members
4 – Paging through Yammer user messages
Group Members
Querying group users (members) is a relatively simple task as we have an API for that
https://developer.yammer.com/docs/usersin_groupidjson
Note that API returns only 50 users per page, meaning we have to build a loop through unknown number of pages.
Basic Patterns
For getting users from Yammer group we can use following pattern:
https://www.yammer.com/api/v1/users/in_group/[groupid].json
or to get response in XML format
https://www.yammer.com/api/v1/users/in_group/[groupid].xml
For paging we need to add “page=N” as a query option
https://www.yammer.com/api/v1/users/in_group/[groupid].json?page=1
Using paging is as important as breathing, because as we may see from documentation, Yammer returns only 50 records per query.
It is not said in the documentation but API returns [more_available] tag
So we can use it to exit loop.
First Step
Let’s firstly build a “response catcher”. A function that will transform web response into a meaningful form.
We will be providing URLs with a following pattern
url_base = "https://www.yammer.com/api/v1/users/in_group/" & GroupID & ".json"
as an argument to a function fGetUsersPage:
(url as text) as table => let response = Web.Contents(url, [Headers=[Authorization=#"Authorization"]]), body = Json.Document(response), moreavailable = try Logical.From( body[more_available] ) otherwise false, users = body[users], #"Converted to Table" = Table.FromList(users, Splitter.SplitByNothing(), null, null, ExtraValues.Error), #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"id", "state", "full_name", "job_title", "mugshot_url", "web_url", "activated_at", "stats", "email"}, {"id", "state", "full_name", "job_title", "mugshot_url", "web_url", "activated_at", "stats", "email"}), #"Expanded stats" = Table.ExpandRecordColumn(#"Expanded Column1", "stats", {"following", "followers"}, {"following", "followers"}) in #"Expanded stats" meta [ MoreAvailable = moreavailable ]
This function returns a page with list of users (up to 50 items) and some of their attributes available through API. On top of this, response will contain a meta parameter – MoreAvailable, which takes true/false value. This meta flag will help us to stop “do-while” loop.
Pagination
I prefer to use List.Generate function for loops in Power Query, so I ended up with the following code that iterates through pages with Group Members and returns a table with users and their parameters
let Delay = 1, GroupID = Text.From( GroupID ), url_base = "https://www.yammer.com/api/v1/users/in_group/" & GroupID & ".json", Source = List.Generate( ()=> [ i = 2, url = url_base, Page = fGetUsersPage(url_base), more = true, last_page = false // change to true when last page reached ], // arg2 = do while each [last_page] = false and [i]<100, // hardcoded limit, increase for groups with more than 5000 members // arg3 - iteration each [ i = [i] + 1, Page = try Function.InvokeAfter( ()=>fGetUsersPage(url_base & "?page=" & Text.From([i])), #duration(0,0,0,Delay) ) otherwise null, more = try Value.Metadata(Page)[MoreAvailable]? = true otherwise false, url = url_base & "?page=" & Text.From([i]), last_page = if not [more] then true else false // [more] in square brackets is a reference to the result of previous iteration ], // arg4 - output of each iteration each [[i], [url], [Page], [more], [last_page]] ), #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error), #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"i", "url", "Page", "more", "last_page"}, {"i", "url", "Page", "more", "last_page"}), #"Removed Other Columns" = Table.SelectColumns(#"Expanded Column1",{"Page"}), #"Expanded Page" = Table.ExpandTableColumn(#"Removed Other Columns", "Page", {"id", "state", "full_name", "job_title", "mugshot_url", "web_url", "activated_at", "following", "followers", "email"}, {"id", "state", "full_name", "job_title", "mugshot_url", "web_url", "activated_at", "following", "followers", "email"}), #"Changed Type" = Table.TransformColumnTypes(#"Expanded Page",{{"id", Int64.Type}, {"state", type text}, {"full_name", type text}, {"job_title", type text}, {"mugshot_url", type text}, {"web_url", type text}, {"email", type text}}), #"Changed Type1" = Table.TransformColumnTypes(#"Changed Type",{{"activated_at", type datetimezone}}), #"Extracted Date" = Table.TransformColumns(#"Changed Type1",{{"activated_at", DateTime.Date, type date}}), #"Changed Type2" = Table.TransformColumnTypes(#"Extracted Date",{{"following", Int64.Type}, {"followers", Int64.Type}}), #"Inserted Age" = Table.AddColumn(#"Changed Type2", "Age", each Date.From(DateTime.LocalNow()) - [activated_at], type duration), #"Inserted Total Years" = Table.AddColumn(#"Inserted Age", "Total Years", each Duration.TotalDays([Age]) / 365, type number), #"Removed Columns" = Table.RemoveColumns(#"Inserted Total Years",{"Age"}), #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Total Years", "UserYears"}}), #"Replaced Value" = Table.ReplaceValue(#"Renamed Columns","48x48","100x100",Replacer.ReplaceText,{"mugshot_url"}) in #"Replaced Value"
Highlights
I’d like to highlight some parts of this code as you might want to re-use these techniques.
1. To add a delay between API queries we can use Function.InvokeAfter
Page = try Function.InvokeAfter( ()=>fGetUsersPage(url_base & "?page=" & Text.From([i])), #duration(0,0,0,Delay) ) otherwise null,
2. Use “try … otherwise …” to suppress potential errors, unless you really want to fail entire refresh process when one of the pages cannot be returned.
3. And the last tip I find really helpful – return all record fields in 4th argument of List.Generate
// arg4 - output of each iteration each [[i], [url], [Page], [more], [last_page]]
It helps a lot with debugging as I may see steps of List.Generate: page index, URL, returned page table and “loop stopping” parameters (“more” and “last_page” in this case):
That’s all for today.
Stay tuned. New posts are coming hopefully soon.