In this post I consider two types of hierarchies
First table defines typical hierarchy of companies.
Usually, such hierarchy is used for financial reports to group key figures.
Most probably, we know number of hierarchy levels upfront, but not always.
Task: Expand hierarchy, automatically detect quantity of levels in resulting table using Power Query (Get & Transform) / Power BI
Stay close to scenario described in pattern: http://www.daxpatterns.com/parent-child-hierarchies/
Goal:
Second table defines substitution of products.
Case: sometimes we required to get sales history of all predecessor products and group result on Actual Product. There can be multiple substitutions, we never know amount.
Task: In this case we need to find the latest successor, Actual Product for each product in other words.
Goal:
I’ll start from second task – find Actual Product.
As we don’t know amount of substitutions we need an iterator, such as List.Generate function, that allows to easily create loops in Power Query (Get & Transform) / Power BI.
First attempt
fGetActualProduct = ( product as text, optional prev_substitutions as nullable list ) =>
let
prev_substitutions = if prev_substitutions = null then {} else prev_substitutions,
#”Filtered Rows” = Table.SelectRows( buffer, each [Product] = product ),
actual_product = if not Table.IsEmpty( #”Filtered Rows” ) then
// found something – take first element and check it
if #”Filtered Rows”{0}[Substitute] <> null and #”Filtered Rows”{0}[Substitute] <> “”
// if also substituted – look for next
then
// check if was substituted earlier
if not List.Contains( prev_substitutions, #”Filtered Rows”{0}[Substitute] )
then @fGetActualProduct( #”Filtered Rows”{0}[Substitute],
List.Combine( { prev_substitutions,
{ #”Filtered Rows”{0}[Substitute] } } ) )
else “-” & product // error in hierarchy
// otherwise – take what came as argument
else product
else product
in
actual_product,
I had to add second argument “prev_substitutions” because I faced stack overflow error, when found circular reference in own data.
When I worked on solution for 1st problem, I realized, that I can avoid filtering of entire table
#”Filtered Rows” = Table.SelectRows( buffer, each [Product] = product ),
Instead, I can use reference to list element, where list is a Product column of initial table.
For example:
buffer[Substitute]{ List.PositionOf( buffer[Product], product ) }
We can find position of product in column Product and get element from column Substitute with the same position.
I’m sure List.PositionOf works much faster than Table.SelectRows, especially, if table contains many other columns. Sometimes we can skip tests and simply believe, right?
Table.SelectRows technique comes from semi-UI approach, when I firstly tried to make action using UI, then change M code. Same story as with VBA macro recorder :-).
So, I re-wrote function
fGetActualProduct2 = ( product as text, optional prev_substitutions as nullable list ) =>
let
prev_substitutions = if prev_substitutions = null then {} else prev_substitutions,
position = List.PositionOf( buffer[Product], product ),
actual_product = if position >=0 then
// found something
if buffer[Substitute]{ position } <> null and buffer[Substitute]{ position } <> “”
// if also substituted – look for next
then
// check if was substituted earlier
if not List.Contains( prev_substitutions, buffer[Substitute]{ position } )
then @fGetActualProduct2( buffer[Substitute]{ position },
List.Combine( { prev_substitutions, { buffer[Substitute]{ position } } } ) )
else
“-” & Text.From( product ) // error in hierarchy
// otherwise take what came as argument
else product
else product
in
actual_product,
Left to apply this function for all rows with substituted products.
1st problem took slightly more time. Firstly, I thought that I can make it with help of multiple joins, using List.Generate, but this approach led me to wrong result.
If we take a look once again on expected result
Produced in DAX:
We may see empty cells starting from Level 2. With “Join” we cannot do this.
In addition, if look closer on Level 2… How to get, using Join, Company3 for all those Childs?
Multiple Joins – many new columns, then new column with formula, then repeat for another level… probably it is possible. You may try if you want.
I went another way.
Firstly, took as base that Root element has reference to itself, so Child=Parent.
RootElement = Table.SelectRows( buffer, each [Child] = [Parent] )[Child]{0},
Then got Hierarchy Depth for each Child
// Get depth of each child
fGetDepthLevel = (child as text, optional level_counter as nullable number) =>
let
Level = if level_counter = null then 2 else level_counter,
parent = buffer[Parent]{ List.PositionOf( buffer[Child], child ) },
res = if parent = RootElement then Level else @fGetDepthLevel( parent, Level + 1 )
in
res,
And added as new column
buffer_w_HD = Table.Buffer( Table.AddColumn(
Table.AddColumn( buffer, “HierarchyDepth”, each
if [Child] = [Parent] then 1
else fGetDepthLevel( [Child] )
),
“Level 1”, each RootElement ) ),
Now we can say that Hierarchy Depth is equal to number of Levels in resulting table
HierarchyDepth = List.Max( buffer_w_HD[HierarchyDepth] ),
So, what left? Add several columns for Levels.
E.g. in Level 2 column we should have for each Child: Parent from Level 2 of Hierarchy, or better to say Depth=2.
We will need following function to get Parent from Level X of Child-Parent Hierarchy
fGetParentByLevel = (child as text, parent_level as number) =>
let
// search position of Child and take its Hierarchy Depth
parent_level_result = buffer_w_HD[HierarchyDepth]{ List.PositionOf( buffer_w_HD[Child], child ) },
// if this is desired Depth – take this Child
res = if parent_level_result = parent_level
then child
// otherwise – search further
else @fGetParentByLevel( buffer_w_HD[Parent]{ List.PositionOf( buffer_w_HD[Child], child ) }, parent_level )
in res,
Ok, and the final step – List.Generate to add required columns with levels
table_generator_buffered = Table.Buffer(
List.Last(
List.Generate( ()=>
[i = 2, T = buffer_w_HD],
each [i] <= HierarchyDepth + 1,
each [i = [i] + 1,
T = Table.AddColumn( [T], “Level ” & Text.From( [i] ), each if [HierarchyDepth] >=i-1 then fGetParentByLevel( [Child], i-1 ) else null ) ],
each [T] ) ) ),
Pay attention to row in List.Generate
T = Table.AddColumn( [T], “Level ” & Text.From( [i] ), each if [HierarchyDepth] >=i-1 then fGetParentByLevel( [Child], i-1 ) else null ) ],
“i” is used with square brackets (first part) and without them (second part – under “each”)
For the moment I should say “Works? -Works! -Don’t touch!”.
My first intention was to use [i] in all places of formula, but compiler couldn’t find it
Seems, everything that is under “each” words is considered by compiler as under “filter context” of Table in first argument. Hence, when we use square brackets PQL engine tries to find field [i], which doesn’t exist in table T.
You may download workbook with described samples – here.
Afterwards,
If your hierarchy is based on IDs of Companies / Products / Employees, for example
Child | Parent | Name |
1 | 1 | First |
2 | 1 | Second |
3 | 3 | 3rd |
4 | 1 | 4th |
But you want to have hierarchies with Names – join table with itself. Add Parent_Name column and use it to build desired hierarchy.
Flexible hierarchy is good and bad.
If today you have less levels than yesterday, some columns in PowerPivot / Power BI / Visuals / Pivot Tables / will disappear. It can lead to undesired results, broken visuals, DAX etc.
As soon as you are sure that hierarchy is very slowly changing object in you model, using PQL approach you get less DAX formulas in your model, less calculated columns – then switch off update of query in Power BI.
Plan properly.
Wonder what is PQL? Read the proposal for official TLA for M Language.
This is an excellent utility function. Thanks!
nice article thank you very much
Is there an example of being able to perform the opposite ?