Recently I faced interesting Power Query problem. Actually, initial problem has nothing related to Power Query.
I required to export quite big volume of data from SAP BW using Business Objects Analysis addin (BO Analysis, or BOA) and save as CSV. Amount of data is literally huge, so it was impossible to export it in one query and even in 10.
I wrote a VBA macro with a loop, where each iteration had to 1) set variables and filters 2) refresh data 3) save result as csv.
So, I had to prepare set of variables and filters for each step of loop, setting filters on several dimensions.
Following four tables describe all possible filters I needed:
KPI, Business Area, Time Aggregation, Sales Channel.
Each set of variables and filters should look like combination
COGS, YTD, GE, 01
COGS, YTD, GE, 02
…
COGS, YTD, !GE, 01
…
COGS, !YTD, GE,01
…
In other words, all possible combinations of rows from these tables.
So, the task is simple: Staying in Excel build a table with all possible combinations of rows from provided tables with an indicator (ID) of Set (for further usage in macro)
Piece of cake for old-school VBA programmers.
“But what if I meet similar task in Power BI Desktop where no VBA?”
“Would be great to have Power Query function for such operation with tables… any list of tables!”
My first thought was: “Hah, Table.CrossJoin trick will solve this easily”.
I took Table1, added new column with Table2 for each row
Then expanded
But wait, how resulting table should look for two tables? Something like this:
So straightforward CrossJoin doesn’t work here.
Aha, seems like I can append table from previous step to itself and sort by values
I got desired 12 rows and logic of getting resulting table is peeping out: “Custom column, odd row – Value 1, even – Value 2”.
However, at this point I imagined nightmare when I add 3rd and 4th table.
So, I decided to find another way.
If we look at the task from different angle, we need to take
1st row from Table1, 1st row from Table2, 1st row from Table3, 1st row from Table4, then
1st row from Table1, 1st row from Table2, 1st row from Table3, 2nd row from Table4, then
..
1st row from Table1, 1st row from Table2, 1st row from Table3, 4th row from Table4, then
1st row from Table1, 1st row from Table2, 2nd row from Table3, 1st row from Table4, then
..
1st row from Table1, 1st row from Table2, 2nd row from Table3, 4th row from Table4, then
..
In other words, we get tuples of four elements
1111
1112
…
1114
1121
..
1124
1211
And so on.
Each of 4 dimensions in tuple depends on amount of rows in corresponding table.
1st position related to Table1, where 3 rows, so value can have values from list {1..3}.
2nd position – Table2, where 2 rows, so – 1..2
3rd – Table3, so – 1..2
4th- Table4, so – 1..4
In general, list { 1 .. Table.RowCount } defines dimension.
Having such lists I can List.CrossJoin them to get all possible combinations. Huge thanks to Tycho Grouwstra for this function.
I tried step by step
It gives a table
Where using Merge Columns, I got first list of pairs
Then repeated with 3rd table
And so on… got in the end table with 48 rows of all possible combinations
Each combination describes set of variables and filters for Run X, which can be identified in Index column
Finally, having reference to initial tables, I could pull corresponding rows from each table
Left to expand
All these operations can be transformed into a function.
In general, such function can receive a list of tables as argument, we don’t know number of tables. Hence, should use iterator, such as List.Generate to cross join all tables.
You may find function in my library of Power Query functions.
As usually, sample workbook is available here.
Happy combining!
Update:
I had a strong feeling that my approach is not optimal. Loops are old good way, but PQ always has secret ways, often not obvious. New workbook with three solutions, mine, from Imke Feldman, and Bill Szysz is available here.
Nice work Ivan 🙂 Thanks!
But I think there is a simpler way to do this 😉
UPD:Link to file with my approach in next comment
Regards
ooopss… Sorry Ivan, wrong link 🙁
Could you remove the previous one?
here is correct link
https://drive.google.com/open?id=0B6UlMk8OzUrxSXZuajJLLUxfczA
Hi Ivan,
Am wondering if you can help out.
I have a similar situation as above, but I believe it’s simpler.
Say In Table1, I have 4 columns, and in Table2 (which is just a list of dates), I only have 1 column.
So my question is, since there’s no direct relationship between those 2 tables, how can I ‘combined’ the 2 tables where for each record in Table1, I want to insert all of the dates from Table2
for example, Table1 is ‘ID’, ‘Name’, ‘DateOfSale’, ‘SaleAmount’.
1 , Ivan , Nov 1, 2017, $1,000
Table2 is just have 1 column ‘Date’
So my desired will be
‘ID’, ‘Name’, ‘DateOfSale’, ‘SaleAmount’, ‘Date’
1 , Ivan , Nov 1, 2017, $1,000 , Jan 1, 2017
1 , Ivan , Nov 1, 2017, $1,000 , Feb 1, 2017
1 , Ivan , Nov 1, 2017, $1,000 , Mar 1, 2017
1 , Ivan , Nov 1, 2017, $1,000 , Apr 1, 2017
1 , Ivan , Nov 1, 2017, $1,000 , May 1, 2017
1 , Ivan , Nov 1, 2017, $1,000 , Jun 1, 2017
Hi David,
You are right, this is simpler scenario.
I’d add dummy column in Table1 and Table2 with only one value – 1. Using this dummy column we can perform Merge operation (join). As all dummy values in all rows are the same, we will get crossjoin (Cartesian product) of two tables. Hope this explanation is helpful.
Check out also https://exceleratorbi.com.au/cross-join-with-power-query/
BR,
Ivan
Thanks Ivan for the explanation as well as giving me a link as a reference.
Learn something new today.