Callum

Callum Green's Blog

Loop Through Webpages with Multiple Parameters in Power Query

Back in February 2016, I posted a step by step article (found here) on how to loop through a website and dynamically pull down multiple pages into a single table in Power Query. I was recently approached by a gentleman who had read my blog, wanting to enhance the looping process. The requirements were:

Current Problem:

-          The user can currently only loop through Year, having to hard code the Page and Month.

Requirements:

-          Dynamically loop through three webpage parameters and output them into a single table – highlighted below:

http://www.boxofficemojo.com/monthly/?page=1&view=calendargross&yr=2017&month=1&p=.htm

-          Specify the Page, Year and Month to loop through

o   Years 2010 - 2017

o   Months 1 - 12

o   Pages 1 - 10

Parameters

Whilst GUI Parameters are not a new concept in Power Query, they did not exist when I wrote my initial blog.  As a result, I had to code a parameter into the M query and then define it as a function.  A similar concept will be adopted here, but the out of the box parameter functionality makes life so much easier now.

I previously wrote an article explaining how to set up parameters (here), so I will simply show you the output of them below:

clip_image002[4]

Note – all three parameters are set to Text.  This is needed in order to pass the parameters into the website address later on.

Web Connection

The first step is to import the data from the website (shown above) using the web connector.  To pass our parameter values into the URL, click the cog in the first Applied Step – typically called ‘Source’.

clip_image004[4]

This will open a new window, where by design, the hardcoded URL is shown under the Basic radio button.  Select Advanced and you will see additional configuration capabilities.

clip_image006[4]

URL Parts enable you to build up elements of the webpage, whilst dynamically passing in parameters.  The URL Preview will show you how the URL looks with any parameters or URL Parts.  All other options can be ignored for this example.

The URL Parts had to be broken down quite significantly and I will show you both the code and how it looks in the Advanced window.

User Interface:

clip_image007[4]

The sections highlighted in red are the parameters and sit in between some of the hard-coded URL text

Code Breakdown:

-          Text = http://www.boxofficemojo.com/monthly/?page=

-          Parameter = [Page]

-          Text = &view=calendargross&yr=

-          Parameter = [Year]

-          Text = &month=

-          Parameter = [Month]

-          Text = &p=.htm

URL Preview:

http://www.boxofficemojo.com/monthly/?page={Page}&view=calendargross&yr={Year}&month={Month}&p=.htm

Year/Month/Page Permutations Table

Before creating/invoking the function, a table that contains the different years, months and pages is needed.  There are many ways to achieve this, but I used a cross join technique (found here) with three disparate tables.  I followed the end user requirements and end up with a table like below:

clip_image009[4]

This will be the final table where the function is called.

Function

Let’s go back to the original table, where the parameters have been applied to the Web URL.  Right click the table and select Create Function.  Nothing else is required (apart from naming it), as Power Query will automatically package up the table and you end up with a new Function like below:

clip_image010[4]

The last step is to invoke the function.  Go back to the Permutations table (called ‘All Data’ in my example), and go to Add Column > Invoke Custom Function.

clip_image011[4]

When prompted with the next screen, map each parameter to the equivalent column.  The red highlighted values are the columns and yellow contain the parameters.  Both are named the same in my example and make mapping even easier.

clip_image013[4]

The function will loop through each URL permutation, dynamically changing the parameters on execution.  Once it has finished, tidy up any errors (for Years, Pages and/or Months) that do not exist.  Apply all transformations within the Query Editor and that is it – the data is ready for Reporting!  You may need to configure some of the measures (e.g. Total Gross) to be numbers, but that is easy enough to do within the Query Editor.

The example in this blog can be adapted in many ways and illustrates how dynamic Power Query can be – both in Excel and Power BI.

Further Reading

o   Cross Join in Power BI –  http://bit.ly/2ij8zJ1

o   Reza Rad Custom Functions – http://bit.ly/2zz6X2f

Contact Me

If you would like a copy of the PBIX workbook or have any general questions, feel free to leave a comment below.

Twitter:                @DataVizWhizz

Query Parameters in Power Query – Part 1

A new feature has recently been added into Power Query – called Query Parameters.  This was actually released to Power BI as part of May 2016’s update.

So what is a Query Parameter?  For anyone familiar with Microsoft Reporting Services, it has pretty much the same functionality as a Report Parameter.  In simple terms, it enables you to filter your data dynamically in Power Query.  There has always been a Filter UI function but it could not be linked to a parameter or plugged into the M code.  We can use Query Parameters as a way of reducing the data volumes brought into a PowerBook (useful for large datasets or tables) or to allow a user to switch between filters (e.g. changing database connections dynamically).

In this blog, I will concentrate on how we can reduce data volumes by implementing a Query Parameter.  Look out for Part 2 in the coming weeks, where I will pick a use case for when we would want to dynamically switch between parameter values.

Reducing Data Volumes

Imagine transforming or just analysing a fact table that contains over 1 billion records.  Not only will Power BI (and Power Query) probably crash or at best, perform badly, there is no need to bring all this data back into a PowerBook.  A subset, or perhaps an individual month is all that is required to undertake the necessary Power Query ETL operations.

The below example demonstrates how a Query Parameter can be used to filter records in a given dataset.  I have used the AdventureWorksDW2014 Database (download here) and picked the FactProductInventory table, which contains 776,286 rows.  I will reduce this count by adding a new Query Parameter in Power Query – using the Datekey to find all rows in between the 31st November and 30th December 2010.

1.       Assuming I have already loaded the data into the workbook, in the Query Editor, click Manage Parameters from the ribbon.
 

clip_image001

2.       Create two parameters – one named DateKeyFrom and the other DateKeyTo.  The former contains the value 20101131 and the latter 20101230.  Keep the default settings and confirm by clicking OK.  You two parameters should look like the below:

clip_image003

 

3.       Click on the DateKey column drop down and choose Number Filters and Between – highlighted below.

 

clip_image005

 

 

4.       Set the filter to pick up the two newly created parameters.  The following screenshot illustrates the criteria required, with the key change involving changing the second drop down from ‘Decimal Number’ to ‘Parameter’.

clip_image007

5.       Check that your dataset has been filtered and only contains data for the time period December 2010.

6.       If you need to change the date criteria and let’s say, only display 5 days of data, this can be achieved easily clicking the parameters in the Queries pane.  I have edited the DateKeyFrom value to 20101225.  You can also update a value by clicking the Manage Properties button.

clip_image009

7.       As soon as you click back onto the FactProductInventory table query, the data automatically begins to refresh.  We are now left with only the last 5 days in December, shown below. 

 

clip_image011

Conclusion

As Query Parameters are hot off the press, I have only just started experimenting with it.  It looks like it has been added to not only reduce and filter on data, but to enable us to plug a dynamic parameters into M.  Previously, I used a Function as a workaround but can now actually use them in the proper context.  Click here to view a previous blog, where I specifically create and invoke a function.  There are a number of other uses for Functions and Rafael Salas’ blog provides an excellent walkthrough.

My next blog will focus on how you can switch between different parameter values and dynamically display the results.  If anyone has built their own Query Parameter examples and want to share them, feel free to leave me a comment below.  I would be very interested to see how far people have already taken this new feature.

Further Reading

There are already a few great blogs and use cases out there:

1.       Chris Webb’s Blog - https://blog.crossjoin.co.uk/2016/05/08/the-m-code-behind-power-bi-parameters/

2.       Soheil Bakhshi Blog (1 of 2) – http://biinsight.com/power-bi-desktop-query-parameters-part-1/

3.       Soheil Bakhshi Blog (2 of 2) - http://biinsight.com/power-bi-desktop-query-parameters-part2-dynamic-data-masking-and-query-parameters/