Adatis

Adatis BI Blogs

SQL Order by with aggregation

In this post we will look into applying an order by clause to a query containing an aggregate function. The purpose of this is to allow us to order a result set by a column which does not appear in the select or group by clause.  Without using an aggregate function, or including the column to order by in your select/group by, you will be unable to execute the query successfully. This is quite a straight forward problem, but one that can easily trip people up.  I started by creating a simple table containing Colours which appear multiple times over multiple dates and against multiple IDs. Using the count() function to count how many times a colour appears on a certain date doesn’t cause us any problems. The result set shows us 7 rows of data, we note that the colour Aqua Marine Green appears first in the list and while black appears second in the list as we will use this information later. However if we wanted to see the result set in the order of their respective ID’s this is where we run into issues, The error message is quite clear if not a little confusing on what needs to happen here, either we need to add the ID column to our group by clause, or wrap it in an aggregate function.  Adding the ID to the group by clause will change your result set, if we take the query above and simply add the ID column to our select and group by we no longer get an error message, however the data has lost all meaning as now we are back to our 10 original rows of data each being counted once. Using an aggregate function in the order by clause “"eg: Min(ID) fixes the problem and provides us with the result set we are after. If we look above we can see that “Black” appears 3 times against Date “2015-09-03” and the ID’s for those 3 records are 1,3,8.  It is only after specifying which ID to order by is SQL able to correctly execute the query. Without adding an aggregate function be that MIN() SUM() AVG() SQL is unable to determine which ID it should use for each group to order the data by. Be careful when picking the aggregate function to use as your choice will affect the result set.  The screenshots below show the result set ordered by MIN(ID), AVG(ID), SUM(ID). Each returning the same set of data but not in the same order. (the aggregated ID column has been added to the screenshots for reference only.) The first example orders by the MIN() function and results in Black appearing first in the result set. The next example is ordered by the AVG() function and results in Black appearing fourth in the result set. Finally ordering by the SUM() function results in Black appearing last in the result set.