SQL GROUP BY

The SQL GROUP BY clause is used along with the SQL aggregate functions and specifies the groups where selected rows are placed. WHEN one or more aggregate functions are presented in the SQL SELECT column list, the SQL GROUP BY clause calculates a summary value for each group. To grasp this concept we will illustrate the SQL GROUP BY application with example. Consider the already familiar Weather table:

CityAverageTemperatureDate
New York22 C10/10/2005
Seattle21 C10/10/2005
Washington20 C10/10/2005
New York18 C10/09/2005
Seattle20 C10/09/2005
Washington17 C10/09/2005

Our task is to calculate the average temperature for each of the cities in the Weather table. Here is how to accomplish that using the SQL GROUP BY clause:

SELECT City, AVG(AverageTemperature) FROM Weather GROUP BY City

The result of this SQL GROUP BY statement is the following:

CityAverageTemperature
Washington18.5 C
Seattle20.5 C
New York20 C

As you can see in the result data set, we have one row for each city. Each city in our SQL ORDER BY statement, represents one group because the City column is specified in the GROUP BY clause.

But what is the number in the AverageTemperature column? How come we have only one number for each city when we have multiple temperature entries for each of the cities in the Weather table? Because the AverageTemperature column is an argument for the SQL AVG aggregate function, the single value we see in the final result set is simply the average temperature value for its respective city:

Washington average temperature = (20 + 17)/2 = 18.5 Seattle average temperature = (21 + 20)/2 = 20.5 New York average temperature = (20 + 20)/2 = 20

When GROUP BY is used, one of the following must be true:

1. Each column which is not part of an aggregate expression in the select list is presented in the SQL GROUP BY clause.

2. The SQL GROUP BY expression matches exactly the select list expression.

Here is an example of incorrect GROUP BY statement:

SELECT City, AVG(AverageTemperature), Date FROM Weather GROUP BY City

This SQL GROUP BY statement is wrong because the Date column is presented in the SQL SELECT list, but is not in the GROUP BY clause. The correct SQL GROUP BY statement should look like this:

SELECT City, AVG(AverageTemperature), Date FROM Weather GROUP BY City, Date