(guest@joequery.me)~ $ |

SQL aggregation functions (with exercises)

(This post is part of the learning sql series.)

These notes follow along with Sams Teach Yourself SQL in 10 Minutes. If you would like to follow along with the examples, create a database and execute this sql script. Small exercises are scattered throughout this article and should be accomplished before you move on to the next section. Review problems are provided at the bottom of article.

Aggregation functions

Topics covered

  • What are aggregation functions?
  • Why do we need aggregation functions?
  • Aggregation function examples
  • Review problems

What are aggregation functions?

As discussed in the previous notes, calculated fields in SQL are calculations performed on a field (or group of fields) for an individual row. In contrast, an aggregation function is a function performed on multiple rows.

Example 1

Adding a dollar sign to each price in a Product table is an example of a calculated field. is applied.

SELECT CONCAT('$', prod_price)
FROM Products
ORDER BY prod_price
LIMIT 5;

Results:

+-------------------------+
| CONCAT('$', prod_price) |
+-------------------------+
| $3.49                   |
| $3.49                   |
| $3.49                   |
| $4.99                   |
| $5.99                   |
+-------------------------+

For each row that returns from the query, the value of the field prod_price is passed to the CONCAT() function. The CONCAT() function takes values as arguments. A value or multiple values go in, an individual value comes out.

In contrast, aggregation functions do not take individual values as arguments. Aggregation functions take in a set of values as arguments. A set of values go in, an individual value comes out.

Why do we need aggregation functions?

Humans constantly query for aggregate data in their day to day lives. Examples include

  • What will the high temperature for today be?
  • What's the cheapest toy in the store?
  • How many students of the high school are in 10th grade?

What makes these real-life queries aggregate? The distinction between aggregate data and non-aggregate data is whether the individual components of the data are significant.

As a way to demonstrate the distinction between aggregate and non-aggregate data, suppose our boss asks us "How many orders were placed yesterday?"

Option 1 - non aggregate answer

In order to provide a non-aggregate answer to this question, we must list out each individual component of data. In the context of "How many orders were placed yesterday?", this means listing out all the orders that were placed yesterday. So imagine you pull up the order sheet from the day before and reply with

"Company A ordered at 9AM, Company B ordered at 9:12AM, Company C ordered at 10:15AM, Company D ordered at 11:30AM, Company E ordered at 11:33AM, Company F ordered at noon, Company G ordered at 12:18PM, Company H ordered at 2:45PM, Company I ordered at 2:53PM, Company J ordered at 3:02PM, Company K ordered at 3:09PM."

Option 2 - aggregate answer

In order to provide an aggregate answer to this question, we must process the data as a whole and provide an individual result that does not include the individual components of the data. This could look like opening up the order sheet from the day before, internally counting the orders that took place, and replying with

"11 orders"

In this case, the aggregate answer best answers the question at hand since the individual orders themselves were not a point of interest.

Exercise 1: Provide 5 real world examples of aggregate data and 5 examples of non-aggregate data.

Aggregation function reference

To view a list of all aggregation functions available in MySQL, visit the Aggregate Function Reference section of the official MySQL documentation.

Aggregation function examples

Example 2

For example, to calculate the average price of all of the products in the Product table, we use the AVG() aggregation function.

SELECT AVG(prod_price) AS avg_price
FROM Products;

Results:

+-----------+
| avg_price |
+-----------+
|  6.823333 |
+-----------+
1 row in set (0.00 sec)

The AVG() function takes in the set of prod_price values from all rows in the Product table. The function returns the average of this set of values.

Example 3

Suppose we wanted to figure out the cost of buying all the products below \$5 in the product table.

First let's examine what products meet this criteria.

SELECT prod_name, prod_price
FROM Products
WHERE prod_price < 5;

Results:

+---------------------+------------+
| prod_name           | prod_price |
+---------------------+------------+
| Fish bean bag toy   |       3.49 |
| Bird bean bag toy   |       3.49 |
| Rabbit bean bag toy |       3.49 |
| Raggedy Ann         |       4.99 |
+---------------------+------------+
4 rows in set (0.01 sec)

If we were to purchase all of these products, the total would be $$3.49 + 3.49 + 3.49 + 4.99 = 15.46$$

Aggregation functions can work with filtering via WHERE clauses. To directly query mysql the sum of the product prices under \$5, we perform the following:

SELECT SUM(prod_price) as total_price
FROM Products
WHERE prod_price < 5;

Results:

+-------------+
| total_price |
+-------------+
|       15.46 |
+-------------+
1 row in set (0.00 sec)

Example 4

Since aggregation functions return single values, you can treat the return value of an aggregation function as any other standard value. This includes passing the value to functions.

SELECT CONCAT('$', SUM(prod_price)) as total_price
FROM Products
WHERE prod_price < 5;

Results:

+-------------+
| total_price |
+-------------+
| $15.46      |
+-------------+

Exercise 2: Rewrite Example 2 to provide the same result without using the AVG() function. Use the MySQL Aggregation Function Reference.

Review problems

The following review questions require the database provided at the beginning of the article.

Write a query that answers each of the following questions. Be sure to include the query and the result of the query for each answer.

Problem 1

How many bean bag toys are in the Products table?

Problem 2

How many customers have yet to provide their email?

Problem 3

What is the average price of products that are 12 inches long?

Problem 4

How much would it cost to order 100 units of the most expensive product?

Problem 5

How much would it cost to order 10 quantities of the 3 most expensive products?

Problem 6

How many vendors are incorporated?

Tagged as databases

(This post is part of the learning sql series.)

Date published - January 18, 2016