Scenario
A wildlife research team maintains separate tables for animal records and the climate classification of each region.
Task
Write a Snowflake SQL query that:
- Joins
{{ ref("animals") }} with {{ ref("regions") }} on the region column
- Groups the results by species and climate
- Returns the species, climate, average age (rounded to 2 decimal places), average weight (rounded down to the nearest integer), and total count of animals for each group
Schema
animals
| Column |
Type |
Description |
| id |
String |
Unique animal identifier |
| species |
String |
Animal species name |
| age |
Integer |
Age in years |
| weight |
Float |
Weight in kilograms |
| region |
String |
Geographic region where the animal lives |
regions
| Column |
Type |
Description |
| region |
String |
Geographic region name |
| climate |
String |
Climate classification for the region |
Example
animals:
| id |
species |
age |
weight |
region |
| 1 |
Wolf |
4 |
45.8 |
Europe |
| 2 |
Elephant |
25 |
5200.5 |
Africa |
| 3 |
Wolf |
6 |
52.3 |
Europe |
| 4 |
Panda |
8 |
110.7 |
Asia |
| 5 |
Elephant |
30 |
5400.2 |
Africa |
regions:
| region |
climate |
| Africa |
Tropical |
| Europe |
Temperate |
| Asia |
Subtropical |
Expected Output:
| species |
climate |
avg_age |
avg_weight |
total_animals |
| Elephant |
Tropical |
27.5 |
5300 |
2 |
| Panda |
Subtropical |
8.0 |
110 |
1 |
| Wolf |
Temperate |
5.0 |
49 |
2 |
Note: Each species belongs to a single region in this dataset, so each group contains one species paired with one climate. The average weight is rounded down to the nearest integer.
WITH joined AS (
SELECT a.*, r.climate
FROM {{ ref("animals") }} AS a
INNER JOIN {{ ref("regions") }} AS r
ON a.region = r.region
)
SELECT
species,
climate,
ROUND(AVG(age), 2) AS avg_age,
FLOOR(AVG(weight)) AS avg_weight,
COUNT(*) AS total_animals
FROM joined
GROUP BY species, climate
Explanation
Step 1: Joining the tables
FROM {{ ref("animals") }} AS a
INNER JOIN {{ ref("regions") }} AS r
ON a.region = r.region
An INNER JOIN matches each animal row to its corresponding region row based on the region column. Animals in regions that do not appear in the regions table (and vice versa) are excluded from the result.
Step 2: Grouping and aggregating
GROUP BY species, climate
GROUP BY collapses all rows that share the same species and climate into a single output row. Every non-aggregated column in the SELECT list must appear in the GROUP BY clause.
Step 3: Computing the aggregates
ROUND(AVG(age), 2) AS avg_age,
FLOOR(AVG(weight)) AS avg_weight,
COUNT(*) AS total_animals
ROUND(AVG(age), 2) computes the arithmetic mean within each group and rounds to two decimal places. FLOOR rounds the average weight down to the nearest integer (for example, 84.47 becomes 84). Note that CAST(... AS INTEGER) rounds to the nearest integer rather than always rounding down, so FLOOR is the correct choice here. COUNT(*) counts the number of rows in each group.
Step 4: Alternative without a CTE
You can write the same query without a CTE by placing the join directly in the FROM clause:
SELECT
a.species,
r.climate,
ROUND(AVG(a.age), 2) AS avg_age,
FLOOR(AVG(a.weight)) AS avg_weight,
COUNT(*) AS total_animals
FROM {{ ref("animals") }} AS a
INNER JOIN {{ ref("regions") }} AS r
ON a.region = r.region
GROUP BY a.species, r.climate
Both approaches produce identical results. The CTE version separates the join logic from the aggregation, which can improve readability for more complex queries.