Microsoft: Species Climate Aggregation — Data Engineering Interview Q&A (2026)

All Questions

Species Climate Aggregation

Microsoft 👶 Easy Snowflake Joins Aggregation

Beginner Mode

Scenario

A wildlife research team maintains separate tables for animal records and the climate classification of each region.

Task

Write a Snowflake SQL query that:

Joins {{ ref("animals") }} with {{ ref("regions") }} on the region column
Groups the results by species and climate
Returns the species, climate, average age (rounded to 2 decimal places), average weight (rounded down to the nearest integer), and total count of animals for each group

Schema

animals

Column	Type	Description
id	String	Unique animal identifier
species	String	Animal species name
age	Integer	Age in years
weight	Float	Weight in kilograms
region	String	Geographic region where the animal lives

regions

Column	Type	Description
region	String	Geographic region name
climate	String	Climate classification for the region

Example

animals:

id	species	age	weight	region
1	Wolf	4	45.8	Europe
2	Elephant	25	5200.5	Africa
3	Wolf	6	52.3	Europe
4	Panda	8	110.7	Asia
5	Elephant	30	5400.2	Africa

regions:

region	climate
Africa	Tropical
Europe	Temperate
Asia	Subtropical

Expected Output:

species	climate	avg_age	avg_weight	total_animals
Elephant	Tropical	27.5	5300	2
Panda	Subtropical	8.0	110	1
Wolf	Temperate	5.0	49	2

Note: Each species belongs to a single region in this dataset, so each group contains one species paired with one climate. The average weight is rounded down to the nearest integer.

WITH joined AS (
    SELECT a.*, r.climate
    FROM {{ ref("animals") }} AS a
    INNER JOIN {{ ref("regions") }} AS r
        ON a.region = r.region
)
SELECT
    species,
    climate,
    ROUND(AVG(age), 2) AS avg_age,
    FLOOR(AVG(weight)) AS avg_weight,
    COUNT(*) AS total_animals
FROM joined
GROUP BY species, climate

Explanation

Step 1: Joining the tables

FROM {{ ref("animals") }} AS a
INNER JOIN {{ ref("regions") }} AS r
    ON a.region = r.region

An INNER JOIN matches each animal row to its corresponding region row based on the region column. Animals in regions that do not appear in the regions table (and vice versa) are excluded from the result.

Step 2: Grouping and aggregating

GROUP BY species, climate

GROUP BY collapses all rows that share the same species and climate into a single output row. Every non-aggregated column in the SELECT list must appear in the GROUP BY clause.

Step 3: Computing the aggregates

ROUND(AVG(age), 2) AS avg_age,
FLOOR(AVG(weight)) AS avg_weight,
COUNT(*) AS total_animals

ROUND(AVG(age), 2) computes the arithmetic mean within each group and rounds to two decimal places. FLOOR rounds the average weight down to the nearest integer (for example, 84.47 becomes 84). Note that CAST(... AS INTEGER) rounds to the nearest integer rather than always rounding down, so FLOOR is the correct choice here. COUNT(*) counts the number of rows in each group.

Step 4: Alternative without a CTE

You can write the same query without a CTE by placing the join directly in the FROM clause:

SELECT
    a.species,
    r.climate,
    ROUND(AVG(a.age), 2) AS avg_age,
    FLOOR(AVG(a.weight)) AS avg_weight,
    COUNT(*) AS total_animals
FROM {{ ref("animals") }} AS a
INNER JOIN {{ ref("regions") }} AS r
    ON a.region = r.region
GROUP BY a.species, r.climate

Both approaches produce identical results. The CTE version separates the join logic from the aggregation, which can improve readability for more complex queries.

Quick Solution

Code Environment

Track

	Question	Difficulty	Company	Access

Need more practice in this area? Explore more questions →