GitHub: Department Budget Variance — Data Engineering Interview Q&A (2026)

Department Budget Variance

GitHub 🧠 Hard Snowflake Aggregation Joins Type Casting

Beginner Mode

Scenario

You work as a data analyst for a government agency that tracks annual budget allocations and actual spending across departments.

Task

Write a Snowflake SQL query that:

Joins {{ ref("budgets") }} with {{ ref("spending") }} on both department and fiscal_year
Calculates budget_variance as CAST(VARIANCE(amount_budgeted) AS INTEGER) grouped by department
Calculates spending_variance as CAST(VARIANCE(amount_spent) AS INTEGER) grouped by department
Returns department, budget_variance, and spending_variance

Schema

budgets

Column	Type	Description
department	String	Name of the government department
fiscal_year	Integer	Fiscal year of the allocation
amount_budgeted	Float	Total budget allocated (in millions)

spending

Column	Type	Description
department	String	Name of the government department
fiscal_year	Integer	Fiscal year of the expenditure
amount_spent	Float	Total amount spent (in millions)

Example

budgets:

department	fiscal_year	amount_budgeted
Engineering	2022	400.0
Engineering	2023	460.0
Marketing	2022	300.0
Marketing	2023	350.0

spending:

department	fiscal_year	amount_spent
Engineering	2022	380.0
Engineering	2023	440.0
Marketing	2022	280.0
Marketing	2023	320.0

Expected Output:

department	budget_variance	spending_variance
Engineering	1800	1800
Marketing	1250	800

Note: Variance here uses sample variance (divides by N-1, not N). The result is a whole number with no decimals. Results are ordered alphabetically by department.

WITH combined AS (
    SELECT
        b.department,
        b.amount_budgeted,
        s.amount_spent
    FROM {{ ref("budgets") }} b
    INNER JOIN {{ ref("spending") }} s
        ON b.department = s.department
        AND b.fiscal_year = s.fiscal_year
),
aggregated AS (
    SELECT
        department,
        CAST(VARIANCE(amount_budgeted) AS INTEGER) AS budget_variance,
        CAST(VARIANCE(amount_spent) AS INTEGER) AS spending_variance
    FROM combined
    GROUP BY department
)
SELECT * FROM aggregated
ORDER BY department

Explanation

Step 1: Join the two tables on department and fiscal year

FROM {{ ref("budgets") }} b
INNER JOIN {{ ref("spending") }} s
    ON b.department = s.department
    AND b.fiscal_year = s.fiscal_year

The INNER JOIN matches each budget row with its corresponding spending row for the same department and year. Rows that exist in only one table (for example, a department that appears in spending but not in budgets) are excluded. Joining on both columns ensures you pair the correct budget and spending figures before aggregating.

Step 2: Compute sample variance with VARIANCE()

CAST(VARIANCE(amount_budgeted) AS INTEGER) AS budget_variance,
CAST(VARIANCE(amount_spent) AS INTEGER) AS spending_variance

Snowflake's VARIANCE() function computes sample variance, which divides the sum of squared deviations by N-1 (where N is the number of values in the group). This is the standard unbiased estimator for population variance from a sample. The alias VAR_SAMP() is equivalent.

Step 3: Cast to INTEGER

CAST(VARIANCE(...) AS INTEGER)

VARIANCE() returns a float. Wrapping it in CAST(... AS INTEGER) truncates the decimal portion to produce a whole number. For example, a variance of 1733.33 becomes 1733. This matches the expected output schema.

Step 4: Group by department

GROUP BY department

Grouping by department produces one row per department with the aggregated variance values across all matched fiscal years.

Alternative: Single CTE approach

You can combine the join and aggregation into a single CTE:

WITH result AS (
    SELECT
        b.department,
        CAST(VARIANCE(b.amount_budgeted) AS INTEGER) AS budget_variance,
        CAST(VARIANCE(s.amount_spent) AS INTEGER) AS spending_variance
    FROM {{ ref("budgets") }} b
    INNER JOIN {{ ref("spending") }} s
        ON b.department = s.department
        AND b.fiscal_year = s.fiscal_year
    GROUP BY b.department
)
SELECT * FROM result
ORDER BY department

This is more concise and produces the same result. The two-CTE version separates the join logic from the aggregation, which can be easier to read and debug for more complex queries.

Quick Solution

Code Environment

Track

	Question	Difficulty	Company	Access

Need more practice in this area? Explore more questions →