Scenario
You work at a venture capital firm and have two tables: one with company profiles and another with individual investment records.
Task
Write a Snowflake SQL query that:
- Joins
{{ ref("companies") }} with {{ ref("investments") }} on company_id
- Calculates
total_investment as the sum of amount per industry
- Groups results by
industry
- Sorts results by
total_investment in descending order
Schema
companies
| Column |
Type |
Description |
| company_id |
Integer |
Unique company identifier |
| company_name |
String |
Name of the company |
| industry |
String |
Industry sector the company belongs to |
investments
| Column |
Type |
Description |
| investment_id |
Integer |
Unique investment identifier |
| company_id |
Integer |
Reference to the company receiving the investment |
| amount |
Float |
Investment amount in dollars |
Example
companies:
| company_id |
company_name |
industry |
| 1 |
NovaSoft |
Technology |
| 2 |
MedVista |
Healthcare |
| 3 |
GreenPeak |
Clean Energy |
| 4 |
FinBridge |
Finance |
| 5 |
StreamCore |
Technology |
investments:
| investment_id |
company_id |
amount |
| 1 |
1 |
4500000 |
| 2 |
2 |
2800000 |
| 3 |
3 |
3200000 |
| 4 |
4 |
1500000 |
| 5 |
5 |
2000000 |
Expected Output:
| industry |
total_investment |
| Technology |
6500000 |
| Clean Energy |
3200000 |
| Healthcare |
2800000 |
| Finance |
1500000 |
Note: NovaSoft and StreamCore both belong to Technology, so their amounts are summed (4500000 + 2000000 = 6500000). Results are sorted from highest to lowest total investment.
SELECT
c.industry,
SUM(i.amount) AS total_investment
FROM {{ ref("companies") }} c
INNER JOIN {{ ref("investments") }} i
ON c.company_id = i.company_id
GROUP BY c.industry
ORDER BY total_investment DESC
Explanation
Step 1: Joining the Two Tables
FROM {{ ref("companies") }} c
INNER JOIN {{ ref("investments") }} i
ON c.company_id = i.company_id
An INNER JOIN combines rows from companies and investments where company_id matches. This pairs each investment record with the company's industry. Companies with no investments (or investments referencing a missing company) are excluded. If you needed to keep companies with zero investments, you would use a LEFT JOIN instead.
Step 2: Aggregating by Industry
SELECT
c.industry,
SUM(i.amount) AS total_investment
...
GROUP BY c.industry
GROUP BY c.industry collapses all rows sharing the same industry into a single output row. SUM(i.amount) adds up every investment amount within each group. The alias total_investment gives the aggregated column a meaningful name.
Step 3: Sorting the Results
ORDER BY total_investment DESC
ORDER BY total_investment DESC arranges the output from highest to lowest total investment. You can reference the alias directly in the ORDER BY clause because Snowflake evaluates ORDER BY after SELECT.
Step 4: Alternative with CTE
WITH aggregated AS (
SELECT
c.industry,
SUM(i.amount) AS total_investment
FROM {{ ref("companies") }} c
INNER JOIN {{ ref("investments") }} i
ON c.company_id = i.company_id
GROUP BY c.industry
)
SELECT *
FROM aggregated
ORDER BY total_investment DESC
A CTE separates the aggregation logic from the final sorting step. For a query this size, both approaches produce identical results and execution plans. CTEs become more valuable when you need to chain multiple transformation steps or reuse an intermediate result.