SQL JOIN with Pandas Data Processing and CSV Export
Beginner Mode
Start your terminal to use beginner mode.
Scenario
A SQLite database contains sales data across multiple tables that need to be analyzed together.
Task
Write a Python script at /home/interview/analyze_sales.py that:
- Connects to
/home/interview/sales.db - Executes a SQL query using INNER JOIN to combine
customers,orders, andorder_itemstables (customers → orders → order_items), filtering for orders with status 'completed' - Loads the results into a pandas DataFrame
- Calculates
total_amount(quantity × unit_price) rounded to 2 decimal places for each order item - Calculates
customer_total(sum of all total_amounts per customer) rounded to 2 decimal places and adds it as a column - Calculates
revenue_percentageshowing each customer's percentage of total revenue (customer_total / overall_total × 100), rounded to 2 decimal places - Exports the result to
/home/interview/sales_report.csv
Note: pandas and sqlite3 are already installed. The output will have customer-level metrics (customer_total, revenue_percentage) repeated for each order item row.
Database Schema
customers: id, name, email, city, countryorders: id, customer_id, order_date, statusorder_items: id, order_id, product_name, quantity, unit_price
Terminal requires a larger screen
Open this page on a desktop or tablet (≥ 768px) to launch the terminal and practice hands-on.
Linux Terminal Environment
Write and execute your solution in the terminal below.
Track
| Question | Difficulty | Company | Access |
|---|
Need more practice in this area? Explore more questions →
Intel