SQL JOIN with Pandas Data Processing and CSV Export
Beginner Mode

Start your terminal to use beginner mode.

Sign in to watch the walkthrough video

Sign In

Scenario

A SQLite database contains sales data across multiple tables that need to be analyzed together.

Task

Write a Python script at /home/interview/analyze_sales.py that:

  1. Connects to /home/interview/sales.db
  2. Executes a SQL query using INNER JOIN to combine customers, orders, and order_items tables (customers → orders → order_items), filtering for orders with status 'completed'
  3. Loads the results into a pandas DataFrame
  4. Calculates total_amount (quantity × unit_price) rounded to 2 decimal places for each order item
  5. Calculates customer_total (sum of all total_amounts per customer) rounded to 2 decimal places and adds it as a column
  6. Calculates revenue_percentage showing each customer's percentage of total revenue (customer_total / overall_total × 100), rounded to 2 decimal places
  7. Exports the result to /home/interview/sales_report.csv

Note: pandas and sqlite3 are already installed. The output will have customer-level metrics (customer_total, revenue_percentage) repeated for each order item row.

Database Schema

  • customers: id, name, email, city, country
  • orders: id, customer_id, order_date, status
  • order_items: id, order_id, product_name, quantity, unit_price

Terminal requires a larger screen

Open this page on a desktop or tablet (≥ 768px) to launch the terminal and practice hands-on.

Linux Terminal Environment

Write and execute your solution in the terminal below.

Sign In

Track

Question Difficulty Company Access
Need more practice in this area? Explore more questions →