Start your terminal to use beginner mode.
Scenario
A SQLite database and a CSV file both contain customer records, but they may be out of sync. You need to identify which records exist in one source but not the other.
Task
Write a Python script at /home/interview/compare_data.py that connects to the SQLite database at /home/interview/customers.db, reads the CSV file at /home/interview/customers.csv, compares the customer IDs in both sources, and saves a report to /home/interview/comparison_report.json showing which IDs are unique to each source.
Example
Expected output format in /home/interview/comparison_report.json:
{
"extra_in_sql": [103, 105, 108],
"extra_in_csv": [15, 42, 87, 91]
}
Where extra_in_sql contains IDs present in the database but missing from the CSV, and extra_in_csv contains IDs present in the CSV but missing from the database.
Terminal requires a larger screen
Open this page on a desktop or tablet (≥ 768px) to launch the terminal and practice hands-on.
Linux Terminal Environment
Write and execute your solution in the terminal below.
Robinhood
Revolut
Accenture
Adobe
Google
LinkedIn
Samsung
Datadog
Wix
Dropbox
Meta
OpenAI
Hulu
Uber
X
DoorDash
Anthropic
Amazon
ActivisionBlizzard
Vercel
Crypto.Com
Zscaler
DeutscheBank
Apple
GoDaddy
GitLab
BMW
PayPal
Snowflake
AMD
Twilio
Atlassian
JPMorgan
NVIDIA
IBM
Databricks
Coinbase
Cisco
Twitter
Microsoft
Palantir
Netflix
VMware
Cloudflare
Stripe
Lyft
Salesforce
GitHub
Bloomberg
Airbnb
Walmart
SAP
HashiCorp
Instacart
Mastercard
Intel
Visa
Tesla