Vercel: Parse Multi-Format Data File with Different Delimiters per Row Type — Data Engineering Interview Q&A (2026)

Parse Multi-Format Data File with Different Delimiters per Row Type

Beginner Mode

Start your terminal to use beginner mode.

Scenario

A data file contains mixed record types with different delimiters. Each row starts with a type indicator that determines its format and delimiter.

Task

Create a Python script at /home/interview/parse_mixed.py that reads /home/interview/mixed_data.txt, parses rows based on their type indicator and delimiter, and saves each type to separate CSV files: /home/interview/customers.csv, /home/interview/products.csv, and /home/interview/orders.csv.

Output Format

Each output CSV should exclude the type field and include proper headers:

File	Columns
customers.csv	customer_id, name, email, country
products.csv	product_id, name, category, price
orders.csv	order_id, customer_id, product_id, quantity, date

Example

Input (mixed_data.txt):

CUSTOMER,C001,John Doe,[email protected],USA
PRODUCT|P001|Laptop|Electronics|999.99
ORDER;O001;C001;P001;2;2026-02-15

Output (customers.csv):

customer_id,name,email,country
C001,John Doe,[email protected],USA

Step 1: Examine the input file

head -20 /home/interview/mixed_data.txt

Review the different row types and their delimiters to understand the parsing requirements.

Step 2: Create the Python script

nano /home/interview/parse_mixed.py

Write a script that parses each row type with its specific delimiter:

import csv

# Read and separate lines by type
customer_lines = []
product_lines = []
order_lines = []

with open('/home/interview/mixed_data.txt', 'r') as f:
    for line in f:
        line = line.strip()
        if line.startswith('CUSTOMER'):
            customer_lines.append(line.split(','))
        elif line.startswith('PRODUCT'):
            product_lines.append(line.split('|'))
        elif line.startswith('ORDER'):
            order_lines.append(line.split(';'))

# Write CUSTOMER records
with open('/home/interview/customers.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerow(['customer_id', 'name', 'email', 'country'])
    for row in customer_lines:
        writer.writerow(row[1:])  # Skip type field

# Write PRODUCT records
with open('/home/interview/products.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerow(['product_id', 'name', 'category', 'price'])
    for row in product_lines:
        writer.writerow(row[1:])  # Skip type field

# Write ORDER records
with open('/home/interview/orders.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerow(['order_id', 'customer_id', 'product_id', 'quantity', 'date'])
    for row in order_lines:
        writer.writerow(row[1:])  # Skip type field

print(f"Parsed {len(customer_lines)} customers, {len(product_lines)} products, {len(order_lines)} orders")

The script reads each line, identifies its type, splits by the appropriate delimiter, and writes to separate CSV files.

Step 3: Run the script

python3 /home/interview/parse_mixed.py

Step 4: Verify the output files

head /home/interview/customers.csv
head /home/interview/products.csv
head /home/interview/orders.csv

Each file should contain properly parsed data with appropriate headers.

Terminal requires a larger screen

Open this page on a desktop or tablet (≥ 768px) to launch the terminal and practice hands-on.

Linux Terminal Environment

Write and execute your solution in the terminal below.

Track

	Question	Difficulty	Company	Access

Need more practice in this area? Explore more questions →