Video Streaming Subscriptions
Beginner Mode

Start your terminal to use beginner mode.

Objective

A streaming company wants to analyze its users' behavior alongside their subscription data to understand engagement. You have been provided with a UserBehavior log and a Subscription ledger.

Task

Calculate the total number of minutes each user watched only while they had an active subscription.

Assume a user's subscription is active on the exact day it starts and the exact day it ends. If a user watched content on a date when they didn't have an active subscription, do not include that watch time in the total. Handle the edge case where the subscriptionEnd is listed as "ongoing". Each user's subscription periods are guaranteed to be non-overlapping.

Save your resulting DataFrame as result_df. Ensure the output matches the exact schema order requested.

File Path

  • Behavior Dataset: /home/interview/behavior.csv
  • Subscription Dataset: /home/interview/subscription.csv
  • Starter script: /home/interview/watch_time.py

Schema

behavior.csv

Column Name Data Type Description
userId string A unique identifier for each user
watchDuration integer Total amount of minutes a user watched content on a particular date
date string The date when the user watched the content (YYYY-MM-DD)

subscription.csv

Column Name Data Type Description
userId string A unique identifier for each user
subscriptionStart string The date when the user's subscription started (YYYY-MM-DD)
subscriptionEnd string The date when the subscription ended (YYYY-MM-DD), or "ongoing"

Expected Output Schema

Column Name Data Type Description
userId string A unique identifier for each user
totalWatchTime integer The total watch time in minutes during an active subscription

Example

Given this sample input:

UserBehavior

userId watchDuration date
U1 45 2023-01-01
U1 60 2023-01-02
U2 70 2023-01-03
U3 30 2023-01-04
U2 50 2023-01-02

Subscription

userId subscriptionStart subscriptionEnd
U1 2023-01-01 2023-01-10
U2 2023-01-02 ongoing
U3 2022-12-25 2023-01-05

The output would be:

userId totalWatchTime
U1 105
U2 120
U3 30

Explanation: * U1 watched 45 mins on 01-01 and 60 mins on 01-02. Both fall within their active window (01-01 to 01-10), total = 105.

  • U2 watched 70 mins on 01-03 and 50 mins on 01-02. Both fall within their active window (started 01-02 and is ongoing), total = 120.

  • U3 watched 30 mins on 01-04. This falls within their active window (12-25 to 01-05), total = 30.

Terminal requires a larger screen

Open this page on a desktop or tablet (≥ 768px) to launch the terminal and practice hands-on.

Linux Terminal Environment

Write and execute your solution in the terminal below.

Sign In

Track

Question Difficulty Company Access
Need more practice in this area? Explore more questions →