Start your terminal to use beginner mode.
Objective
A streaming company wants to analyze its users' behavior alongside their subscription data to understand engagement. You have been provided with a UserBehavior log and a Subscription ledger.
Task
Calculate the total number of minutes each user watched only while they had an active subscription.
Assume a user's subscription is active on the exact day it starts and the exact day it ends. If a user watched content on a date when they didn't have an active subscription, do not include that watch time in the total. Handle the edge case where the subscriptionEnd is listed as "ongoing". Each user's subscription periods are guaranteed to be non-overlapping.
Save your resulting DataFrame as result_df. Ensure the output matches the exact schema order requested.
File Path
- Behavior Dataset:
/home/interview/behavior.csv - Subscription Dataset:
/home/interview/subscription.csv - Starter script:
/home/interview/watch_time.py
Schema
behavior.csv
| Column Name | Data Type | Description |
|---|---|---|
| userId | string | A unique identifier for each user |
| watchDuration | integer | Total amount of minutes a user watched content on a particular date |
| date | string | The date when the user watched the content (YYYY-MM-DD) |
subscription.csv
| Column Name | Data Type | Description |
|---|---|---|
| userId | string | A unique identifier for each user |
| subscriptionStart | string | The date when the user's subscription started (YYYY-MM-DD) |
| subscriptionEnd | string | The date when the subscription ended (YYYY-MM-DD), or "ongoing" |
Expected Output Schema
| Column Name | Data Type | Description |
|---|---|---|
| userId | string | A unique identifier for each user |
| totalWatchTime | integer | The total watch time in minutes during an active subscription |
Example
Given this sample input:
UserBehavior
| userId | watchDuration | date |
|---|---|---|
| U1 | 45 | 2023-01-01 |
| U1 | 60 | 2023-01-02 |
| U2 | 70 | 2023-01-03 |
| U3 | 30 | 2023-01-04 |
| U2 | 50 | 2023-01-02 |
Subscription
| userId | subscriptionStart | subscriptionEnd |
|---|---|---|
| U1 | 2023-01-01 | 2023-01-10 |
| U2 | 2023-01-02 | ongoing |
| U3 | 2022-12-25 | 2023-01-05 |
The output would be:
| userId | totalWatchTime |
|---|---|
| U1 | 105 |
| U2 | 120 |
| U3 | 30 |
Explanation: * U1 watched 45 mins on 01-01 and 60 mins on 01-02. Both fall within their active window (01-01 to 01-10), total = 105.
U2watched 70 mins on 01-03 and 50 mins on 01-02. Both fall within their active window (started 01-02 and is ongoing), total = 120.U3watched 30 mins on 01-04. This falls within their active window (12-25 to 01-05), total = 30.
Terminal requires a larger screen
Open this page on a desktop or tablet (≥ 768px) to launch the terminal and practice hands-on.
Linux Terminal Environment
Write and execute your solution in the terminal below.
Track
| Question | Difficulty | Company | Access |
|---|
Broadcom