Author Numbering
Beginner Mode

Start your terminal to use beginner mode.

Objective

You are working with a dataset of research paper co-authorships where each row links an author to the paper they contributed to.

Task

Assign a sequential row number to each author within their paper group, ordered by author_id. The row number should reset to 1 for each new paper_id. Use a window function partitioned by paper_id and ordered by author_id to compute the numbering. The output should contain paper_id, author_id, name, and row_number. Save your result as result_df.

File Path

  • Authors: /home/interview/authors.csv
  • Starter script: /home/interview/author_numbering.py

Schema

authors.csv

Column Type
paper_id string
author_id string
name string

Expected output schema

Column Type
paper_id string
author_id string
name string
row_number integer

Example

Given this sample input:

authors

paper_id author_id name
P1 A3 Carol Davis
P1 A1 Alice Chen
P1 A2 Bob Martinez
P2 A5 Eva Patel
P2 A4 David Kim

The output would be:

paper_id author_id name row_number
P1 A1 Alice Chen 1
P1 A2 Bob Martinez 2
P1 A3 Carol Davis 3
P2 A4 David Kim 1
P2 A5 Eva Patel 2

Within paper P1, authors are ordered by author_id (A1, A2, A3) and assigned row numbers 1 through 3. The numbering restarts at 1 for paper P2.

Terminal requires a larger screen

Open this page on a desktop or tablet (≥ 768px) to launch the terminal and practice hands-on.

Linux Terminal Environment

Write and execute your solution in the terminal below.

Sign In

Track

Question Difficulty Company Access
Need more practice in this area? Explore more questions →