Start your terminal to use beginner mode.
Objective
You are working as a Materials Engineer and you are required to process two sets of data related to various experiments conducted in your lab. You need to write a function that performs a join operation between your experiments log and your materials catalog.
Task
Join df_experiments and df_materials to bring more context to the experiment results. Specifically, you want to join on the material_id field that is common between the two DataFrames.
The join operation should be such that all records from both the df_experiments and df_materials DataFrames are included, regardless if they have a match in the other DataFrame (a full outer join).
Save your resulting DataFrame as result_df. Ensure the output matches the exact schema order requested, and order the final output by material_id in ascending order, then by experiment_id in ascending order.
File Path
- Experiments Dataset:
/home/interview/experiments.csv - Materials Dataset:
/home/interview/materials.csv - Starter script:
/home/interview/materials_join.py
Schema
experiments.csv
| Column Name | Data Type |
|---|---|
| experiment_id | integer |
| material_id | integer |
| experiment_date | string |
| experiment_results | float |
materials.csv
| Column Name | Data Type |
|---|---|
| material_id | integer |
| material_name | string |
| material_type | string |
Expected Output Schema
| Column Name | Data Type |
|---|---|
| experiment_id | integer |
| material_id | integer |
| material_name | string |
| material_type | string |
| experiment_date | string |
| experiment_results | float |
Example
Given this sample input:
df_experiments
| experiment_id | material_id | experiment_date | experiment_results |
|---------------+-------------+-----------------+--------------------|
| 1 | 101 | 2023-07-01 | 7.6 |
| 2 | 102 | 2023-07-02 | 8.3 |
| 3 | 103 | 2023-07-03 | 6.9 |
| 4 | 101 | 2023-07-04 | 7.2 |
df_materials
| material_id | material_name | material_type |
|-------------|---------------+---------------|
| 101 | Material A | Type X |
| 102 | Material B | Type Y |
| 104 | Material C | Type Z |
The expected output would be:
| experiment_id | material_id | material_name | material_type | experiment_date | experiment_results |
|---------------+-------------+---------------+---------------+-----------------+--------------------|
| 1 | 101 | Material A | Type X | 2023-07-01 | 7.6 |
| 4 | 101 | Material A | Type X | 2023-07-04 | 7.2 |
| 2 | 102 | Material B | Type Y | 2023-07-02 | 8.3 |
| 3 | 103 | null | null | 2023-07-03 | 6.9 |
| null | 104 | Material C | Type Z | null | null |
(Note: Material 103 has no name metadata, and Material 104 has no experiment results. Both are retained in the final output).
Terminal requires a larger screen
Open this page on a desktop or tablet (≥ 768px) to launch the terminal and practice hands-on.
Linux Terminal Environment
Write and execute your solution in the terminal below.
Track
| Question | Difficulty | Company | Access |
|---|
Twilio