Watch 4 video solutions for Reshape Data: Concatenate, a easy level problem. This walkthrough by JR: Educational Channel has 908 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
DataFramedf1+-------------+--------+ | Column Name | Type | +-------------+--------+ | student_id | int | | name | object | | age | int | +-------------+--------+ DataFramedf2+-------------+--------+ | Column Name | Type | +-------------+--------+ | student_id | int | | name | object | | age | int | +-------------+--------+
Write a solution to concatenate these two DataFrames vertically into one DataFrame.
The result format is in the following example.
Example 1:
Input: df1 +------------+---------+-----+ | student_id | name | age | +------------+---------+-----+ | 1 | Mason | 8 | | 2 | Ava | 6 | | 3 | Taylor | 15 | | 4 | Georgia | 17 | +------------+---------+-----+ df2 +------------+------+-----+ | student_id | name | age | +------------+------+-----+ | 5 | Leo | 7 | | 6 | Alex | 7 | +------------+------+-----+ Output: +------------+---------+-----+ | student_id | name | age | +------------+---------+-----+ | 1 | Mason | 8 | | 2 | Ava | 6 | | 3 | Taylor | 15 | | 4 | Georgia | 17 | | 5 | Leo | 7 | | 6 | Alex | 7 | +------------+---------+-----+ Explanation: The two DataFramess are stacked vertically, and their rows are combined.
Problem Overview: You receive two datasets with identical column structure and need to reshape the data by concatenating them vertically. The result should contain all rows from both datasets while preserving column alignment.
Approach 1: Using pandas for Vertical Concatenation (O(n + m) time, O(n + m) space)
This approach relies on the built-in pandas.concat() function to combine two DataFrames along the row axis. The key insight is that pandas already provides optimized internal routines for stacking datasets with identical schemas. You pass both DataFrames in a list and set axis=0 to append rows. Pandas automatically aligns columns and creates a new DataFrame containing all rows. This is the most practical method when working in data pipelines, analytics scripts, or interview questions involving pandas and DataFrame operations. The runtime is linear because every row from both inputs must be copied into the resulting DataFrame.
Approach 2: Manual Row-wise Appending for Concatenation (O(n + m) time, O(n + m) space)
This method recreates the concatenation behavior manually. Iterate through each row of the first dataset and append it to a result container, then repeat the same process for the second dataset. If the data is represented as lists of records or dictionaries, you simply push rows sequentially into a new list. For DataFrames, this may involve iterating with iterrows() or converting to records first. The key idea is sequential row aggregation: iterate over all rows and build the combined dataset step by step. While the complexity remains linear, this approach is less efficient in real-world scenarios because manual loops in Python add overhead compared with vectorized pandas operations. It is mainly useful for demonstrating how data processing pipelines work internally.
Recommended for interviews: The pandas concat() approach is what interviewers expect when the problem explicitly involves DataFrames. It shows familiarity with the ecosystem and avoids unnecessary loops. The manual row-wise approach demonstrates that you understand the underlying mechanics of concatenation, but the optimized library method reflects stronger practical engineering judgment.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Using pandas concat() | O(n + m) | O(n + m) | Best for pandas/DataFrame problems and real data workflows |
| Manual Row-wise Appending | O(n + m) | O(n + m) | Useful for understanding how concatenation works internally or when pandas is unavailable |