Reshape Data: Concatenate - Solution & Explanation

Easy4 min readAsked at: Amazon, Google

Problem Statement

DataFrame df1
+-------------+--------+
| Column Name | Type   |
+-------------+--------+
| student_id  | int    |
| name        | object |
| age         | int    |
+-------------+--------+

DataFrame df2
+-------------+--------+
| Column Name | Type   |
+-------------+--------+
| student_id  | int    |
| name        | object |
| age         | int    |
+-------------+--------+

Write a solution to concatenate these two DataFrames vertically into one DataFrame.

The result format is in the following example.

Example 1:

Input:
df1
+------------+---------+-----+
| student_id | name    | age |
+------------+---------+-----+
| 1          | Mason   | 8   |
| 2          | Ava     | 6   |
| 3          | Taylor  | 15  |
| 4          | Georgia | 17  |
+------------+---------+-----+
df2
+------------+------+-----+
| student_id | name | age |
+------------+------+-----+
| 5          | Leo  | 7   |
| 6          | Alex | 7   |
+------------+------+-----+
Output:
+------------+---------+-----+
| student_id | name    | age |
+------------+---------+-----+
| 1          | Mason   | 8   |
| 2          | Ava     | 6   |
| 3          | Taylor  | 15  |
| 4          | Georgia | 17  |
| 5          | Leo     | 7   |
| 6          | Alex    | 7   |
+------------+---------+-----+
Explanation:
The two DataFramess are stacked vertically, and their rows are combined.

Approach Overview

Problem Overview: You receive two datasets with identical column structure and need to reshape the data by concatenating them vertically. The result should contain all rows from both datasets while preserving column alignment.

Approach 1: Using pandas for Vertical Concatenation (O(n + m) time, O(n + m) space)

This approach relies on the built-in pandas.concat() function to combine two DataFrames along the row axis. The key insight is that pandas already provides optimized internal routines for stacking datasets with identical schemas. You pass both DataFrames in a list and set axis=0 to append rows. Pandas automatically aligns columns and creates a new DataFrame containing all rows. This is the most practical method when working in data pipelines, analytics scripts, or interview questions involving pandas and DataFrame operations. The runtime is linear because every row from both inputs must be copied into the resulting DataFrame.

Approach 2: Manual Row-wise Appending for Concatenation (O(n + m) time, O(n + m) space)

This method recreates the concatenation behavior manually. Iterate through each row of the first dataset and append it to a result container, then repeat the same process for the second dataset. If the data is represented as lists of records or dictionaries, you simply push rows sequentially into a new list. For DataFrames, this may involve iterating with iterrows() or converting to records first. The key idea is sequential row aggregation: iterate over all rows and build the combined dataset step by step. While the complexity remains linear, this approach is less efficient in real-world scenarios because manual loops in Python add overhead compared with vectorized pandas operations. It is mainly useful for demonstrating how data processing pipelines work internally.

Recommended for interviews: The pandas concat() approach is what interviewers expect when the problem explicitly involves DataFrames. It shows familiarity with the ecosystem and avoids unnecessary loops. The manual row-wise approach demonstrates that you understand the underlying mechanics of concatenation, but the optimized library method reflects stronger practical engineering judgment.

Approach 1: Approach 1: Using pandas for Vertical Concatenation

The most straightforward way to concatenate two DataFrames vertically in Python is by using the pandas.concat function. This function allows you to combine multiple DataFrames along either the rows or columns, specified by the axis parameter. Setting axis=0 will stack the DataFrames on top of each other.

This Python solution uses the pandas library to concatenate df1 and df2. The pd.concat function combines the two DataFrames by stacking them on top of each other, specified by axis=0. Setting ignore_index=True reassigns an automatic sequential index to the concatenated DataFrame.

Code

Python

Complexity

Time Complexity: O(n + m), where n and m are the number of rows in df1 and df2, respectively.
Space Complexity: O(n + m) for storing the new concatenated DataFrame.

Try this approach in the editor →

Approach 2: Approach 2: Manual Row-wise Appending for Concatenation

This approach involves manually appending rows from the second DataFrame to the first. This could be slower compared to built-in function calls but provides a clear understanding of what's happening under-the-hood when DataFrames are concatenated.

This Python solution manually creates a list of rows from both df1 and df2 using the iterrows() function. It concatenates these lists to form a complete list of rows and constructs a new DataFrame from it.

Code

Python

Complexity

Time Complexity: O(n + m), similar to the concatenation function, but with additional overhead due to manual row iteration.
Space Complexity: O(n + m) for storing the combined rows in a new DataFrame.

Try this approach in the editor →

Approach 3: Default Approach

Code

Python

Try this approach in the editor →

Complexity Comparison

Approach	Complexity
Approach 1: Using pandas for Vertical Concatenation	Time Complexity: O(n + m), where `n` and `m` are the number of rows in `df1` and `df2`, respectively. Space Complexity: O(n + m) for storing the new concatenated DataFrame.
Approach 2: Manual Row-wise Appending for Concatenation	Time Complexity: O(n + m), similar to the concatenation function, but with additional overhead due to manual row iteration. Space Complexity: O(n + m) for storing the combined rows in a new DataFrame.
Default Approach	—

Detailed Complexity Analysis

Approach	Time	Space	When to Use
Using pandas concat()	O(n + m)	O(n + m)	Best for pandas/DataFrame problems and real data workflows
Manual Row-wise Appending	O(n + m)	O(n + m)	Useful for understanding how concatenation works internally or when pandas is unavailable

Video Solution

LeetCode 2888 Reshape Data: Concatenate in Python | Pandas Tutorial for Beginners • JR: Educational Channel • 908 views views

Watch 3 more video solutions →

Frequently Asked Questions

Reshape Data: Concatenate Python solution

The typical Python solution uses pandas.concat(). Example: result = pd.concat([df1, df2], axis=0). This combines both DataFrames vertically and returns a new DataFrame containing all rows. The operation runs in O(n + m) time.

Is Reshape Data: Concatenate easy or hard?

Reshape Data: Concatenate is considered an Easy problem. It mainly tests familiarity with pandas operations and basic data manipulation concepts rather than complex algorithms or advanced data structures.

How to solve Reshape Data: Concatenate in O(n)?

Treat the operation as a linear merge of rows from two datasets. Using pandas, call concat([df1, df2], axis=0) to stack rows vertically. Each row is processed once, giving O(n + m) time complexity, which is optimal because all elements must appear in the output.

What is the best approach for Reshape Data: Concatenate?

The best approach is using pandas concat() to vertically combine the two DataFrames. It directly stacks rows while preserving column alignment and runs in O(n + m) time where n and m are the number of rows in each dataset. This method is concise, optimized, and commonly used in real data processing tasks.

Is Reshape Data: Concatenate asked at Google/Amazon/Meta?

Data reshaping and DataFrame manipulation problems appear in data engineering, analytics, and machine learning interviews at companies like Google, Amazon, and Meta. While this exact problem may vary, the ability to concatenate datasets using pandas or similar tools is a common expectation.

What data structure is used in Reshape Data: Concatenate?

The primary data structure is a pandas DataFrame. The operation stacks rows from multiple DataFrames into a single DataFrame while maintaining the column schema. Internally, pandas manages the underlying column arrays efficiently during concatenation.

What is the time complexity of Reshape Data: Concatenate?

The time complexity is O(n + m) because every row from both input datasets must be copied into the resulting DataFrame. Pandas internally processes each row once during concatenation. Space complexity is also O(n + m) since a new combined DataFrame is created.

Ready to solve this problem?

Practice Reshape Data: Concatenate with our built-in code editor and test cases.

Practice on FleetCode

Problem Info

DifficultyEasy

Acceptance90.6%

Approaches3

Reading time4 min

Asked at

Amazon Google

Practice this problem

Open in Editor

Reshape Data: Concatenate - Solution & Explanation

Problem Statement

Approach Overview

Approach 1: Approach 1: Using pandas for Vertical Concatenation

Code

Complexity

Approach 2: Approach 2: Manual Row-wise Appending for Concatenation

Code

Complexity

Approach 3: Default Approach

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Reshape Data: Concatenate - Solution & Explanation

Problem Statement

Approach Overview

Approach 1: Approach 1: Using pandas for Vertical Concatenation

Code

Complexity

Approach 2: Approach 2: Manual Row-wise Appending for Concatenation

Code

Complexity

Approach 3: Default Approach

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents