#2883 Drop Missing Data - Solution

DataFrame students
+-------------+--------+
| Column Name | Type   |
+-------------+--------+
| student_id  | int    |
| name        | object |
| age         | int    |
+-------------+--------+

There are some rows having missing values in the name column.

Write a solution to remove the rows with missing values.

The result format is in the following example.

Example 1:

Input:
+------------+---------+-----+
| student_id | name    | age |
+------------+---------+-----+
| 32         | Piper   | 5   |
| 217        | None    | 19  |
| 779        | Georgia | 20  |
| 849        | Willow  | 14  |
+------------+---------+-----+
Output:
+------------+---------+-----+
| student_id | name    | age |
+------------+---------+-----+
| 32         | Piper   | 5   |
| 779        | Georgia | 20  | 
| 849        | Willow  | 14  | 
+------------+---------+-----+
Explanation: 
Student with id 217 havs empty value in the name column, so it will be removed.

The key idea in #2883 Drop Missing Data is to clean a dataset by removing rows that contain missing values in a specific column. In many data processing tasks, missing or NULL/NaN entries can affect analysis, so filtering them out is a common preprocessing step.

A straightforward approach is to scan the dataset and keep only the rows where the target column contains a valid value. In data-processing libraries such as Pandas, this can be done using functions like dropna() or by applying a boolean filter that checks whether the column value is not null.

The algorithm processes each row once and decides whether it should remain in the dataset. Because it only performs a single pass over the rows, the method is efficient and easy to implement. The time complexity is O(n), where n is the number of rows, while the extra space usage is minimal since filtering typically operates directly on the existing structure.

Approach	Time Complexity	Space Complexity
Filter rows with non-missing values (e.g., using `dropna` or boolean mask)	O(n)	O(1)

This approach leverages a filtering method to iterate over each row and eliminate the rows with missing 'name' values. The method checks for nullity in the 'name' column and keeps only those rows where 'name' is not null.

Time Complexity: O(n), where n is the number of rows, as we potentially check each row.
Space Complexity: O(1), as we are modifying the DataFrame in place (though Pandas may create a copy depending on the operation).

In Python, Pandas provides a convenient dropna function which is used to remove missing values. Here, we specify the 'name' column in the subset parameter to ensure only rows where the 'name' is missing are dropped.

This approach manually iterates over each row in the dataset, checking if the 'name' field is missing. Rows with missing 'name' values are filtered out, which can be useful in environments that lack high-level filtering functions.

Time Complexity: O(n), to iterate over each student.
Space Complexity: O(n), to store the valid students in the new array.

codebasics

22:07442,229 views

This C program defines a structure Student and iterates over each student, checking if the name is NULL. Valid entries are copied to a new array, which is returned after the loop.

2883. Drop Missing Data

Problem Statement

Approach

Complexity

Video Solution Available

Problem Hints

Solutions (4)

Using Filtering Method

Explanation

Iterative Approach

Video Solutions

Python Pandas Tutorial 5: Handle Missing Data: fillna, dropna, interpolate

4 Leetcode Mistakes

NUMBER OF ISLANDS - Leetcode 200 - Python

Missing Number - Blind 75 - Leetcode 268 - Python

First Missing Positive - Leetcode 41 - Python

Find Missing Observations - Leetcode 2028 Weekly Contest Problem - Python

Find Missing Observations - Leetcode 2028 - Python

Find Missing and Repeated Values - Leetcode 2965 - Python

2883. Drop Missing Data | LeetCode | Python | Pandas

#3 Pandas Tutorial: Drop missing Data in Python | Data cleaning | Python Tutorial

Notes

Personal Notes

Similar Problems

No similar problems found

Problem Stats

Practice on LeetCode

Frequently Asked Questions

Is Drop Missing Data asked in FAANG interviews?

What data structure is best for Drop Missing Data?

What is the optimal approach for Drop Missing Data?

Why is handling missing data important in datasets?

Explanation