DataFrame students +-------------+--------+ | Column Name | Type | +-------------+--------+ | student_id | int | | name | object | | age | int | +-------------+--------+
There are some rows having missing values in the name column.
Write a solution to remove the rows with missing values.
The result format is in the following example.
Example 1:
Input: +------------+---------+-----+ | student_id | name | age | +------------+---------+-----+ | 32 | Piper | 5 | | 217 | None | 19 | | 779 | Georgia | 20 | | 849 | Willow | 14 | +------------+---------+-----+ Output: +------------+---------+-----+ | student_id | name | age | +------------+---------+-----+ | 32 | Piper | 5 | | 779 | Georgia | 20 | | 849 | Willow | 14 | +------------+---------+-----+ Explanation: Student with id 217 havs empty value in the name column, so it will be removed.
Problem Overview: You receive a dataset (or table-like structure) that may contain missing values in some rows. The task is to remove rows with missing data and return only the valid records. The operation is common in data preprocessing where incomplete entries must be excluded before further analysis.
Approach 1: Using Filtering Method (O(n) time, O(n) space)
This approach relies on filtering the dataset and keeping only rows that contain valid values in the required column(s). You iterate through each record and check whether the target field is null, None, or otherwise missing. If the value exists, the row is added to the result collection; otherwise it is skipped. High-level languages such as Python and JavaScript make this concise with built‑in filtering utilities like DataFrame.dropna() or array.filter(). The algorithm scans the data once, giving O(n) time complexity with O(n) auxiliary space for the filtered result. This approach is straightforward and commonly used in real-world data pipelines.
Approach 2: Iterative Approach (O(n) time, O(1) extra space)
The iterative method manually traverses the dataset using a loop and removes or skips rows with missing fields. In lower-level languages such as C or C++, you check each element and copy valid rows into the output position of the same array or another buffer. The key idea is simple: iterate once, test for missing values, and keep only valid entries. If you modify the structure in place, the extra space can be reduced to O(1). This technique is useful when working with raw arrays or when memory usage must stay minimal. The runtime remains O(n) because every row is inspected exactly once.
Problems like this mainly test careful data traversal and validation rather than complex algorithms. The pattern appears frequently in preprocessing tasks and general array manipulation. Practicing these operations builds comfort with linear scans and conditional filtering, which also appear in problems involving arrays and basic iteration. The same idea can extend to more advanced filtering logic such as validating ranges, removing duplicates, or applying rules across multiple fields.
Recommended for interviews: The filtering approach is typically expected because it expresses the intent clearly and uses a single linear pass. Showing the iterative implementation demonstrates that you understand how the operation works internally. In interviews, start with the simple scan that checks each row and excludes missing values, then mention that built‑in filtering utilities provide the same O(n) performance with cleaner code.
This approach leverages a filtering method to iterate over each row and eliminate the rows with missing 'name' values. The method checks for nullity in the 'name' column and keeps only those rows where 'name' is not null.
In Python, Pandas provides a convenient dropna function which is used to remove missing values. Here, we specify the 'name' column in the subset parameter to ensure only rows where the 'name' is missing are dropped.
Python
JavaScript
Time Complexity: O(n), where n is the number of rows, as we potentially check each row.
Space Complexity: O(1), as we are modifying the DataFrame in place (though Pandas may create a copy depending on the operation).
This approach manually iterates over each row in the dataset, checking if the 'name' field is missing. Rows with missing 'name' values are filtered out, which can be useful in environments that lack high-level filtering functions.
This C program defines a structure Student and iterates over each student, checking if the name is NULL. Valid entries are copied to a new array, which is returned after the loop.
Time Complexity: O(n), to iterate over each student.
Space Complexity: O(n), to store the valid students in the new array.
| Approach | Complexity |
|---|---|
| Using Filtering Method | Time Complexity: O(n), where n is the number of rows, as we potentially check each row. |
| Iterative Approach | Time Complexity: O(n), to iterate over each student. |
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Filtering Method | O(n) | O(n) | Best for high-level languages with built-in filtering utilities like Python or JavaScript |
| Iterative Approach | O(n) | O(1) extra space | Useful in C/C++ or memory-constrained environments where manual traversal is required |
2883. Drop Missing Data | LeetCode | Python | Pandas • You Data And AI • 637 views views
Watch 2 more video solutions →Practice Drop Missing Data with our built-in code editor and test cases.
Practice on FleetCodePractice this problem
Open in Editor