DataFrame students +-------------+--------+ | Column Name | Type | +-------------+--------+ | student_id | int | | name | object | | age | int | +-------------+--------+
There are some rows having missing values in the name column.
Write a solution to remove the rows with missing values.
The result format is in the following example.
Example 1:
Input: +------------+---------+-----+ | student_id | name | age | +------------+---------+-----+ | 32 | Piper | 5 | | 217 | None | 19 | | 779 | Georgia | 20 | | 849 | Willow | 14 | +------------+---------+-----+ Output: +------------+---------+-----+ | student_id | name | age | +------------+---------+-----+ | 32 | Piper | 5 | | 779 | Georgia | 20 | | 849 | Willow | 14 | +------------+---------+-----+ Explanation: Student with id 217 havs empty value in the name column, so it will be removed.
This approach leverages a filtering method to iterate over each row and eliminate the rows with missing 'name' values. The method checks for nullity in the 'name' column and keeps only those rows where 'name' is not null.
In Python, Pandas provides a convenient dropna function which is used to remove missing values. Here, we specify the 'name' column in the subset parameter to ensure only rows where the 'name' is missing are dropped.
JavaScript
Time Complexity: O(n), where n is the number of rows, as we potentially check each row.
Space Complexity: O(1), as we are modifying the DataFrame in place (though Pandas may create a copy depending on the operation).
This approach manually iterates over each row in the dataset, checking if the 'name' field is missing. Rows with missing 'name' values are filtered out, which can be useful in environments that lack high-level filtering functions.
This C program defines a structure Student and iterates over each student, checking if the name is NULL. Valid entries are copied to a new array, which is returned after the loop.
C++
Time Complexity: O(n), to iterate over each student.
Space Complexity: O(n), to store the valid students in the new array.
| Approach | Complexity |
|---|---|
| Using Filtering Method | Time Complexity: O(n), where n is the number of rows, as we potentially check each row. |
| Iterative Approach | Time Complexity: O(n), to iterate over each student. |
Python Pandas Tutorial 5: Handle Missing Data: fillna, dropna, interpolate • codebasics • 442,229 views views
Watch 9 more video solutions →Practice Drop Missing Data with our built-in code editor and test cases.
Practice on FleetCodePractice this problem
Open in Editor