Watch the video solution for Books with NULL Ratings, a easy level problem involving Database. This walkthrough by Everyday Data Science has 558 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Table: books
+----------------+---------+ | Column Name | Type | +----------------+---------+ | book_id | int | | title | varchar | | author | varchar | | published_year | int | | rating | decimal | +----------------+---------+ book_id is the unique key for this table. Each row of this table contains information about a book including its unique ID, title, author, publication year, and rating. rating can be NULL, indicating that the book hasn't been rated yet.
Write a solution to find all books that have not been rated yet (i.e., have a NULL rating).
Return the result table ordered by book_id in ascending order.
The result format is in the following example.
Example:
Input:
books table:
+---------+------------------------+------------------+----------------+--------+ | book_id | title | author | published_year | rating | +---------+------------------------+------------------+----------------+--------+ | 1 | The Great Gatsby | F. Scott | 1925 | 4.5 | | 2 | To Kill a Mockingbird | Harper Lee | 1960 | NULL | | 3 | Pride and Prejudice | Jane Austen | 1813 | 4.8 | | 4 | The Catcher in the Rye | J.D. Salinger | 1951 | NULL | | 5 | Animal Farm | George Orwell | 1945 | 4.2 | | 6 | Lord of the Flies | William Golding | 1954 | NULL | +---------+------------------------+------------------+----------------+--------+
Output:
+---------+------------------------+------------------+----------------+ | book_id | title | author | published_year | +---------+------------------------+------------------+----------------+ | 2 | To Kill a Mockingbird | Harper Lee | 1960 | | 4 | The Catcher in the Rye | J.D. Salinger | 1951 | | 6 | Lord of the Flies | William Golding | 1954 | +---------+------------------------+------------------+----------------+
Explanation:
Problem Overview: The task is to return all books whose rating value is NULL. In SQL-style datasets, NULL represents missing or unknown data, so the query must explicitly check for that condition rather than comparing with standard equality operators.
Approach 1: Conditional Filtering (SQL / MySQL) (Time: O(n), Space: O(1))
Scan the Books table and filter rows where the rating column is NULL. SQL treats NULL as a special marker for missing values, so using = NULL will not work. Instead, the correct condition is IS NULL. The database engine evaluates this condition for each row during the table scan and returns only those entries with missing ratings. This approach is straightforward and commonly used in SQL queries when dealing with incomplete datasets.
Because the database must check each row's rating value, the time complexity is O(n), where n is the number of rows in the table. Space complexity remains O(1) since the query only filters rows without creating additional structures.
Approach 2: Conditional Filtering (Pandas) (Time: O(n), Space: O(n))
When solving the same problem using Pandas, load the Books table into a DataFrame and apply a boolean filter using isna() or isnull(). These methods detect missing values in a column and return a boolean mask. Applying that mask to the DataFrame keeps only rows where the rating is missing. This pattern is common in database-style data analysis tasks where datasets contain incomplete records.
The operation scans the column once, giving a time complexity of O(n). The boolean mask created by Pandas requires additional memory proportional to the dataset size, so the space complexity is O(n).
Recommended for interviews: Interviewers expect the SQL IS NULL filtering approach. The key detail they look for is understanding that NULL cannot be compared using =. Demonstrating the correct conditional filter shows familiarity with SQL semantics and real-world data handling.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| SQL Conditional Filtering (IS NULL) | O(n) | O(1) | Standard database queries when filtering rows with missing values |
| Pandas DataFrame Filtering (isna) | O(n) | O(n) | Data analysis workflows where the table is loaded into a Pandas DataFrame |