Watch 10 video solutions for Movie Rating, a medium level problem involving Database. This walkthrough by Learn With Chirag has 9,883 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Table: Movies
+---------------+---------+ | Column Name | Type | +---------------+---------+ | movie_id | int | | title | varchar | +---------------+---------+ movie_id is the primary key (column with unique values) for this table. title is the name of the movie.
Table: Users
+---------------+---------+ | Column Name | Type | +---------------+---------+ | user_id | int | | name | varchar | +---------------+---------+ user_id is the primary key (column with unique values) for this table. The column 'name' has unique values.
Table: MovieRating
+---------------+---------+ | Column Name | Type | +---------------+---------+ | movie_id | int | | user_id | int | | rating | int | | created_at | date | +---------------+---------+ (movie_id, user_id) is the primary key (column with unique values) for this table. This table contains the rating of a movie by a user in their review. created_at is the user's review date.
Write a solution to:
February 2020. In case of a tie, return the lexicographically smaller movie name.The result format is in the following example.
Example 1:
Input:
Movies table:
+-------------+--------------+
| movie_id | title |
+-------------+--------------+
| 1 | Avengers |
| 2 | Frozen 2 |
| 3 | Joker |
+-------------+--------------+
Users table:
+-------------+--------------+
| user_id | name |
+-------------+--------------+
| 1 | Daniel |
| 2 | Monica |
| 3 | Maria |
| 4 | James |
+-------------+--------------+
MovieRating table:
+-------------+--------------+--------------+-------------+
| movie_id | user_id | rating | created_at |
+-------------+--------------+--------------+-------------+
| 1 | 1 | 3 | 2020-01-12 |
| 1 | 2 | 4 | 2020-02-11 |
| 1 | 3 | 2 | 2020-02-12 |
| 1 | 4 | 1 | 2020-01-01 |
| 2 | 1 | 5 | 2020-02-17 |
| 2 | 2 | 2 | 2020-02-01 |
| 2 | 3 | 2 | 2020-03-01 |
| 3 | 1 | 3 | 2020-02-22 |
| 3 | 2 | 4 | 2020-02-25 |
+-------------+--------------+--------------+-------------+
Output:
+--------------+
| results |
+--------------+
| Daniel |
| Frozen 2 |
+--------------+
Explanation:
Daniel and Monica have rated 3 movies ("Avengers", "Frozen 2" and "Joker") but Daniel is smaller lexicographically.
Frozen 2 and Joker have a rating average of 3.5 in February but Frozen 2 is smaller lexicographically.
Problem Overview: You are given three tables: Movies, Users, and MovieRating. The task has two parts: find the user who submitted the most ratings, and find the movie with the highest average rating during February 2020. If multiple results tie, return the lexicographically smallest name or title.
Approach 1: SQL Grouping and Ordering (O(n log n) time, O(n) space)
This approach relies on SQL aggregation. First, group ratings by user_id and count how many ratings each user submitted. Join with the Users table to access names, then sort by rating count descending and name ascending to break ties. For the movie result, filter ratings within February 2020, compute AVG(rating) grouped by movie_id, join with Movies, and order by average rating descending and title ascending. Sorting introduces O(n log n) time complexity, while grouped results require O(n) intermediate storage.
Approach 2: Direct SQL with Subquery Utilization (O(n log n) time, O(n) space)
This version pushes more logic into subqueries. A subquery calculates rating counts per user and returns the top result using ORDER BY and LIMIT. Another subquery filters February 2020 ratings and computes movie averages. The outer query selects the final rows. The database engine still performs grouping and sorting internally, so complexity remains about O(n log n). This structure keeps the main query concise and separates each requirement clearly.
Approach 3: Programmatic HashMap Aggregation (O(n) time, O(n) space)
If you load the tables into application code (Python or another language), you can compute results using hash maps. Iterate through the rating records once to maintain a count per user and accumulate sums and counts per movie. Use another pass to compute average ratings for February 2020 entries. Hash lookups are constant time, so the full pass runs in O(n). A final comparison step selects the maximum values while resolving ties lexicographically.
These solutions revolve around relational aggregation concepts from database queries and grouping operations commonly used in SQL. The core idea is transforming raw rating records into aggregated statistics that can be ranked.
Recommended for interviews: The SQL grouping and ordering approach is the expected solution. It demonstrates fluency with joins, GROUP BY, and tie‑breaking with ordered results. The HashMap version shows strong algorithmic thinking when solving the problem outside a database environment.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| SQL Grouping and Ordering | O(n log n) | O(n) | Standard SQL solution using GROUP BY, joins, and sorting |
| Direct SQL with Subqueries | O(n log n) | O(n) | Cleaner query structure when separating logic with subqueries |
| HashMap Aggregation (Programmatic) | O(n) | O(n) | Useful when processing data outside SQL using Python or other languages |