Table: Movies
+---------------+---------+ | Column Name | Type | +---------------+---------+ | movie_id | int | | title | varchar | +---------------+---------+ movie_id is the primary key (column with unique values) for this table. title is the name of the movie.
Table: Users
+---------------+---------+ | Column Name | Type | +---------------+---------+ | user_id | int | | name | varchar | +---------------+---------+ user_id is the primary key (column with unique values) for this table. The column 'name' has unique values.
Table: MovieRating
+---------------+---------+ | Column Name | Type | +---------------+---------+ | movie_id | int | | user_id | int | | rating | int | | created_at | date | +---------------+---------+ (movie_id, user_id) is the primary key (column with unique values) for this table. This table contains the rating of a movie by a user in their review. created_at is the user's review date.
Write a solution to:
February 2020. In case of a tie, return the lexicographically smaller movie name.The result format is in the following example.
Example 1:
Input:
Movies table:
+-------------+--------------+
| movie_id | title |
+-------------+--------------+
| 1 | Avengers |
| 2 | Frozen 2 |
| 3 | Joker |
+-------------+--------------+
Users table:
+-------------+--------------+
| user_id | name |
+-------------+--------------+
| 1 | Daniel |
| 2 | Monica |
| 3 | Maria |
| 4 | James |
+-------------+--------------+
MovieRating table:
+-------------+--------------+--------------+-------------+
| movie_id | user_id | rating | created_at |
+-------------+--------------+--------------+-------------+
| 1 | 1 | 3 | 2020-01-12 |
| 1 | 2 | 4 | 2020-02-11 |
| 1 | 3 | 2 | 2020-02-12 |
| 1 | 4 | 1 | 2020-01-01 |
| 2 | 1 | 5 | 2020-02-17 |
| 2 | 2 | 2 | 2020-02-01 |
| 2 | 3 | 2 | 2020-03-01 |
| 3 | 1 | 3 | 2020-02-22 |
| 3 | 2 | 4 | 2020-02-25 |
+-------------+--------------+--------------+-------------+
Output:
+--------------+
| results |
+--------------+
| Daniel |
| Frozen 2 |
+--------------+
Explanation:
Daniel and Monica have rated 3 movies ("Avengers", "Frozen 2" and "Joker") but Daniel is smaller lexicographically.
Frozen 2 and Joker have a rating average of 3.5 in February but Frozen 2 is smaller lexicographically.
Problem Overview: You are given three tables: Movies, Users, and MovieRating. The task has two parts: find the user who submitted the most ratings, and find the movie with the highest average rating during February 2020. If multiple results tie, return the lexicographically smallest name or title.
Approach 1: SQL Grouping and Ordering (O(n log n) time, O(n) space)
This approach relies on SQL aggregation. First, group ratings by user_id and count how many ratings each user submitted. Join with the Users table to access names, then sort by rating count descending and name ascending to break ties. For the movie result, filter ratings within February 2020, compute AVG(rating) grouped by movie_id, join with Movies, and order by average rating descending and title ascending. Sorting introduces O(n log n) time complexity, while grouped results require O(n) intermediate storage.
Approach 2: Direct SQL with Subquery Utilization (O(n log n) time, O(n) space)
This version pushes more logic into subqueries. A subquery calculates rating counts per user and returns the top result using ORDER BY and LIMIT. Another subquery filters February 2020 ratings and computes movie averages. The outer query selects the final rows. The database engine still performs grouping and sorting internally, so complexity remains about O(n log n). This structure keeps the main query concise and separates each requirement clearly.
Approach 3: Programmatic HashMap Aggregation (O(n) time, O(n) space)
If you load the tables into application code (Python or another language), you can compute results using hash maps. Iterate through the rating records once to maintain a count per user and accumulate sums and counts per movie. Use another pass to compute average ratings for February 2020 entries. Hash lookups are constant time, so the full pass runs in O(n). A final comparison step selects the maximum values while resolving ties lexicographically.
These solutions revolve around relational aggregation concepts from database queries and grouping operations commonly used in SQL. The core idea is transforming raw rating records into aggregated statistics that can be ranked.
Recommended for interviews: The SQL grouping and ordering approach is the expected solution. It demonstrates fluency with joins, GROUP BY, and tie‑breaking with ordered results. The HashMap version shows strong algorithmic thinking when solving the problem outside a database environment.
This approach uses SQL queries with aggregation and sorting to determine the required results. It relies on the GROUP BY clause for aggregating data, and ORDER BY for sorting. Joins are used to link information from different tables.
The solution consists of creating three CTEs (Common Table Expressions): UserRatings, MaxUserRatings, and AvgMovieRatings. The first calculates the number of ratings per user, the second selects the user(s) with the maximum number of ratings, and the third calculates the average movie rating in February 2020. We then find the lexicographically smallest user name and movie title in case of ties.
SQL
Time Complexity: The time complexity primarily depends on the size of the dataset and the usage of GROUP BY and ORDER BY operations, which in general takes O(n log n) where n is the number of rows in the MovieRating table.
Space Complexity: O(m + k) for holding intermediate aggregates, where m is the distinct number of users and k is the number of distinct movies.
Implementing the solution programmatically requires gathering data from the three different tables into memory and using hashmaps (dictionaries) to process it. Aggregations like counting ratings per user and averaging movie ratings in February are performed using these data structures.
This Python code uses dictionaries to store and compute the necessary counts and averages. For the first part, it counts how many movies each user rated. Then, it finds the user(s) who has the highest count. In case of ties, the lexicographically smallest is chosen. For the second part, it calculates the average ratings for movies within February 2020 and selects the movie with the highest average. Again, ties are resolved by choosing lexicographically.
Python
Time Complexity: O(u + m), where u is the number of users and m is the number of ratings. We traverse all ratings once.
Space Complexity: O(u + v), where u is the number of users and v is the number of movies, due to storage in hashmaps.
This approach involves two main SQL queries to achieve the desired results. For the first part, the task is to determine which user rated the most movies. We achieve this using the COUNT and GROUP BY functions to tally up the number of ratings per user. The results are then ordered by the count in descending order and lexicographically by name to resolve ties.
For the second task, we filter the MovieRating table to include only entries from February 2020. Using the AVG function, we calculate the average rating for each movie and use ORDER BY to sort the results by average ratings, resolving ties lexicographically by movie title.
In C, direct execution of SQL would involve using a library for database connection, such as MySQL Connector/C, and executing these queries within a connected session. This stub demonstrates how you might set up and define the queries required to solve the problem. Actual SQL execution would involve interfacing with a database to execute the queries and fetch results.
Time Complexity: O(N + M) for parsing and fetching results from the database for both queries, where N and M are the number of records queried.
Space Complexity: O(1), assuming output directly to the console or a single row fetch at a time.
This approach still operates with SQL as the problem is fundamentally a SQL-based operation. Here's how we expand on using nested subqueries and CTEs (Common Table Expressions) to simplify readability and management of data within SQL scripts.
This SQL-based approach in C through the use of CTEs simplifies the organization of complex operations. Utilizing CTE, we first calculate ratings for each user in a subquery, and separately calculate and order the average ratings for February 2020 by the movie, utilizing easy-to-read sequential querying structure.
Time Complexity: O(N + M) as queries sequentially parse data, where N and M typically align with the size of data in executions.
Space Complexity: O(1), assuming data is streamed or fetched minimally around summarization processes.
MySQL
| Approach | Complexity |
|---|---|
| Using SQL Queries | Time Complexity: The time complexity primarily depends on the size of the dataset and the usage of GROUP BY and ORDER BY operations, which in general takes O(n log n) where n is the number of rows in the MovieRating table. |
| Programmatically with HashMaps | Time Complexity: O(u + m), where u is the number of users and m is the number of ratings. We traverse all ratings once. |
| Approach 1: SQL Grouping and Ordering | Time Complexity: O(N + M) for parsing and fetching results from the database for both queries, where N and M are the number of records queried. |
| Approach 2: Direct SQL with Subquery Utilization | Time Complexity: O(N + M) as queries sequentially parse data, where N and M typically align with the size of data in executions. |
| Default Approach | — |
| Approach | Time | Space | When to Use |
|---|---|---|---|
| SQL Grouping and Ordering | O(n log n) | O(n) | Standard SQL solution using GROUP BY, joins, and sorting |
| Direct SQL with Subqueries | O(n log n) | O(n) | Cleaner query structure when separating logic with subqueries |
| HashMap Aggregation (Programmatic) | O(n) | O(n) | Useful when processing data outside SQL using Python or other languages |
Movie Rating | Leetcode 1341 | Crack SQL Interviews in 50 Qs #mysql #leetcode • Learn With Chirag • 9,883 views views
Watch 9 more video solutions →Practice Movie Rating with our built-in code editor and test cases.
Practice on FleetCode