Movie Rating - Solution & Explanation

MediumDatabase23 min readAsked at: Amazon, Microsoft, Meta +3

Problem Statement

Table: Movies

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| movie_id      | int     |
| title         | varchar |
+---------------+---------+
movie_id is the primary key (column with unique values) for this table.
title is the name of the movie.

Table: Users

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| user_id       | int     |
| name          | varchar |
+---------------+---------+
user_id is the primary key (column with unique values) for this table.
The column 'name' has unique values.

Table: MovieRating

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| movie_id      | int     |
| user_id       | int     |
| rating        | int     |
| created_at    | date    |
+---------------+---------+
(movie_id, user_id) is the primary key (column with unique values) for this table.
This table contains the rating of a movie by a user in their review.
created_at is the user's review date.

Write a solution to:

Find the name of the user who has rated the greatest number of movies. In case of a tie, return the lexicographically smaller user name.
Find the movie name with the highest average rating in February 2020. In case of a tie, return the lexicographically smaller movie name.

The result format is in the following example.

Example 1:

Input: 
Movies table:
+-------------+--------------+
| movie_id    |  title       |
+-------------+--------------+
| 1           | Avengers     |
| 2           | Frozen 2     |
| 3           | Joker        |
+-------------+--------------+
Users table:
+-------------+--------------+
| user_id     |  name        |
+-------------+--------------+
| 1           | Daniel       |
| 2           | Monica       |
| 3           | Maria        |
| 4           | James        |
+-------------+--------------+
MovieRating table:
+-------------+--------------+--------------+-------------+
| movie_id    | user_id      | rating       | created_at  |
+-------------+--------------+--------------+-------------+
| 1           | 1            | 3            | 2020-01-12  |
| 1           | 2            | 4            | 2020-02-11  |
| 1           | 3            | 2            | 2020-02-12  |
| 1           | 4            | 1            | 2020-01-01  |
| 2           | 1            | 5            | 2020-02-17  | 
| 2           | 2            | 2            | 2020-02-01  | 
| 2           | 3            | 2            | 2020-03-01  |
| 3           | 1            | 3            | 2020-02-22  | 
| 3           | 2            | 4            | 2020-02-25  | 
+-------------+--------------+--------------+-------------+
Output: 
+--------------+
| results      |
+--------------+
| Daniel       |
| Frozen 2     |
+--------------+
Explanation: 
Daniel and Monica have rated 3 movies ("Avengers", "Frozen 2" and "Joker") but Daniel is smaller lexicographically.
Frozen 2 and Joker have a rating average of 3.5 in February but Frozen 2 is smaller lexicographically.

Approach Overview

Problem Overview: You are given three tables: Movies, Users, and MovieRating. The task has two parts: find the user who submitted the most ratings, and find the movie with the highest average rating during February 2020. If multiple results tie, return the lexicographically smallest name or title.

Approach 1: SQL Grouping and Ordering (O(n log n) time, O(n) space)

This approach relies on SQL aggregation. First, group ratings by user_id and count how many ratings each user submitted. Join with the Users table to access names, then sort by rating count descending and name ascending to break ties. For the movie result, filter ratings within February 2020, compute AVG(rating) grouped by movie_id, join with Movies, and order by average rating descending and title ascending. Sorting introduces O(n log n) time complexity, while grouped results require O(n) intermediate storage.

Approach 2: Direct SQL with Subquery Utilization (O(n log n) time, O(n) space)

This version pushes more logic into subqueries. A subquery calculates rating counts per user and returns the top result using ORDER BY and LIMIT. Another subquery filters February 2020 ratings and computes movie averages. The outer query selects the final rows. The database engine still performs grouping and sorting internally, so complexity remains about O(n log n). This structure keeps the main query concise and separates each requirement clearly.

Approach 3: Programmatic HashMap Aggregation (O(n) time, O(n) space)

If you load the tables into application code (Python or another language), you can compute results using hash maps. Iterate through the rating records once to maintain a count per user and accumulate sums and counts per movie. Use another pass to compute average ratings for February 2020 entries. Hash lookups are constant time, so the full pass runs in O(n). A final comparison step selects the maximum values while resolving ties lexicographically.

These solutions revolve around relational aggregation concepts from database queries and grouping operations commonly used in SQL. The core idea is transforming raw rating records into aggregated statistics that can be ranked.

Recommended for interviews: The SQL grouping and ordering approach is the expected solution. It demonstrates fluency with joins, GROUP BY, and tie‑breaking with ordered results. The HashMap version shows strong algorithmic thinking when solving the problem outside a database environment.

Approach 1: Using SQL Queries

This approach uses SQL queries with aggregation and sorting to determine the required results. It relies on the GROUP BY clause for aggregating data, and ORDER BY for sorting. Joins are used to link information from different tables.

The solution consists of creating three CTEs (Common Table Expressions): UserRatings, MaxUserRatings, and AvgMovieRatings. The first calculates the number of ratings per user, the second selects the user(s) with the maximum number of ratings, and the third calculates the average movie rating in February 2020. We then find the lexicographically smallest user name and movie title in case of ties.

Code

SQL

Complexity

Time Complexity: The time complexity primarily depends on the size of the dataset and the usage of GROUP BY and ORDER BY operations, which in general takes O(n log n) where n is the number of rows in the MovieRating table.
Space Complexity: O(m + k) for holding intermediate aggregates, where m is the distinct number of users and k is the number of distinct movies.

Try this approach in the editor →

Approach 2: Programmatically with HashMaps

Implementing the solution programmatically requires gathering data from the three different tables into memory and using hashmaps (dictionaries) to process it. Aggregations like counting ratings per user and averaging movie ratings in February are performed using these data structures.

This Python code uses dictionaries to store and compute the necessary counts and averages. For the first part, it counts how many movies each user rated. Then, it finds the user(s) who has the highest count. In case of ties, the lexicographically smallest is chosen. For the second part, it calculates the average ratings for movies within February 2020 and selects the movie with the highest average. Again, ties are resolved by choosing lexicographically.

Code

Python

Complexity

Time Complexity: O(u + m), where u is the number of users and m is the number of ratings. We traverse all ratings once.
Space Complexity: O(u + v), where u is the number of users and v is the number of movies, due to storage in hashmaps.

Try this approach in the editor →

Approach 3: Approach 1: SQL Grouping and Ordering

This approach involves two main SQL queries to achieve the desired results. For the first part, the task is to determine which user rated the most movies. We achieve this using the COUNT and GROUP BY functions to tally up the number of ratings per user. The results are then ordered by the count in descending order and lexicographically by name to resolve ties.

For the second task, we filter the MovieRating table to include only entries from February 2020. Using the AVG function, we calculate the average rating for each movie and use ORDER BY to sort the results by average ratings, resolving ties lexicographically by movie title.

In C, direct execution of SQL would involve using a library for database connection, such as MySQL Connector/C, and executing these queries within a connected session. This stub demonstrates how you might set up and define the queries required to solve the problem. Actual SQL execution would involve interfacing with a database to execute the queries and fetch results.

Code

C C++Java Python C#JavaScript

C++

Java

Python

JavaScript

Complexity

Time Complexity: O(N + M) for parsing and fetching results from the database for both queries, where N and M are the number of records queried.
Space Complexity: O(1), assuming output directly to the console or a single row fetch at a time.

Try this approach in the editor →

Approach 4: Approach 2: Direct SQL with Subquery Utilization

This approach still operates with SQL as the problem is fundamentally a SQL-based operation. Here's how we expand on using nested subqueries and CTEs (Common Table Expressions) to simplify readability and management of data within SQL scripts.

This SQL-based approach in C through the use of CTEs simplifies the organization of complex operations. Utilizing CTE, we first calculate ratings for each user in a subquery, and separately calculate and order the average ratings for February 2020 by the movie, utilizing easy-to-read sequential querying structure.

Code

C C++Java Python C#JavaScript

C++

Java

Python

JavaScript

Complexity

Time Complexity: O(N + M) as queries sequentially parse data, where N and M typically align with the size of data in executions.
Space Complexity: O(1), assuming data is streamed or fetched minimally around summarization processes.

Try this approach in the editor →

Approach 5: Default Approach

Code

MySQL

Try this approach in the editor →

Complexity Comparison

Approach	Complexity
Using SQL Queries	Time Complexity: The time complexity primarily depends on the size of the dataset and the usage of GROUP BY and ORDER BY operations, which in general takes O(n log n) where n is the number of rows in the MovieRating table. Space Complexity: O(m + k) for holding intermediate aggregates, where m is the distinct number of users and k is the number of distinct movies.
Programmatically with HashMaps	Time Complexity: O(u + m), where u is the number of users and m is the number of ratings. We traverse all ratings once. Space Complexity: O(u + v), where u is the number of users and v is the number of movies, due to storage in hashmaps.
Approach 1: SQL Grouping and Ordering	Time Complexity: O(N + M) for parsing and fetching results from the database for both queries, where N and M are the number of records queried. Space Complexity: O(1), assuming output directly to the console or a single row fetch at a time.
Approach 2: Direct SQL with Subquery Utilization	Time Complexity: O(N + M) as queries sequentially parse data, where N and M typically align with the size of data in executions. Space Complexity: O(1), assuming data is streamed or fetched minimally around summarization processes.
Default Approach	—

Detailed Complexity Analysis

Approach	Time	Space	When to Use
SQL Grouping and Ordering	O(n log n)	O(n)	Standard SQL solution using GROUP BY, joins, and sorting
Direct SQL with Subqueries	O(n log n)	O(n)	Cleaner query structure when separating logic with subqueries
HashMap Aggregation (Programmatic)	O(n)	O(n)	Useful when processing data outside SQL using Python or other languages

Video Solution

Movie Rating | Leetcode 1341 | Crack SQL Interviews in 50 Qs #mysql #leetcode • Learn With Chirag • 9,883 views views

Watch 9 more video solutions →

Frequently Asked Questions

Is Movie Rating easy or hard?

Movie Rating is considered a medium difficulty database problem. The challenge is not complex algorithms but correctly structuring SQL joins, aggregations, and tie breaking conditions while filtering ratings by date.

How to solve Movie Rating in O(n)?

An O(n) approach is possible when processing records programmatically. Iterate through all rating rows once while maintaining hash maps for user rating counts and movie rating sums and counts. After aggregation, compute the best user and movie with simple comparisons. Hash lookups keep operations constant time.

Movie Rating Python or Java solution

In Python or Java, store user rating counts in a hash map and track movie rating sums and counts for February 2020. After processing all records, compute averages and choose the highest values while applying lexicographic tie breaking. This approach runs in O(n) time with O(n) space.

What is the best approach for Movie Rating?

The standard solution uses SQL aggregation with GROUP BY and ORDER BY. First count ratings per user to find the most active reviewer, then compute the average rating per movie for February 2020. Sorting resolves ties by lexicographical order of names or titles. This approach runs in about O(n log n) time due to sorting.

Is Movie Rating asked at Google/Amazon/Meta?

Database aggregation problems like Movie Rating frequently appear in SQL interview rounds at companies such as Amazon, Meta, and other data‑heavy teams. Interviewers use them to evaluate joins, GROUP BY usage, and ranking logic with tie breaking.

What data structure is used in Movie Rating?

The SQL version primarily uses relational operations such as joins, GROUP BY aggregation, and ORDER BY sorting. In a programmatic solution, hash maps store counts and rating totals for users and movies, enabling efficient aggregation in linear time.

What is the time complexity of Movie Rating?

Most SQL implementations run in O(n log n) time because grouped results are sorted to determine the top user and movie. The grouping itself is roughly O(n), while ORDER BY operations add the log factor. Space complexity is O(n) for intermediate aggregated results.

Ready to solve this problem?

Practice Movie Rating with our built-in code editor and test cases.

Practice on FleetCode

Combine Two Tables

Second Highest Salary

Problem Info

DifficultyMedium

Acceptance40.8%

Approaches5

Reading time23 min

Asked at

Amazon Microsoft Meta SAP Google

Practice this problem

Open in Editor

Problem Statement

Approach Overview

Approach 1: Using SQL Queries

Code

Complexity

Approach 2: Programmatically with HashMaps

Code

Complexity

Approach 3: Approach 1: SQL Grouping and Ordering

Code

Complexity

Approach 4: Approach 2: Direct SQL with Subquery Utilization

Code

Complexity

Approach 5: Default Approach

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Approach 1: Using SQL Queries

Code

Complexity

Approach 2: Programmatically with HashMaps

Code

Complexity

Approach 3: Approach 1: SQL Grouping and Ordering

Code

Complexity

Approach 4: Approach 2: Direct SQL with Subquery Utilization

Code

Complexity

Approach 5: Default Approach

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents