Watch 10 video solutions for Queries Quality and Percentage, a easy level problem involving Database. This walkthrough by Everyday Data Science has 17,283 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Table: Queries
+-------------+---------+ | Column Name | Type | +-------------+---------+ | query_name | varchar | | result | varchar | | position | int | | rating | int | +-------------+---------+ This table may have duplicate rows. This table contains information collected from some queries on a database. Thepositioncolumn has a value from 1 to 500. Theratingcolumn has a value from 1 to 5. Query withratingless than 3 is a poor query.
We define query quality as:
The average of the ratio between query rating and its position.
We also define poor query percentage as:
The percentage of all queries with rating less than 3.
Write a solution to find each query_name, the quality and poor_query_percentage.
Both quality and poor_query_percentage should be rounded to 2 decimal places.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input: Queries table: +------------+-------------------+----------+--------+ | query_name | result | position | rating | +------------+-------------------+----------+--------+ | Dog | Golden Retriever | 1 | 5 | | Dog | German Shepherd | 2 | 5 | | Dog | Mule | 200 | 1 | | Cat | Shirazi | 5 | 2 | | Cat | Siamese | 3 | 3 | | Cat | Sphynx | 7 | 4 | +------------+-------------------+----------+--------+ Output: +------------+---------+-----------------------+ | query_name | quality | poor_query_percentage | +------------+---------+-----------------------+ | Dog | 2.50 | 33.33 | | Cat | 0.66 | 33.33 | +------------+---------+-----------------------+ Explanation: Dog queries quality is ((5 / 1) + (5 / 2) + (1 / 200)) / 3 = 2.50 Dog queries poor_ query_percentage is (1 / 3) * 100 = 33.33 Cat queries quality equals ((2 / 5) + (3 / 3) + (4 / 7)) / 3 = 0.66 Cat queries poor_ query_percentage is (1 / 3) * 100 = 33.33
Problem Overview: You are given a Queries table where each row represents the result of a search query with its rating and position. For every query_name, compute two metrics: quality (the average of rating / position) and poor_query_percentage (percentage of rows where rating < 3). The output should group results by query_name and round values to two decimal places.
Approach 1: SQL Aggregation and Group By (O(n) time, O(1) space)
This problem fits naturally into SQL aggregation. Scan the table once, group rows by query_name, and compute two aggregates. For quality, calculate AVG(rating * 1.0 / position). For poor query percentage, use a conditional aggregation such as SUM(CASE WHEN rating < 3 THEN 1 ELSE 0 END) divided by the total count, multiplied by 100. SQL engines handle grouping efficiently, making this the cleanest and most common approach in interviews and production analytics queries.
Approach 2: SQL Aggregation with Subqueries (O(n) time, O(1) space)
Another way is to compute intermediate metrics in subqueries and combine them in an outer query. One subquery calculates the average rating / position per query_name, while another computes the percentage of poor queries. The outer query joins these results on query_name. This structure is useful when metrics are complex or derived from separate aggregations. It still performs a linear scan over the dataset and relies on relational aggregation features from database systems.
Approach 3: Using DataFrames or List Aggregations (O(n) time, O(n) space)
In Python environments such as analytics scripts or interview variants, load the data into a DataFrame or dictionary keyed by query_name. Iterate through rows, compute rating / position, and store values per query. Then compute the mean and the proportion of ratings below 3. Libraries like pandas provide groupby with vectorized aggregation, which mirrors SQL GROUP BY behavior. Space complexity increases because the dataset is held in memory.
Approach 4: Iterative Calculation After Data Retrieval (O(n) time, O(k) space)
If rows are retrieved through an API or cursor, process them iteratively. Maintain running totals for each query_name: sum of rating / position, total count, and count of rating < 3. After processing all rows, compute averages and percentages. This avoids storing all intermediate values and keeps only aggregated counters per query, which is efficient when working with streaming data.
Recommended for interviews: The expected solution is the SQL GROUP BY aggregation. It expresses both metrics directly with aggregate functions and conditional counting. Showing the subquery version demonstrates understanding of query decomposition, but the grouped aggregation is the most concise and idiomatic solution.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| SQL Aggregation and Group By | O(n) | O(1) | Standard SQL analytics queries; simplest and most common solution |
| SQL Aggregation with Subqueries | O(n) | O(1) | When metrics are computed in separate logical steps |
| DataFrames or List Aggregations (Python) | O(n) | O(n) | When solving outside SQL using pandas or in-memory processing |
| Iterative Calculation After Data Retrieval | O(n) | O(k) | Streaming data or cursor-based processing with limited memory |