Watch 2 video solutions for Employees With Deductions, a medium level problem involving Database. This walkthrough by Everyday Data Science has 1,497 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Table: Employees
+--------------+------+ | Column Name | Type | +--------------+------+ | employee_id | int | | needed_hours | int | +--------------+------+ employee_id is column with unique values for this table. Each row contains the id of an employee and the minimum number of hours needed for them to work to get their salary.
Table: Logs
+-------------+----------+ | Column Name | Type | +-------------+----------+ | employee_id | int | | in_time | datetime | | out_time | datetime | +-------------+----------+ (employee_id, in_time, out_time) is the primary key (combination of columns with unique values) for this table. Each row of this table shows the time stamps for an employee. in_time is the time the employee started to work, and out_time is the time the employee ended work. All the times are in October 2022. out_time can be one day after in_time which means the employee worked after the midnight.
In a company, each employee must work a certain number of hours every month. Employees work in sessions. The number of hours an employee worked can be calculated from the sum of the number of minutes the employee worked in all of their sessions. The number of minutes in each session is rounded up.
51 minutes and 2 seconds in a session, we consider it 52 minutes.Write a solution to report the IDs of the employees that will be deducted. In other words, report the IDs of the employees that did not work the needed hours.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input:
Employees table:
+-------------+--------------+
| employee_id | needed_hours |
+-------------+--------------+
| 1 | 20 |
| 2 | 12 |
| 3 | 2 |
+-------------+--------------+
Logs table:
+-------------+---------------------+---------------------+
| employee_id | in_time | out_time |
+-------------+---------------------+---------------------+
| 1 | 2022-10-01 09:00:00 | 2022-10-01 17:00:00 |
| 1 | 2022-10-06 09:05:04 | 2022-10-06 17:09:03 |
| 1 | 2022-10-12 23:00:00 | 2022-10-13 03:00:01 |
| 2 | 2022-10-29 12:00:00 | 2022-10-29 23:58:58 |
+-------------+---------------------+---------------------+
Output:
+-------------+
| employee_id |
+-------------+
| 2 |
| 3 |
+-------------+
Explanation:
Employee 1:
- Worked for three sessions:
- On 2022-10-01, they worked for 8 hours.
- On 2022-10-06, they worked for 8 hours and 4 minutes.
- On 2022-10-12, they worked for 4 hours and 1 minute. Note that they worked through midnight.
- Employee 1 worked a total of 20 hours and 5 minutes across sessions and will not be deducted.
Employee 2:
- Worked for one session:
- On 2022-10-29, they worked for 11 hours and 59 minutes.
- Employee 2 did not work their hours and will be deducted.
Employee 3:
- Did not work any session.
- Employee 3 did not work their hours and will be deducted.
Problem Overview: You are given an Employees table with required working hours and a Logs table that records when employees clock in and out. The task is to compute the total worked time for each employee and return those whose worked hours are less than the required hours, meaning a deduction should apply.
Approach 1: Correlated Subquery (O(n^2) time, O(1) space)
A straightforward way is to calculate the total working time for each employee using a correlated subquery. For every row in Employees, run a subquery on Logs that sums out_time - in_time for the same employee_id. If that computed value is smaller than needed_hours, the employee has deductions. This approach is easy to reason about but inefficient because the database repeatedly scans the Logs table for each employee.
In large datasets this quickly becomes expensive since each employee triggers another aggregation query. It still works well for small tables or quick prototyping when performance is not a concern.
Approach 2: Aggregation with GROUP BY (O(n) time, O(n) space)
The efficient solution aggregates working hours in a single pass using GROUP BY. Join Employees with Logs on employee_id, compute SUM(out_time - in_time) for each employee, and compare it directly against needed_hours. SQL handles the grouping internally and calculates total working hours for all employees in one scan.
This approach leverages standard database operations such as joins and aggregation. The key idea is computing all employee totals once instead of recalculating them repeatedly. Using HAVING or filtering on the aggregated result makes it easy to return only the employees with insufficient hours.
Conceptually, the query performs three steps: join the two tables, aggregate log durations per employee, and filter rows where total work time is less than required. These operations are core patterns in SQL query design and appear frequently in interview-style database problems.
Recommended for interviews: The GROUP BY aggregation approach is what interviewers expect. It shows you understand relational joins and aggregate filtering using SUM and HAVING. Mentioning the correlated subquery demonstrates awareness of simpler but less efficient solutions, while the aggregated query shows you can design scalable SQL queries.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Correlated Subquery | O(n^2) | O(1) | Simple queries or small datasets where readability matters more than performance |
| JOIN + GROUP BY Aggregation | O(n) | O(n) | Best approach for large tables; computes all employee totals in one pass |