Watch 10 video solutions for Employees With Missing Information, a easy level problem involving Database. This walkthrough by Everyday Data Science has 6,373 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Table: Employees
+-------------+---------+ | Column Name | Type | +-------------+---------+ | employee_id | int | | name | varchar | +-------------+---------+ employee_id is the column with unique values for this table. Each row of this table indicates the name of the employee whose ID is employee_id.
Table: Salaries
+-------------+---------+ | Column Name | Type | +-------------+---------+ | employee_id | int | | salary | int | +-------------+---------+ employee_id is the column with unique values for this table. Each row of this table indicates the salary of the employee whose ID is employee_id.
Write a solution to report the IDs of all the employees with missing information. The information of an employee is missing if:
Return the result table ordered by employee_id in ascending order.
The result format is in the following example.
Example 1:
Input: Employees table: +-------------+----------+ | employee_id | name | +-------------+----------+ | 2 | Crew | | 4 | Haven | | 5 | Kristian | +-------------+----------+ Salaries table: +-------------+--------+ | employee_id | salary | +-------------+--------+ | 5 | 76071 | | 1 | 22517 | | 4 | 63539 | +-------------+--------+ Output: +-------------+ | employee_id | +-------------+ | 1 | | 2 | +-------------+ Explanation: Employees 1, 2, 4, and 5 are working at this company. The name of employee 1 is missing. The salary of employee 2 is missing.
Problem Overview: Two database tables store employee records separately. Employees contains employee_id and name, while Salaries stores employee_id and salary. Some employees appear in only one table. Your task is to return all employee_id values that are missing information in either table and sort the result in ascending order.
Approach 1: SQL JOIN and NULL Check (O(n + m) time, O(1) extra space)
The most common solution uses LEFT JOIN operations to detect unmatched rows. Join Employees with Salaries on employee_id and filter rows where the salary record is NULL. Do the reverse join to find salaries that do not have a matching employee record. Combine both result sets with UNION. The key insight: unmatched rows in joins produce NULL values, which directly reveal missing information. This approach works efficiently because the database performs indexed join operations internally. When working with relational datasets, this pattern appears frequently in SQL and database interview questions.
Approach 2: SQL UNION with NOT IN / NOT EXISTS (O(n + m) time, O(1) extra space)
Another clean SQL solution checks membership explicitly. Select employee IDs from Employees that do not appear in Salaries, then union them with IDs from Salaries that do not appear in Employees. Both queries run independently and the UNION merges the results while removing duplicates. The core operation is a set difference between the two tables. This approach is easy to read and commonly used when joins would make the query harder to follow.
Approach 3: SQL FULL OUTER JOIN (O(n + m) time, O(1) extra space)
A FULL OUTER JOIN directly exposes rows that exist in only one table. When joining Employees and Salaries, rows missing from either side produce NULL columns. Filtering where either the employee record or salary record is NULL returns exactly the employees with incomplete data. This is often the most expressive solution conceptually, though some SQL dialects do not support full outer joins. ORM-based implementations (such as SQLAlchemy or Entity Framework) typically translate this pattern cleanly.
Approach 4: Hashing (O(n + m) time, O(n + m) space)
Outside SQL environments, the problem becomes a simple set comparison. Insert all employee_id values from the first dataset into a hash set, then iterate through the second dataset. IDs missing in the set indicate salaries without employee records. Repeat the process in the opposite direction. Hash lookups run in constant time, making the overall complexity linear. This technique mirrors classic hashing interview problems where you detect elements present in one collection but absent in another.
Recommended for interviews: The SQL JOIN + NULL filtering approach is the most expected answer for database interviews. It demonstrates a solid understanding of relational joins and how missing matches appear as NULL. Mentioning the FULL OUTER JOIN variant shows deeper SQL knowledge, while the hashing approach demonstrates how the same logic translates to application code.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| SQL JOIN + NULL Check | O(n + m) | O(1) | Standard SQL solution for detecting unmatched rows between tables |
| SQL UNION with NOT IN / NOT EXISTS | O(n + m) | O(1) | When you want explicit set difference queries instead of joins |
| SQL FULL OUTER JOIN | O(n + m) | O(1) | Best conceptual approach if the SQL dialect supports full outer joins |
| Hashing (Application Code) | O(n + m) | O(n + m) | When solving outside SQL using Python or JavaScript collections |