Watch 10 video solutions for Average Time of Process per Machine, a easy level problem involving Database. This walkthrough by Learn With Chirag has 31,816 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Table: Activity
+----------------+---------+
| Column Name | Type |
+----------------+---------+
| machine_id | int |
| process_id | int |
| activity_type | enum |
| timestamp | float |
+----------------+---------+
The table shows the user activities for a factory website.
(machine_id, process_id, activity_type) is the primary key (combination of columns with unique values) of this table.
machine_id is the ID of a machine.
process_id is the ID of a process running on the machine with ID machine_id.
activity_type is an ENUM (category) of type ('start', 'end').
timestamp is a float representing the current time in seconds.
'start' means the machine starts the process at the given timestamp and 'end' means the machine ends the process at the given timestamp.
The 'start' timestamp will always be before the 'end' timestamp for every (machine_id, process_id) pair.
It is guaranteed that each (machine_id, process_id) pair has a 'start' and 'end' timestamp.
There is a factory website that has several machines each running the same number of processes. Write a solution to find the average time each machine takes to complete a process.
The time to complete a process is the 'end' timestamp minus the 'start' timestamp. The average time is calculated by the total time to complete every process on the machine divided by the number of processes that were run.
The resulting table should have the machine_id along with the average time as processing_time, which should be rounded to 3 decimal places.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input: Activity table: +------------+------------+---------------+-----------+ | machine_id | process_id | activity_type | timestamp | +------------+------------+---------------+-----------+ | 0 | 0 | start | 0.712 | | 0 | 0 | end | 1.520 | | 0 | 1 | start | 3.140 | | 0 | 1 | end | 4.120 | | 1 | 0 | start | 0.550 | | 1 | 0 | end | 1.550 | | 1 | 1 | start | 0.430 | | 1 | 1 | end | 1.420 | | 2 | 0 | start | 4.100 | | 2 | 0 | end | 4.512 | | 2 | 1 | start | 2.500 | | 2 | 1 | end | 5.000 | +------------+------------+---------------+-----------+ Output: +------------+-----------------+ | machine_id | processing_time | +------------+-----------------+ | 0 | 0.894 | | 1 | 0.995 | | 2 | 1.456 | +------------+-----------------+ Explanation: There are 3 machines running 2 processes each. Machine 0's average time is ((1.520 - 0.712) + (4.120 - 3.140)) / 2 = 0.894 Machine 1's average time is ((1.550 - 0.550) + (1.420 - 0.430)) / 2 = 0.995 Machine 2's average time is ((4.512 - 4.100) + (5.000 - 2.500)) / 2 = 1.456
Problem Overview: Each process on a machine has two records: a start event and an end event with timestamps. The goal is to compute the average processing time for every machine_id. For each process, subtract the start time from the end time, then average those durations per machine.
Approach 1: Using HashMap for Storing Start and End Times (O(n) time, O(n) space)
Iterate through all activity records and store the start timestamp of each process in a hash map keyed by (machine_id, process_id). When the corresponding end record appears, compute the duration by subtracting the stored start time from the end time. Maintain another map that tracks total processing time and process count per machine. At the end, divide total time by count to produce the average for each machine. Hash lookups make each pairing operation constant time, giving O(n) total time complexity with O(n) auxiliary space. This approach is straightforward and mirrors how you'd process event logs in production systems. It relies on efficient key lookup using a HashMap.
Approach 2: Sorting and Linear Scan Method (O(n log n) time, O(1) extra space)
Sort the activity records by machine_id, process_id, and timestamp. After sorting, each process’s start and end events appear next to each other. Perform a single linear scan: when a start record is encountered, the following record will be the matching end. Compute the duration immediately and accumulate totals per machine. Because sorting dominates the runtime, the complexity becomes O(n log n) with constant extra space if the sort is done in-place. This approach works well when input ordering is unpredictable and you want deterministic pairing through ordering, using concepts from sorting and sequential scans.
Both solutions compute identical results: total duration per machine divided by the number of completed processes. The logic is similar to aggregations commonly used in database problems where events must be paired and summarized.
Recommended for interviews: The HashMap approach is typically expected. It runs in linear time, directly models the pairing of start and end events, and demonstrates strong understanding of event processing with constant-time lookups. The sorting method still shows solid reasoning but is less optimal due to the O(n log n) sorting cost.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| HashMap for Start-End Pairing | O(n) | O(n) | Best general solution when events arrive in arbitrary order |
| Sorting and Linear Scan | O(n log n) | O(1) extra | Useful when sorting is acceptable or memory usage must stay minimal |