Average Time of Process per Machine - Solution & Explanation

EasyDatabase11 min readAsked at: Amazon, Microsoft, Meta +6

Problem Statement

Table: Activity

+----------------+---------+
| Column Name    | Type    |
+----------------+---------+
| machine_id     | int     |
| process_id     | int     |
| activity_type  | enum    |
| timestamp      | float   |
+----------------+---------+
The table shows the user activities for a factory website.
(machine_id, process_id, activity_type) is the primary key (combination of columns with unique values) of this table.
machine_id is the ID of a machine.
process_id is the ID of a process running on the machine with ID machine_id.
activity_type is an ENUM (category) of type ('start', 'end').
timestamp is a float representing the current time in seconds.
'start' means the machine starts the process at the given timestamp and 'end' means the machine ends the process at the given timestamp.
The 'start' timestamp will always be before the 'end' timestamp for every (machine_id, process_id) pair.
It is guaranteed that each (machine_id, process_id) pair has a 'start' and 'end' timestamp.

There is a factory website that has several machines each running the same number of processes. Write a solution to find the average time each machine takes to complete a process.

The time to complete a process is the 'end' timestamp minus the 'start' timestamp. The average time is calculated by the total time to complete every process on the machine divided by the number of processes that were run.

The resulting table should have the machine_id along with the average time as processing_time, which should be rounded to 3 decimal places.

Return the result table in any order.

The result format is in the following example.

Example 1:

Input: 
Activity table:
+------------+------------+---------------+-----------+
| machine_id | process_id | activity_type | timestamp |
+------------+------------+---------------+-----------+
| 0          | 0          | start         | 0.712     |
| 0          | 0          | end           | 1.520     |
| 0          | 1          | start         | 3.140     |
| 0          | 1          | end           | 4.120     |
| 1          | 0          | start         | 0.550     |
| 1          | 0          | end           | 1.550     |
| 1          | 1          | start         | 0.430     |
| 1          | 1          | end           | 1.420     |
| 2          | 0          | start         | 4.100     |
| 2          | 0          | end           | 4.512     |
| 2          | 1          | start         | 2.500     |
| 2          | 1          | end           | 5.000     |
+------------+------------+---------------+-----------+
Output: 
+------------+-----------------+
| machine_id | processing_time |
+------------+-----------------+
| 0          | 0.894           |
| 1          | 0.995           |
| 2          | 1.456           |
+------------+-----------------+
Explanation: 
There are 3 machines running 2 processes each.
Machine 0's average time is ((1.520 - 0.712) + (4.120 - 3.140)) / 2 = 0.894
Machine 1's average time is ((1.550 - 0.550) + (1.420 - 0.430)) / 2 = 0.995
Machine 2's average time is ((4.512 - 4.100) + (5.000 - 2.500)) / 2 = 1.456

Approach Overview

Problem Overview: Each process on a machine has two records: a start event and an end event with timestamps. The goal is to compute the average processing time for every machine_id. For each process, subtract the start time from the end time, then average those durations per machine.

Approach 1: Using HashMap for Storing Start and End Times (O(n) time, O(n) space)

Iterate through all activity records and store the start timestamp of each process in a hash map keyed by (machine_id, process_id). When the corresponding end record appears, compute the duration by subtracting the stored start time from the end time. Maintain another map that tracks total processing time and process count per machine. At the end, divide total time by count to produce the average for each machine. Hash lookups make each pairing operation constant time, giving O(n) total time complexity with O(n) auxiliary space. This approach is straightforward and mirrors how you'd process event logs in production systems. It relies on efficient key lookup using a HashMap.

Approach 2: Sorting and Linear Scan Method (O(n log n) time, O(1) extra space)

Sort the activity records by machine_id, process_id, and timestamp. After sorting, each process’s start and end events appear next to each other. Perform a single linear scan: when a start record is encountered, the following record will be the matching end. Compute the duration immediately and accumulate totals per machine. Because sorting dominates the runtime, the complexity becomes O(n log n) with constant extra space if the sort is done in-place. This approach works well when input ordering is unpredictable and you want deterministic pairing through ordering, using concepts from sorting and sequential scans.

Both solutions compute identical results: total duration per machine divided by the number of completed processes. The logic is similar to aggregations commonly used in database problems where events must be paired and summarized.

Recommended for interviews: The HashMap approach is typically expected. It runs in linear time, directly models the pairing of start and end events, and demonstrates strong understanding of event processing with constant-time lookups. The sorting method still shows solid reasoning but is less optimal due to the O(n log n) sorting cost.

Approach 1: Using HashMap for Storing Start and End Times

This approach involves using a data structure like a HashMap to store the start and end timestamps for each process on each machine. For each machine and process, compute the time taken using these timestamps. Then, calculate the average processing time for all machines.

This Python solution uses a dictionary to collect timestamps by process and machine. It then calculates process times and averages them by machine.

Code

Python JavaScript

Python

JavaScript

Complexity

Time Complexity: O(n), where n is the number of rows in the input. We iterate once through the list to collect timestamps.
Space Complexity: O(m), where m is the number of unique machine and process combinations.

Try this approach in the editor →

Approach 2: Sorting and Linear Scan Method

In this approach, we first sort the data by machine_id and process_id, ensuring that each start and end appears consecutively. We then do a single linear scan to compute the time taken for each process and maintain a cumulative sum for each machine, eventually computing the average processing time.

This Java implementation sorts the activity data and employs a dual-index scan to compute and accumulate the process time for each machine. The average times are then calculated and displayed.

Code

Java C++

Java

C++

Complexity

Time Complexity: O(n log n) due to sorting the list of activities.
Space Complexity: O(m), for storing accumulated processing times and counts by machine.

Try this approach in the editor →

Approach 3: Grouping and Aggregation

We can group by machine_id and use the AVG function to calculate the average time consumption of all process tasks on each machine. Since each process task on the machine has a pair of start and end timestamps, the time consumption of each process task can be calculated by subtracting the start timestamp from the end timestamp. Therefore, we can use the CASE WHEN or IF function to calculate the time consumption of each process task, and then use the AVG function to calculate the average time consumption of all process tasks on each machine.

Note that each machine has 2 process tasks, so we need to multiply the calculated average time consumption by 2.

Code

MySQL

Try this approach in the editor →

Approach 4: Default Approach

Code

MySQL

Try this approach in the editor →

Complexity Comparison

Approach	Complexity
Using HashMap for Storing Start and End Times	Time Complexity: O(n), where n is the number of rows in the input. We iterate once through the list to collect timestamps. Space Complexity: O(m), where m is the number of unique machine and process combinations.
Sorting and Linear Scan Method	Time Complexity: O(n log n) due to sorting the list of activities. Space Complexity: O(m), for storing accumulated processing times and counts by machine.
Grouping and Aggregation	—
Default Approach	—

Detailed Complexity Analysis

Approach	Time	Space	When to Use
HashMap for Start-End Pairing	O(n)	O(n)	Best general solution when events arrive in arbitrary order
Sorting and Linear Scan	O(n log n)	O(1) extra	Useful when sorting is acceptable or memory usage must stay minimal

Video Solution

Average Time of Process per Machine | Leetcode 1661 | Crack SQL Interviews in 50 Qs #mysql • Learn With Chirag • 31,816 views views

Watch 9 more video solutions →

Frequently Asked Questions

Is Average Time of Process per Machine easy or hard?

Average Time of Process per Machine is classified as an Easy problem. The core idea is straightforward: pair start and end events, compute the duration, and average the results per machine. The challenge mainly lies in organizing the data efficiently.

Average Time of Process per Machine Python/Java solution

Python and JavaScript implementations typically use dictionaries or objects as hash maps to pair start and end events. Java and C++ solutions often rely on sorting the records and performing a linear scan to compute durations, though hash maps can also be used in those languages.

How to solve Average Time of Process per Machine in O(n)?

Iterate through the activity list and store start timestamps in a hash map keyed by machine_id and process_id. When an end event is encountered, subtract the stored start time to compute the processing duration. Accumulate total duration and process count per machine, then divide total time by count to get the average.

What is the best approach for Average Time of Process per Machine?

The HashMap pairing approach is the most efficient. Store the start timestamp for each (machine_id, process_id) pair, then compute the duration when the end record appears. This processes all records in O(n) time with O(n) space while directly aggregating totals per machine.

Is Average Time of Process per Machine asked at Google/Amazon/Meta?

Problems involving log pairing, event processing, and aggregation appear frequently in interviews at companies like Amazon and Google. This question tests the ability to match related events and compute aggregated metrics efficiently using hash maps or database-style grouping.

What data structure is used in Average Time of Process per Machine?

A HashMap is the primary data structure used in the optimal solution. It stores the start timestamp for each process and allows constant-time lookup when the corresponding end event appears. Additional maps or counters track the accumulated time and process count per machine.

What is the time complexity of Average Time of Process per Machine?

The optimal solution runs in O(n) time using a HashMap to match start and end events. Each record is processed once with constant-time lookups. A sorting-based alternative takes O(n log n) time because the activity records must be ordered before scanning.

Ready to solve this problem?

Practice Average Time of Process per Machine with our built-in code editor and test cases.

Practice on FleetCode

Combine Two Tables

Second Highest Salary

Problem Info

DifficultyEasy

Acceptance68.7%

Approaches4

Reading time11 min

Asked at

Amazon Microsoft Meta Accenture Adobe

Practice this problem

Open in Editor

Average Time of Process per Machine - Solution & Explanation

Problem Statement

Approach Overview

Approach 1: Using HashMap for Storing Start and End Times

Code

Complexity

Approach 2: Sorting and Linear Scan Method

Code

Complexity

Approach 3: Grouping and Aggregation

Code

Approach 4: Default Approach

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Average Time of Process per Machine - Solution & Explanation

Problem Statement

Approach Overview

Approach 1: Using HashMap for Storing Start and End Times

Code

Complexity

Approach 2: Sorting and Linear Scan Method

Code

Complexity

Approach 3: Grouping and Aggregation

Code

Approach 4: Default Approach

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Approach 1: Using HashMap for Storing Start and End Times

Code

Complexity

Approach 2: Sorting and Linear Scan Method

Code

Complexity

Approach 3: Grouping and Aggregation

Code

Approach 4: Default Approach

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Approach 1: Using HashMap for Storing Start and End Times

Code

Complexity

Approach 2: Sorting and Linear Scan Method

Code

Complexity

Approach 3: Grouping and Aggregation

Code

Approach 4: Default Approach

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents