Given the root of a binary tree, return all duplicate subtrees.
For each kind of duplicate subtrees, you only need to return the root node of any one of them.
Two trees are duplicate if they have the same structure with the same node values.
Example 1:
Input: root = [1,2,3,4,null,2,4,null,null,4] Output: [[2,4],[4]]
Example 2:
Input: root = [2,1,1] Output: [[1]]
Example 3:
Input: root = [2,2,2,3,null,3,null] Output: [[2,3],[3]]
Constraints:
[1, 5000]-200 <= Node.val <= 200Problem Overview: You are given the root of a binary tree and need to return the roots of all subtrees that appear more than once. Two subtrees are considered duplicates if they have the same structure and the same node values. The challenge is efficiently detecting identical subtree structures across the entire tree.
Approach 1: Tree Serialization Using DFS (O(n2) time, O(n) space)
This approach converts every subtree into a unique string representation using postorder traversal. During a DFS traversal, you serialize the left subtree, the right subtree, and the current node value into a string like "left,right,val". A hash table tracks how many times each serialized subtree appears. When a serialization is seen for the second time, you record that subtree root as a duplicate.
The key insight is that identical subtree structures generate identical serialized strings. Using depth-first search, every node contributes to exactly one serialization result, allowing you to systematically compare subtrees without pairwise comparisons. The downside is that repeated string concatenation can grow large for deep trees, leading to a worst‑case time complexity of O(n2).
Approach 2: Tree Serialization with Unique Identifiers (O(n) time, O(n) space)
A more efficient strategy replaces full string serialization with compact integer identifiers. During DFS, each subtree is represented by a tuple containing the current node value and the IDs of its left and right subtrees. This tuple is stored in a hash map that assigns a unique integer ID the first time the structure appears.
Each subtree is therefore represented by a small numeric signature instead of a long string. Another map counts how many times each subtree ID appears, and once the count reaches two, the subtree root is added to the result list. Because each node is processed once and tuple lookups are constant time, the algorithm runs in O(n) time and uses O(n) space.
This approach works particularly well for large binary tree inputs because subtree comparisons become integer comparisons instead of string comparisons.
Recommended for interviews: The unique identifier approach is the one interviewers typically expect. It demonstrates strong understanding of DFS traversal, hashing, and subtree memoization while achieving optimal O(n) complexity. Explaining the basic serialization idea first shows clear reasoning, but optimizing it with subtree IDs highlights deeper algorithmic thinking.
Utilize a Depth-First Search (DFS) strategy and serialize each subtree. Treat the serialization as a key in a hashmap to track occurrences of identical subtrees. Duplicate entries are detected when a serialized key is encountered more than once.
This Python solution uses DFS to serialize each subtree into a string. These strings are stored in a dictionary. If a string appears more than once, it indicates a duplicate subtree. The function returns one node from each set of duplicates.
Time Complexity: O(N), where N is the number of nodes since we visit each node once.
Space Complexity: O(N), accounting for the hashmap to store serialized trees and recursion stack space.
Rather than using direct serialization strings, assign each unique subtree encountered a unique ID using a hash map. If a subtree's ID is seen again, it's a duplicate.
This Python approach maps each subtree structure to a unique identifier, incrementally assigning IDs using a dictionary. Duplicates are recognized when the same ID appears multiple times.
Python
Java
C++
JavaScript
Time Complexity: O(N), as we traverse each node.
Space Complexity: O(N), necessary for dictionaries and recursive processing space.
| Approach | Complexity |
|---|---|
| Tree Serialization Using DFS | Time Complexity: O(N), where N is the number of nodes since we visit each node once. |
| Tree Serialization with Unique Identifiers | Time Complexity: O(N), as we traverse each node. |
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Tree Serialization Using DFS | O(n^2) | O(n) | Simple implementation when tree size is moderate and string serialization cost is acceptable |
| Serialization with Unique Subtree Identifiers | O(n) | O(n) | Optimal solution for interviews and large trees where repeated string comparisons are expensive |
Find Duplicate Subtrees - Leetcode 652 - Python • NeetCodeIO • 27,013 views views
Watch 9 more video solutions →Practice Find Duplicate Subtrees with our built-in code editor and test cases.
Practice on FleetCodePractice this problem
Open in Editor