Given a list of folders folder, return the folders after removing all sub-folders in those folders. You may return the answer in any order.
If a folder[i] is located within another folder[j], it is called a sub-folder of it. A sub-folder of folder[j] must start with folder[j], followed by a "/". For example, "/a/b" is a sub-folder of "/a", but "/b" is not a sub-folder of "/a/b/c".
The format of a path is one or more concatenated strings of the form: '/' followed by one or more lowercase English letters.
"/leetcode" and "/leetcode/problems" are valid paths while an empty string and "/" are not.
Example 1:
Input: folder = ["/a","/a/b","/c/d","/c/d/e","/c/f"] Output: ["/a","/c/d","/c/f"] Explanation: Folders "/a/b" is a subfolder of "/a" and "/c/d/e" is inside of folder "/c/d" in our filesystem.
Example 2:
Input: folder = ["/a","/a/b/c","/a/b/d"] Output: ["/a"] Explanation: Folders "/a/b/c" and "/a/b/d" will be removed because they are subfolders of "/a".
Example 3:
Input: folder = ["/a/b/c","/a/b/ca","/a/b/d"] Output: ["/a/b/c","/a/b/ca","/a/b/d"]
Constraints:
1 <= folder.length <= 4 * 1042 <= folder[i].length <= 100folder[i] contains only lowercase letters and '/'.folder[i] always starts with the character '/'.Problem Overview: You receive a list of folder paths in a filesystem. Some folders may be subfolders of others (for example /a/b is inside /a). The task is to return only the top-level folders and remove every path that is a subfolder of another.
Approach 1: Sorting and Checking Prefix (O(n log n) time, O(1) extra space)
Sort all folder paths lexicographically. Once sorted, subfolders naturally appear right after their parent folder because they share the same prefix. Iterate through the sorted list and keep a folder only if it is not prefixed by the last accepted folder plus /. The key insight is that sorting groups related paths together, which eliminates the need to compare every pair. This approach relies heavily on string prefix comparison and works well for interview settings because it is simple and efficient. The overall time complexity is O(n log n) for sorting plus linear scanning, and space complexity is O(1) aside from the output. This approach mainly uses concepts from Array and String processing.
Approach 2: Using Trie Data Structure (O(T) time, O(T) space)
Insert each folder path into a Trie where each node represents a directory name between slashes. While inserting, stop if you reach a node already marked as a complete folder path. That indicates a parent folder already exists, so the current path is a subfolder and should be ignored. If insertion completes, mark the final node as an end node and discard any deeper children. The key idea is that the Trie naturally models hierarchical filesystem structure. Each traversal step corresponds to moving deeper into directories, making it easy to detect when a parent folder already exists. Time complexity is O(T) where T is the total number of characters across all paths, and space complexity is also O(T) due to storing the Trie nodes. This method highlights structured path processing using Trie and recursive traversal patterns similar to Depth-First Search.
Recommended for interviews: The sorting and prefix check approach is typically what interviewers expect. It demonstrates strong reasoning about lexicographic ordering and efficient scanning. The Trie approach shows deeper understanding of hierarchical data structures and filesystem modeling. Mentioning both approaches signals strong problem-solving breadth.
The main idea of this approach is to first sort the folders alphabetically. This way, any sub-folder will appear immediately after its parent folder. We can then iterate through the sorted list and check if the current folder is a prefix of the next folder (followed by '/'). If it is not a prefix, we add it to the final result list.
This Python solution sorts the folder paths and iterates through them. It maintains a 'prev' variable to store the last added folder to the results, ensuring we only add a folder if it is not a sub-folder of the previous folder in the result list.
Python
C++
Java
JavaScript
Time Complexity: O(n log n) due to sorting the list. Space Complexity: O(n) for storing the result list.
In this approach, we use a Trie (or Prefix Tree) to represent the hierarchy of folders. We insert each folder path into the Trie while ensuring to mark the end of any folder. During insertion, if we find that a folder is already marked as the end of a path in the Trie, we ignore any further paths as sub-folders of this path.
This Python solution builds a Trie for all folder paths. It only marks paths as ends when they are not sub-folders, thus ignoring deeper sub-folders in the structure.
Time Complexity: O(n * l) where n is the number of folders, and l is the average length of the folder paths. Space Complexity: O(n * l) for the Trie.
First, we sort the array folder in lexicographical order, then traverse the array. For the current folder f we are traversing, if its length is greater than or equal to the length of the last folder in the answer array, and its prefix includes the last folder in the answer array plus a /, then f is a subfolder of the last folder in the answer array, and we don't need to add it to the answer array. Otherwise, we add f to the answer array.
After the traversal ends, the folders in the answer array are the answer required by the problem.
The time complexity is O(n times log n times m), and the space complexity is O(m). Where n and m are the length of the array folder and the maximum length of the strings in the array folder, respectively.
Python
Java
C++
Go
TypeScript
JavaScript
We can use a trie to store all the folders in the array folder. Each node of the trie contains a children field, used to store the child nodes of the current node, and a fid field, used to store the index of the folder corresponding to the current node in the array folder.
For each folder f in the array folder, we first split f into several substrings according to /, then start from the root node and add the substrings to the trie in turn. Next, we start from the root node to search the trie. If the fid field of the current node is not -1, it means that the folder corresponding to the current node is a folder in the answer array. We add it to the answer array and return. Otherwise, we recursively search all child nodes of the current node and finally return the answer array.
The time complexity is O(n times m), and the space complexity is O(n times m). Where n and m are the length of the array folder and the maximum length of the strings in the array folder, respectively.
Python
Java
C++
Go
TypeScript
JavaScript
| Approach | Complexity |
|---|---|
| Approach 1: Sorting and Checking Prefix | Time Complexity: O(n log n) due to sorting the list. Space Complexity: O(n) for storing the result list. |
| Approach 2: Using Trie Data Structure | Time Complexity: O(n * l) where n is the number of folders, and l is the average length of the folder paths. Space Complexity: O(n * l) for the Trie. |
| Sorting | — |
| Trie | — |
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Sorting + Prefix Check | O(n log n) | O(1) | Best general solution. Simple implementation and commonly expected in interviews. |
| Trie Data Structure | O(T) | O(T) | Useful when modeling hierarchical paths or when many shared prefixes exist. |
Remove Sub-Folders from the Filesystem - Leetcode 1233 - Python • NeetCodeIO • 12,861 views views
Watch 9 more video solutions →Practice Remove Sub-Folders from the Filesystem with our built-in code editor and test cases.
Practice on FleetCode